linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc
@ 2022-07-20 19:32 Melissa Wen
  2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw)
  To: airlied, alexander.deucher, christian.koenig, daniel,
	harry.wentland, Rodrigo.Siqueira, sunpeng.li, Xinhui.Pan
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen,
	amd-gfx, dri-devel, linux-kernel

An initial report from Guenter[1] shows some soft-fp vs hard-fp error
from DCN31 clk mgr for powerpc. I was not able to reproduce it
cross-compiling with gcc-powerpc-linux-gnu and gcc-11.3, but thanks to
Maíra tips, I can reproduce the issue using make.cross, as follows:

- wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
- chmod +x ~/bin/make.cross
- mkdir build_dir
- COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 ~/make.cross O=build_dir ARCH=powerpc SHELL=/bin/bash

with a config file generate by allmodconfig

So, the first patch fix the issue reported by Guenter. The second is
just a cleanup in dcn31_resource file to remove useless DC_FP_ wrapper.
Finally, the last three patches I'm removing the -mno-gnu-attribute
option, that was just hiding FPU-associated code in clk mgr files of
dcn21/30/301, and moving them to DML folder. This series doesn't cover
recent drivers dcn32/314.

Thanks Guenter, Maíra, Siqueira and Alex for all inputs on this
debugging process. Let me know your thoughts on this approach.

Melissa

[1] https://lore.kernel.org/amd-gfx/20220618232737.2036722-1-linux@roeck-us.net/

Melissa Wen (5):
  drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family for powerpc
  drm/amd/display: remove useless FPU protection wrapper from
    dcn31_resource file
  drm/amd/display: move FPU code on dcn21 clk_mgr
  drm/amd/display: move FPU code from dcn30 clk mgr to DML folder
  drm/amd/display: move FPU code from dcn301 clk mgr to DML folder

 .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |  18 --
 .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 234 +----------------
 .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h |   7 +
 .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c  |  63 +----
 .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c    |  86 +------
 .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h    |   3 +
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  11 +-
 .../amd/display/dc/dcn315/dcn315_resource.c   |   5 +-
 .../amd/display/dc/dcn316/dcn316_resource.c   |   5 +-
 .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c  | 235 ++++++++++++++++++
 .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h  |   2 +
 .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.c  |  63 ++++-
 .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.h  |   1 +
 .../amd/display/dc/dml/dcn301/dcn301_fpu.c    |  74 ++++++
 .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c  |  11 +
 .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.h  |   3 +
 16 files changed, 423 insertions(+), 398 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family for powerpc
  2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen
@ 2022-07-20 19:32 ` Melissa Wen
  2022-07-21 18:54   ` Rodrigo Siqueira Jordao
  2022-07-20 19:32 ` [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file Melissa Wen
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw)
  To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen,
	amd-gfx, dri-devel, linux-kernel

Move remaining FPU code to DML folder that caused compilation error for
powerpc. This patch depends on [1] to prevent the error below:

/gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o uses soft float
/gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
/gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o uses soft float
/gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
/gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o uses soft float
/gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o

[1] https://lore.kernel.org/amd-gfx/20220716195144.342960-1-mwen@igalia.com/

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Melissa Wen <mwen@igalia.com>
---
 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c |  5 +++--
 .../gpu/drm/amd/display/dc/dcn315/dcn315_resource.c   |  5 +++--
 .../gpu/drm/amd/display/dc/dcn316/dcn316_resource.c   |  5 +++--
 drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c  | 11 +++++++++++
 drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h  |  3 +++
 5 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
index 178d40c0d70a..929b712cbada 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
@@ -1663,11 +1663,12 @@ int dcn31_populate_dml_pipes_from_context(
 		pipes[pipe_cnt].pipe.src.immediate_flip = true;
 		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
 		pipes[pipe_cnt].pipe.src.gpuvm = true;
-		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
-		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
 		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
 		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
 		pipes[pipe_cnt].dout.dsc_input_bpc = 0;
+		DC_FP_START();
+		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
+		DC_FP_END();
 
 		if (dc->debug.dml_hostvm_override == DML_HOSTVM_NO_OVERRIDE)
 			pipes[pipe_cnt].pipe.src.hostvm = dc->res_pool->hubbub->riommu_active;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
index df2abd8fe2eb..1a5f5977f962 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
@@ -1658,11 +1658,12 @@ static int dcn315_populate_dml_pipes_from_context(
 
 		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
 		pipes[pipe_cnt].pipe.src.gpuvm = true;
-		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
-		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
 		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
 		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
 		pipes[pipe_cnt].dout.dsc_input_bpc = 0;
+		DC_FP_START();
+		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
+		DC_FP_END();
 
 		if (pipes[pipe_cnt].dout.dsc_enable) {
 			switch (timing->display_color_depth) {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c
index 070fe10a004e..53dea466348f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c
@@ -1661,11 +1661,12 @@ static int dcn316_populate_dml_pipes_from_context(
 
 		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
 		pipes[pipe_cnt].pipe.src.gpuvm = true;
-		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
-		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
 		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
 		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
 		pipes[pipe_cnt].dout.dsc_input_bpc = 0;
+		DC_FP_START();
+		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
+		DC_FP_END();
 
 		if (pipes[pipe_cnt].dout.dsc_enable) {
 			switch (timing->display_color_depth) {
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
index facac3daeaca..e36cfa5985ea 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
@@ -435,8 +435,19 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_16_soc = {
 	.urgent_latency_adjustment_fabric_clock_reference_mhz = 0,
 };
 
+void dcn31_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes,
+				  int pipe_cnt)
+{
+	dc_assert_fp_enabled();
+
+	pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
+	pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
+}
+
 void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context)
 {
+	dc_assert_fp_enabled();
+
 	if (dc->clk_mgr->bw_params->wm_table.entries[WM_A].valid) {
 		context->bw_ctx.dml.soc.dram_clock_change_latency_us = dc->clk_mgr->bw_params->wm_table.entries[WM_A].pstate_latency_us;
 		context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us = dc->clk_mgr->bw_params->wm_table.entries[WM_A].sr_enter_plus_exit_time_us;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
index 0a10de80c1a4..4372f17b55d4 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
@@ -31,6 +31,9 @@
 #define DCN3_15_MIN_COMPBUF_SIZE_KB 128
 #define DCN3_16_DEFAULT_DET_SIZE 192
 
+void dcn31_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes,
+				  int pipe_cnt);
+
 void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context);
 
 void dcn31_calculate_wm_and_dlg_fp(
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file
  2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen
  2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen
@ 2022-07-20 19:32 ` Melissa Wen
  2022-07-21 18:55   ` Rodrigo Siqueira Jordao
  2022-07-20 19:32 ` [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr Melissa Wen
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw)
  To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen,
	amd-gfx, dri-devel, linux-kernel

Many lines of code in dcn31_resource_construct are wrapped by DC_FP
macro to protect FPU operations; however, there is no FPU in this
region. Therefore, just remove the wrapper for clarity.

Signed-off-by: Melissa Wen <mwen@igalia.com>
---
 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
index 929b712cbada..6d25fcf865bf 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
@@ -1863,8 +1863,6 @@ static bool dcn31_resource_construct(
 	struct dc_context *ctx = dc->ctx;
 	struct irq_service_init_data init_data;
 
-	DC_FP_START();
-
 	ctx->dc_bios->regs = &bios_regs;
 
 	pool->base.res_cap = &res_cap_dcn31;
@@ -2175,13 +2173,9 @@ static bool dcn31_resource_construct(
 
 	dc->dcn_ip->max_num_dpp = dcn3_1_ip.max_num_dpp;
 
-	DC_FP_END();
-
 	return true;
 
 create_fail:
-
-	DC_FP_END();
 	dcn31_resource_destruct(pool);
 
 	return false;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr
  2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen
  2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen
  2022-07-20 19:32 ` [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file Melissa Wen
@ 2022-07-20 19:32 ` Melissa Wen
  2022-07-21 18:57   ` Rodrigo Siqueira Jordao
  2022-07-20 19:32 ` [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder Melissa Wen
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw)
  To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen,
	amd-gfx, dri-devel, linux-kernel

The -mno-gnu-attribute option in dcn21 clk mgr makefile hides a soft vs
hard fp error for powerpc. After removing this flag, we can see some FPU
code remains there:

/gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses
hard float,
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn21/rn_clk_mgr.o uses
soft float

Therefore, remove the -mno-gnu-attribute flag for dcn21/powerpc and move
FPU-associated code to DML folder.

Signed-off-by: Melissa Wen <mwen@igalia.com>
---
 .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |   6 -
 .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 234 +----------------
 .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h |   7 +
 .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c  | 235 ++++++++++++++++++
 .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h  |   2 +
 5 files changed, 248 insertions(+), 236 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
index a48453612d10..66dc02c426e9 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
@@ -107,12 +107,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN201)
 ###############################################################################
 CLK_MGR_DCN21 = rn_clk_mgr.o rn_clk_mgr_vbios_smu.o
 
-# prevent build errors regarding soft-float vs hard-float FP ABI tags
-# this code is currently unused on ppc64, as it applies to Renoir APUs only
-ifdef CONFIG_PPC64
-CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
-endif
-
 AMD_DAL_CLK_MGR_DCN21 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn21/,$(CLK_MGR_DCN21))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21)
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
index cf1b5f354ae9..0202dc682682 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
@@ -26,10 +26,9 @@
 #include "dccg.h"
 #include "clk_mgr_internal.h"
 
-
 #include "dcn20/dcn20_clk_mgr.h"
 #include "rn_clk_mgr.h"
-
+#include "dml/dcn20/dcn20_fpu.h"
 
 #include "dce100/dce_clk_mgr.h"
 #include "rn_clk_mgr_vbios_smu.h"
@@ -45,7 +44,6 @@
 
 /* Constants */
 
-#define LPDDR_MEM_RETRAIN_LATENCY 4.977 /* Number obtained from LPDDR4 Training Counter Requirement doc */
 #define SMU_VER_55_51_0 0x373300 /* SMU Version that is able to set DISPCLK below 100MHz */
 
 /* Macros */
@@ -613,228 +611,6 @@ static struct clk_bw_params rn_bw_params = {
 
 };
 
-static struct wm_table ddr4_wm_table_gs = {
-	.entries = {
-		{
-			.wm_inst = WM_A,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 7.09,
-			.sr_enter_plus_exit_time_us = 8.14,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_B,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 10.12,
-			.sr_enter_plus_exit_time_us = 11.48,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_C,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 10.12,
-			.sr_enter_plus_exit_time_us = 11.48,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_D,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 10.12,
-			.sr_enter_plus_exit_time_us = 11.48,
-			.valid = true,
-		},
-	}
-};
-
-static struct wm_table lpddr4_wm_table_gs = {
-	.entries = {
-		{
-			.wm_inst = WM_A,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 5.32,
-			.sr_enter_plus_exit_time_us = 6.38,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_B,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.82,
-			.sr_enter_plus_exit_time_us = 11.196,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_C,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.89,
-			.sr_enter_plus_exit_time_us = 11.24,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_D,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.748,
-			.sr_enter_plus_exit_time_us = 11.102,
-			.valid = true,
-		},
-	}
-};
-
-static struct wm_table lpddr4_wm_table_with_disabled_ppt = {
-	.entries = {
-		{
-			.wm_inst = WM_A,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 8.32,
-			.sr_enter_plus_exit_time_us = 9.38,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_B,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.82,
-			.sr_enter_plus_exit_time_us = 11.196,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_C,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.89,
-			.sr_enter_plus_exit_time_us = 11.24,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_D,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.748,
-			.sr_enter_plus_exit_time_us = 11.102,
-			.valid = true,
-		},
-	}
-};
-
-static struct wm_table ddr4_wm_table_rn = {
-	.entries = {
-		{
-			.wm_inst = WM_A,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 11.90,
-			.sr_enter_plus_exit_time_us = 12.80,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_B,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 13.18,
-			.sr_enter_plus_exit_time_us = 14.30,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_C,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 13.18,
-			.sr_enter_plus_exit_time_us = 14.30,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_D,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 13.18,
-			.sr_enter_plus_exit_time_us = 14.30,
-			.valid = true,
-		},
-	}
-};
-
-static struct wm_table ddr4_1R_wm_table_rn = {
-	.entries = {
-		{
-			.wm_inst = WM_A,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 13.90,
-			.sr_enter_plus_exit_time_us = 14.80,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_B,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 13.90,
-			.sr_enter_plus_exit_time_us = 14.80,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_C,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 13.90,
-			.sr_enter_plus_exit_time_us = 14.80,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_D,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 13.90,
-			.sr_enter_plus_exit_time_us = 14.80,
-			.valid = true,
-		},
-	}
-};
-
-static struct wm_table lpddr4_wm_table_rn = {
-	.entries = {
-		{
-			.wm_inst = WM_A,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 7.32,
-			.sr_enter_plus_exit_time_us = 8.38,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_B,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.82,
-			.sr_enter_plus_exit_time_us = 11.196,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_C,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.89,
-			.sr_enter_plus_exit_time_us = 11.24,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_D,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 9.748,
-			.sr_enter_plus_exit_time_us = 11.102,
-			.valid = true,
-		},
-	}
-};
-
 static unsigned int find_socclk_for_voltage(struct dpm_clocks *clock_table, unsigned int voltage)
 {
 	int i;
@@ -914,12 +690,10 @@ static void rn_clk_mgr_helper_populate_bw_params(struct clk_bw_params *bw_params
 		/*
 		 * WM set D will be re-purposed for memory retraining
 		 */
-		bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY;
-		bw_params->wm_table.entries[WM_D].wm_inst = WM_D;
-		bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING;
-		bw_params->wm_table.entries[WM_D].valid = true;
+		DC_FP_START();
+		dcn21_clk_mgr_set_bw_params_wm_table(bw_params);
+		DC_FP_END();
 	}
-
 }
 
 void rn_clk_mgr_construct(
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h
index e4322fa5475b..2e088c5171b2 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h
@@ -29,6 +29,13 @@
 #include "clk_mgr.h"
 #include "dm_pp_smu.h"
 
+extern struct wm_table ddr4_wm_table_gs;
+extern struct wm_table lpddr4_wm_table_gs;
+extern struct wm_table lpddr4_wm_table_with_disabled_ppt;
+extern struct wm_table ddr4_wm_table_rn;
+extern struct wm_table ddr4_1R_wm_table_rn;
+extern struct wm_table lpddr4_wm_table_rn;
+
 struct rn_clk_registers {
 	uint32_t CLK1_CLK0_CURRENT_CNT; /* DPREFCLK */
 };
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
index dc60b835e938..eeeae52fe6fc 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
@@ -42,6 +42,9 @@
 #define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
 #endif
 
+/* Constant */
+#define LPDDR_MEM_RETRAIN_LATENCY 4.977 /* Number obtained from LPDDR4 Training Counter Requirement doc */
+
 /**
  * DOC: DCN2x FPU manipulation Overview
  *
@@ -650,6 +653,228 @@ struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc = {
 	.num_states = 8
 };
 
+struct wm_table ddr4_wm_table_gs = {
+	.entries = {
+		{
+			.wm_inst = WM_A,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 7.09,
+			.sr_enter_plus_exit_time_us = 8.14,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_B,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 10.12,
+			.sr_enter_plus_exit_time_us = 11.48,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_C,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 10.12,
+			.sr_enter_plus_exit_time_us = 11.48,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_D,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 10.12,
+			.sr_enter_plus_exit_time_us = 11.48,
+			.valid = true,
+		},
+	}
+};
+
+struct wm_table lpddr4_wm_table_gs = {
+	.entries = {
+		{
+			.wm_inst = WM_A,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 5.32,
+			.sr_enter_plus_exit_time_us = 6.38,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_B,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.82,
+			.sr_enter_plus_exit_time_us = 11.196,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_C,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.89,
+			.sr_enter_plus_exit_time_us = 11.24,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_D,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.748,
+			.sr_enter_plus_exit_time_us = 11.102,
+			.valid = true,
+		},
+	}
+};
+
+struct wm_table lpddr4_wm_table_with_disabled_ppt = {
+	.entries = {
+		{
+			.wm_inst = WM_A,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 8.32,
+			.sr_enter_plus_exit_time_us = 9.38,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_B,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.82,
+			.sr_enter_plus_exit_time_us = 11.196,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_C,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.89,
+			.sr_enter_plus_exit_time_us = 11.24,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_D,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.748,
+			.sr_enter_plus_exit_time_us = 11.102,
+			.valid = true,
+		},
+	}
+};
+
+struct wm_table ddr4_wm_table_rn = {
+	.entries = {
+		{
+			.wm_inst = WM_A,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 11.90,
+			.sr_enter_plus_exit_time_us = 12.80,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_B,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 13.18,
+			.sr_enter_plus_exit_time_us = 14.30,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_C,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 13.18,
+			.sr_enter_plus_exit_time_us = 14.30,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_D,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 13.18,
+			.sr_enter_plus_exit_time_us = 14.30,
+			.valid = true,
+		},
+	}
+};
+
+struct wm_table ddr4_1R_wm_table_rn = {
+	.entries = {
+		{
+			.wm_inst = WM_A,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 13.90,
+			.sr_enter_plus_exit_time_us = 14.80,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_B,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 13.90,
+			.sr_enter_plus_exit_time_us = 14.80,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_C,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 13.90,
+			.sr_enter_plus_exit_time_us = 14.80,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_D,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 13.90,
+			.sr_enter_plus_exit_time_us = 14.80,
+			.valid = true,
+		},
+	}
+};
+
+struct wm_table lpddr4_wm_table_rn = {
+	.entries = {
+		{
+			.wm_inst = WM_A,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 7.32,
+			.sr_enter_plus_exit_time_us = 8.38,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_B,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.82,
+			.sr_enter_plus_exit_time_us = 11.196,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_C,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.89,
+			.sr_enter_plus_exit_time_us = 11.24,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_D,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 9.748,
+			.sr_enter_plus_exit_time_us = 11.102,
+			.valid = true,
+		},
+	}
+};
+
 void dcn20_populate_dml_writeback_from_context(struct dc *dc,
 					       struct resource_context *res_ctx,
 					       display_e2e_pipe_params_st *pipes)
@@ -2068,3 +2293,13 @@ void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params
 
 	dml_init_instance(&dc->dml, &dcn2_1_soc, &dcn2_1_ip, DML_PROJECT_DCN21);
 }
+
+void dcn21_clk_mgr_set_bw_params_wm_table(struct clk_bw_params *bw_params)
+{
+	dc_assert_fp_enabled();
+
+	bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY;
+	bw_params->wm_table.entries[WM_D].wm_inst = WM_D;
+	bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING;
+	bw_params->wm_table.entries[WM_D].valid = true;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h
index aa892193e485..a6e1ad0f38e9 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h
@@ -82,4 +82,6 @@ bool dcn21_validate_bandwidth_fp(struct dc *dc,
 				 bool fast_validate);
 void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params);
 
+void dcn21_clk_mgr_set_bw_params_wm_table(struct clk_bw_params *bw_params);
+
 #endif /* __DCN20_FPU_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder
  2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen
                   ` (2 preceding siblings ...)
  2022-07-20 19:32 ` [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr Melissa Wen
@ 2022-07-20 19:32 ` Melissa Wen
  2022-07-21 18:58   ` Rodrigo Siqueira Jordao
  2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen
  2022-07-21 19:07 ` [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Rodrigo Siqueira Jordao
  5 siblings, 1 reply; 13+ messages in thread
From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw)
  To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen,
	amd-gfx, dri-devel, linux-kernel

The -mno-gnu-attribute option in clk mgr makefile for dcn30 hides a soft
vs hard fp error for powerpc. After removing this flag, we can see some
FPU code remains there:

gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses
hard float,
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.o
uses soft float

Therefore, remove the -mno-gnu-attribute flag for dcn30/powerpc and move
FPU-associated code to DML folder.

Signed-off-by: Melissa Wen <mwen@igalia.com>
---
 .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |  6 --
 .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c  | 63 ++-----------------
 .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.c  | 63 ++++++++++++++++++-
 .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.h  |  1 +
 4 files changed, 68 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
index 66dc02c426e9..15b660a951a5 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
@@ -115,12 +115,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21)
 ###############################################################################
 CLK_MGR_DCN30 = dcn30_clk_mgr.o dcn30_clk_mgr_smu_msg.o
 
-# prevent build errors regarding soft-float vs hard-float FP ABI tags
-# this code is currently unused on ppc64, as it applies to VanGogh APUs only
-ifdef CONFIG_PPC64
-CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn30/dcn30_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
-endif
-
 AMD_DAL_CLK_MGR_DCN30 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn30/,$(CLK_MGR_DCN30))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30)
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c
index 914708cefc79..3ce0ee0d012f 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c
@@ -29,6 +29,7 @@
 #include "dcn20/dcn20_clk_mgr.h"
 #include "dce100/dce_clk_mgr.h"
 #include "dcn30/dcn30_clk_mgr.h"
+#include "dml/dcn30/dcn30_fpu.h"
 #include "reg_helper.h"
 #include "core_types.h"
 #include "dm_helpers.h"
@@ -97,65 +98,11 @@ static void dcn3_init_single_clock(struct clk_mgr_internal *clk_mgr, uint32_t cl
 	}
 }
 
-static noinline void dcn3_build_wm_range_table(struct clk_mgr_internal *clk_mgr)
+static void dcn3_build_wm_range_table(struct clk_mgr_internal *clk_mgr)
 {
-	/* defaults */
-	double pstate_latency_us = clk_mgr->base.ctx->dc->dml.soc.dram_clock_change_latency_us;
-	double sr_exit_time_us = clk_mgr->base.ctx->dc->dml.soc.sr_exit_time_us;
-	double sr_enter_plus_exit_time_us = clk_mgr->base.ctx->dc->dml.soc.sr_enter_plus_exit_time_us;
-	uint16_t min_uclk_mhz = clk_mgr->base.bw_params->clk_table.entries[0].memclk_mhz;
-
-	/* Set A - Normal - default values*/
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].valid = true;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.pstate_latency_us = pstate_latency_us;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.sr_exit_time_us = sr_exit_time_us;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_dcfclk = 0;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_dcfclk = 0xFFFF;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_uclk = min_uclk_mhz;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_uclk = 0xFFFF;
-
-	/* Set B - Performance - higher minimum clocks */
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].valid = true;
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us = pstate_latency_us;
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.sr_exit_time_us = sr_exit_time_us;
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE;
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_dcfclk = TUNED VALUE;
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_dcfclk = 0xFFFF;
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_uclk = TUNED VALUE;
-//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_uclk = 0xFFFF;
-
-	/* Set C - Dummy P-State - P-State latency set to "dummy p-state" value */
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].valid = true;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us = 0;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.sr_exit_time_us = sr_exit_time_us;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.wm_type = WATERMARKS_DUMMY_PSTATE;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_dcfclk = 0;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_dcfclk = 0xFFFF;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_uclk = min_uclk_mhz;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_uclk = 0xFFFF;
-	clk_mgr->base.bw_params->dummy_pstate_table[0].dram_speed_mts = 1600;
-	clk_mgr->base.bw_params->dummy_pstate_table[0].dummy_pstate_latency_us = 38;
-	clk_mgr->base.bw_params->dummy_pstate_table[1].dram_speed_mts = 8000;
-	clk_mgr->base.bw_params->dummy_pstate_table[1].dummy_pstate_latency_us = 9;
-	clk_mgr->base.bw_params->dummy_pstate_table[2].dram_speed_mts = 10000;
-	clk_mgr->base.bw_params->dummy_pstate_table[2].dummy_pstate_latency_us = 8;
-	clk_mgr->base.bw_params->dummy_pstate_table[3].dram_speed_mts = 16000;
-	clk_mgr->base.bw_params->dummy_pstate_table[3].dummy_pstate_latency_us = 5;
-
-	/* Set D - MALL - SR enter and exit times adjusted for MALL */
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].valid = true;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us = pstate_latency_us;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.sr_exit_time_us = 2;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us = 4;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.wm_type = WATERMARKS_MALL;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_dcfclk = 0;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_dcfclk = 0xFFFF;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_uclk = min_uclk_mhz;
-	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0xFFFF;
+	DC_FP_START();
+	dcn3_fpu_build_wm_range_table(&clk_mgr->base);
+	DC_FP_END();
 }
 
 void dcn3_init_clocks(struct clk_mgr *clk_mgr_base)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c
index a8db1306750e..c00f759fdded 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c
@@ -29,7 +29,7 @@
 #include "dcn20/dcn20_resource.h"
 #include "dcn30/dcn30_resource.h"
 
-
+#include "clk_mgr/dcn30/dcn30_smu11_driver_if.h"
 #include "display_mode_vba_30.h"
 #include "dcn30_fpu.h"
 
@@ -616,4 +616,65 @@ void dcn30_fpu_update_bw_bounding_box(struct dc *dc,
 
 }
 
+void dcn3_fpu_build_wm_range_table(struct clk_mgr *base)
+{
+	/* defaults */
+	double pstate_latency_us = base->ctx->dc->dml.soc.dram_clock_change_latency_us;
+	double sr_exit_time_us = base->ctx->dc->dml.soc.sr_exit_time_us;
+	double sr_enter_plus_exit_time_us = base->ctx->dc->dml.soc.sr_enter_plus_exit_time_us;
+	uint16_t min_uclk_mhz = base->bw_params->clk_table.entries[0].memclk_mhz;
 
+	dc_assert_fp_enabled();
+
+	/* Set A - Normal - default values*/
+	base->bw_params->wm_table.nv_entries[WM_A].valid = true;
+	base->bw_params->wm_table.nv_entries[WM_A].dml_input.pstate_latency_us = pstate_latency_us;
+	base->bw_params->wm_table.nv_entries[WM_A].dml_input.sr_exit_time_us = sr_exit_time_us;
+	base->bw_params->wm_table.nv_entries[WM_A].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
+	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE;
+	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_dcfclk = 0;
+	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_dcfclk = 0xFFFF;
+	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_uclk = min_uclk_mhz;
+	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_uclk = 0xFFFF;
+
+	/* Set B - Performance - higher minimum clocks */
+//	base->bw_params->wm_table.nv_entries[WM_B].valid = true;
+//	base->bw_params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us = pstate_latency_us;
+//	base->bw_params->wm_table.nv_entries[WM_B].dml_input.sr_exit_time_us = sr_exit_time_us;
+//	base->bw_params->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
+//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE;
+//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_dcfclk = TUNED VALUE;
+//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_dcfclk = 0xFFFF;
+//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_uclk = TUNED VALUE;
+//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_uclk = 0xFFFF;
+
+	/* Set C - Dummy P-State - P-State latency set to "dummy p-state" value */
+	base->bw_params->wm_table.nv_entries[WM_C].valid = true;
+	base->bw_params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us = 0;
+	base->bw_params->wm_table.nv_entries[WM_C].dml_input.sr_exit_time_us = sr_exit_time_us;
+	base->bw_params->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
+	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.wm_type = WATERMARKS_DUMMY_PSTATE;
+	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_dcfclk = 0;
+	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_dcfclk = 0xFFFF;
+	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_uclk = min_uclk_mhz;
+	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_uclk = 0xFFFF;
+	base->bw_params->dummy_pstate_table[0].dram_speed_mts = 1600;
+	base->bw_params->dummy_pstate_table[0].dummy_pstate_latency_us = 38;
+	base->bw_params->dummy_pstate_table[1].dram_speed_mts = 8000;
+	base->bw_params->dummy_pstate_table[1].dummy_pstate_latency_us = 9;
+	base->bw_params->dummy_pstate_table[2].dram_speed_mts = 10000;
+	base->bw_params->dummy_pstate_table[2].dummy_pstate_latency_us = 8;
+	base->bw_params->dummy_pstate_table[3].dram_speed_mts = 16000;
+	base->bw_params->dummy_pstate_table[3].dummy_pstate_latency_us = 5;
+
+	/* Set D - MALL - SR enter and exit times adjusted for MALL */
+	base->bw_params->wm_table.nv_entries[WM_D].valid = true;
+	base->bw_params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us = pstate_latency_us;
+	base->bw_params->wm_table.nv_entries[WM_D].dml_input.sr_exit_time_us = 2;
+	base->bw_params->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us = 4;
+	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.wm_type = WATERMARKS_MALL;
+	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_dcfclk = 0;
+	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_dcfclk = 0xFFFF;
+	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_uclk = min_uclk_mhz;
+	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0xFFFF;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h
index dedfe7b5f173..c2024052a497 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h
@@ -63,5 +63,6 @@ void dcn30_fpu_update_bw_bounding_box(struct dc *dc,
 	unsigned int *dcfclk_mhz,
 	unsigned int *dram_speed_mts);
 
+void dcn3_fpu_build_wm_range_table(struct clk_mgr *base);
 
 #endif /* __DCN30_FPU_H__*/
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] drm/amd/display: move FPU code from dcn301 clk mgr to DML folder
  2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen
                   ` (3 preceding siblings ...)
  2022-07-20 19:32 ` [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder Melissa Wen
@ 2022-07-20 19:32 ` Melissa Wen
  2022-07-21 17:26   ` Maíra Canal
  2022-07-21 18:59   ` Rodrigo Siqueira Jordao
  2022-07-21 19:07 ` [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Rodrigo Siqueira Jordao
  5 siblings, 2 replies; 13+ messages in thread
From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw)
  To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen,
	amd-gfx, dri-devel, linux-kernel

The -mno-gnu-attribute option in dcn301 clk mgr makefile hides a soft vs
hard fp error for powerpc. After removing this flag, we can see some FPU
code remains there:

gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses
hard float,
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.o
uses soft float

Therefore, remove the -mno-gnu-attribute flag for dcn301/powerpc and
move FPU-associated code to DML folder.

Signed-off-by: Melissa Wen <mwen@igalia.com>
---
 .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |  6 --
 .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c    | 86 ++-----------------
 .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h    |  3 +
 .../amd/display/dc/dml/dcn301/dcn301_fpu.c    | 74 ++++++++++++++++
 4 files changed, 84 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
index 15b660a951a5..271d8e573181 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
@@ -123,12 +123,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30)
 ###############################################################################
 CLK_MGR_DCN301 = vg_clk_mgr.o dcn301_smu.o
 
-# prevent build errors regarding soft-float vs hard-float FP ABI tags
-# this code is currently unused on ppc64, as it applies to VanGogh APUs only
-ifdef CONFIG_PPC64
-CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn301/vg_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
-endif
-
 AMD_DAL_CLK_MGR_DCN301 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn301/,$(CLK_MGR_DCN301))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN301)
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
index f310b0d25a07..65f224af03c0 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
@@ -32,6 +32,10 @@
 // For dcn20_update_clocks_update_dpp_dto
 #include "dcn20/dcn20_clk_mgr.h"
 
+// For DML FPU code
+#include "dml/dcn20/dcn20_fpu.h"
+#include "dml/dcn301/dcn301_fpu.h"
+
 #include "vg_clk_mgr.h"
 #include "dcn301_smu.h"
 #include "reg_helper.h"
@@ -526,81 +530,6 @@ static struct clk_bw_params vg_bw_params = {
 
 };
 
-static struct wm_table ddr4_wm_table = {
-	.entries = {
-		{
-			.wm_inst = WM_A,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 6.09,
-			.sr_enter_plus_exit_time_us = 7.14,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_B,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 10.12,
-			.sr_enter_plus_exit_time_us = 11.48,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_C,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 10.12,
-			.sr_enter_plus_exit_time_us = 11.48,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_D,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.72,
-			.sr_exit_time_us = 10.12,
-			.sr_enter_plus_exit_time_us = 11.48,
-			.valid = true,
-		},
-	}
-};
-
-static struct wm_table lpddr5_wm_table = {
-	.entries = {
-		{
-			.wm_inst = WM_A,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 13.5,
-			.sr_enter_plus_exit_time_us = 16.5,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_B,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 13.5,
-			.sr_enter_plus_exit_time_us = 16.5,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_C,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 13.5,
-			.sr_enter_plus_exit_time_us = 16.5,
-			.valid = true,
-		},
-		{
-			.wm_inst = WM_D,
-			.wm_type = WM_TYPE_PSTATE_CHG,
-			.pstate_latency_us = 11.65333,
-			.sr_exit_time_us = 13.5,
-			.sr_enter_plus_exit_time_us = 16.5,
-			.valid = true,
-		},
-	}
-};
-
-
 static unsigned int find_dcfclk_for_voltage(const struct vg_dpm_clocks *clock_table,
 		unsigned int voltage)
 {
@@ -670,10 +599,9 @@ static void vg_clk_mgr_helper_populate_bw_params(
 		/*
 		 * WM set D will be re-purposed for memory retraining
 		 */
-		bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY;
-		bw_params->wm_table.entries[WM_D].wm_inst = WM_D;
-		bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING;
-		bw_params->wm_table.entries[WM_D].valid = true;
+		DC_FP_START();
+		dcn21_clk_mgr_set_bw_params_wm_table(bw_params);
+		DC_FP_END();
 	}
 
 }
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
index 7255477307f1..75884f572989 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
@@ -29,6 +29,9 @@
 
 struct watermarks;
 
+extern struct wm_table ddr4_wm_table;
+extern struct wm_table lpddr5_wm_table;
+
 struct smu_watermark_set {
 	struct watermarks *wm_set;
 	union large_integer mc_address;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
index e4863f0bf0f6..7ef66e511ec8 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
@@ -214,6 +214,80 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_01_soc = {
 	.urgent_latency_adjustment_fabric_clock_reference_mhz = 0,
 };
 
+struct wm_table ddr4_wm_table = {
+	.entries = {
+		{
+			.wm_inst = WM_A,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 6.09,
+			.sr_enter_plus_exit_time_us = 7.14,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_B,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 10.12,
+			.sr_enter_plus_exit_time_us = 11.48,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_C,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 10.12,
+			.sr_enter_plus_exit_time_us = 11.48,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_D,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.72,
+			.sr_exit_time_us = 10.12,
+			.sr_enter_plus_exit_time_us = 11.48,
+			.valid = true,
+		},
+	}
+};
+
+struct wm_table lpddr5_wm_table = {
+	.entries = {
+		{
+			.wm_inst = WM_A,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 13.5,
+			.sr_enter_plus_exit_time_us = 16.5,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_B,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 13.5,
+			.sr_enter_plus_exit_time_us = 16.5,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_C,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 13.5,
+			.sr_enter_plus_exit_time_us = 16.5,
+			.valid = true,
+		},
+		{
+			.wm_inst = WM_D,
+			.wm_type = WM_TYPE_PSTATE_CHG,
+			.pstate_latency_us = 11.65333,
+			.sr_exit_time_us = 13.5,
+			.sr_enter_plus_exit_time_us = 16.5,
+			.valid = true,
+		},
+	}
+};
+
 static void calculate_wm_set_for_vlevel(int vlevel,
 		struct wm_range_table_entry *table_entry,
 		struct dcn_watermarks *wm_set,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/5] drm/amd/display: move FPU code from dcn301 clk mgr to DML folder
  2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen
@ 2022-07-21 17:26   ` Maíra Canal
  2022-07-21 18:59   ` Rodrigo Siqueira Jordao
  1 sibling, 0 replies; 13+ messages in thread
From: Maíra Canal @ 2022-07-21 17:26 UTC (permalink / raw)
  To: Melissa Wen, harry.wentland, sunpeng.li, Rodrigo.Siqueira,
	alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, kernel-dev, amd-gfx, dri-devel, linux-kernel

Hi Melissa,

On 7/20/22 16:32, Melissa Wen wrote:
> The -mno-gnu-attribute option in dcn301 clk mgr makefile hides a soft vs
> hard fp error for powerpc. After removing this flag, we can see some FPU
> code remains there:
> 
> gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses
> hard float,
> drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.o
> uses soft float
> 
> Therefore, remove the -mno-gnu-attribute flag for dcn301/powerpc and
> move FPU-associated code to DML folder.
> 
> Signed-off-by: Melissa Wen <mwen@igalia.com>
> ---
>  .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |  6 --
>  .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c    | 86 ++-----------------
>  .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h    |  3 +
>  .../amd/display/dc/dml/dcn301/dcn301_fpu.c    | 74 ++++++++++++++++
>  4 files changed, 84 insertions(+), 85 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> index 15b660a951a5..271d8e573181 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> @@ -123,12 +123,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30)
>  ###############################################################################
>  CLK_MGR_DCN301 = vg_clk_mgr.o dcn301_smu.o
>  
> -# prevent build errors regarding soft-float vs hard-float FP ABI tags
> -# this code is currently unused on ppc64, as it applies to VanGogh APUs only
> -ifdef CONFIG_PPC64
> -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn301/vg_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
> -endif
> -
>  AMD_DAL_CLK_MGR_DCN301 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn301/,$(CLK_MGR_DCN301))
>  
>  AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN301)
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> index f310b0d25a07..65f224af03c0 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> @@ -32,6 +32,10 @@
>  // For dcn20_update_clocks_update_dpp_dto
>  #include "dcn20/dcn20_clk_mgr.h"
>  
> +// For DML FPU code
> +#include "dml/dcn20/dcn20_fpu.h"
> +#include "dml/dcn301/dcn301_fpu.h"
> +

I guess the "dml/dcn301/dcn301_fpu.h" header is not needed, as you only
use dcn21_clk_mgr_set_bw_params_wm_table and the structs are on the
source file.

Besides that, to the whole series:
Reviewed-by: Maíra Canal <mairacanal@riseup.net>

Best Regards,
- Maíra Canal

>  #include "vg_clk_mgr.h"
>  #include "dcn301_smu.h"
>  #include "reg_helper.h"
> @@ -526,81 +530,6 @@ static struct clk_bw_params vg_bw_params = {
>  
>  };
>  
> -static struct wm_table ddr4_wm_table = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 6.09,
> -			.sr_enter_plus_exit_time_us = 7.14,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -static struct wm_table lpddr5_wm_table = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 13.5,
> -			.sr_enter_plus_exit_time_us = 16.5,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 13.5,
> -			.sr_enter_plus_exit_time_us = 16.5,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 13.5,
> -			.sr_enter_plus_exit_time_us = 16.5,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 13.5,
> -			.sr_enter_plus_exit_time_us = 16.5,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -
>  static unsigned int find_dcfclk_for_voltage(const struct vg_dpm_clocks *clock_table,
>  		unsigned int voltage)
>  {
> @@ -670,10 +599,9 @@ static void vg_clk_mgr_helper_populate_bw_params(
>  		/*
>  		 * WM set D will be re-purposed for memory retraining
>  		 */
> -		bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY;
> -		bw_params->wm_table.entries[WM_D].wm_inst = WM_D;
> -		bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING;
> -		bw_params->wm_table.entries[WM_D].valid = true;
> +		DC_FP_START();
> +		dcn21_clk_mgr_set_bw_params_wm_table(bw_params);
> +		DC_FP_END();
>  	}
>  
>  }
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
> index 7255477307f1..75884f572989 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
> @@ -29,6 +29,9 @@
>  
>  struct watermarks;
>  
> +extern struct wm_table ddr4_wm_table;
> +extern struct wm_table lpddr5_wm_table;
> +
>  struct smu_watermark_set {
>  	struct watermarks *wm_set;
>  	union large_integer mc_address;
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
> index e4863f0bf0f6..7ef66e511ec8 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
> @@ -214,6 +214,80 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_01_soc = {
>  	.urgent_latency_adjustment_fabric_clock_reference_mhz = 0,
>  };
>  
> +struct wm_table ddr4_wm_table = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 6.09,
> +			.sr_enter_plus_exit_time_us = 7.14,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +	}
> +};
> +
> +struct wm_table lpddr5_wm_table = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 13.5,
> +			.sr_enter_plus_exit_time_us = 16.5,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 13.5,
> +			.sr_enter_plus_exit_time_us = 16.5,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 13.5,
> +			.sr_enter_plus_exit_time_us = 16.5,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 13.5,
> +			.sr_enter_plus_exit_time_us = 16.5,
> +			.valid = true,
> +		},
> +	}
> +};
> +
>  static void calculate_wm_set_for_vlevel(int vlevel,
>  		struct wm_range_table_entry *table_entry,
>  		struct dcn_watermarks *wm_set,

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family for powerpc
  2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen
@ 2022-07-21 18:54   ` Rodrigo Siqueira Jordao
  0 siblings, 0 replies; 13+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-07-21 18:54 UTC (permalink / raw)
  To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel,
	linux-kernel



On 2022-07-20 15:32, Melissa Wen wrote:
> Move remaining FPU code to DML folder that caused compilation error for
> powerpc. This patch depends on [1] to prevent the error below:
> 
> /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o uses soft float
> /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
> /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o uses soft float
> /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
> /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o uses soft float
> /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
> 
> [1] https://lore.kernel.org/amd-gfx/20220716195144.342960-1-mwen@igalia.com/
> 
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Melissa Wen <mwen@igalia.com>
> ---
>   drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c |  5 +++--
>   .../gpu/drm/amd/display/dc/dcn315/dcn315_resource.c   |  5 +++--
>   .../gpu/drm/amd/display/dc/dcn316/dcn316_resource.c   |  5 +++--
>   drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c  | 11 +++++++++++
>   drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h  |  3 +++
>   5 files changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> index 178d40c0d70a..929b712cbada 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> @@ -1663,11 +1663,12 @@ int dcn31_populate_dml_pipes_from_context(
>   		pipes[pipe_cnt].pipe.src.immediate_flip = true;
>   		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
>   		pipes[pipe_cnt].pipe.src.gpuvm = true;
> -		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
> -		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
>   		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
>   		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
>   		pipes[pipe_cnt].dout.dsc_input_bpc = 0;
> +		DC_FP_START();
> +		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
> +		DC_FP_END();
>   
>   		if (dc->debug.dml_hostvm_override == DML_HOSTVM_NO_OVERRIDE)
>   			pipes[pipe_cnt].pipe.src.hostvm = dc->res_pool->hubbub->riommu_active;
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
> index df2abd8fe2eb..1a5f5977f962 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
> @@ -1658,11 +1658,12 @@ static int dcn315_populate_dml_pipes_from_context(
>   
>   		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
>   		pipes[pipe_cnt].pipe.src.gpuvm = true;
> -		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
> -		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
>   		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
>   		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
>   		pipes[pipe_cnt].dout.dsc_input_bpc = 0;
> +		DC_FP_START();
> +		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
> +		DC_FP_END();
>   
>   		if (pipes[pipe_cnt].dout.dsc_enable) {
>   			switch (timing->display_color_depth) {
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c
> index 070fe10a004e..53dea466348f 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c
> @@ -1661,11 +1661,12 @@ static int dcn316_populate_dml_pipes_from_context(
>   
>   		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
>   		pipes[pipe_cnt].pipe.src.gpuvm = true;
> -		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
> -		pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
>   		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
>   		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
>   		pipes[pipe_cnt].dout.dsc_input_bpc = 0;
> +		DC_FP_START();
> +		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
> +		DC_FP_END();
>   
>   		if (pipes[pipe_cnt].dout.dsc_enable) {
>   			switch (timing->display_color_depth) {
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> index facac3daeaca..e36cfa5985ea 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> @@ -435,8 +435,19 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_16_soc = {
>   	.urgent_latency_adjustment_fabric_clock_reference_mhz = 0,
>   };
>   
> +void dcn31_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes,
> +				  int pipe_cnt)
> +{
> +	dc_assert_fp_enabled();
> +
> +	pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
> +	pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
> +}
> +
>   void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context)
>   {
> +	dc_assert_fp_enabled();
> +
>   	if (dc->clk_mgr->bw_params->wm_table.entries[WM_A].valid) {
>   		context->bw_ctx.dml.soc.dram_clock_change_latency_us = dc->clk_mgr->bw_params->wm_table.entries[WM_A].pstate_latency_us;
>   		context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us = dc->clk_mgr->bw_params->wm_table.entries[WM_A].sr_enter_plus_exit_time_us;
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
> index 0a10de80c1a4..4372f17b55d4 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
> @@ -31,6 +31,9 @@
>   #define DCN3_15_MIN_COMPBUF_SIZE_KB 128
>   #define DCN3_16_DEFAULT_DET_SIZE 192
>   
> +void dcn31_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes,
> +				  int pipe_cnt);
> +
>   void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context);
>   
>   void dcn31_calculate_wm_and_dlg_fp(

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file
  2022-07-20 19:32 ` [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file Melissa Wen
@ 2022-07-21 18:55   ` Rodrigo Siqueira Jordao
  0 siblings, 0 replies; 13+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-07-21 18:55 UTC (permalink / raw)
  To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel,
	linux-kernel



On 2022-07-20 15:32, Melissa Wen wrote:
> Many lines of code in dcn31_resource_construct are wrapped by DC_FP
> macro to protect FPU operations; however, there is no FPU in this
> region. Therefore, just remove the wrapper for clarity.
> 
> Signed-off-by: Melissa Wen <mwen@igalia.com>
> ---
>   drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 6 ------
>   1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> index 929b712cbada..6d25fcf865bf 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> @@ -1863,8 +1863,6 @@ static bool dcn31_resource_construct(
>   	struct dc_context *ctx = dc->ctx;
>   	struct irq_service_init_data init_data;
>   
> -	DC_FP_START();
> -
>   	ctx->dc_bios->regs = &bios_regs;
>   
>   	pool->base.res_cap = &res_cap_dcn31;
> @@ -2175,13 +2173,9 @@ static bool dcn31_resource_construct(
>   
>   	dc->dcn_ip->max_num_dpp = dcn3_1_ip.max_num_dpp;
>   
> -	DC_FP_END();
> -
>   	return true;
>   
>   create_fail:
> -
> -	DC_FP_END();
>   	dcn31_resource_destruct(pool);
>   
>   	return false;

Very nice catch!

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr
  2022-07-20 19:32 ` [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr Melissa Wen
@ 2022-07-21 18:57   ` Rodrigo Siqueira Jordao
  0 siblings, 0 replies; 13+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-07-21 18:57 UTC (permalink / raw)
  To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel,
	linux-kernel



On 2022-07-20 15:32, Melissa Wen wrote:
> The -mno-gnu-attribute option in dcn21 clk mgr makefile hides a soft vs
> hard fp error for powerpc. After removing this flag, we can see some FPU
> code remains there:
> 
> /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses
> hard float,
> drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn21/rn_clk_mgr.o uses
> soft float
> 
> Therefore, remove the -mno-gnu-attribute flag for dcn21/powerpc and move
> FPU-associated code to DML folder.
> 
> Signed-off-by: Melissa Wen <mwen@igalia.com>
> ---
>   .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |   6 -
>   .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 234 +----------------
>   .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h |   7 +
>   .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c  | 235 ++++++++++++++++++
>   .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h  |   2 +
>   5 files changed, 248 insertions(+), 236 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> index a48453612d10..66dc02c426e9 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> @@ -107,12 +107,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN201)
>   ###############################################################################
>   CLK_MGR_DCN21 = rn_clk_mgr.o rn_clk_mgr_vbios_smu.o
>   
> -# prevent build errors regarding soft-float vs hard-float FP ABI tags
> -# this code is currently unused on ppc64, as it applies to Renoir APUs only
> -ifdef CONFIG_PPC64
> -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
> -endif
> -
>   AMD_DAL_CLK_MGR_DCN21 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn21/,$(CLK_MGR_DCN21))
>   
>   AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21)
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
> index cf1b5f354ae9..0202dc682682 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
> @@ -26,10 +26,9 @@
>   #include "dccg.h"
>   #include "clk_mgr_internal.h"
>   
> -
>   #include "dcn20/dcn20_clk_mgr.h"
>   #include "rn_clk_mgr.h"
> -
> +#include "dml/dcn20/dcn20_fpu.h"
>   
>   #include "dce100/dce_clk_mgr.h"
>   #include "rn_clk_mgr_vbios_smu.h"
> @@ -45,7 +44,6 @@
>   
>   /* Constants */
>   
> -#define LPDDR_MEM_RETRAIN_LATENCY 4.977 /* Number obtained from LPDDR4 Training Counter Requirement doc */
>   #define SMU_VER_55_51_0 0x373300 /* SMU Version that is able to set DISPCLK below 100MHz */
>   
>   /* Macros */
> @@ -613,228 +611,6 @@ static struct clk_bw_params rn_bw_params = {
>   
>   };
>   
> -static struct wm_table ddr4_wm_table_gs = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 7.09,
> -			.sr_enter_plus_exit_time_us = 8.14,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -static struct wm_table lpddr4_wm_table_gs = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 5.32,
> -			.sr_enter_plus_exit_time_us = 6.38,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.82,
> -			.sr_enter_plus_exit_time_us = 11.196,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.89,
> -			.sr_enter_plus_exit_time_us = 11.24,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.748,
> -			.sr_enter_plus_exit_time_us = 11.102,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -static struct wm_table lpddr4_wm_table_with_disabled_ppt = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 8.32,
> -			.sr_enter_plus_exit_time_us = 9.38,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.82,
> -			.sr_enter_plus_exit_time_us = 11.196,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.89,
> -			.sr_enter_plus_exit_time_us = 11.24,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.748,
> -			.sr_enter_plus_exit_time_us = 11.102,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -static struct wm_table ddr4_wm_table_rn = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 11.90,
> -			.sr_enter_plus_exit_time_us = 12.80,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 13.18,
> -			.sr_enter_plus_exit_time_us = 14.30,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 13.18,
> -			.sr_enter_plus_exit_time_us = 14.30,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 13.18,
> -			.sr_enter_plus_exit_time_us = 14.30,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -static struct wm_table ddr4_1R_wm_table_rn = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 13.90,
> -			.sr_enter_plus_exit_time_us = 14.80,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 13.90,
> -			.sr_enter_plus_exit_time_us = 14.80,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 13.90,
> -			.sr_enter_plus_exit_time_us = 14.80,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 13.90,
> -			.sr_enter_plus_exit_time_us = 14.80,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -static struct wm_table lpddr4_wm_table_rn = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 7.32,
> -			.sr_enter_plus_exit_time_us = 8.38,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.82,
> -			.sr_enter_plus_exit_time_us = 11.196,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.89,
> -			.sr_enter_plus_exit_time_us = 11.24,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 9.748,
> -			.sr_enter_plus_exit_time_us = 11.102,
> -			.valid = true,
> -		},
> -	}
> -};
> -
>   static unsigned int find_socclk_for_voltage(struct dpm_clocks *clock_table, unsigned int voltage)
>   {
>   	int i;
> @@ -914,12 +690,10 @@ static void rn_clk_mgr_helper_populate_bw_params(struct clk_bw_params *bw_params
>   		/*
>   		 * WM set D will be re-purposed for memory retraining
>   		 */
> -		bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY;
> -		bw_params->wm_table.entries[WM_D].wm_inst = WM_D;
> -		bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING;
> -		bw_params->wm_table.entries[WM_D].valid = true;
> +		DC_FP_START();
> +		dcn21_clk_mgr_set_bw_params_wm_table(bw_params);
> +		DC_FP_END();
>   	}
> -
>   }
>   
>   void rn_clk_mgr_construct(
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h
> index e4322fa5475b..2e088c5171b2 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h
> @@ -29,6 +29,13 @@
>   #include "clk_mgr.h"
>   #include "dm_pp_smu.h"
>   
> +extern struct wm_table ddr4_wm_table_gs;
> +extern struct wm_table lpddr4_wm_table_gs;
> +extern struct wm_table lpddr4_wm_table_with_disabled_ppt;
> +extern struct wm_table ddr4_wm_table_rn;
> +extern struct wm_table ddr4_1R_wm_table_rn;
> +extern struct wm_table lpddr4_wm_table_rn;
> +
>   struct rn_clk_registers {
>   	uint32_t CLK1_CLK0_CURRENT_CNT; /* DPREFCLK */
>   };
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
> index dc60b835e938..eeeae52fe6fc 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
> @@ -42,6 +42,9 @@
>   #define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
>   #endif
>   
> +/* Constant */
> +#define LPDDR_MEM_RETRAIN_LATENCY 4.977 /* Number obtained from LPDDR4 Training Counter Requirement doc */
> +
>   /**
>    * DOC: DCN2x FPU manipulation Overview
>    *
> @@ -650,6 +653,228 @@ struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc = {
>   	.num_states = 8
>   };
>   
> +struct wm_table ddr4_wm_table_gs = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 7.09,
> +			.sr_enter_plus_exit_time_us = 8.14,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +	}
> +};
> +
> +struct wm_table lpddr4_wm_table_gs = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 5.32,
> +			.sr_enter_plus_exit_time_us = 6.38,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.82,
> +			.sr_enter_plus_exit_time_us = 11.196,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.89,
> +			.sr_enter_plus_exit_time_us = 11.24,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.748,
> +			.sr_enter_plus_exit_time_us = 11.102,
> +			.valid = true,
> +		},
> +	}
> +};
> +
> +struct wm_table lpddr4_wm_table_with_disabled_ppt = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 8.32,
> +			.sr_enter_plus_exit_time_us = 9.38,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.82,
> +			.sr_enter_plus_exit_time_us = 11.196,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.89,
> +			.sr_enter_plus_exit_time_us = 11.24,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.748,
> +			.sr_enter_plus_exit_time_us = 11.102,
> +			.valid = true,
> +		},
> +	}
> +};
> +
> +struct wm_table ddr4_wm_table_rn = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 11.90,
> +			.sr_enter_plus_exit_time_us = 12.80,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 13.18,
> +			.sr_enter_plus_exit_time_us = 14.30,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 13.18,
> +			.sr_enter_plus_exit_time_us = 14.30,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 13.18,
> +			.sr_enter_plus_exit_time_us = 14.30,
> +			.valid = true,
> +		},
> +	}
> +};
> +
> +struct wm_table ddr4_1R_wm_table_rn = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 13.90,
> +			.sr_enter_plus_exit_time_us = 14.80,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 13.90,
> +			.sr_enter_plus_exit_time_us = 14.80,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 13.90,
> +			.sr_enter_plus_exit_time_us = 14.80,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 13.90,
> +			.sr_enter_plus_exit_time_us = 14.80,
> +			.valid = true,
> +		},
> +	}
> +};
> +
> +struct wm_table lpddr4_wm_table_rn = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 7.32,
> +			.sr_enter_plus_exit_time_us = 8.38,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.82,
> +			.sr_enter_plus_exit_time_us = 11.196,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.89,
> +			.sr_enter_plus_exit_time_us = 11.24,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 9.748,
> +			.sr_enter_plus_exit_time_us = 11.102,
> +			.valid = true,
> +		},
> +	}
> +};
> +
>   void dcn20_populate_dml_writeback_from_context(struct dc *dc,
>   					       struct resource_context *res_ctx,
>   					       display_e2e_pipe_params_st *pipes)
> @@ -2068,3 +2293,13 @@ void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params
>   
>   	dml_init_instance(&dc->dml, &dcn2_1_soc, &dcn2_1_ip, DML_PROJECT_DCN21);
>   }
> +
> +void dcn21_clk_mgr_set_bw_params_wm_table(struct clk_bw_params *bw_params)
> +{
> +	dc_assert_fp_enabled();
> +
> +	bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY;
> +	bw_params->wm_table.entries[WM_D].wm_inst = WM_D;
> +	bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING;
> +	bw_params->wm_table.entries[WM_D].valid = true;
> +}
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h
> index aa892193e485..a6e1ad0f38e9 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h
> @@ -82,4 +82,6 @@ bool dcn21_validate_bandwidth_fp(struct dc *dc,
>   				 bool fast_validate);
>   void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params);
>   
> +void dcn21_clk_mgr_set_bw_params_wm_table(struct clk_bw_params *bw_params);
> +
>   #endif /* __DCN20_FPU_H__ */

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder
  2022-07-20 19:32 ` [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder Melissa Wen
@ 2022-07-21 18:58   ` Rodrigo Siqueira Jordao
  0 siblings, 0 replies; 13+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-07-21 18:58 UTC (permalink / raw)
  To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel,
	linux-kernel



On 2022-07-20 15:32, Melissa Wen wrote:
> The -mno-gnu-attribute option in clk mgr makefile for dcn30 hides a soft
> vs hard fp error for powerpc. After removing this flag, we can see some
> FPU code remains there:
> 
> gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses
> hard float,
> drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.o
> uses soft float
> 
> Therefore, remove the -mno-gnu-attribute flag for dcn30/powerpc and move
> FPU-associated code to DML folder.
> 
> Signed-off-by: Melissa Wen <mwen@igalia.com>
> ---
>   .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |  6 --
>   .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c  | 63 ++-----------------
>   .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.c  | 63 ++++++++++++++++++-
>   .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.h  |  1 +
>   4 files changed, 68 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> index 66dc02c426e9..15b660a951a5 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> @@ -115,12 +115,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21)
>   ###############################################################################
>   CLK_MGR_DCN30 = dcn30_clk_mgr.o dcn30_clk_mgr_smu_msg.o
>   
> -# prevent build errors regarding soft-float vs hard-float FP ABI tags
> -# this code is currently unused on ppc64, as it applies to VanGogh APUs only
> -ifdef CONFIG_PPC64
> -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn30/dcn30_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
> -endif
> -
>   AMD_DAL_CLK_MGR_DCN30 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn30/,$(CLK_MGR_DCN30))
>   
>   AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30)
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c
> index 914708cefc79..3ce0ee0d012f 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c
> @@ -29,6 +29,7 @@
>   #include "dcn20/dcn20_clk_mgr.h"
>   #include "dce100/dce_clk_mgr.h"
>   #include "dcn30/dcn30_clk_mgr.h"
> +#include "dml/dcn30/dcn30_fpu.h"
>   #include "reg_helper.h"
>   #include "core_types.h"
>   #include "dm_helpers.h"
> @@ -97,65 +98,11 @@ static void dcn3_init_single_clock(struct clk_mgr_internal *clk_mgr, uint32_t cl
>   	}
>   }
>   
> -static noinline void dcn3_build_wm_range_table(struct clk_mgr_internal *clk_mgr)
> +static void dcn3_build_wm_range_table(struct clk_mgr_internal *clk_mgr)
>   {
> -	/* defaults */
> -	double pstate_latency_us = clk_mgr->base.ctx->dc->dml.soc.dram_clock_change_latency_us;
> -	double sr_exit_time_us = clk_mgr->base.ctx->dc->dml.soc.sr_exit_time_us;
> -	double sr_enter_plus_exit_time_us = clk_mgr->base.ctx->dc->dml.soc.sr_enter_plus_exit_time_us;
> -	uint16_t min_uclk_mhz = clk_mgr->base.bw_params->clk_table.entries[0].memclk_mhz;
> -
> -	/* Set A - Normal - default values*/
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].valid = true;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.pstate_latency_us = pstate_latency_us;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.sr_exit_time_us = sr_exit_time_us;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_dcfclk = 0;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_dcfclk = 0xFFFF;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_uclk = min_uclk_mhz;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_uclk = 0xFFFF;
> -
> -	/* Set B - Performance - higher minimum clocks */
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].valid = true;
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us = pstate_latency_us;
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.sr_exit_time_us = sr_exit_time_us;
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE;
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_dcfclk = TUNED VALUE;
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_dcfclk = 0xFFFF;
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_uclk = TUNED VALUE;
> -//	clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_uclk = 0xFFFF;
> -
> -	/* Set C - Dummy P-State - P-State latency set to "dummy p-state" value */
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].valid = true;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us = 0;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.sr_exit_time_us = sr_exit_time_us;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.wm_type = WATERMARKS_DUMMY_PSTATE;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_dcfclk = 0;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_dcfclk = 0xFFFF;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_uclk = min_uclk_mhz;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_uclk = 0xFFFF;
> -	clk_mgr->base.bw_params->dummy_pstate_table[0].dram_speed_mts = 1600;
> -	clk_mgr->base.bw_params->dummy_pstate_table[0].dummy_pstate_latency_us = 38;
> -	clk_mgr->base.bw_params->dummy_pstate_table[1].dram_speed_mts = 8000;
> -	clk_mgr->base.bw_params->dummy_pstate_table[1].dummy_pstate_latency_us = 9;
> -	clk_mgr->base.bw_params->dummy_pstate_table[2].dram_speed_mts = 10000;
> -	clk_mgr->base.bw_params->dummy_pstate_table[2].dummy_pstate_latency_us = 8;
> -	clk_mgr->base.bw_params->dummy_pstate_table[3].dram_speed_mts = 16000;
> -	clk_mgr->base.bw_params->dummy_pstate_table[3].dummy_pstate_latency_us = 5;
> -
> -	/* Set D - MALL - SR enter and exit times adjusted for MALL */
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].valid = true;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us = pstate_latency_us;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.sr_exit_time_us = 2;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us = 4;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.wm_type = WATERMARKS_MALL;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_dcfclk = 0;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_dcfclk = 0xFFFF;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_uclk = min_uclk_mhz;
> -	clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0xFFFF;
> +	DC_FP_START();
> +	dcn3_fpu_build_wm_range_table(&clk_mgr->base);
> +	DC_FP_END();
>   }
>   
>   void dcn3_init_clocks(struct clk_mgr *clk_mgr_base)
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c
> index a8db1306750e..c00f759fdded 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c
> @@ -29,7 +29,7 @@
>   #include "dcn20/dcn20_resource.h"
>   #include "dcn30/dcn30_resource.h"
>   
> -
> +#include "clk_mgr/dcn30/dcn30_smu11_driver_if.h"
>   #include "display_mode_vba_30.h"
>   #include "dcn30_fpu.h"
>   
> @@ -616,4 +616,65 @@ void dcn30_fpu_update_bw_bounding_box(struct dc *dc,
>   
>   }
>   
> +void dcn3_fpu_build_wm_range_table(struct clk_mgr *base)
> +{
> +	/* defaults */
> +	double pstate_latency_us = base->ctx->dc->dml.soc.dram_clock_change_latency_us;
> +	double sr_exit_time_us = base->ctx->dc->dml.soc.sr_exit_time_us;
> +	double sr_enter_plus_exit_time_us = base->ctx->dc->dml.soc.sr_enter_plus_exit_time_us;
> +	uint16_t min_uclk_mhz = base->bw_params->clk_table.entries[0].memclk_mhz;
>   
> +	dc_assert_fp_enabled();
> +
> +	/* Set A - Normal - default values*/
> +	base->bw_params->wm_table.nv_entries[WM_A].valid = true;
> +	base->bw_params->wm_table.nv_entries[WM_A].dml_input.pstate_latency_us = pstate_latency_us;
> +	base->bw_params->wm_table.nv_entries[WM_A].dml_input.sr_exit_time_us = sr_exit_time_us;
> +	base->bw_params->wm_table.nv_entries[WM_A].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
> +	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE;
> +	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_dcfclk = 0;
> +	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_dcfclk = 0xFFFF;
> +	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_uclk = min_uclk_mhz;
> +	base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_uclk = 0xFFFF;
> +
> +	/* Set B - Performance - higher minimum clocks */
> +//	base->bw_params->wm_table.nv_entries[WM_B].valid = true;
> +//	base->bw_params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us = pstate_latency_us;
> +//	base->bw_params->wm_table.nv_entries[WM_B].dml_input.sr_exit_time_us = sr_exit_time_us;
> +//	base->bw_params->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
> +//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE;
> +//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_dcfclk = TUNED VALUE;
> +//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_dcfclk = 0xFFFF;
> +//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_uclk = TUNED VALUE;
> +//	base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_uclk = 0xFFFF;
> +
> +	/* Set C - Dummy P-State - P-State latency set to "dummy p-state" value */
> +	base->bw_params->wm_table.nv_entries[WM_C].valid = true;
> +	base->bw_params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us = 0;
> +	base->bw_params->wm_table.nv_entries[WM_C].dml_input.sr_exit_time_us = sr_exit_time_us;
> +	base->bw_params->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us;
> +	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.wm_type = WATERMARKS_DUMMY_PSTATE;
> +	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_dcfclk = 0;
> +	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_dcfclk = 0xFFFF;
> +	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_uclk = min_uclk_mhz;
> +	base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_uclk = 0xFFFF;
> +	base->bw_params->dummy_pstate_table[0].dram_speed_mts = 1600;
> +	base->bw_params->dummy_pstate_table[0].dummy_pstate_latency_us = 38;
> +	base->bw_params->dummy_pstate_table[1].dram_speed_mts = 8000;
> +	base->bw_params->dummy_pstate_table[1].dummy_pstate_latency_us = 9;
> +	base->bw_params->dummy_pstate_table[2].dram_speed_mts = 10000;
> +	base->bw_params->dummy_pstate_table[2].dummy_pstate_latency_us = 8;
> +	base->bw_params->dummy_pstate_table[3].dram_speed_mts = 16000;
> +	base->bw_params->dummy_pstate_table[3].dummy_pstate_latency_us = 5;
> +
> +	/* Set D - MALL - SR enter and exit times adjusted for MALL */
> +	base->bw_params->wm_table.nv_entries[WM_D].valid = true;
> +	base->bw_params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us = pstate_latency_us;
> +	base->bw_params->wm_table.nv_entries[WM_D].dml_input.sr_exit_time_us = 2;
> +	base->bw_params->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us = 4;
> +	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.wm_type = WATERMARKS_MALL;
> +	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_dcfclk = 0;
> +	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_dcfclk = 0xFFFF;
> +	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_uclk = min_uclk_mhz;
> +	base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0xFFFF;
> +}
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h
> index dedfe7b5f173..c2024052a497 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h
> @@ -63,5 +63,6 @@ void dcn30_fpu_update_bw_bounding_box(struct dc *dc,
>   	unsigned int *dcfclk_mhz,
>   	unsigned int *dram_speed_mts);
>   
> +void dcn3_fpu_build_wm_range_table(struct clk_mgr *base);
>   
>   #endif /* __DCN30_FPU_H__*/

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/5] drm/amd/display: move FPU code from dcn301 clk mgr to DML folder
  2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen
  2022-07-21 17:26   ` Maíra Canal
@ 2022-07-21 18:59   ` Rodrigo Siqueira Jordao
  1 sibling, 0 replies; 13+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-07-21 18:59 UTC (permalink / raw)
  To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher,
	christian.koenig, Xinhui.Pan, airlied, daniel
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel,
	linux-kernel



On 2022-07-20 15:32, Melissa Wen wrote:
> The -mno-gnu-attribute option in dcn301 clk mgr makefile hides a soft vs
> hard fp error for powerpc. After removing this flag, we can see some FPU
> code remains there:
> 
> gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses
> hard float,
> drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.o
> uses soft float
> 
> Therefore, remove the -mno-gnu-attribute flag for dcn301/powerpc and
> move FPU-associated code to DML folder.
> 
> Signed-off-by: Melissa Wen <mwen@igalia.com>
> ---
>   .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |  6 --
>   .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c    | 86 ++-----------------
>   .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h    |  3 +
>   .../amd/display/dc/dml/dcn301/dcn301_fpu.c    | 74 ++++++++++++++++
>   4 files changed, 84 insertions(+), 85 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> index 15b660a951a5..271d8e573181 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> @@ -123,12 +123,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30)
>   ###############################################################################
>   CLK_MGR_DCN301 = vg_clk_mgr.o dcn301_smu.o
>   
> -# prevent build errors regarding soft-float vs hard-float FP ABI tags
> -# this code is currently unused on ppc64, as it applies to VanGogh APUs only
> -ifdef CONFIG_PPC64
> -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn301/vg_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
> -endif
> -
>   AMD_DAL_CLK_MGR_DCN301 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn301/,$(CLK_MGR_DCN301))
>   
>   AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN301)
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> index f310b0d25a07..65f224af03c0 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> @@ -32,6 +32,10 @@
>   // For dcn20_update_clocks_update_dpp_dto
>   #include "dcn20/dcn20_clk_mgr.h"
>   
> +// For DML FPU code
> +#include "dml/dcn20/dcn20_fpu.h"
> +#include "dml/dcn301/dcn301_fpu.h"
> +
>   #include "vg_clk_mgr.h"
>   #include "dcn301_smu.h"
>   #include "reg_helper.h"
> @@ -526,81 +530,6 @@ static struct clk_bw_params vg_bw_params = {
>   
>   };
>   
> -static struct wm_table ddr4_wm_table = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 6.09,
> -			.sr_enter_plus_exit_time_us = 7.14,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.72,
> -			.sr_exit_time_us = 10.12,
> -			.sr_enter_plus_exit_time_us = 11.48,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -static struct wm_table lpddr5_wm_table = {
> -	.entries = {
> -		{
> -			.wm_inst = WM_A,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 13.5,
> -			.sr_enter_plus_exit_time_us = 16.5,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_B,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 13.5,
> -			.sr_enter_plus_exit_time_us = 16.5,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_C,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 13.5,
> -			.sr_enter_plus_exit_time_us = 16.5,
> -			.valid = true,
> -		},
> -		{
> -			.wm_inst = WM_D,
> -			.wm_type = WM_TYPE_PSTATE_CHG,
> -			.pstate_latency_us = 11.65333,
> -			.sr_exit_time_us = 13.5,
> -			.sr_enter_plus_exit_time_us = 16.5,
> -			.valid = true,
> -		},
> -	}
> -};
> -
> -
>   static unsigned int find_dcfclk_for_voltage(const struct vg_dpm_clocks *clock_table,
>   		unsigned int voltage)
>   {
> @@ -670,10 +599,9 @@ static void vg_clk_mgr_helper_populate_bw_params(
>   		/*
>   		 * WM set D will be re-purposed for memory retraining
>   		 */
> -		bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY;
> -		bw_params->wm_table.entries[WM_D].wm_inst = WM_D;
> -		bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING;
> -		bw_params->wm_table.entries[WM_D].valid = true;
> +		DC_FP_START();
> +		dcn21_clk_mgr_set_bw_params_wm_table(bw_params);
> +		DC_FP_END();
>   	}
>   
>   }
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
> index 7255477307f1..75884f572989 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h
> @@ -29,6 +29,9 @@
>   
>   struct watermarks;
>   
> +extern struct wm_table ddr4_wm_table;
> +extern struct wm_table lpddr5_wm_table;
> +
>   struct smu_watermark_set {
>   	struct watermarks *wm_set;
>   	union large_integer mc_address;
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
> index e4863f0bf0f6..7ef66e511ec8 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c
> @@ -214,6 +214,80 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_01_soc = {
>   	.urgent_latency_adjustment_fabric_clock_reference_mhz = 0,
>   };
>   
> +struct wm_table ddr4_wm_table = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 6.09,
> +			.sr_enter_plus_exit_time_us = 7.14,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.72,
> +			.sr_exit_time_us = 10.12,
> +			.sr_enter_plus_exit_time_us = 11.48,
> +			.valid = true,
> +		},
> +	}
> +};
> +
> +struct wm_table lpddr5_wm_table = {
> +	.entries = {
> +		{
> +			.wm_inst = WM_A,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 13.5,
> +			.sr_enter_plus_exit_time_us = 16.5,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_B,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 13.5,
> +			.sr_enter_plus_exit_time_us = 16.5,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_C,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 13.5,
> +			.sr_enter_plus_exit_time_us = 16.5,
> +			.valid = true,
> +		},
> +		{
> +			.wm_inst = WM_D,
> +			.wm_type = WM_TYPE_PSTATE_CHG,
> +			.pstate_latency_us = 11.65333,
> +			.sr_exit_time_us = 13.5,
> +			.sr_enter_plus_exit_time_us = 16.5,
> +			.valid = true,
> +		},
> +	}
> +};
> +
>   static void calculate_wm_set_for_vlevel(int vlevel,
>   		struct wm_range_table_entry *table_entry,
>   		struct dcn_watermarks *wm_set,

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc
  2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen
                   ` (4 preceding siblings ...)
  2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen
@ 2022-07-21 19:07 ` Rodrigo Siqueira Jordao
  5 siblings, 0 replies; 13+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-07-21 19:07 UTC (permalink / raw)
  To: Melissa Wen, airlied, alexander.deucher, christian.koenig,
	daniel, harry.wentland, sunpeng.li, Xinhui.Pan
  Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel,
	linux-kernel



On 2022-07-20 15:32, Melissa Wen wrote:
> An initial report from Guenter[1] shows some soft-fp vs hard-fp error
> from DCN31 clk mgr for powerpc. I was not able to reproduce it
> cross-compiling with gcc-powerpc-linux-gnu and gcc-11.3, but thanks to
> Maíra tips, I can reproduce the issue using make.cross, as follows:
> 
> - wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> - chmod +x ~/bin/make.cross
> - mkdir build_dir
> - COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 ~/make.cross O=build_dir ARCH=powerpc SHELL=/bin/bash

Hi Melissa,

I didn't know about these steps, I was trying to reproduce this issue by 
using the standard cross compile package provided by my distro (Debian 
testing and ArchLinux), and as a result, I was never able to see the 
problem. Anyway, I can now reproduce this issue, thanks a lot.

> with a config file generate by allmodconfig
> 
> So, the first patch fix the issue reported by Guenter. The second is
> just a cleanup in dcn31_resource file to remove useless DC_FP_ wrapper.
> Finally, the last three patches I'm removing the -mno-gnu-attribute
> option, that was just hiding FPU-associated code in clk mgr files of
> dcn21/30/301, and moving them to DML folder. This series doesn't cover
> recent drivers dcn32/314.

I validated this series in our internal CI by running multiple IGT tests 
in numerous ASICs. Tomorrow we will also send some extra patches 
associated with this FPU effort; hopefully, after that, we will finally 
have all the FPU code under DML. Again, thanks a lot for your effort!

Thanks
Siqueira

> Thanks Guenter, Maíra, Siqueira and Alex for all inputs on this
> debugging process. Let me know your thoughts on this approach.
> 
> Melissa
> 
> [1] https://lore.kernel.org/amd-gfx/20220618232737.2036722-1-linux@roeck-us.net/>> 
> Melissa Wen (5):
>    drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family for powerpc
>    drm/amd/display: remove useless FPU protection wrapper from
>      dcn31_resource file
>    drm/amd/display: move FPU code on dcn21 clk_mgr
>    drm/amd/display: move FPU code from dcn30 clk mgr to DML folder
>    drm/amd/display: move FPU code from dcn301 clk mgr to DML folder
> 
>   .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |  18 --
>   .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 234 +----------------
>   .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h |   7 +
>   .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c  |  63 +----
>   .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c    |  86 +------
>   .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h    |   3 +
>   .../drm/amd/display/dc/dcn31/dcn31_resource.c |  11 +-
>   .../amd/display/dc/dcn315/dcn315_resource.c   |   5 +-
>   .../amd/display/dc/dcn316/dcn316_resource.c   |   5 +-
>   .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c  | 235 ++++++++++++++++++
>   .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h  |   2 +
>   .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.c  |  63 ++++-
>   .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.h  |   1 +
>   .../amd/display/dc/dml/dcn301/dcn301_fpu.c    |  74 ++++++
>   .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c  |  11 +
>   .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.h  |   3 +
>   16 files changed, 423 insertions(+), 398 deletions(-)
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-07-21 19:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen
2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen
2022-07-21 18:54   ` Rodrigo Siqueira Jordao
2022-07-20 19:32 ` [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file Melissa Wen
2022-07-21 18:55   ` Rodrigo Siqueira Jordao
2022-07-20 19:32 ` [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr Melissa Wen
2022-07-21 18:57   ` Rodrigo Siqueira Jordao
2022-07-20 19:32 ` [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder Melissa Wen
2022-07-21 18:58   ` Rodrigo Siqueira Jordao
2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen
2022-07-21 17:26   ` Maíra Canal
2022-07-21 18:59   ` Rodrigo Siqueira Jordao
2022-07-21 19:07 ` [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Rodrigo Siqueira Jordao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).