* [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc @ 2022-07-20 19:32 Melissa Wen 2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen ` (5 more replies) 0 siblings, 6 replies; 13+ messages in thread From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw) To: airlied, alexander.deucher, christian.koenig, daniel, harry.wentland, Rodrigo.Siqueira, sunpeng.li, Xinhui.Pan Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen, amd-gfx, dri-devel, linux-kernel An initial report from Guenter[1] shows some soft-fp vs hard-fp error from DCN31 clk mgr for powerpc. I was not able to reproduce it cross-compiling with gcc-powerpc-linux-gnu and gcc-11.3, but thanks to Maíra tips, I can reproduce the issue using make.cross, as follows: - wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross - chmod +x ~/bin/make.cross - mkdir build_dir - COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 ~/make.cross O=build_dir ARCH=powerpc SHELL=/bin/bash with a config file generate by allmodconfig So, the first patch fix the issue reported by Guenter. The second is just a cleanup in dcn31_resource file to remove useless DC_FP_ wrapper. Finally, the last three patches I'm removing the -mno-gnu-attribute option, that was just hiding FPU-associated code in clk mgr files of dcn21/30/301, and moving them to DML folder. This series doesn't cover recent drivers dcn32/314. Thanks Guenter, Maíra, Siqueira and Alex for all inputs on this debugging process. Let me know your thoughts on this approach. Melissa [1] https://lore.kernel.org/amd-gfx/20220618232737.2036722-1-linux@roeck-us.net/ Melissa Wen (5): drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family for powerpc drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file drm/amd/display: move FPU code on dcn21 clk_mgr drm/amd/display: move FPU code from dcn30 clk mgr to DML folder drm/amd/display: move FPU code from dcn301 clk mgr to DML folder .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 18 -- .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 234 +---------------- .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h | 7 + .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 63 +---- .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 86 +------ .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h | 3 + .../drm/amd/display/dc/dcn31/dcn31_resource.c | 11 +- .../amd/display/dc/dcn315/dcn315_resource.c | 5 +- .../amd/display/dc/dcn316/dcn316_resource.c | 5 +- .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 235 ++++++++++++++++++ .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h | 2 + .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.c | 63 ++++- .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.h | 1 + .../amd/display/dc/dml/dcn301/dcn301_fpu.c | 74 ++++++ .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 11 + .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 3 + 16 files changed, 423 insertions(+), 398 deletions(-) -- 2.35.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family for powerpc 2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen @ 2022-07-20 19:32 ` Melissa Wen 2022-07-21 18:54 ` Rodrigo Siqueira Jordao 2022-07-20 19:32 ` [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file Melissa Wen ` (4 subsequent siblings) 5 siblings, 1 reply; 13+ messages in thread From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw) To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen, amd-gfx, dri-devel, linux-kernel Move remaining FPU code to DML folder that caused compilation error for powerpc. This patch depends on [1] to prevent the error below: /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o uses soft float /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o uses soft float /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o uses soft float /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o [1] https://lore.kernel.org/amd-gfx/20220716195144.342960-1-mwen@igalia.com/ Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Melissa Wen <mwen@igalia.com> --- drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 5 +++-- .../gpu/drm/amd/display/dc/dcn315/dcn315_resource.c | 5 +++-- .../gpu/drm/amd/display/dc/dcn316/dcn316_resource.c | 5 +++-- drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 11 +++++++++++ drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 3 +++ 5 files changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c index 178d40c0d70a..929b712cbada 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c @@ -1663,11 +1663,12 @@ int dcn31_populate_dml_pipes_from_context( pipes[pipe_cnt].pipe.src.immediate_flip = true; pipes[pipe_cnt].pipe.src.unbounded_req_mode = false; pipes[pipe_cnt].pipe.src.gpuvm = true; - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch; pipes[pipe_cnt].pipe.src.dcc_rate = 3; pipes[pipe_cnt].dout.dsc_input_bpc = 0; + DC_FP_START(); + dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt); + DC_FP_END(); if (dc->debug.dml_hostvm_override == DML_HOSTVM_NO_OVERRIDE) pipes[pipe_cnt].pipe.src.hostvm = dc->res_pool->hubbub->riommu_active; diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c index df2abd8fe2eb..1a5f5977f962 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c @@ -1658,11 +1658,12 @@ static int dcn315_populate_dml_pipes_from_context( pipes[pipe_cnt].pipe.src.unbounded_req_mode = false; pipes[pipe_cnt].pipe.src.gpuvm = true; - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch; pipes[pipe_cnt].pipe.src.dcc_rate = 3; pipes[pipe_cnt].dout.dsc_input_bpc = 0; + DC_FP_START(); + dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt); + DC_FP_END(); if (pipes[pipe_cnt].dout.dsc_enable) { switch (timing->display_color_depth) { diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c index 070fe10a004e..53dea466348f 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c @@ -1661,11 +1661,12 @@ static int dcn316_populate_dml_pipes_from_context( pipes[pipe_cnt].pipe.src.unbounded_req_mode = false; pipes[pipe_cnt].pipe.src.gpuvm = true; - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch; pipes[pipe_cnt].pipe.src.dcc_rate = 3; pipes[pipe_cnt].dout.dsc_input_bpc = 0; + DC_FP_START(); + dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt); + DC_FP_END(); if (pipes[pipe_cnt].dout.dsc_enable) { switch (timing->display_color_depth) { diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c index facac3daeaca..e36cfa5985ea 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c @@ -435,8 +435,19 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_16_soc = { .urgent_latency_adjustment_fabric_clock_reference_mhz = 0, }; +void dcn31_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes, + int pipe_cnt) +{ + dc_assert_fp_enabled(); + + pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; + pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; +} + void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context) { + dc_assert_fp_enabled(); + if (dc->clk_mgr->bw_params->wm_table.entries[WM_A].valid) { context->bw_ctx.dml.soc.dram_clock_change_latency_us = dc->clk_mgr->bw_params->wm_table.entries[WM_A].pstate_latency_us; context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us = dc->clk_mgr->bw_params->wm_table.entries[WM_A].sr_enter_plus_exit_time_us; diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h index 0a10de80c1a4..4372f17b55d4 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h @@ -31,6 +31,9 @@ #define DCN3_15_MIN_COMPBUF_SIZE_KB 128 #define DCN3_16_DEFAULT_DET_SIZE 192 +void dcn31_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes, + int pipe_cnt); + void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context); void dcn31_calculate_wm_and_dlg_fp( -- 2.35.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family for powerpc 2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen @ 2022-07-21 18:54 ` Rodrigo Siqueira Jordao 0 siblings, 0 replies; 13+ messages in thread From: Rodrigo Siqueira Jordao @ 2022-07-21 18:54 UTC (permalink / raw) To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel, linux-kernel On 2022-07-20 15:32, Melissa Wen wrote: > Move remaining FPU code to DML folder that caused compilation error for > powerpc. This patch depends on [1] to prevent the error below: > > /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o uses soft float > /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o > /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o uses soft float > /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o > /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o uses soft float > /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o > > [1] https://lore.kernel.org/amd-gfx/20220716195144.342960-1-mwen@igalia.com/ > > Reported-by: Guenter Roeck <linux@roeck-us.net> > Signed-off-by: Melissa Wen <mwen@igalia.com> > --- > drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 5 +++-- > .../gpu/drm/amd/display/dc/dcn315/dcn315_resource.c | 5 +++-- > .../gpu/drm/amd/display/dc/dcn316/dcn316_resource.c | 5 +++-- > drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 11 +++++++++++ > drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 3 +++ > 5 files changed, 23 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > index 178d40c0d70a..929b712cbada 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > @@ -1663,11 +1663,12 @@ int dcn31_populate_dml_pipes_from_context( > pipes[pipe_cnt].pipe.src.immediate_flip = true; > pipes[pipe_cnt].pipe.src.unbounded_req_mode = false; > pipes[pipe_cnt].pipe.src.gpuvm = true; > - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; > - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; > pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch; > pipes[pipe_cnt].pipe.src.dcc_rate = 3; > pipes[pipe_cnt].dout.dsc_input_bpc = 0; > + DC_FP_START(); > + dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt); > + DC_FP_END(); > > if (dc->debug.dml_hostvm_override == DML_HOSTVM_NO_OVERRIDE) > pipes[pipe_cnt].pipe.src.hostvm = dc->res_pool->hubbub->riommu_active; > diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c > index df2abd8fe2eb..1a5f5977f962 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c > @@ -1658,11 +1658,12 @@ static int dcn315_populate_dml_pipes_from_context( > > pipes[pipe_cnt].pipe.src.unbounded_req_mode = false; > pipes[pipe_cnt].pipe.src.gpuvm = true; > - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; > - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; > pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch; > pipes[pipe_cnt].pipe.src.dcc_rate = 3; > pipes[pipe_cnt].dout.dsc_input_bpc = 0; > + DC_FP_START(); > + dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt); > + DC_FP_END(); > > if (pipes[pipe_cnt].dout.dsc_enable) { > switch (timing->display_color_depth) { > diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c > index 070fe10a004e..53dea466348f 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c > @@ -1661,11 +1661,12 @@ static int dcn316_populate_dml_pipes_from_context( > > pipes[pipe_cnt].pipe.src.unbounded_req_mode = false; > pipes[pipe_cnt].pipe.src.gpuvm = true; > - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; > - pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; > pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch; > pipes[pipe_cnt].pipe.src.dcc_rate = 3; > pipes[pipe_cnt].dout.dsc_input_bpc = 0; > + DC_FP_START(); > + dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt); > + DC_FP_END(); > > if (pipes[pipe_cnt].dout.dsc_enable) { > switch (timing->display_color_depth) { > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c > index facac3daeaca..e36cfa5985ea 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c > @@ -435,8 +435,19 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_16_soc = { > .urgent_latency_adjustment_fabric_clock_reference_mhz = 0, > }; > > +void dcn31_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes, > + int pipe_cnt) > +{ > + dc_assert_fp_enabled(); > + > + pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; > + pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; > +} > + > void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context) > { > + dc_assert_fp_enabled(); > + > if (dc->clk_mgr->bw_params->wm_table.entries[WM_A].valid) { > context->bw_ctx.dml.soc.dram_clock_change_latency_us = dc->clk_mgr->bw_params->wm_table.entries[WM_A].pstate_latency_us; > context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us = dc->clk_mgr->bw_params->wm_table.entries[WM_A].sr_enter_plus_exit_time_us; > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h > index 0a10de80c1a4..4372f17b55d4 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h > @@ -31,6 +31,9 @@ > #define DCN3_15_MIN_COMPBUF_SIZE_KB 128 > #define DCN3_16_DEFAULT_DET_SIZE 192 > > +void dcn31_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes, > + int pipe_cnt); > + > void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context); > > void dcn31_calculate_wm_and_dlg_fp( Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file 2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen 2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen @ 2022-07-20 19:32 ` Melissa Wen 2022-07-21 18:55 ` Rodrigo Siqueira Jordao 2022-07-20 19:32 ` [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr Melissa Wen ` (3 subsequent siblings) 5 siblings, 1 reply; 13+ messages in thread From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw) To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen, amd-gfx, dri-devel, linux-kernel Many lines of code in dcn31_resource_construct are wrapped by DC_FP macro to protect FPU operations; however, there is no FPU in this region. Therefore, just remove the wrapper for clarity. Signed-off-by: Melissa Wen <mwen@igalia.com> --- drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c index 929b712cbada..6d25fcf865bf 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c @@ -1863,8 +1863,6 @@ static bool dcn31_resource_construct( struct dc_context *ctx = dc->ctx; struct irq_service_init_data init_data; - DC_FP_START(); - ctx->dc_bios->regs = &bios_regs; pool->base.res_cap = &res_cap_dcn31; @@ -2175,13 +2173,9 @@ static bool dcn31_resource_construct( dc->dcn_ip->max_num_dpp = dcn3_1_ip.max_num_dpp; - DC_FP_END(); - return true; create_fail: - - DC_FP_END(); dcn31_resource_destruct(pool); return false; -- 2.35.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file 2022-07-20 19:32 ` [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file Melissa Wen @ 2022-07-21 18:55 ` Rodrigo Siqueira Jordao 0 siblings, 0 replies; 13+ messages in thread From: Rodrigo Siqueira Jordao @ 2022-07-21 18:55 UTC (permalink / raw) To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel, linux-kernel On 2022-07-20 15:32, Melissa Wen wrote: > Many lines of code in dcn31_resource_construct are wrapped by DC_FP > macro to protect FPU operations; however, there is no FPU in this > region. Therefore, just remove the wrapper for clarity. > > Signed-off-by: Melissa Wen <mwen@igalia.com> > --- > drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 6 ------ > 1 file changed, 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > index 929b712cbada..6d25fcf865bf 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > @@ -1863,8 +1863,6 @@ static bool dcn31_resource_construct( > struct dc_context *ctx = dc->ctx; > struct irq_service_init_data init_data; > > - DC_FP_START(); > - > ctx->dc_bios->regs = &bios_regs; > > pool->base.res_cap = &res_cap_dcn31; > @@ -2175,13 +2173,9 @@ static bool dcn31_resource_construct( > > dc->dcn_ip->max_num_dpp = dcn3_1_ip.max_num_dpp; > > - DC_FP_END(); > - > return true; > > create_fail: > - > - DC_FP_END(); > dcn31_resource_destruct(pool); > > return false; Very nice catch! Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr 2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen 2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen 2022-07-20 19:32 ` [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file Melissa Wen @ 2022-07-20 19:32 ` Melissa Wen 2022-07-21 18:57 ` Rodrigo Siqueira Jordao 2022-07-20 19:32 ` [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder Melissa Wen ` (2 subsequent siblings) 5 siblings, 1 reply; 13+ messages in thread From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw) To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen, amd-gfx, dri-devel, linux-kernel The -mno-gnu-attribute option in dcn21 clk mgr makefile hides a soft vs hard fp error for powerpc. After removing this flag, we can see some FPU code remains there: /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn21/rn_clk_mgr.o uses soft float Therefore, remove the -mno-gnu-attribute flag for dcn21/powerpc and move FPU-associated code to DML folder. Signed-off-by: Melissa Wen <mwen@igalia.com> --- .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 - .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 234 +---------------- .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h | 7 + .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 235 ++++++++++++++++++ .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h | 2 + 5 files changed, 248 insertions(+), 236 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile index a48453612d10..66dc02c426e9 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile @@ -107,12 +107,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN201) ############################################################################### CLK_MGR_DCN21 = rn_clk_mgr.o rn_clk_mgr_vbios_smu.o -# prevent build errors regarding soft-float vs hard-float FP ABI tags -# this code is currently unused on ppc64, as it applies to Renoir APUs only -ifdef CONFIG_PPC64 -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call cc-option,-mno-gnu-attribute) -endif - AMD_DAL_CLK_MGR_DCN21 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn21/,$(CLK_MGR_DCN21)) AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c index cf1b5f354ae9..0202dc682682 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c @@ -26,10 +26,9 @@ #include "dccg.h" #include "clk_mgr_internal.h" - #include "dcn20/dcn20_clk_mgr.h" #include "rn_clk_mgr.h" - +#include "dml/dcn20/dcn20_fpu.h" #include "dce100/dce_clk_mgr.h" #include "rn_clk_mgr_vbios_smu.h" @@ -45,7 +44,6 @@ /* Constants */ -#define LPDDR_MEM_RETRAIN_LATENCY 4.977 /* Number obtained from LPDDR4 Training Counter Requirement doc */ #define SMU_VER_55_51_0 0x373300 /* SMU Version that is able to set DISPCLK below 100MHz */ /* Macros */ @@ -613,228 +611,6 @@ static struct clk_bw_params rn_bw_params = { }; -static struct wm_table ddr4_wm_table_gs = { - .entries = { - { - .wm_inst = WM_A, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 7.09, - .sr_enter_plus_exit_time_us = 8.14, - .valid = true, - }, - { - .wm_inst = WM_B, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 10.12, - .sr_enter_plus_exit_time_us = 11.48, - .valid = true, - }, - { - .wm_inst = WM_C, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 10.12, - .sr_enter_plus_exit_time_us = 11.48, - .valid = true, - }, - { - .wm_inst = WM_D, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 10.12, - .sr_enter_plus_exit_time_us = 11.48, - .valid = true, - }, - } -}; - -static struct wm_table lpddr4_wm_table_gs = { - .entries = { - { - .wm_inst = WM_A, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 5.32, - .sr_enter_plus_exit_time_us = 6.38, - .valid = true, - }, - { - .wm_inst = WM_B, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.82, - .sr_enter_plus_exit_time_us = 11.196, - .valid = true, - }, - { - .wm_inst = WM_C, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.89, - .sr_enter_plus_exit_time_us = 11.24, - .valid = true, - }, - { - .wm_inst = WM_D, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.748, - .sr_enter_plus_exit_time_us = 11.102, - .valid = true, - }, - } -}; - -static struct wm_table lpddr4_wm_table_with_disabled_ppt = { - .entries = { - { - .wm_inst = WM_A, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 8.32, - .sr_enter_plus_exit_time_us = 9.38, - .valid = true, - }, - { - .wm_inst = WM_B, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.82, - .sr_enter_plus_exit_time_us = 11.196, - .valid = true, - }, - { - .wm_inst = WM_C, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.89, - .sr_enter_plus_exit_time_us = 11.24, - .valid = true, - }, - { - .wm_inst = WM_D, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.748, - .sr_enter_plus_exit_time_us = 11.102, - .valid = true, - }, - } -}; - -static struct wm_table ddr4_wm_table_rn = { - .entries = { - { - .wm_inst = WM_A, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 11.90, - .sr_enter_plus_exit_time_us = 12.80, - .valid = true, - }, - { - .wm_inst = WM_B, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 13.18, - .sr_enter_plus_exit_time_us = 14.30, - .valid = true, - }, - { - .wm_inst = WM_C, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 13.18, - .sr_enter_plus_exit_time_us = 14.30, - .valid = true, - }, - { - .wm_inst = WM_D, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 13.18, - .sr_enter_plus_exit_time_us = 14.30, - .valid = true, - }, - } -}; - -static struct wm_table ddr4_1R_wm_table_rn = { - .entries = { - { - .wm_inst = WM_A, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 13.90, - .sr_enter_plus_exit_time_us = 14.80, - .valid = true, - }, - { - .wm_inst = WM_B, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 13.90, - .sr_enter_plus_exit_time_us = 14.80, - .valid = true, - }, - { - .wm_inst = WM_C, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 13.90, - .sr_enter_plus_exit_time_us = 14.80, - .valid = true, - }, - { - .wm_inst = WM_D, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 13.90, - .sr_enter_plus_exit_time_us = 14.80, - .valid = true, - }, - } -}; - -static struct wm_table lpddr4_wm_table_rn = { - .entries = { - { - .wm_inst = WM_A, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 7.32, - .sr_enter_plus_exit_time_us = 8.38, - .valid = true, - }, - { - .wm_inst = WM_B, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.82, - .sr_enter_plus_exit_time_us = 11.196, - .valid = true, - }, - { - .wm_inst = WM_C, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.89, - .sr_enter_plus_exit_time_us = 11.24, - .valid = true, - }, - { - .wm_inst = WM_D, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 9.748, - .sr_enter_plus_exit_time_us = 11.102, - .valid = true, - }, - } -}; - static unsigned int find_socclk_for_voltage(struct dpm_clocks *clock_table, unsigned int voltage) { int i; @@ -914,12 +690,10 @@ static void rn_clk_mgr_helper_populate_bw_params(struct clk_bw_params *bw_params /* * WM set D will be re-purposed for memory retraining */ - bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY; - bw_params->wm_table.entries[WM_D].wm_inst = WM_D; - bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING; - bw_params->wm_table.entries[WM_D].valid = true; + DC_FP_START(); + dcn21_clk_mgr_set_bw_params_wm_table(bw_params); + DC_FP_END(); } - } void rn_clk_mgr_construct( diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h index e4322fa5475b..2e088c5171b2 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h @@ -29,6 +29,13 @@ #include "clk_mgr.h" #include "dm_pp_smu.h" +extern struct wm_table ddr4_wm_table_gs; +extern struct wm_table lpddr4_wm_table_gs; +extern struct wm_table lpddr4_wm_table_with_disabled_ppt; +extern struct wm_table ddr4_wm_table_rn; +extern struct wm_table ddr4_1R_wm_table_rn; +extern struct wm_table lpddr4_wm_table_rn; + struct rn_clk_registers { uint32_t CLK1_CLK0_CURRENT_CNT; /* DPREFCLK */ }; diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c index dc60b835e938..eeeae52fe6fc 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c @@ -42,6 +42,9 @@ #define MIN(X, Y) ((X) < (Y) ? (X) : (Y)) #endif +/* Constant */ +#define LPDDR_MEM_RETRAIN_LATENCY 4.977 /* Number obtained from LPDDR4 Training Counter Requirement doc */ + /** * DOC: DCN2x FPU manipulation Overview * @@ -650,6 +653,228 @@ struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc = { .num_states = 8 }; +struct wm_table ddr4_wm_table_gs = { + .entries = { + { + .wm_inst = WM_A, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 7.09, + .sr_enter_plus_exit_time_us = 8.14, + .valid = true, + }, + { + .wm_inst = WM_B, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 10.12, + .sr_enter_plus_exit_time_us = 11.48, + .valid = true, + }, + { + .wm_inst = WM_C, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 10.12, + .sr_enter_plus_exit_time_us = 11.48, + .valid = true, + }, + { + .wm_inst = WM_D, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 10.12, + .sr_enter_plus_exit_time_us = 11.48, + .valid = true, + }, + } +}; + +struct wm_table lpddr4_wm_table_gs = { + .entries = { + { + .wm_inst = WM_A, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 5.32, + .sr_enter_plus_exit_time_us = 6.38, + .valid = true, + }, + { + .wm_inst = WM_B, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.82, + .sr_enter_plus_exit_time_us = 11.196, + .valid = true, + }, + { + .wm_inst = WM_C, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.89, + .sr_enter_plus_exit_time_us = 11.24, + .valid = true, + }, + { + .wm_inst = WM_D, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.748, + .sr_enter_plus_exit_time_us = 11.102, + .valid = true, + }, + } +}; + +struct wm_table lpddr4_wm_table_with_disabled_ppt = { + .entries = { + { + .wm_inst = WM_A, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 8.32, + .sr_enter_plus_exit_time_us = 9.38, + .valid = true, + }, + { + .wm_inst = WM_B, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.82, + .sr_enter_plus_exit_time_us = 11.196, + .valid = true, + }, + { + .wm_inst = WM_C, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.89, + .sr_enter_plus_exit_time_us = 11.24, + .valid = true, + }, + { + .wm_inst = WM_D, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.748, + .sr_enter_plus_exit_time_us = 11.102, + .valid = true, + }, + } +}; + +struct wm_table ddr4_wm_table_rn = { + .entries = { + { + .wm_inst = WM_A, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 11.90, + .sr_enter_plus_exit_time_us = 12.80, + .valid = true, + }, + { + .wm_inst = WM_B, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 13.18, + .sr_enter_plus_exit_time_us = 14.30, + .valid = true, + }, + { + .wm_inst = WM_C, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 13.18, + .sr_enter_plus_exit_time_us = 14.30, + .valid = true, + }, + { + .wm_inst = WM_D, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 13.18, + .sr_enter_plus_exit_time_us = 14.30, + .valid = true, + }, + } +}; + +struct wm_table ddr4_1R_wm_table_rn = { + .entries = { + { + .wm_inst = WM_A, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 13.90, + .sr_enter_plus_exit_time_us = 14.80, + .valid = true, + }, + { + .wm_inst = WM_B, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 13.90, + .sr_enter_plus_exit_time_us = 14.80, + .valid = true, + }, + { + .wm_inst = WM_C, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 13.90, + .sr_enter_plus_exit_time_us = 14.80, + .valid = true, + }, + { + .wm_inst = WM_D, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 13.90, + .sr_enter_plus_exit_time_us = 14.80, + .valid = true, + }, + } +}; + +struct wm_table lpddr4_wm_table_rn = { + .entries = { + { + .wm_inst = WM_A, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 7.32, + .sr_enter_plus_exit_time_us = 8.38, + .valid = true, + }, + { + .wm_inst = WM_B, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.82, + .sr_enter_plus_exit_time_us = 11.196, + .valid = true, + }, + { + .wm_inst = WM_C, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.89, + .sr_enter_plus_exit_time_us = 11.24, + .valid = true, + }, + { + .wm_inst = WM_D, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 9.748, + .sr_enter_plus_exit_time_us = 11.102, + .valid = true, + }, + } +}; + void dcn20_populate_dml_writeback_from_context(struct dc *dc, struct resource_context *res_ctx, display_e2e_pipe_params_st *pipes) @@ -2068,3 +2293,13 @@ void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params dml_init_instance(&dc->dml, &dcn2_1_soc, &dcn2_1_ip, DML_PROJECT_DCN21); } + +void dcn21_clk_mgr_set_bw_params_wm_table(struct clk_bw_params *bw_params) +{ + dc_assert_fp_enabled(); + + bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY; + bw_params->wm_table.entries[WM_D].wm_inst = WM_D; + bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING; + bw_params->wm_table.entries[WM_D].valid = true; +} diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h index aa892193e485..a6e1ad0f38e9 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h @@ -82,4 +82,6 @@ bool dcn21_validate_bandwidth_fp(struct dc *dc, bool fast_validate); void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params); +void dcn21_clk_mgr_set_bw_params_wm_table(struct clk_bw_params *bw_params); + #endif /* __DCN20_FPU_H__ */ -- 2.35.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr 2022-07-20 19:32 ` [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr Melissa Wen @ 2022-07-21 18:57 ` Rodrigo Siqueira Jordao 0 siblings, 0 replies; 13+ messages in thread From: Rodrigo Siqueira Jordao @ 2022-07-21 18:57 UTC (permalink / raw) To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel, linux-kernel On 2022-07-20 15:32, Melissa Wen wrote: > The -mno-gnu-attribute option in dcn21 clk mgr makefile hides a soft vs > hard fp error for powerpc. After removing this flag, we can see some FPU > code remains there: > > /gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses > hard float, > drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn21/rn_clk_mgr.o uses > soft float > > Therefore, remove the -mno-gnu-attribute flag for dcn21/powerpc and move > FPU-associated code to DML folder. > > Signed-off-by: Melissa Wen <mwen@igalia.com> > --- > .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 - > .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 234 +---------------- > .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h | 7 + > .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 235 ++++++++++++++++++ > .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h | 2 + > 5 files changed, 248 insertions(+), 236 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > index a48453612d10..66dc02c426e9 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > @@ -107,12 +107,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN201) > ############################################################################### > CLK_MGR_DCN21 = rn_clk_mgr.o rn_clk_mgr_vbios_smu.o > > -# prevent build errors regarding soft-float vs hard-float FP ABI tags > -# this code is currently unused on ppc64, as it applies to Renoir APUs only > -ifdef CONFIG_PPC64 > -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call cc-option,-mno-gnu-attribute) > -endif > - > AMD_DAL_CLK_MGR_DCN21 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn21/,$(CLK_MGR_DCN21)) > > AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21) > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c > index cf1b5f354ae9..0202dc682682 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c > @@ -26,10 +26,9 @@ > #include "dccg.h" > #include "clk_mgr_internal.h" > > - > #include "dcn20/dcn20_clk_mgr.h" > #include "rn_clk_mgr.h" > - > +#include "dml/dcn20/dcn20_fpu.h" > > #include "dce100/dce_clk_mgr.h" > #include "rn_clk_mgr_vbios_smu.h" > @@ -45,7 +44,6 @@ > > /* Constants */ > > -#define LPDDR_MEM_RETRAIN_LATENCY 4.977 /* Number obtained from LPDDR4 Training Counter Requirement doc */ > #define SMU_VER_55_51_0 0x373300 /* SMU Version that is able to set DISPCLK below 100MHz */ > > /* Macros */ > @@ -613,228 +611,6 @@ static struct clk_bw_params rn_bw_params = { > > }; > > -static struct wm_table ddr4_wm_table_gs = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 7.09, > - .sr_enter_plus_exit_time_us = 8.14, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - } > -}; > - > -static struct wm_table lpddr4_wm_table_gs = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 5.32, > - .sr_enter_plus_exit_time_us = 6.38, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.82, > - .sr_enter_plus_exit_time_us = 11.196, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.89, > - .sr_enter_plus_exit_time_us = 11.24, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.748, > - .sr_enter_plus_exit_time_us = 11.102, > - .valid = true, > - }, > - } > -}; > - > -static struct wm_table lpddr4_wm_table_with_disabled_ppt = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 8.32, > - .sr_enter_plus_exit_time_us = 9.38, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.82, > - .sr_enter_plus_exit_time_us = 11.196, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.89, > - .sr_enter_plus_exit_time_us = 11.24, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.748, > - .sr_enter_plus_exit_time_us = 11.102, > - .valid = true, > - }, > - } > -}; > - > -static struct wm_table ddr4_wm_table_rn = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 11.90, > - .sr_enter_plus_exit_time_us = 12.80, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 13.18, > - .sr_enter_plus_exit_time_us = 14.30, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 13.18, > - .sr_enter_plus_exit_time_us = 14.30, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 13.18, > - .sr_enter_plus_exit_time_us = 14.30, > - .valid = true, > - }, > - } > -}; > - > -static struct wm_table ddr4_1R_wm_table_rn = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 13.90, > - .sr_enter_plus_exit_time_us = 14.80, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 13.90, > - .sr_enter_plus_exit_time_us = 14.80, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 13.90, > - .sr_enter_plus_exit_time_us = 14.80, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 13.90, > - .sr_enter_plus_exit_time_us = 14.80, > - .valid = true, > - }, > - } > -}; > - > -static struct wm_table lpddr4_wm_table_rn = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 7.32, > - .sr_enter_plus_exit_time_us = 8.38, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.82, > - .sr_enter_plus_exit_time_us = 11.196, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.89, > - .sr_enter_plus_exit_time_us = 11.24, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 9.748, > - .sr_enter_plus_exit_time_us = 11.102, > - .valid = true, > - }, > - } > -}; > - > static unsigned int find_socclk_for_voltage(struct dpm_clocks *clock_table, unsigned int voltage) > { > int i; > @@ -914,12 +690,10 @@ static void rn_clk_mgr_helper_populate_bw_params(struct clk_bw_params *bw_params > /* > * WM set D will be re-purposed for memory retraining > */ > - bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY; > - bw_params->wm_table.entries[WM_D].wm_inst = WM_D; > - bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING; > - bw_params->wm_table.entries[WM_D].valid = true; > + DC_FP_START(); > + dcn21_clk_mgr_set_bw_params_wm_table(bw_params); > + DC_FP_END(); > } > - > } > > void rn_clk_mgr_construct( > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h > index e4322fa5475b..2e088c5171b2 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h > @@ -29,6 +29,13 @@ > #include "clk_mgr.h" > #include "dm_pp_smu.h" > > +extern struct wm_table ddr4_wm_table_gs; > +extern struct wm_table lpddr4_wm_table_gs; > +extern struct wm_table lpddr4_wm_table_with_disabled_ppt; > +extern struct wm_table ddr4_wm_table_rn; > +extern struct wm_table ddr4_1R_wm_table_rn; > +extern struct wm_table lpddr4_wm_table_rn; > + > struct rn_clk_registers { > uint32_t CLK1_CLK0_CURRENT_CNT; /* DPREFCLK */ > }; > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c > index dc60b835e938..eeeae52fe6fc 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c > @@ -42,6 +42,9 @@ > #define MIN(X, Y) ((X) < (Y) ? (X) : (Y)) > #endif > > +/* Constant */ > +#define LPDDR_MEM_RETRAIN_LATENCY 4.977 /* Number obtained from LPDDR4 Training Counter Requirement doc */ > + > /** > * DOC: DCN2x FPU manipulation Overview > * > @@ -650,6 +653,228 @@ struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc = { > .num_states = 8 > }; > > +struct wm_table ddr4_wm_table_gs = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 7.09, > + .sr_enter_plus_exit_time_us = 8.14, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + } > +}; > + > +struct wm_table lpddr4_wm_table_gs = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 5.32, > + .sr_enter_plus_exit_time_us = 6.38, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.82, > + .sr_enter_plus_exit_time_us = 11.196, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.89, > + .sr_enter_plus_exit_time_us = 11.24, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.748, > + .sr_enter_plus_exit_time_us = 11.102, > + .valid = true, > + }, > + } > +}; > + > +struct wm_table lpddr4_wm_table_with_disabled_ppt = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 8.32, > + .sr_enter_plus_exit_time_us = 9.38, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.82, > + .sr_enter_plus_exit_time_us = 11.196, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.89, > + .sr_enter_plus_exit_time_us = 11.24, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.748, > + .sr_enter_plus_exit_time_us = 11.102, > + .valid = true, > + }, > + } > +}; > + > +struct wm_table ddr4_wm_table_rn = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 11.90, > + .sr_enter_plus_exit_time_us = 12.80, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 13.18, > + .sr_enter_plus_exit_time_us = 14.30, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 13.18, > + .sr_enter_plus_exit_time_us = 14.30, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 13.18, > + .sr_enter_plus_exit_time_us = 14.30, > + .valid = true, > + }, > + } > +}; > + > +struct wm_table ddr4_1R_wm_table_rn = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 13.90, > + .sr_enter_plus_exit_time_us = 14.80, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 13.90, > + .sr_enter_plus_exit_time_us = 14.80, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 13.90, > + .sr_enter_plus_exit_time_us = 14.80, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 13.90, > + .sr_enter_plus_exit_time_us = 14.80, > + .valid = true, > + }, > + } > +}; > + > +struct wm_table lpddr4_wm_table_rn = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 7.32, > + .sr_enter_plus_exit_time_us = 8.38, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.82, > + .sr_enter_plus_exit_time_us = 11.196, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.89, > + .sr_enter_plus_exit_time_us = 11.24, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 9.748, > + .sr_enter_plus_exit_time_us = 11.102, > + .valid = true, > + }, > + } > +}; > + > void dcn20_populate_dml_writeback_from_context(struct dc *dc, > struct resource_context *res_ctx, > display_e2e_pipe_params_st *pipes) > @@ -2068,3 +2293,13 @@ void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params > > dml_init_instance(&dc->dml, &dcn2_1_soc, &dcn2_1_ip, DML_PROJECT_DCN21); > } > + > +void dcn21_clk_mgr_set_bw_params_wm_table(struct clk_bw_params *bw_params) > +{ > + dc_assert_fp_enabled(); > + > + bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY; > + bw_params->wm_table.entries[WM_D].wm_inst = WM_D; > + bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING; > + bw_params->wm_table.entries[WM_D].valid = true; > +} > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h > index aa892193e485..a6e1ad0f38e9 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h > @@ -82,4 +82,6 @@ bool dcn21_validate_bandwidth_fp(struct dc *dc, > bool fast_validate); > void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params); > > +void dcn21_clk_mgr_set_bw_params_wm_table(struct clk_bw_params *bw_params); > + > #endif /* __DCN20_FPU_H__ */ Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder 2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen ` (2 preceding siblings ...) 2022-07-20 19:32 ` [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr Melissa Wen @ 2022-07-20 19:32 ` Melissa Wen 2022-07-21 18:58 ` Rodrigo Siqueira Jordao 2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen 2022-07-21 19:07 ` [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Rodrigo Siqueira Jordao 5 siblings, 1 reply; 13+ messages in thread From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw) To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen, amd-gfx, dri-devel, linux-kernel The -mno-gnu-attribute option in clk mgr makefile for dcn30 hides a soft vs hard fp error for powerpc. After removing this flag, we can see some FPU code remains there: gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.o uses soft float Therefore, remove the -mno-gnu-attribute flag for dcn30/powerpc and move FPU-associated code to DML folder. Signed-off-by: Melissa Wen <mwen@igalia.com> --- .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 -- .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 63 ++----------------- .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.c | 63 ++++++++++++++++++- .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.h | 1 + 4 files changed, 68 insertions(+), 65 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile index 66dc02c426e9..15b660a951a5 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile @@ -115,12 +115,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21) ############################################################################### CLK_MGR_DCN30 = dcn30_clk_mgr.o dcn30_clk_mgr_smu_msg.o -# prevent build errors regarding soft-float vs hard-float FP ABI tags -# this code is currently unused on ppc64, as it applies to VanGogh APUs only -ifdef CONFIG_PPC64 -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn30/dcn30_clk_mgr.o := $(call cc-option,-mno-gnu-attribute) -endif - AMD_DAL_CLK_MGR_DCN30 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn30/,$(CLK_MGR_DCN30)) AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c index 914708cefc79..3ce0ee0d012f 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c @@ -29,6 +29,7 @@ #include "dcn20/dcn20_clk_mgr.h" #include "dce100/dce_clk_mgr.h" #include "dcn30/dcn30_clk_mgr.h" +#include "dml/dcn30/dcn30_fpu.h" #include "reg_helper.h" #include "core_types.h" #include "dm_helpers.h" @@ -97,65 +98,11 @@ static void dcn3_init_single_clock(struct clk_mgr_internal *clk_mgr, uint32_t cl } } -static noinline void dcn3_build_wm_range_table(struct clk_mgr_internal *clk_mgr) +static void dcn3_build_wm_range_table(struct clk_mgr_internal *clk_mgr) { - /* defaults */ - double pstate_latency_us = clk_mgr->base.ctx->dc->dml.soc.dram_clock_change_latency_us; - double sr_exit_time_us = clk_mgr->base.ctx->dc->dml.soc.sr_exit_time_us; - double sr_enter_plus_exit_time_us = clk_mgr->base.ctx->dc->dml.soc.sr_enter_plus_exit_time_us; - uint16_t min_uclk_mhz = clk_mgr->base.bw_params->clk_table.entries[0].memclk_mhz; - - /* Set A - Normal - default values*/ - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].valid = true; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.pstate_latency_us = pstate_latency_us; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.sr_exit_time_us = sr_exit_time_us; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_dcfclk = 0; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_dcfclk = 0xFFFF; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_uclk = min_uclk_mhz; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_uclk = 0xFFFF; - - /* Set B - Performance - higher minimum clocks */ -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].valid = true; -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us = pstate_latency_us; -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.sr_exit_time_us = sr_exit_time_us; -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE; -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_dcfclk = TUNED VALUE; -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_dcfclk = 0xFFFF; -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_uclk = TUNED VALUE; -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_uclk = 0xFFFF; - - /* Set C - Dummy P-State - P-State latency set to "dummy p-state" value */ - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].valid = true; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us = 0; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.sr_exit_time_us = sr_exit_time_us; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.wm_type = WATERMARKS_DUMMY_PSTATE; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_dcfclk = 0; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_dcfclk = 0xFFFF; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_uclk = min_uclk_mhz; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_uclk = 0xFFFF; - clk_mgr->base.bw_params->dummy_pstate_table[0].dram_speed_mts = 1600; - clk_mgr->base.bw_params->dummy_pstate_table[0].dummy_pstate_latency_us = 38; - clk_mgr->base.bw_params->dummy_pstate_table[1].dram_speed_mts = 8000; - clk_mgr->base.bw_params->dummy_pstate_table[1].dummy_pstate_latency_us = 9; - clk_mgr->base.bw_params->dummy_pstate_table[2].dram_speed_mts = 10000; - clk_mgr->base.bw_params->dummy_pstate_table[2].dummy_pstate_latency_us = 8; - clk_mgr->base.bw_params->dummy_pstate_table[3].dram_speed_mts = 16000; - clk_mgr->base.bw_params->dummy_pstate_table[3].dummy_pstate_latency_us = 5; - - /* Set D - MALL - SR enter and exit times adjusted for MALL */ - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].valid = true; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us = pstate_latency_us; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.sr_exit_time_us = 2; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us = 4; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.wm_type = WATERMARKS_MALL; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_dcfclk = 0; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_dcfclk = 0xFFFF; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_uclk = min_uclk_mhz; - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0xFFFF; + DC_FP_START(); + dcn3_fpu_build_wm_range_table(&clk_mgr->base); + DC_FP_END(); } void dcn3_init_clocks(struct clk_mgr *clk_mgr_base) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c index a8db1306750e..c00f759fdded 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c @@ -29,7 +29,7 @@ #include "dcn20/dcn20_resource.h" #include "dcn30/dcn30_resource.h" - +#include "clk_mgr/dcn30/dcn30_smu11_driver_if.h" #include "display_mode_vba_30.h" #include "dcn30_fpu.h" @@ -616,4 +616,65 @@ void dcn30_fpu_update_bw_bounding_box(struct dc *dc, } +void dcn3_fpu_build_wm_range_table(struct clk_mgr *base) +{ + /* defaults */ + double pstate_latency_us = base->ctx->dc->dml.soc.dram_clock_change_latency_us; + double sr_exit_time_us = base->ctx->dc->dml.soc.sr_exit_time_us; + double sr_enter_plus_exit_time_us = base->ctx->dc->dml.soc.sr_enter_plus_exit_time_us; + uint16_t min_uclk_mhz = base->bw_params->clk_table.entries[0].memclk_mhz; + dc_assert_fp_enabled(); + + /* Set A - Normal - default values*/ + base->bw_params->wm_table.nv_entries[WM_A].valid = true; + base->bw_params->wm_table.nv_entries[WM_A].dml_input.pstate_latency_us = pstate_latency_us; + base->bw_params->wm_table.nv_entries[WM_A].dml_input.sr_exit_time_us = sr_exit_time_us; + base->bw_params->wm_table.nv_entries[WM_A].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE; + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_dcfclk = 0; + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_dcfclk = 0xFFFF; + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_uclk = min_uclk_mhz; + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_uclk = 0xFFFF; + + /* Set B - Performance - higher minimum clocks */ +// base->bw_params->wm_table.nv_entries[WM_B].valid = true; +// base->bw_params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us = pstate_latency_us; +// base->bw_params->wm_table.nv_entries[WM_B].dml_input.sr_exit_time_us = sr_exit_time_us; +// base->bw_params->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE; +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_dcfclk = TUNED VALUE; +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_dcfclk = 0xFFFF; +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_uclk = TUNED VALUE; +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_uclk = 0xFFFF; + + /* Set C - Dummy P-State - P-State latency set to "dummy p-state" value */ + base->bw_params->wm_table.nv_entries[WM_C].valid = true; + base->bw_params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us = 0; + base->bw_params->wm_table.nv_entries[WM_C].dml_input.sr_exit_time_us = sr_exit_time_us; + base->bw_params->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.wm_type = WATERMARKS_DUMMY_PSTATE; + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_dcfclk = 0; + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_dcfclk = 0xFFFF; + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_uclk = min_uclk_mhz; + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_uclk = 0xFFFF; + base->bw_params->dummy_pstate_table[0].dram_speed_mts = 1600; + base->bw_params->dummy_pstate_table[0].dummy_pstate_latency_us = 38; + base->bw_params->dummy_pstate_table[1].dram_speed_mts = 8000; + base->bw_params->dummy_pstate_table[1].dummy_pstate_latency_us = 9; + base->bw_params->dummy_pstate_table[2].dram_speed_mts = 10000; + base->bw_params->dummy_pstate_table[2].dummy_pstate_latency_us = 8; + base->bw_params->dummy_pstate_table[3].dram_speed_mts = 16000; + base->bw_params->dummy_pstate_table[3].dummy_pstate_latency_us = 5; + + /* Set D - MALL - SR enter and exit times adjusted for MALL */ + base->bw_params->wm_table.nv_entries[WM_D].valid = true; + base->bw_params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us = pstate_latency_us; + base->bw_params->wm_table.nv_entries[WM_D].dml_input.sr_exit_time_us = 2; + base->bw_params->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us = 4; + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.wm_type = WATERMARKS_MALL; + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_dcfclk = 0; + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_dcfclk = 0xFFFF; + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_uclk = min_uclk_mhz; + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0xFFFF; +} diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h index dedfe7b5f173..c2024052a497 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h @@ -63,5 +63,6 @@ void dcn30_fpu_update_bw_bounding_box(struct dc *dc, unsigned int *dcfclk_mhz, unsigned int *dram_speed_mts); +void dcn3_fpu_build_wm_range_table(struct clk_mgr *base); #endif /* __DCN30_FPU_H__*/ -- 2.35.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder 2022-07-20 19:32 ` [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder Melissa Wen @ 2022-07-21 18:58 ` Rodrigo Siqueira Jordao 0 siblings, 0 replies; 13+ messages in thread From: Rodrigo Siqueira Jordao @ 2022-07-21 18:58 UTC (permalink / raw) To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel, linux-kernel On 2022-07-20 15:32, Melissa Wen wrote: > The -mno-gnu-attribute option in clk mgr makefile for dcn30 hides a soft > vs hard fp error for powerpc. After removing this flag, we can see some > FPU code remains there: > > gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses > hard float, > drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.o > uses soft float > > Therefore, remove the -mno-gnu-attribute flag for dcn30/powerpc and move > FPU-associated code to DML folder. > > Signed-off-by: Melissa Wen <mwen@igalia.com> > --- > .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 -- > .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 63 ++----------------- > .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.c | 63 ++++++++++++++++++- > .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.h | 1 + > 4 files changed, 68 insertions(+), 65 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > index 66dc02c426e9..15b660a951a5 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > @@ -115,12 +115,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21) > ############################################################################### > CLK_MGR_DCN30 = dcn30_clk_mgr.o dcn30_clk_mgr_smu_msg.o > > -# prevent build errors regarding soft-float vs hard-float FP ABI tags > -# this code is currently unused on ppc64, as it applies to VanGogh APUs only > -ifdef CONFIG_PPC64 > -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn30/dcn30_clk_mgr.o := $(call cc-option,-mno-gnu-attribute) > -endif > - > AMD_DAL_CLK_MGR_DCN30 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn30/,$(CLK_MGR_DCN30)) > > AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30) > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c > index 914708cefc79..3ce0ee0d012f 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c > @@ -29,6 +29,7 @@ > #include "dcn20/dcn20_clk_mgr.h" > #include "dce100/dce_clk_mgr.h" > #include "dcn30/dcn30_clk_mgr.h" > +#include "dml/dcn30/dcn30_fpu.h" > #include "reg_helper.h" > #include "core_types.h" > #include "dm_helpers.h" > @@ -97,65 +98,11 @@ static void dcn3_init_single_clock(struct clk_mgr_internal *clk_mgr, uint32_t cl > } > } > > -static noinline void dcn3_build_wm_range_table(struct clk_mgr_internal *clk_mgr) > +static void dcn3_build_wm_range_table(struct clk_mgr_internal *clk_mgr) > { > - /* defaults */ > - double pstate_latency_us = clk_mgr->base.ctx->dc->dml.soc.dram_clock_change_latency_us; > - double sr_exit_time_us = clk_mgr->base.ctx->dc->dml.soc.sr_exit_time_us; > - double sr_enter_plus_exit_time_us = clk_mgr->base.ctx->dc->dml.soc.sr_enter_plus_exit_time_us; > - uint16_t min_uclk_mhz = clk_mgr->base.bw_params->clk_table.entries[0].memclk_mhz; > - > - /* Set A - Normal - default values*/ > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].valid = true; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.pstate_latency_us = pstate_latency_us; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.sr_exit_time_us = sr_exit_time_us; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_dcfclk = 0; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_dcfclk = 0xFFFF; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_uclk = min_uclk_mhz; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_uclk = 0xFFFF; > - > - /* Set B - Performance - higher minimum clocks */ > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].valid = true; > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us = pstate_latency_us; > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.sr_exit_time_us = sr_exit_time_us; > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE; > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_dcfclk = TUNED VALUE; > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_dcfclk = 0xFFFF; > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_uclk = TUNED VALUE; > -// clk_mgr->base.bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_uclk = 0xFFFF; > - > - /* Set C - Dummy P-State - P-State latency set to "dummy p-state" value */ > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].valid = true; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us = 0; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.sr_exit_time_us = sr_exit_time_us; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.wm_type = WATERMARKS_DUMMY_PSTATE; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_dcfclk = 0; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_dcfclk = 0xFFFF; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_uclk = min_uclk_mhz; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_uclk = 0xFFFF; > - clk_mgr->base.bw_params->dummy_pstate_table[0].dram_speed_mts = 1600; > - clk_mgr->base.bw_params->dummy_pstate_table[0].dummy_pstate_latency_us = 38; > - clk_mgr->base.bw_params->dummy_pstate_table[1].dram_speed_mts = 8000; > - clk_mgr->base.bw_params->dummy_pstate_table[1].dummy_pstate_latency_us = 9; > - clk_mgr->base.bw_params->dummy_pstate_table[2].dram_speed_mts = 10000; > - clk_mgr->base.bw_params->dummy_pstate_table[2].dummy_pstate_latency_us = 8; > - clk_mgr->base.bw_params->dummy_pstate_table[3].dram_speed_mts = 16000; > - clk_mgr->base.bw_params->dummy_pstate_table[3].dummy_pstate_latency_us = 5; > - > - /* Set D - MALL - SR enter and exit times adjusted for MALL */ > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].valid = true; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us = pstate_latency_us; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.sr_exit_time_us = 2; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us = 4; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.wm_type = WATERMARKS_MALL; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_dcfclk = 0; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_dcfclk = 0xFFFF; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_uclk = min_uclk_mhz; > - clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0xFFFF; > + DC_FP_START(); > + dcn3_fpu_build_wm_range_table(&clk_mgr->base); > + DC_FP_END(); > } > > void dcn3_init_clocks(struct clk_mgr *clk_mgr_base) > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c > index a8db1306750e..c00f759fdded 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.c > @@ -29,7 +29,7 @@ > #include "dcn20/dcn20_resource.h" > #include "dcn30/dcn30_resource.h" > > - > +#include "clk_mgr/dcn30/dcn30_smu11_driver_if.h" > #include "display_mode_vba_30.h" > #include "dcn30_fpu.h" > > @@ -616,4 +616,65 @@ void dcn30_fpu_update_bw_bounding_box(struct dc *dc, > > } > > +void dcn3_fpu_build_wm_range_table(struct clk_mgr *base) > +{ > + /* defaults */ > + double pstate_latency_us = base->ctx->dc->dml.soc.dram_clock_change_latency_us; > + double sr_exit_time_us = base->ctx->dc->dml.soc.sr_exit_time_us; > + double sr_enter_plus_exit_time_us = base->ctx->dc->dml.soc.sr_enter_plus_exit_time_us; > + uint16_t min_uclk_mhz = base->bw_params->clk_table.entries[0].memclk_mhz; > > + dc_assert_fp_enabled(); > + > + /* Set A - Normal - default values*/ > + base->bw_params->wm_table.nv_entries[WM_A].valid = true; > + base->bw_params->wm_table.nv_entries[WM_A].dml_input.pstate_latency_us = pstate_latency_us; > + base->bw_params->wm_table.nv_entries[WM_A].dml_input.sr_exit_time_us = sr_exit_time_us; > + base->bw_params->wm_table.nv_entries[WM_A].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; > + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE; > + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_dcfclk = 0; > + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_dcfclk = 0xFFFF; > + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.min_uclk = min_uclk_mhz; > + base->bw_params->wm_table.nv_entries[WM_A].pmfw_breakdown.max_uclk = 0xFFFF; > + > + /* Set B - Performance - higher minimum clocks */ > +// base->bw_params->wm_table.nv_entries[WM_B].valid = true; > +// base->bw_params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us = pstate_latency_us; > +// base->bw_params->wm_table.nv_entries[WM_B].dml_input.sr_exit_time_us = sr_exit_time_us; > +// base->bw_params->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; > +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.wm_type = WATERMARKS_CLOCK_RANGE; > +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_dcfclk = TUNED VALUE; > +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_dcfclk = 0xFFFF; > +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.min_uclk = TUNED VALUE; > +// base->bw_params->wm_table.nv_entries[WM_B].pmfw_breakdown.max_uclk = 0xFFFF; > + > + /* Set C - Dummy P-State - P-State latency set to "dummy p-state" value */ > + base->bw_params->wm_table.nv_entries[WM_C].valid = true; > + base->bw_params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us = 0; > + base->bw_params->wm_table.nv_entries[WM_C].dml_input.sr_exit_time_us = sr_exit_time_us; > + base->bw_params->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us = sr_enter_plus_exit_time_us; > + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.wm_type = WATERMARKS_DUMMY_PSTATE; > + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_dcfclk = 0; > + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_dcfclk = 0xFFFF; > + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.min_uclk = min_uclk_mhz; > + base->bw_params->wm_table.nv_entries[WM_C].pmfw_breakdown.max_uclk = 0xFFFF; > + base->bw_params->dummy_pstate_table[0].dram_speed_mts = 1600; > + base->bw_params->dummy_pstate_table[0].dummy_pstate_latency_us = 38; > + base->bw_params->dummy_pstate_table[1].dram_speed_mts = 8000; > + base->bw_params->dummy_pstate_table[1].dummy_pstate_latency_us = 9; > + base->bw_params->dummy_pstate_table[2].dram_speed_mts = 10000; > + base->bw_params->dummy_pstate_table[2].dummy_pstate_latency_us = 8; > + base->bw_params->dummy_pstate_table[3].dram_speed_mts = 16000; > + base->bw_params->dummy_pstate_table[3].dummy_pstate_latency_us = 5; > + > + /* Set D - MALL - SR enter and exit times adjusted for MALL */ > + base->bw_params->wm_table.nv_entries[WM_D].valid = true; > + base->bw_params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us = pstate_latency_us; > + base->bw_params->wm_table.nv_entries[WM_D].dml_input.sr_exit_time_us = 2; > + base->bw_params->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us = 4; > + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.wm_type = WATERMARKS_MALL; > + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_dcfclk = 0; > + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_dcfclk = 0xFFFF; > + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.min_uclk = min_uclk_mhz; > + base->bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0xFFFF; > +} > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h > index dedfe7b5f173..c2024052a497 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/dcn30_fpu.h > @@ -63,5 +63,6 @@ void dcn30_fpu_update_bw_bounding_box(struct dc *dc, > unsigned int *dcfclk_mhz, > unsigned int *dram_speed_mts); > > +void dcn3_fpu_build_wm_range_table(struct clk_mgr *base); > > #endif /* __DCN30_FPU_H__*/ Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 5/5] drm/amd/display: move FPU code from dcn301 clk mgr to DML folder 2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen ` (3 preceding siblings ...) 2022-07-20 19:32 ` [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder Melissa Wen @ 2022-07-20 19:32 ` Melissa Wen 2022-07-21 17:26 ` Maíra Canal 2022-07-21 18:59 ` Rodrigo Siqueira Jordao 2022-07-21 19:07 ` [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Rodrigo Siqueira Jordao 5 siblings, 2 replies; 13+ messages in thread From: Melissa Wen @ 2022-07-20 19:32 UTC (permalink / raw) To: harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, Melissa Wen, amd-gfx, dri-devel, linux-kernel The -mno-gnu-attribute option in dcn301 clk mgr makefile hides a soft vs hard fp error for powerpc. After removing this flag, we can see some FPU code remains there: gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.o uses soft float Therefore, remove the -mno-gnu-attribute flag for dcn301/powerpc and move FPU-associated code to DML folder. Signed-off-by: Melissa Wen <mwen@igalia.com> --- .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 -- .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 86 ++----------------- .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h | 3 + .../amd/display/dc/dml/dcn301/dcn301_fpu.c | 74 ++++++++++++++++ 4 files changed, 84 insertions(+), 85 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile index 15b660a951a5..271d8e573181 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile @@ -123,12 +123,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30) ############################################################################### CLK_MGR_DCN301 = vg_clk_mgr.o dcn301_smu.o -# prevent build errors regarding soft-float vs hard-float FP ABI tags -# this code is currently unused on ppc64, as it applies to VanGogh APUs only -ifdef CONFIG_PPC64 -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn301/vg_clk_mgr.o := $(call cc-option,-mno-gnu-attribute) -endif - AMD_DAL_CLK_MGR_DCN301 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn301/,$(CLK_MGR_DCN301)) AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN301) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c index f310b0d25a07..65f224af03c0 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c @@ -32,6 +32,10 @@ // For dcn20_update_clocks_update_dpp_dto #include "dcn20/dcn20_clk_mgr.h" +// For DML FPU code +#include "dml/dcn20/dcn20_fpu.h" +#include "dml/dcn301/dcn301_fpu.h" + #include "vg_clk_mgr.h" #include "dcn301_smu.h" #include "reg_helper.h" @@ -526,81 +530,6 @@ static struct clk_bw_params vg_bw_params = { }; -static struct wm_table ddr4_wm_table = { - .entries = { - { - .wm_inst = WM_A, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 6.09, - .sr_enter_plus_exit_time_us = 7.14, - .valid = true, - }, - { - .wm_inst = WM_B, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 10.12, - .sr_enter_plus_exit_time_us = 11.48, - .valid = true, - }, - { - .wm_inst = WM_C, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 10.12, - .sr_enter_plus_exit_time_us = 11.48, - .valid = true, - }, - { - .wm_inst = WM_D, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.72, - .sr_exit_time_us = 10.12, - .sr_enter_plus_exit_time_us = 11.48, - .valid = true, - }, - } -}; - -static struct wm_table lpddr5_wm_table = { - .entries = { - { - .wm_inst = WM_A, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 13.5, - .sr_enter_plus_exit_time_us = 16.5, - .valid = true, - }, - { - .wm_inst = WM_B, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 13.5, - .sr_enter_plus_exit_time_us = 16.5, - .valid = true, - }, - { - .wm_inst = WM_C, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 13.5, - .sr_enter_plus_exit_time_us = 16.5, - .valid = true, - }, - { - .wm_inst = WM_D, - .wm_type = WM_TYPE_PSTATE_CHG, - .pstate_latency_us = 11.65333, - .sr_exit_time_us = 13.5, - .sr_enter_plus_exit_time_us = 16.5, - .valid = true, - }, - } -}; - - static unsigned int find_dcfclk_for_voltage(const struct vg_dpm_clocks *clock_table, unsigned int voltage) { @@ -670,10 +599,9 @@ static void vg_clk_mgr_helper_populate_bw_params( /* * WM set D will be re-purposed for memory retraining */ - bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY; - bw_params->wm_table.entries[WM_D].wm_inst = WM_D; - bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING; - bw_params->wm_table.entries[WM_D].valid = true; + DC_FP_START(); + dcn21_clk_mgr_set_bw_params_wm_table(bw_params); + DC_FP_END(); } } diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h index 7255477307f1..75884f572989 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h @@ -29,6 +29,9 @@ struct watermarks; +extern struct wm_table ddr4_wm_table; +extern struct wm_table lpddr5_wm_table; + struct smu_watermark_set { struct watermarks *wm_set; union large_integer mc_address; diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c index e4863f0bf0f6..7ef66e511ec8 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c @@ -214,6 +214,80 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_01_soc = { .urgent_latency_adjustment_fabric_clock_reference_mhz = 0, }; +struct wm_table ddr4_wm_table = { + .entries = { + { + .wm_inst = WM_A, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 6.09, + .sr_enter_plus_exit_time_us = 7.14, + .valid = true, + }, + { + .wm_inst = WM_B, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 10.12, + .sr_enter_plus_exit_time_us = 11.48, + .valid = true, + }, + { + .wm_inst = WM_C, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 10.12, + .sr_enter_plus_exit_time_us = 11.48, + .valid = true, + }, + { + .wm_inst = WM_D, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.72, + .sr_exit_time_us = 10.12, + .sr_enter_plus_exit_time_us = 11.48, + .valid = true, + }, + } +}; + +struct wm_table lpddr5_wm_table = { + .entries = { + { + .wm_inst = WM_A, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 13.5, + .sr_enter_plus_exit_time_us = 16.5, + .valid = true, + }, + { + .wm_inst = WM_B, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 13.5, + .sr_enter_plus_exit_time_us = 16.5, + .valid = true, + }, + { + .wm_inst = WM_C, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 13.5, + .sr_enter_plus_exit_time_us = 16.5, + .valid = true, + }, + { + .wm_inst = WM_D, + .wm_type = WM_TYPE_PSTATE_CHG, + .pstate_latency_us = 11.65333, + .sr_exit_time_us = 13.5, + .sr_enter_plus_exit_time_us = 16.5, + .valid = true, + }, + } +}; + static void calculate_wm_set_for_vlevel(int vlevel, struct wm_range_table_entry *table_entry, struct dcn_watermarks *wm_set, -- 2.35.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 5/5] drm/amd/display: move FPU code from dcn301 clk mgr to DML folder 2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen @ 2022-07-21 17:26 ` Maíra Canal 2022-07-21 18:59 ` Rodrigo Siqueira Jordao 1 sibling, 0 replies; 13+ messages in thread From: Maíra Canal @ 2022-07-21 17:26 UTC (permalink / raw) To: Melissa Wen, harry.wentland, sunpeng.li, Rodrigo.Siqueira, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, kernel-dev, amd-gfx, dri-devel, linux-kernel Hi Melissa, On 7/20/22 16:32, Melissa Wen wrote: > The -mno-gnu-attribute option in dcn301 clk mgr makefile hides a soft vs > hard fp error for powerpc. After removing this flag, we can see some FPU > code remains there: > > gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses > hard float, > drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.o > uses soft float > > Therefore, remove the -mno-gnu-attribute flag for dcn301/powerpc and > move FPU-associated code to DML folder. > > Signed-off-by: Melissa Wen <mwen@igalia.com> > --- > .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 -- > .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 86 ++----------------- > .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h | 3 + > .../amd/display/dc/dml/dcn301/dcn301_fpu.c | 74 ++++++++++++++++ > 4 files changed, 84 insertions(+), 85 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > index 15b660a951a5..271d8e573181 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > @@ -123,12 +123,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30) > ############################################################################### > CLK_MGR_DCN301 = vg_clk_mgr.o dcn301_smu.o > > -# prevent build errors regarding soft-float vs hard-float FP ABI tags > -# this code is currently unused on ppc64, as it applies to VanGogh APUs only > -ifdef CONFIG_PPC64 > -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn301/vg_clk_mgr.o := $(call cc-option,-mno-gnu-attribute) > -endif > - > AMD_DAL_CLK_MGR_DCN301 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn301/,$(CLK_MGR_DCN301)) > > AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN301) > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > index f310b0d25a07..65f224af03c0 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > @@ -32,6 +32,10 @@ > // For dcn20_update_clocks_update_dpp_dto > #include "dcn20/dcn20_clk_mgr.h" > > +// For DML FPU code > +#include "dml/dcn20/dcn20_fpu.h" > +#include "dml/dcn301/dcn301_fpu.h" > + I guess the "dml/dcn301/dcn301_fpu.h" header is not needed, as you only use dcn21_clk_mgr_set_bw_params_wm_table and the structs are on the source file. Besides that, to the whole series: Reviewed-by: Maíra Canal <mairacanal@riseup.net> Best Regards, - Maíra Canal > #include "vg_clk_mgr.h" > #include "dcn301_smu.h" > #include "reg_helper.h" > @@ -526,81 +530,6 @@ static struct clk_bw_params vg_bw_params = { > > }; > > -static struct wm_table ddr4_wm_table = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 6.09, > - .sr_enter_plus_exit_time_us = 7.14, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - } > -}; > - > -static struct wm_table lpddr5_wm_table = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 13.5, > - .sr_enter_plus_exit_time_us = 16.5, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 13.5, > - .sr_enter_plus_exit_time_us = 16.5, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 13.5, > - .sr_enter_plus_exit_time_us = 16.5, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 13.5, > - .sr_enter_plus_exit_time_us = 16.5, > - .valid = true, > - }, > - } > -}; > - > - > static unsigned int find_dcfclk_for_voltage(const struct vg_dpm_clocks *clock_table, > unsigned int voltage) > { > @@ -670,10 +599,9 @@ static void vg_clk_mgr_helper_populate_bw_params( > /* > * WM set D will be re-purposed for memory retraining > */ > - bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY; > - bw_params->wm_table.entries[WM_D].wm_inst = WM_D; > - bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING; > - bw_params->wm_table.entries[WM_D].valid = true; > + DC_FP_START(); > + dcn21_clk_mgr_set_bw_params_wm_table(bw_params); > + DC_FP_END(); > } > > } > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h > index 7255477307f1..75884f572989 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h > @@ -29,6 +29,9 @@ > > struct watermarks; > > +extern struct wm_table ddr4_wm_table; > +extern struct wm_table lpddr5_wm_table; > + > struct smu_watermark_set { > struct watermarks *wm_set; > union large_integer mc_address; > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c > index e4863f0bf0f6..7ef66e511ec8 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c > @@ -214,6 +214,80 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_01_soc = { > .urgent_latency_adjustment_fabric_clock_reference_mhz = 0, > }; > > +struct wm_table ddr4_wm_table = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 6.09, > + .sr_enter_plus_exit_time_us = 7.14, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + } > +}; > + > +struct wm_table lpddr5_wm_table = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 13.5, > + .sr_enter_plus_exit_time_us = 16.5, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 13.5, > + .sr_enter_plus_exit_time_us = 16.5, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 13.5, > + .sr_enter_plus_exit_time_us = 16.5, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 13.5, > + .sr_enter_plus_exit_time_us = 16.5, > + .valid = true, > + }, > + } > +}; > + > static void calculate_wm_set_for_vlevel(int vlevel, > struct wm_range_table_entry *table_entry, > struct dcn_watermarks *wm_set, ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5/5] drm/amd/display: move FPU code from dcn301 clk mgr to DML folder 2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen 2022-07-21 17:26 ` Maíra Canal @ 2022-07-21 18:59 ` Rodrigo Siqueira Jordao 1 sibling, 0 replies; 13+ messages in thread From: Rodrigo Siqueira Jordao @ 2022-07-21 18:59 UTC (permalink / raw) To: Melissa Wen, harry.wentland, sunpeng.li, alexander.deucher, christian.koenig, Xinhui.Pan, airlied, daniel Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel, linux-kernel On 2022-07-20 15:32, Melissa Wen wrote: > The -mno-gnu-attribute option in dcn301 clk mgr makefile hides a soft vs > hard fp error for powerpc. After removing this flag, we can see some FPU > code remains there: > > gcc-11.3.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld: > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses > hard float, > drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.o > uses soft float > > Therefore, remove the -mno-gnu-attribute flag for dcn301/powerpc and > move FPU-associated code to DML folder. > > Signed-off-by: Melissa Wen <mwen@igalia.com> > --- > .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 -- > .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 86 ++----------------- > .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h | 3 + > .../amd/display/dc/dml/dcn301/dcn301_fpu.c | 74 ++++++++++++++++ > 4 files changed, 84 insertions(+), 85 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > index 15b660a951a5..271d8e573181 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > @@ -123,12 +123,6 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN30) > ############################################################################### > CLK_MGR_DCN301 = vg_clk_mgr.o dcn301_smu.o > > -# prevent build errors regarding soft-float vs hard-float FP ABI tags > -# this code is currently unused on ppc64, as it applies to VanGogh APUs only > -ifdef CONFIG_PPC64 > -CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn301/vg_clk_mgr.o := $(call cc-option,-mno-gnu-attribute) > -endif > - > AMD_DAL_CLK_MGR_DCN301 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn301/,$(CLK_MGR_DCN301)) > > AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN301) > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > index f310b0d25a07..65f224af03c0 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > @@ -32,6 +32,10 @@ > // For dcn20_update_clocks_update_dpp_dto > #include "dcn20/dcn20_clk_mgr.h" > > +// For DML FPU code > +#include "dml/dcn20/dcn20_fpu.h" > +#include "dml/dcn301/dcn301_fpu.h" > + > #include "vg_clk_mgr.h" > #include "dcn301_smu.h" > #include "reg_helper.h" > @@ -526,81 +530,6 @@ static struct clk_bw_params vg_bw_params = { > > }; > > -static struct wm_table ddr4_wm_table = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 6.09, > - .sr_enter_plus_exit_time_us = 7.14, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.72, > - .sr_exit_time_us = 10.12, > - .sr_enter_plus_exit_time_us = 11.48, > - .valid = true, > - }, > - } > -}; > - > -static struct wm_table lpddr5_wm_table = { > - .entries = { > - { > - .wm_inst = WM_A, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 13.5, > - .sr_enter_plus_exit_time_us = 16.5, > - .valid = true, > - }, > - { > - .wm_inst = WM_B, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 13.5, > - .sr_enter_plus_exit_time_us = 16.5, > - .valid = true, > - }, > - { > - .wm_inst = WM_C, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 13.5, > - .sr_enter_plus_exit_time_us = 16.5, > - .valid = true, > - }, > - { > - .wm_inst = WM_D, > - .wm_type = WM_TYPE_PSTATE_CHG, > - .pstate_latency_us = 11.65333, > - .sr_exit_time_us = 13.5, > - .sr_enter_plus_exit_time_us = 16.5, > - .valid = true, > - }, > - } > -}; > - > - > static unsigned int find_dcfclk_for_voltage(const struct vg_dpm_clocks *clock_table, > unsigned int voltage) > { > @@ -670,10 +599,9 @@ static void vg_clk_mgr_helper_populate_bw_params( > /* > * WM set D will be re-purposed for memory retraining > */ > - bw_params->wm_table.entries[WM_D].pstate_latency_us = LPDDR_MEM_RETRAIN_LATENCY; > - bw_params->wm_table.entries[WM_D].wm_inst = WM_D; > - bw_params->wm_table.entries[WM_D].wm_type = WM_TYPE_RETRAINING; > - bw_params->wm_table.entries[WM_D].valid = true; > + DC_FP_START(); > + dcn21_clk_mgr_set_bw_params_wm_table(bw_params); > + DC_FP_END(); > } > > } > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h > index 7255477307f1..75884f572989 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.h > @@ -29,6 +29,9 @@ > > struct watermarks; > > +extern struct wm_table ddr4_wm_table; > +extern struct wm_table lpddr5_wm_table; > + > struct smu_watermark_set { > struct watermarks *wm_set; > union large_integer mc_address; > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c > index e4863f0bf0f6..7ef66e511ec8 100644 > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c > @@ -214,6 +214,80 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_01_soc = { > .urgent_latency_adjustment_fabric_clock_reference_mhz = 0, > }; > > +struct wm_table ddr4_wm_table = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 6.09, > + .sr_enter_plus_exit_time_us = 7.14, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.72, > + .sr_exit_time_us = 10.12, > + .sr_enter_plus_exit_time_us = 11.48, > + .valid = true, > + }, > + } > +}; > + > +struct wm_table lpddr5_wm_table = { > + .entries = { > + { > + .wm_inst = WM_A, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 13.5, > + .sr_enter_plus_exit_time_us = 16.5, > + .valid = true, > + }, > + { > + .wm_inst = WM_B, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 13.5, > + .sr_enter_plus_exit_time_us = 16.5, > + .valid = true, > + }, > + { > + .wm_inst = WM_C, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 13.5, > + .sr_enter_plus_exit_time_us = 16.5, > + .valid = true, > + }, > + { > + .wm_inst = WM_D, > + .wm_type = WM_TYPE_PSTATE_CHG, > + .pstate_latency_us = 11.65333, > + .sr_exit_time_us = 13.5, > + .sr_enter_plus_exit_time_us = 16.5, > + .valid = true, > + }, > + } > +}; > + > static void calculate_wm_set_for_vlevel(int vlevel, > struct wm_range_table_entry *table_entry, > struct dcn_watermarks *wm_set, Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc 2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen ` (4 preceding siblings ...) 2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen @ 2022-07-21 19:07 ` Rodrigo Siqueira Jordao 5 siblings, 0 replies; 13+ messages in thread From: Rodrigo Siqueira Jordao @ 2022-07-21 19:07 UTC (permalink / raw) To: Melissa Wen, airlied, alexander.deucher, christian.koenig, daniel, harry.wentland, sunpeng.li, Xinhui.Pan Cc: Guenter Roeck, Maíra Canal, kernel-dev, amd-gfx, dri-devel, linux-kernel On 2022-07-20 15:32, Melissa Wen wrote: > An initial report from Guenter[1] shows some soft-fp vs hard-fp error > from DCN31 clk mgr for powerpc. I was not able to reproduce it > cross-compiling with gcc-powerpc-linux-gnu and gcc-11.3, but thanks to > Maíra tips, I can reproduce the issue using make.cross, as follows: > > - wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross > - chmod +x ~/bin/make.cross > - mkdir build_dir > - COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 ~/make.cross O=build_dir ARCH=powerpc SHELL=/bin/bash Hi Melissa, I didn't know about these steps, I was trying to reproduce this issue by using the standard cross compile package provided by my distro (Debian testing and ArchLinux), and as a result, I was never able to see the problem. Anyway, I can now reproduce this issue, thanks a lot. > with a config file generate by allmodconfig > > So, the first patch fix the issue reported by Guenter. The second is > just a cleanup in dcn31_resource file to remove useless DC_FP_ wrapper. > Finally, the last three patches I'm removing the -mno-gnu-attribute > option, that was just hiding FPU-associated code in clk mgr files of > dcn21/30/301, and moving them to DML folder. This series doesn't cover > recent drivers dcn32/314. I validated this series in our internal CI by running multiple IGT tests in numerous ASICs. Tomorrow we will also send some extra patches associated with this FPU effort; hopefully, after that, we will finally have all the FPU code under DML. Again, thanks a lot for your effort! Thanks Siqueira > Thanks Guenter, Maíra, Siqueira and Alex for all inputs on this > debugging process. Let me know your thoughts on this approach. > > Melissa > > [1] https://lore.kernel.org/amd-gfx/20220618232737.2036722-1-linux@roeck-us.net/>> > Melissa Wen (5): > drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family for powerpc > drm/amd/display: remove useless FPU protection wrapper from > dcn31_resource file > drm/amd/display: move FPU code on dcn21 clk_mgr > drm/amd/display: move FPU code from dcn30 clk mgr to DML folder > drm/amd/display: move FPU code from dcn301 clk mgr to DML folder > > .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 18 -- > .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 234 +---------------- > .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.h | 7 + > .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 63 +---- > .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 86 +------ > .../display/dc/clk_mgr/dcn301/vg_clk_mgr.h | 3 + > .../drm/amd/display/dc/dcn31/dcn31_resource.c | 11 +- > .../amd/display/dc/dcn315/dcn315_resource.c | 5 +- > .../amd/display/dc/dcn316/dcn316_resource.c | 5 +- > .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 235 ++++++++++++++++++ > .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h | 2 + > .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.c | 63 ++++- > .../drm/amd/display/dc/dml/dcn30/dcn30_fpu.h | 1 + > .../amd/display/dc/dml/dcn301/dcn301_fpu.c | 74 ++++++ > .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 11 + > .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 3 + > 16 files changed, 423 insertions(+), 398 deletions(-) > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-07-21 19:07 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-07-20 19:32 [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Melissa Wen 2022-07-20 19:32 ` [PATCH 1/5] drm/amd/display: fix soft-fp vs hard-fp on DCN 3.1 family " Melissa Wen 2022-07-21 18:54 ` Rodrigo Siqueira Jordao 2022-07-20 19:32 ` [PATCH 2/5] drm/amd/display: remove useless FPU protection wrapper from dcn31_resource file Melissa Wen 2022-07-21 18:55 ` Rodrigo Siqueira Jordao 2022-07-20 19:32 ` [PATCH 3/5] drm/amd/display: move FPU code on dcn21 clk_mgr Melissa Wen 2022-07-21 18:57 ` Rodrigo Siqueira Jordao 2022-07-20 19:32 ` [PATCH 4/5] drm/amd/display: move FPU code from dcn30 clk mgr to DML folder Melissa Wen 2022-07-21 18:58 ` Rodrigo Siqueira Jordao 2022-07-20 19:32 ` [PATCH 5/5] drm/amd/display: move FPU code from dcn301 " Melissa Wen 2022-07-21 17:26 ` Maíra Canal 2022-07-21 18:59 ` Rodrigo Siqueira Jordao 2022-07-21 19:07 ` [PATCH 0/5] drm/amd/display: FPU cleanup in clk_mgr files for powerpc Rodrigo Siqueira Jordao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).