All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
@ 2022-03-29  8:29 CHANDAN VURDIGERE NATARAJ
  2022-03-29 15:11 ` Deucher, Alexander
  2022-03-29 16:46 ` Paul Menzel
  0 siblings, 2 replies; 6+ messages in thread
From: CHANDAN VURDIGERE NATARAJ @ 2022-03-29  8:29 UTC (permalink / raw)
  To: Harry Wentland, Leo Li, Alex Deucher; +Cc: CHANDAN VURDIGERE NATARAJ, amd-gfx

[WHY]
Below general protection fault observed when WebGL Aquarium is run for
longer duration. If drm debug logs are enabled and set to 0x1f then the
issue is observed within 10 minutes of run.

[  100.717056] general protection fault, probably for non-canonical address 0x2d33302d32323032: 0000 [#1] PREEMPT SMP NOPTI
[  100.727921] CPU: 3 PID: 1906 Comm: DrmThread Tainted: G        W         5.15.30 #12 d726c6a2d6ebe5cf9223931cbca6892f916fe18b
[  100.754419] RIP: 0010:CalculateSwathWidth+0x1f7/0x44f
[  100.767109] Code: 00 00 00 f2 42 0f 11 04 f0 48 8b 85 88 00 00 00 f2 42 0f 10 04 f0 48 8b 85 98 00 00 00 f2 42 0f 11 04 f0 48 8b 45 10 0f 57 c0 <f3> 42 0f 2a 04 b0 0f 57 c9 f3 43 0f 2a 0c b4 e8 8c e2 f3 ff 48 8b
[  100.781269] RSP: 0018:ffffa9230079eeb0 EFLAGS: 00010246
[  100.812528] RAX: 2d33302d32323032 RBX: 0000000000000500 RCX: 0000000000000000
[  100.819656] RDX: 0000000000000001 RSI: ffff99deb712c49c RDI: 0000000000000000
[  100.826781] RBP: ffffa9230079ef50 R08: ffff99deb712460c R09: ffff99deb712462c
[  100.833907] R10: ffff99deb7124940 R11: ffff99deb7124d70 R12: ffff99deb712ae44
[  100.841033] R13: 0000000000000001 R14: 0000000000000000 R15: ffffa9230079f0a0
[  100.848159] FS:  00007af121212640(0000) GS:ffff99deba780000(0000) knlGS:0000000000000000
[  100.856240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.861980] CR2: 0000209000fe1000 CR3: 000000011b18c000 CR4: 0000000000350ee0
[  100.869106] Call Trace:
[  100.871555]  <TASK>
[  100.873655]  ? asm_sysvec_reschedule_ipi+0x12/0x20
[  100.878449]  CalculateSwathAndDETConfiguration+0x1a3/0x6dd
[  100.883937]  dml31_ModeSupportAndSystemConfigurationFull+0x2ce4/0x76da
[  100.890467]  ? kallsyms_lookup_buildid+0xc8/0x163
[  100.895173]  ? kallsyms_lookup_buildid+0xc8/0x163
[  100.899874]  ? __sprint_symbol+0x80/0x135
[  100.903883]  ? dm_update_plane_state+0x3f9/0x4d2
[  100.908500]  ? symbol_string+0xb7/0xde
[  100.912250]  ? number+0x145/0x29b
[  100.915566]  ? vsnprintf+0x341/0x5ff
[  100.919141]  ? desc_read_finalized_seq+0x39/0x87
[  100.923755]  ? update_load_avg+0x1b9/0x607
[  100.927849]  ? compute_mst_dsc_configs_for_state+0x7d/0xd5b
[  100.933416]  ? fetch_pipe_params+0xa4d/0xd0c
[  100.937686]  ? dc_fpu_end+0x3d/0xa8
[  100.941175]  dml_get_voltage_level+0x16b/0x180
[  100.945619]  dcn30_internal_validate_bw+0x10e/0x89b
[  100.950495]  ? dcn31_validate_bandwidth+0x68/0x1fc
[  100.955285]  ? resource_build_scaling_params+0x98b/0xb8c
[  100.960595]  ? dcn31_validate_bandwidth+0x68/0x1fc
[  100.965384]  dcn31_validate_bandwidth+0x9a/0x1fc
[  100.970001]  dc_validate_global_state+0x238/0x295
[  100.974703]  amdgpu_dm_atomic_check+0x9c1/0xbce
[  100.979235]  ? _printk+0x59/0x73
[  100.982467]  drm_atomic_check_only+0x403/0x78b
[  100.986912]  drm_mode_atomic_ioctl+0x49b/0x546
[  100.991358]  ? drm_ioctl+0x1c1/0x3b3
[  100.994936]  ? drm_atomic_set_property+0x92a/0x92a
[  100.999725]  drm_ioctl_kernel+0xdc/0x149
[  101.003648]  drm_ioctl+0x27f/0x3b3
[  101.007051]  ? drm_atomic_set_property+0x92a/0x92a
[  101.011842]  amdgpu_drm_ioctl+0x49/0x7d
[  101.015679]  __se_sys_ioctl+0x7c/0xb8
[  101.015685]  do_syscall_64+0x5f/0xb8
[  101.015690]  ? __irq_exit_rcu+0x34/0x96

[HOW]
It calles populate_dml_pipes which uses doubles to initialize.
Adding FPU protection avoids context switch and probable loss of vba context
as there is potential contention while drm debug logs are enabled.

Signed-off-by: CHANDAN VURDIGERE NATARAJ <chandan.vurdigerenataraj@amd.com>

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
index 826970f2bd0a..f27262417abe 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
@@ -1750,7 +1750,9 @@ bool dcn31_validate_bandwidth(struct dc *dc,
 
 	BW_VAL_TRACE_COUNT();
 
+	DC_FP_START();
 	out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt, &vlevel, fast_validate);
+	DC_FP_END();
 
 	// Disable fast_validate to set min dcfclk in alculate_wm_and_dlg
 	if (pipe_cnt == 0)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
  2022-03-29  8:29 [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw CHANDAN VURDIGERE NATARAJ
@ 2022-03-29 15:11 ` Deucher, Alexander
  2022-03-29 16:46 ` Paul Menzel
  1 sibling, 0 replies; 6+ messages in thread
From: Deucher, Alexander @ 2022-03-29 15:11 UTC (permalink / raw)
  To: VURDIGERENATARAJ, CHANDAN, Wentland,  Harry, Li, Sun peng (Leo); +Cc: amd-gfx

[-- Attachment #1: Type: text/plain, Size: 4832 bytes --]

[Public]

Acked-by: Alex Deucher <alexander.deucher@amd.com>
________________________________
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of CHANDAN VURDIGERE NATARAJ <chandan.vurdigerenataraj@amd.com>
Sent: Tuesday, March 29, 2022 4:29 AM
To: Wentland, Harry <Harry.Wentland@amd.com>; Li, Sun peng (Leo) <Sunpeng.Li@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>
Cc: VURDIGERENATARAJ, CHANDAN <CHANDAN.VURDIGERENATARAJ@amd.com>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw

[WHY]
Below general protection fault observed when WebGL Aquarium is run for
longer duration. If drm debug logs are enabled and set to 0x1f then the
issue is observed within 10 minutes of run.

[  100.717056] general protection fault, probably for non-canonical address 0x2d33302d32323032: 0000 [#1] PREEMPT SMP NOPTI
[  100.727921] CPU: 3 PID: 1906 Comm: DrmThread Tainted: G        W         5.15.30 #12 d726c6a2d6ebe5cf9223931cbca6892f916fe18b
[  100.754419] RIP: 0010:CalculateSwathWidth+0x1f7/0x44f
[  100.767109] Code: 00 00 00 f2 42 0f 11 04 f0 48 8b 85 88 00 00 00 f2 42 0f 10 04 f0 48 8b 85 98 00 00 00 f2 42 0f 11 04 f0 48 8b 45 10 0f 57 c0 <f3> 42 0f 2a 04 b0 0f 57 c9 f3 43 0f 2a 0c b4 e8 8c e2 f3 ff 48 8b
[  100.781269] RSP: 0018:ffffa9230079eeb0 EFLAGS: 00010246
[  100.812528] RAX: 2d33302d32323032 RBX: 0000000000000500 RCX: 0000000000000000
[  100.819656] RDX: 0000000000000001 RSI: ffff99deb712c49c RDI: 0000000000000000
[  100.826781] RBP: ffffa9230079ef50 R08: ffff99deb712460c R09: ffff99deb712462c
[  100.833907] R10: ffff99deb7124940 R11: ffff99deb7124d70 R12: ffff99deb712ae44
[  100.841033] R13: 0000000000000001 R14: 0000000000000000 R15: ffffa9230079f0a0
[  100.848159] FS:  00007af121212640(0000) GS:ffff99deba780000(0000) knlGS:0000000000000000
[  100.856240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.861980] CR2: 0000209000fe1000 CR3: 000000011b18c000 CR4: 0000000000350ee0
[  100.869106] Call Trace:
[  100.871555]  <TASK>
[  100.873655]  ? asm_sysvec_reschedule_ipi+0x12/0x20
[  100.878449]  CalculateSwathAndDETConfiguration+0x1a3/0x6dd
[  100.883937]  dml31_ModeSupportAndSystemConfigurationFull+0x2ce4/0x76da
[  100.890467]  ? kallsyms_lookup_buildid+0xc8/0x163
[  100.895173]  ? kallsyms_lookup_buildid+0xc8/0x163
[  100.899874]  ? __sprint_symbol+0x80/0x135
[  100.903883]  ? dm_update_plane_state+0x3f9/0x4d2
[  100.908500]  ? symbol_string+0xb7/0xde
[  100.912250]  ? number+0x145/0x29b
[  100.915566]  ? vsnprintf+0x341/0x5ff
[  100.919141]  ? desc_read_finalized_seq+0x39/0x87
[  100.923755]  ? update_load_avg+0x1b9/0x607
[  100.927849]  ? compute_mst_dsc_configs_for_state+0x7d/0xd5b
[  100.933416]  ? fetch_pipe_params+0xa4d/0xd0c
[  100.937686]  ? dc_fpu_end+0x3d/0xa8
[  100.941175]  dml_get_voltage_level+0x16b/0x180
[  100.945619]  dcn30_internal_validate_bw+0x10e/0x89b
[  100.950495]  ? dcn31_validate_bandwidth+0x68/0x1fc
[  100.955285]  ? resource_build_scaling_params+0x98b/0xb8c
[  100.960595]  ? dcn31_validate_bandwidth+0x68/0x1fc
[  100.965384]  dcn31_validate_bandwidth+0x9a/0x1fc
[  100.970001]  dc_validate_global_state+0x238/0x295
[  100.974703]  amdgpu_dm_atomic_check+0x9c1/0xbce
[  100.979235]  ? _printk+0x59/0x73
[  100.982467]  drm_atomic_check_only+0x403/0x78b
[  100.986912]  drm_mode_atomic_ioctl+0x49b/0x546
[  100.991358]  ? drm_ioctl+0x1c1/0x3b3
[  100.994936]  ? drm_atomic_set_property+0x92a/0x92a
[  100.999725]  drm_ioctl_kernel+0xdc/0x149
[  101.003648]  drm_ioctl+0x27f/0x3b3
[  101.007051]  ? drm_atomic_set_property+0x92a/0x92a
[  101.011842]  amdgpu_drm_ioctl+0x49/0x7d
[  101.015679]  __se_sys_ioctl+0x7c/0xb8
[  101.015685]  do_syscall_64+0x5f/0xb8
[  101.015690]  ? __irq_exit_rcu+0x34/0x96

[HOW]
It calles populate_dml_pipes which uses doubles to initialize.
Adding FPU protection avoids context switch and probable loss of vba context
as there is potential contention while drm debug logs are enabled.

Signed-off-by: CHANDAN VURDIGERE NATARAJ <chandan.vurdigerenataraj@amd.com>

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
index 826970f2bd0a..f27262417abe 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
@@ -1750,7 +1750,9 @@ bool dcn31_validate_bandwidth(struct dc *dc,

         BW_VAL_TRACE_COUNT();

+       DC_FP_START();
         out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt, &vlevel, fast_validate);
+       DC_FP_END();

         // Disable fast_validate to set min dcfclk in alculate_wm_and_dlg
         if (pipe_cnt == 0)
--
2.25.1


[-- Attachment #2: Type: text/html, Size: 6806 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
  2022-03-29  8:29 [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw CHANDAN VURDIGERE NATARAJ
  2022-03-29 15:11 ` Deucher, Alexander
@ 2022-03-29 16:46 ` Paul Menzel
  2022-03-30  2:33   ` VURDIGERENATARAJ, CHANDAN
  1 sibling, 1 reply; 6+ messages in thread
From: Paul Menzel @ 2022-03-29 16:46 UTC (permalink / raw)
  To: CHANDAN VURDIGERE NATARAJ; +Cc: Leo Li, Alex Deucher, Harry Wentland, amd-gfx

Dear Chandan,


Am 29.03.22 um 10:29 schrieb CHANDAN VURDIGERE NATARAJ:

Is it common to spell your name all uppercase? If not, please use 
Chandan nVurdigere Nataraj.

> [WHY]

The [] already emphasize the word, so Why could be used.

> Below general protection fault observed when WebGL Aquarium is run for
> longer duration. If drm debug logs are enabled and set to 0x1f then the

In what browser and what version?

> issue is observed within 10 minutes of run.

Where you able to reproduce it without drm debug logs?

> [  100.717056] general protection fault, probably for non-canonical address 0x2d33302d32323032: 0000 [#1] PREEMPT SMP NOPTI
> [  100.727921] CPU: 3 PID: 1906 Comm: DrmThread Tainted: G        W         5.15.30 #12 d726c6a2d6ebe5cf9223931cbca6892f916fe18b
> [  100.754419] RIP: 0010:CalculateSwathWidth+0x1f7/0x44f
> [  100.767109] Code: 00 00 00 f2 42 0f 11 04 f0 48 8b 85 88 00 00 00 f2 42 0f 10 04 f0 48 8b 85 98 00 00 00 f2 42 0f 11 04 f0 48 8b 45 10 0f 57 c0 <f3> 42 0f 2a 04 b0 0f 57 c9 f3 43 0f 2a 0c b4 e8 8c e2 f3 ff 48 8b
> [  100.781269] RSP: 0018:ffffa9230079eeb0 EFLAGS: 00010246
> [  100.812528] RAX: 2d33302d32323032 RBX: 0000000000000500 RCX: 0000000000000000
> [  100.819656] RDX: 0000000000000001 RSI: ffff99deb712c49c RDI: 0000000000000000
> [  100.826781] RBP: ffffa9230079ef50 R08: ffff99deb712460c R09: ffff99deb712462c
> [  100.833907] R10: ffff99deb7124940 R11: ffff99deb7124d70 R12: ffff99deb712ae44
> [  100.841033] R13: 0000000000000001 R14: 0000000000000000 R15: ffffa9230079f0a0
> [  100.848159] FS:  00007af121212640(0000) GS:ffff99deba780000(0000) knlGS:0000000000000000
> [  100.856240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  100.861980] CR2: 0000209000fe1000 CR3: 000000011b18c000 CR4: 0000000000350ee0
> [  100.869106] Call Trace:
> [  100.871555]  <TASK>
> [  100.873655]  ? asm_sysvec_reschedule_ipi+0x12/0x20
> [  100.878449]  CalculateSwathAndDETConfiguration+0x1a3/0x6dd
> [  100.883937]  dml31_ModeSupportAndSystemConfigurationFull+0x2ce4/0x76da
> [  100.890467]  ? kallsyms_lookup_buildid+0xc8/0x163
> [  100.895173]  ? kallsyms_lookup_buildid+0xc8/0x163
> [  100.899874]  ? __sprint_symbol+0x80/0x135
> [  100.903883]  ? dm_update_plane_state+0x3f9/0x4d2
> [  100.908500]  ? symbol_string+0xb7/0xde
> [  100.912250]  ? number+0x145/0x29b
> [  100.915566]  ? vsnprintf+0x341/0x5ff
> [  100.919141]  ? desc_read_finalized_seq+0x39/0x87
> [  100.923755]  ? update_load_avg+0x1b9/0x607
> [  100.927849]  ? compute_mst_dsc_configs_for_state+0x7d/0xd5b
> [  100.933416]  ? fetch_pipe_params+0xa4d/0xd0c
> [  100.937686]  ? dc_fpu_end+0x3d/0xa8
> [  100.941175]  dml_get_voltage_level+0x16b/0x180
> [  100.945619]  dcn30_internal_validate_bw+0x10e/0x89b
> [  100.950495]  ? dcn31_validate_bandwidth+0x68/0x1fc
> [  100.955285]  ? resource_build_scaling_params+0x98b/0xb8c
> [  100.960595]  ? dcn31_validate_bandwidth+0x68/0x1fc
> [  100.965384]  dcn31_validate_bandwidth+0x9a/0x1fc
> [  100.970001]  dc_validate_global_state+0x238/0x295
> [  100.974703]  amdgpu_dm_atomic_check+0x9c1/0xbce
> [  100.979235]  ? _printk+0x59/0x73
> [  100.982467]  drm_atomic_check_only+0x403/0x78b
> [  100.986912]  drm_mode_atomic_ioctl+0x49b/0x546
> [  100.991358]  ? drm_ioctl+0x1c1/0x3b3
> [  100.994936]  ? drm_atomic_set_property+0x92a/0x92a
> [  100.999725]  drm_ioctl_kernel+0xdc/0x149
> [  101.003648]  drm_ioctl+0x27f/0x3b3
> [  101.007051]  ? drm_atomic_set_property+0x92a/0x92a
> [  101.011842]  amdgpu_drm_ioctl+0x49/0x7d
> [  101.015679]  __se_sys_ioctl+0x7c/0xb8
> [  101.015685]  do_syscall_64+0x5f/0xb8
> [  101.015690]  ? __irq_exit_rcu+0x34/0x96
> 
> [HOW]
> It calles populate_dml_pipes which uses doubles to initialize.

calls

Excuse my ignorance. So using doubles causes a context switch?

> Adding FPU protection avoids context switch and probable loss of vba context
> as there is potential contention while drm debug logs are enabled.
> 
> Signed-off-by: CHANDAN VURDIGERE NATARAJ <chandan.vurdigerenataraj@amd.com>
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> index 826970f2bd0a..f27262417abe 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> @@ -1750,7 +1750,9 @@ bool dcn31_validate_bandwidth(struct dc *dc,
>   
>   	BW_VAL_TRACE_COUNT();
>   
> +	DC_FP_START();
>   	out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt, &vlevel, fast_validate);
> +	DC_FP_END();
>   
>   	// Disable fast_validate to set min dcfclk in alculate_wm_and_dlg
>   	if (pipe_cnt == 0)


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
  2022-03-29 16:46 ` Paul Menzel
@ 2022-03-30  2:33   ` VURDIGERENATARAJ, CHANDAN
  2022-03-30 13:57     ` Melissa Wen
  0 siblings, 1 reply; 6+ messages in thread
From: VURDIGERENATARAJ, CHANDAN @ 2022-03-30  2:33 UTC (permalink / raw)
  To: Paul Menzel
  Cc: Li, Sun peng (Leo), Deucher, Alexander, Wentland, Harry, amd-gfx

Hi Paul,

>Am 29.03.22 um 10:29 schrieb CHANDAN VURDIGERE NATARAJ:
>
>Is it common to spell your name all uppercase? If not, please use Chandan nVurdigere Nataraj.
>
>> [WHY]
>
>The [] already emphasize the word, so Why could be used.
>
>> Below general protection fault observed when WebGL Aquarium is run for 
>> longer duration. If drm debug logs are enabled and set to 0x1f then 
>> the
>
>In what browser and what version?
The issue was observed on ChromiumOS and Chromium Browser version 100.0.4877.0
>
>> issue is observed within 10 minutes of run.
>
>Where you able to reproduce it without drm debug logs?
Yes. It took 34 hours to reproduce without drm debug logs. Using drm debug logs was a faster way to reproduce the issue.
>
>> [  100.717056] general protection fault, probably for non-canonical address 0x2d33302d32323032: 0000 [#1] PREEMPT SMP NOPTI
>> [  100.727921] CPU: 3 PID: 1906 Comm: DrmThread Tainted: G        W         5.15.30 #12 d726c6a2d6ebe5cf9223931cbca6892f916fe18b
>> [  100.754419] RIP: 0010:CalculateSwathWidth+0x1f7/0x44f
>> [  100.767109] Code: 00 00 00 f2 42 0f 11 04 f0 48 8b 85 88 00 00 00 
>> f2 42 0f 10 04 f0 48 8b 85 98 00 00 00 f2 42 0f 11 04 f0 48 8b 45 10 
>> 0f 57 c0 <f3> 42 0f 2a 04 b0 0f 57 c9 f3 43 0f 2a 0c b4 e8 8c e2 f3 ff 
>> 48 8b [  100.781269] RSP: 0018:ffffa9230079eeb0 EFLAGS: 00010246 [  
>> 100.812528] RAX: 2d33302d32323032 RBX: 0000000000000500 RCX: 
>> 0000000000000000 [  100.819656] RDX: 0000000000000001 RSI: 
>> ffff99deb712c49c RDI: 0000000000000000 [  100.826781] RBP: 
>> ffffa9230079ef50 R08: ffff99deb712460c R09: ffff99deb712462c [  
>> 100.833907] R10: ffff99deb7124940 R11: ffff99deb7124d70 R12: 
>> ffff99deb712ae44 [  100.841033] R13: 0000000000000001 R14: 
>> 0000000000000000 R15: ffffa9230079f0a0 [  100.848159] FS:  00007af121212640(0000) GS:ffff99deba780000(0000) knlGS:0000000000000000 [  100.856240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  100.861980] CR2: 0000209000fe1000 CR3: >000000011b18c000 CR4: 0000000000350ee0 [  100.869106] Call Trace:
>> [  100.871555]  <TASK>
>> [  100.873655]  ? asm_sysvec_reschedule_ipi+0x12/0x20
>> [  100.878449]  CalculateSwathAndDETConfiguration+0x1a3/0x6dd
>> [  100.883937]  
>> dml31_ModeSupportAndSystemConfigurationFull+0x2ce4/0x76da
>> [  100.890467]  ? kallsyms_lookup_buildid+0xc8/0x163
>> [  100.895173]  ? kallsyms_lookup_buildid+0xc8/0x163
>> [  100.899874]  ? __sprint_symbol+0x80/0x135 [  100.903883]  ? 
>> dm_update_plane_state+0x3f9/0x4d2 [  100.908500]  ? 
>> symbol_string+0xb7/0xde [  100.912250]  ? number+0x145/0x29b [  
>> 100.915566]  ? vsnprintf+0x341/0x5ff [  100.919141]  ? 
>> desc_read_finalized_seq+0x39/0x87 [  100.923755]  ? 
>> update_load_avg+0x1b9/0x607 [  100.927849]  ? 
>> compute_mst_dsc_configs_for_state+0x7d/0xd5b
>> [  100.933416]  ? fetch_pipe_params+0xa4d/0xd0c [  100.937686]  ? 
>> dc_fpu_end+0x3d/0xa8 [  100.941175]  dml_get_voltage_level+0x16b/0x180 
>> [  100.945619]  dcn30_internal_validate_bw+0x10e/0x89b
>> [  100.950495]  ? dcn31_validate_bandwidth+0x68/0x1fc
>> [  100.955285]  ? resource_build_scaling_params+0x98b/0xb8c
>> [  100.960595]  ? dcn31_validate_bandwidth+0x68/0x1fc
>> [  100.965384]  dcn31_validate_bandwidth+0x9a/0x1fc
>> [  100.970001]  dc_validate_global_state+0x238/0x295
>> [  100.974703]  amdgpu_dm_atomic_check+0x9c1/0xbce
>> [  100.979235]  ? _printk+0x59/0x73
>> [  100.982467]  drm_atomic_check_only+0x403/0x78b [  100.986912]  
>> drm_mode_atomic_ioctl+0x49b/0x546 [  100.991358]  ? 
>> drm_ioctl+0x1c1/0x3b3 [  100.994936]  ? 
>> drm_atomic_set_property+0x92a/0x92a
>> [  100.999725]  drm_ioctl_kernel+0xdc/0x149 [  101.003648]  
>> drm_ioctl+0x27f/0x3b3 [  101.007051]  ? 
>> drm_atomic_set_property+0x92a/0x92a
>> [  101.011842]  amdgpu_drm_ioctl+0x49/0x7d [  101.015679]  
>> __se_sys_ioctl+0x7c/0xb8 [  101.015685]  do_syscall_64+0x5f/0xb8 [  
>> 101.015690]  ? __irq_exit_rcu+0x34/0x96
>> 
>> [HOW]
>> It calles populate_dml_pipes which uses doubles to initialize.
>
>calls
>
>Excuse my ignorance. So using doubles causes a context switch?
If we don’t add FPU protection then context switch can happen. DC_FP_START would in-turn call preempt_disable.

>> Adding FPU protection avoids context switch and probable loss of vba 
>> context as there is potential contention while drm debug logs are enabled.
>> 
>> Signed-off-by: CHANDAN VURDIGERE NATARAJ 
>> <chandan.vurdigerenataraj@amd.com>
>> 
>> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c 
>> b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
>> index 826970f2bd0a..f27262417abe 100644
>> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
>> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
>> @@ -1750,7 +1750,9 @@ bool dcn31_validate_bandwidth(struct dc *dc,
>>   
>>   	BW_VAL_TRACE_COUNT();
>>   
>> +	DC_FP_START();
>>   	out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt, 
>> &vlevel, fast_validate);
>> +	DC_FP_END();
>>   
>>   	// Disable fast_validate to set min dcfclk in alculate_wm_and_dlg
>>   	if (pipe_cnt == 0)
>
>
>Kind regards,
>
>Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
  2022-03-30  2:33   ` VURDIGERENATARAJ, CHANDAN
@ 2022-03-30 13:57     ` Melissa Wen
  2022-03-31 13:18       ` Rodrigo Siqueira Jordao
  0 siblings, 1 reply; 6+ messages in thread
From: Melissa Wen @ 2022-03-30 13:57 UTC (permalink / raw)
  To: VURDIGERENATARAJ, CHANDAN
  Cc: Paul Menzel, Li, Sun peng (Leo),
	Rodrigo Siqueira, amd-gfx, Deucher, Alexander, Wentland, Harry

[-- Attachment #1: Type: text/plain, Size: 5792 bytes --]

On 03/30, VURDIGERENATARAJ, CHANDAN wrote:
> Hi Paul,
> 
> >Am 29.03.22 um 10:29 schrieb CHANDAN VURDIGERE NATARAJ:
> >
> >Is it common to spell your name all uppercase? If not, please use Chandan nVurdigere Nataraj.
> >
> >> [WHY]
> >
> >The [] already emphasize the word, so Why could be used.
> >
> >> Below general protection fault observed when WebGL Aquarium is run for 
> >> longer duration. If drm debug logs are enabled and set to 0x1f then 
> >> the
> >
> >In what browser and what version?
> The issue was observed on ChromiumOS and Chromium Browser version 100.0.4877.0
> >
> >> issue is observed within 10 minutes of run.
> >
> >Where you able to reproduce it without drm debug logs?
> Yes. It took 34 hours to reproduce without drm debug logs. Using drm debug logs was a faster way to reproduce the issue.
> >
> >> [  100.717056] general protection fault, probably for non-canonical address 0x2d33302d32323032: 0000 [#1] PREEMPT SMP NOPTI
> >> [  100.727921] CPU: 3 PID: 1906 Comm: DrmThread Tainted: G        W         5.15.30 #12 d726c6a2d6ebe5cf9223931cbca6892f916fe18b
> >> [  100.754419] RIP: 0010:CalculateSwathWidth+0x1f7/0x44f
> >> [  100.767109] Code: 00 00 00 f2 42 0f 11 04 f0 48 8b 85 88 00 00 00 
> >> f2 42 0f 10 04 f0 48 8b 85 98 00 00 00 f2 42 0f 11 04 f0 48 8b 45 10 
> >> 0f 57 c0 <f3> 42 0f 2a 04 b0 0f 57 c9 f3 43 0f 2a 0c b4 e8 8c e2 f3 ff 
> >> 48 8b [  100.781269] RSP: 0018:ffffa9230079eeb0 EFLAGS: 00010246 [  
> >> 100.812528] RAX: 2d33302d32323032 RBX: 0000000000000500 RCX: 
> >> 0000000000000000 [  100.819656] RDX: 0000000000000001 RSI: 
> >> ffff99deb712c49c RDI: 0000000000000000 [  100.826781] RBP: 
> >> ffffa9230079ef50 R08: ffff99deb712460c R09: ffff99deb712462c [  
> >> 100.833907] R10: ffff99deb7124940 R11: ffff99deb7124d70 R12: 
> >> ffff99deb712ae44 [  100.841033] R13: 0000000000000001 R14: 
> >> 0000000000000000 R15: ffffa9230079f0a0 [  100.848159] FS:  00007af121212640(0000) GS:ffff99deba780000(0000) knlGS:0000000000000000 [  100.856240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  100.861980] CR2: 0000209000fe1000 CR3: >000000011b18c000 CR4: 0000000000350ee0 [  100.869106] Call Trace:
> >> [  100.871555]  <TASK>
> >> [  100.873655]  ? asm_sysvec_reschedule_ipi+0x12/0x20
> >> [  100.878449]  CalculateSwathAndDETConfiguration+0x1a3/0x6dd
> >> [  100.883937]  
> >> dml31_ModeSupportAndSystemConfigurationFull+0x2ce4/0x76da
> >> [  100.890467]  ? kallsyms_lookup_buildid+0xc8/0x163
> >> [  100.895173]  ? kallsyms_lookup_buildid+0xc8/0x163
> >> [  100.899874]  ? __sprint_symbol+0x80/0x135 [  100.903883]  ? 
> >> dm_update_plane_state+0x3f9/0x4d2 [  100.908500]  ? 
> >> symbol_string+0xb7/0xde [  100.912250]  ? number+0x145/0x29b [  
> >> 100.915566]  ? vsnprintf+0x341/0x5ff [  100.919141]  ? 
> >> desc_read_finalized_seq+0x39/0x87 [  100.923755]  ? 
> >> update_load_avg+0x1b9/0x607 [  100.927849]  ? 
> >> compute_mst_dsc_configs_for_state+0x7d/0xd5b
> >> [  100.933416]  ? fetch_pipe_params+0xa4d/0xd0c [  100.937686]  ? 
> >> dc_fpu_end+0x3d/0xa8 [  100.941175]  dml_get_voltage_level+0x16b/0x180 
> >> [  100.945619]  dcn30_internal_validate_bw+0x10e/0x89b
> >> [  100.950495]  ? dcn31_validate_bandwidth+0x68/0x1fc
> >> [  100.955285]  ? resource_build_scaling_params+0x98b/0xb8c
> >> [  100.960595]  ? dcn31_validate_bandwidth+0x68/0x1fc
> >> [  100.965384]  dcn31_validate_bandwidth+0x9a/0x1fc
> >> [  100.970001]  dc_validate_global_state+0x238/0x295
> >> [  100.974703]  amdgpu_dm_atomic_check+0x9c1/0xbce
> >> [  100.979235]  ? _printk+0x59/0x73
> >> [  100.982467]  drm_atomic_check_only+0x403/0x78b [  100.986912]  
> >> drm_mode_atomic_ioctl+0x49b/0x546 [  100.991358]  ? 
> >> drm_ioctl+0x1c1/0x3b3 [  100.994936]  ? 
> >> drm_atomic_set_property+0x92a/0x92a
> >> [  100.999725]  drm_ioctl_kernel+0xdc/0x149 [  101.003648]  
> >> drm_ioctl+0x27f/0x3b3 [  101.007051]  ? 
> >> drm_atomic_set_property+0x92a/0x92a
> >> [  101.011842]  amdgpu_drm_ioctl+0x49/0x7d [  101.015679]  
> >> __se_sys_ioctl+0x7c/0xb8 [  101.015685]  do_syscall_64+0x5f/0xb8 [  
> >> 101.015690]  ? __irq_exit_rcu+0x34/0x96
> >> 
> >> [HOW]
> >> It calles populate_dml_pipes which uses doubles to initialize.
> >
> >calls
> >
> >Excuse my ignorance. So using doubles causes a context switch?
> If we don’t add FPU protection then context switch can happen. DC_FP_START would in-turn call preempt_disable.
> 
> >> Adding FPU protection avoids context switch and probable loss of vba 
> >> context as there is potential contention while drm debug logs are enabled.
> >> 
> >> Signed-off-by: CHANDAN VURDIGERE NATARAJ 
> >> <chandan.vurdigerenataraj@amd.com>
> >> 
> >> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c 
> >> b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> >> index 826970f2bd0a..f27262417abe 100644
> >> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> >> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
> >> @@ -1750,7 +1750,9 @@ bool dcn31_validate_bandwidth(struct dc *dc,
> >>   
> >>   	BW_VAL_TRACE_COUNT();
> >>   
> >> +	DC_FP_START();
> >>   	out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt, 
> >> &vlevel, fast_validate);
> >> +	DC_FP_END();
> >>   
> >>   	// Disable fast_validate to set min dcfclk in alculate_wm_and_dlg
> >>   	if (pipe_cnt == 0)
Acked-by: Melissa Wen <mwen@igalia.com>

In fact, I revisited the code and realized the FPU protection is
missing for two other dcn30 functions called by dcn31:
- dcn30_populate_dml_writeback_from_context()
- dcn30_set_mcif_arb_params()

I'll send a patch addressing this shortly.

CC'ing: Siqueira

Best regards,

Melissa
> >
> >
> >Kind regards,
> >
> >Paul

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
  2022-03-30 13:57     ` Melissa Wen
@ 2022-03-31 13:18       ` Rodrigo Siqueira Jordao
  0 siblings, 0 replies; 6+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-03-31 13:18 UTC (permalink / raw)
  To: Melissa Wen, VURDIGERENATARAJ, CHANDAN
  Cc: Paul Menzel, Li, Sun peng (Leo),
	Rodrigo Siqueira, amd-gfx, Deucher, Alexander, Wentland, Harry



On 2022-03-30 09:57, Melissa Wen wrote:
> On 03/30, VURDIGERENATARAJ, CHANDAN wrote:
>> Hi Paul,
>>
>>> Am 29.03.22 um 10:29 schrieb CHANDAN VURDIGERE NATARAJ:
>>>
>>> Is it common to spell your name all uppercase? If not, please use Chandan nVurdigere Nataraj.
>>>
>>>> [WHY]
>>>
>>> The [] already emphasize the word, so Why could be used.
>>>
>>>> Below general protection fault observed when WebGL Aquarium is run for
>>>> longer duration. If drm debug logs are enabled and set to 0x1f then
>>>> the
>>>
>>> In what browser and what version?
>> The issue was observed on ChromiumOS and Chromium Browser version 100.0.4877.0
>>>
>>>> issue is observed within 10 minutes of run.
>>>
>>> Where you able to reproduce it without drm debug logs?
>> Yes. It took 34 hours to reproduce without drm debug logs. Using drm debug logs was a faster way to reproduce the issue.
>>>
>>>> [  100.717056] general protection fault, probably for non-canonical address 0x2d33302d32323032: 0000 [#1] PREEMPT SMP NOPTI
>>>> [  100.727921] CPU: 3 PID: 1906 Comm: DrmThread Tainted: G        W         5.15.30 #12 d726c6a2d6ebe5cf9223931cbca6892f916fe18b
>>>> [  100.754419] RIP: 0010:CalculateSwathWidth+0x1f7/0x44f
>>>> [  100.767109] Code: 00 00 00 f2 42 0f 11 04 f0 48 8b 85 88 00 00 00
>>>> f2 42 0f 10 04 f0 48 8b 85 98 00 00 00 f2 42 0f 11 04 f0 48 8b 45 10
>>>> 0f 57 c0 <f3> 42 0f 2a 04 b0 0f 57 c9 f3 43 0f 2a 0c b4 e8 8c e2 f3 ff
>>>> 48 8b [  100.781269] RSP: 0018:ffffa9230079eeb0 EFLAGS: 00010246 [
>>>> 100.812528] RAX: 2d33302d32323032 RBX: 0000000000000500 RCX:
>>>> 0000000000000000 [  100.819656] RDX: 0000000000000001 RSI:
>>>> ffff99deb712c49c RDI: 0000000000000000 [  100.826781] RBP:
>>>> ffffa9230079ef50 R08: ffff99deb712460c R09: ffff99deb712462c [
>>>> 100.833907] R10: ffff99deb7124940 R11: ffff99deb7124d70 R12:
>>>> ffff99deb712ae44 [  100.841033] R13: 0000000000000001 R14:
>>>> 0000000000000000 R15: ffffa9230079f0a0 [  100.848159] FS:  00007af121212640(0000) GS:ffff99deba780000(0000) knlGS:0000000000000000 [  100.856240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  100.861980] CR2: 0000209000fe1000 CR3: >000000011b18c000 CR4: 0000000000350ee0 [  100.869106] Call Trace:
>>>> [  100.871555]  <TASK>
>>>> [  100.873655]  ? asm_sysvec_reschedule_ipi+0x12/0x20
>>>> [  100.878449]  CalculateSwathAndDETConfiguration+0x1a3/0x6dd
>>>> [  100.883937]
>>>> dml31_ModeSupportAndSystemConfigurationFull+0x2ce4/0x76da
>>>> [  100.890467]  ? kallsyms_lookup_buildid+0xc8/0x163
>>>> [  100.895173]  ? kallsyms_lookup_buildid+0xc8/0x163
>>>> [  100.899874]  ? __sprint_symbol+0x80/0x135 [  100.903883]  ?
>>>> dm_update_plane_state+0x3f9/0x4d2 [  100.908500]  ?
>>>> symbol_string+0xb7/0xde [  100.912250]  ? number+0x145/0x29b [
>>>> 100.915566]  ? vsnprintf+0x341/0x5ff [  100.919141]  ?
>>>> desc_read_finalized_seq+0x39/0x87 [  100.923755]  ?
>>>> update_load_avg+0x1b9/0x607 [  100.927849]  ?
>>>> compute_mst_dsc_configs_for_state+0x7d/0xd5b
>>>> [  100.933416]  ? fetch_pipe_params+0xa4d/0xd0c [  100.937686]  ?
>>>> dc_fpu_end+0x3d/0xa8 [  100.941175]  dml_get_voltage_level+0x16b/0x180
>>>> [  100.945619]  dcn30_internal_validate_bw+0x10e/0x89b
>>>> [  100.950495]  ? dcn31_validate_bandwidth+0x68/0x1fc
>>>> [  100.955285]  ? resource_build_scaling_params+0x98b/0xb8c
>>>> [  100.960595]  ? dcn31_validate_bandwidth+0x68/0x1fc
>>>> [  100.965384]  dcn31_validate_bandwidth+0x9a/0x1fc
>>>> [  100.970001]  dc_validate_global_state+0x238/0x295
>>>> [  100.974703]  amdgpu_dm_atomic_check+0x9c1/0xbce
>>>> [  100.979235]  ? _printk+0x59/0x73
>>>> [  100.982467]  drm_atomic_check_only+0x403/0x78b [  100.986912]
>>>> drm_mode_atomic_ioctl+0x49b/0x546 [  100.991358]  ?
>>>> drm_ioctl+0x1c1/0x3b3 [  100.994936]  ?
>>>> drm_atomic_set_property+0x92a/0x92a
>>>> [  100.999725]  drm_ioctl_kernel+0xdc/0x149 [  101.003648]
>>>> drm_ioctl+0x27f/0x3b3 [  101.007051]  ?
>>>> drm_atomic_set_property+0x92a/0x92a
>>>> [  101.011842]  amdgpu_drm_ioctl+0x49/0x7d [  101.015679]
>>>> __se_sys_ioctl+0x7c/0xb8 [  101.015685]  do_syscall_64+0x5f/0xb8 [
>>>> 101.015690]  ? __irq_exit_rcu+0x34/0x96
>>>>
>>>> [HOW]
>>>> It calles populate_dml_pipes which uses doubles to initialize.
>>>
>>> calls
>>>
>>> Excuse my ignorance. So using doubles causes a context switch?
>> If we don’t add FPU protection then context switch can happen. DC_FP_START would in-turn call preempt_disable.
>>
>>>> Adding FPU protection avoids context switch and probable loss of vba
>>>> context as there is potential contention while drm debug logs are enabled.
>>>>
>>>> Signed-off-by: CHANDAN VURDIGERE NATARAJ
>>>> <chandan.vurdigerenataraj@amd.com>
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
>>>> b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
>>>> index 826970f2bd0a..f27262417abe 100644
>>>> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
>>>> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
>>>> @@ -1750,7 +1750,9 @@ bool dcn31_validate_bandwidth(struct dc *dc,
>>>>    
>>>>    	BW_VAL_TRACE_COUNT();
>>>>    
>>>> +	DC_FP_START();
>>>>    	out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt,
>>>> &vlevel, fast_validate);
>>>> +	DC_FP_END();
>>>>    
>>>>    	// Disable fast_validate to set min dcfclk in alculate_wm_and_dlg

Hi Chandan,

This change lgtm, however, you need to rebase this patch with the latest 
code from amd-staging-drm-next... with that change:

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>

>>>>    	if (pipe_cnt == 0)
> Acked-by: Melissa Wen <mwen@igalia.com>
> 
> In fact, I revisited the code and realized the FPU protection is
> missing for two other dcn30 functions called by dcn31:
> - dcn30_populate_dml_writeback_from_context()
> - dcn30_set_mcif_arb_params()
> 
> I'll send a patch addressing this shortly.
> 
> CC'ing: Siqueira
> 
> Best regards,
> 
> Melissa
>>>
>>>
>>> Kind regards,
>>>
>>> Paul


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-31 13:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-29  8:29 [PATCH] drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw CHANDAN VURDIGERE NATARAJ
2022-03-29 15:11 ` Deucher, Alexander
2022-03-29 16:46 ` Paul Menzel
2022-03-30  2:33   ` VURDIGERENATARAJ, CHANDAN
2022-03-30 13:57     ` Melissa Wen
2022-03-31 13:18       ` Rodrigo Siqueira Jordao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.