dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.1 1/9] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init"
@ 2023-05-11 19:39 Sasha Levin
  2023-05-11 19:39 ` [PATCH AUTOSEL 6.1 4/9] drm/amd/display: Do not set drr on pipe commit Sasha Levin
  2023-05-11 19:39 ` [PATCH AUTOSEL 6.1 9/9] drm/amdgpu: Use the default reset when loading or reloading the driver Sasha Levin
  0 siblings, 2 replies; 5+ messages in thread
From: Sasha Levin @ 2023-05-11 19:39 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, andrey.grodzovsky, lijo.lazar, Chong Li, dri-devel,
	Amaranath.Somalapuram, Bokun.Zhang, JingWen.Chen2, Xinhui.Pan,
	amd-gfx, YiPeng.Chai, mario.limonciello, Alex Deucher,
	christian.koenig, Hawking.Zhang

From: Chong Li <chongli2@amd.com>

[ Upstream commit 38eecbe086a4e52f54b2bbda8feba65d44addbef ]

[WHY]
 Function "amdgpu_irq_update()" called by "amdgpu_device_ip_late_init()" is an atomic context.
 We shouldn't access registers through KIQ since "msleep()" may be called in "amdgpu_kiq_rreg()".

[HOW]
 Move function "amdgpu_virt_release_full_gpu()" after function "amdgpu_device_ip_late_init()",
 to ensure that registers be accessed through RLCG instead of KIQ.

Call Trace:
  <TASK>
  show_stack+0x52/0x69
  dump_stack_lvl+0x49/0x6d
  dump_stack+0x10/0x18
  __schedule_bug.cold+0x4f/0x6b
  __schedule+0x473/0x5d0
  ? __wake_up_klogd.part.0+0x40/0x70
  ? vprintk_emit+0xbe/0x1f0
  schedule+0x68/0x110
  schedule_timeout+0x87/0x160
  ? timer_migration_handler+0xa0/0xa0
  msleep+0x2d/0x50
  amdgpu_kiq_rreg+0x18d/0x1f0 [amdgpu]
  amdgpu_device_rreg.part.0+0x59/0xd0 [amdgpu]
  amdgpu_device_rreg+0x3a/0x50 [amdgpu]
  amdgpu_sriov_rreg+0x3c/0xb0 [amdgpu]
  gfx_v10_0_set_gfx_eop_interrupt_state.constprop.0+0x16c/0x190 [amdgpu]
  gfx_v10_0_set_eop_interrupt_state+0xa5/0xb0 [amdgpu]
  amdgpu_irq_update+0x53/0x80 [amdgpu]
  amdgpu_irq_get+0x7c/0xb0 [amdgpu]
  amdgpu_fence_driver_hw_init+0x58/0x90 [amdgpu]
  amdgpu_device_init.cold+0x16b7/0x2022 [amdgpu]

Signed-off-by: Chong Li <chongli2@amd.com>
Reviewed-by: JingWen.Chen2@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 32 ++++++++++++----------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 9df5dcedaf3e2..494e8ce52af22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2511,8 +2511,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 	amdgpu_fru_get_product_info(adev);
 
 init_failed:
-	if (amdgpu_sriov_vf(adev))
-		amdgpu_virt_release_full_gpu(adev, true);
 
 	return r;
 }
@@ -3837,18 +3835,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
 	r = amdgpu_device_ip_init(adev);
 	if (r) {
-		/* failed in exclusive mode due to timeout */
-		if (amdgpu_sriov_vf(adev) &&
-		    !amdgpu_sriov_runtime(adev) &&
-		    amdgpu_virt_mmio_blocked(adev) &&
-		    !amdgpu_virt_wait_reset(adev)) {
-			dev_err(adev->dev, "VF exclusive mode timeout\n");
-			/* Don't send request since VF is inactive. */
-			adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
-			adev->virt.ops = NULL;
-			r = -EAGAIN;
-			goto release_ras_con;
-		}
 		dev_err(adev->dev, "amdgpu_device_ip_init failed\n");
 		amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_AMDGPU_INIT_FAIL, 0, 0);
 		goto release_ras_con;
@@ -3920,8 +3906,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 				   msecs_to_jiffies(AMDGPU_RESUME_MS));
 	}
 
-	if (amdgpu_sriov_vf(adev))
+	if (amdgpu_sriov_vf(adev)) {
+		amdgpu_virt_release_full_gpu(adev, true);
 		flush_delayed_work(&adev->delayed_init_work);
+	}
 
 	r = sysfs_create_files(&adev->dev->kobj, amdgpu_dev_attributes);
 	if (r)
@@ -3958,6 +3946,20 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	return 0;
 
 release_ras_con:
+	if (amdgpu_sriov_vf(adev))
+		amdgpu_virt_release_full_gpu(adev, true);
+
+	/* failed in exclusive mode due to timeout */
+	if (amdgpu_sriov_vf(adev) &&
+		!amdgpu_sriov_runtime(adev) &&
+		amdgpu_virt_mmio_blocked(adev) &&
+		!amdgpu_virt_wait_reset(adev)) {
+		dev_err(adev->dev, "VF exclusive mode timeout\n");
+		/* Don't send request since VF is inactive. */
+		adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
+		adev->virt.ops = NULL;
+		r = -EAGAIN;
+	}
 	amdgpu_release_ras_context(adev);
 
 failed:
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 6.1 4/9] drm/amd/display: Do not set drr on pipe commit
  2023-05-11 19:39 [PATCH AUTOSEL 6.1 1/9] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init" Sasha Levin
@ 2023-05-11 19:39 ` Sasha Levin
  2023-05-15 13:04   ` Michel Dänzer
  2023-05-11 19:39 ` [PATCH AUTOSEL 6.1 9/9] drm/amdgpu: Use the default reset when loading or reloading the driver Sasha Levin
  1 sibling, 1 reply; 5+ messages in thread
From: Sasha Levin @ 2023-05-11 19:39 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: felipe.clark, wenjing.liu, dri-devel, Jun.Lei, Sasha Levin,
	jiapeng.chong, Rodrigo Siqueira, amd-gfx, aurabindo.pillai,
	Alvin.Lee2, sunpeng.li, mwen, Daniel Wheeler, Dillon.Varone,
	Wesley Chalmers, qingqing.zhuo, Xinhui.Pan, Alex Deucher,
	christian.koenig

From: Wesley Chalmers <Wesley.Chalmers@amd.com>

[ Upstream commit 474f01015ffdb74e01c2eb3584a2822c64e7b2be ]

[WHY]
Writing to DRR registers such as OTG_V_TOTAL_MIN on the same frame as a
pipe commit can cause underflow.

[HOW]
Move DMUB p-state delegate into optimze_bandwidth; enabling FAMS sets
optimized_required.

This change expects that Freesync requests are blocked when
optimized_required is true.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 6 ++++++
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 7 +++++++
 2 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index f348bc15a9256..54408320ac73d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -2017,6 +2017,12 @@ void dcn20_optimize_bandwidth(
 	if (hubbub->funcs->program_compbuf_size)
 		hubbub->funcs->program_compbuf_size(hubbub, context->bw_ctx.bw.dcn.compbuf_size_kb, true);
 
+	if (context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching) {
+		dc_dmub_srv_p_state_delegate(dc,
+			true, context);
+		context->bw_ctx.bw.dcn.clk.p_state_change_support = true;
+	}
+
 	dc->clk_mgr->funcs->update_clocks(
 			dc->clk_mgr,
 			context,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index c20e9f76f0213..54f71da717357 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -986,11 +986,18 @@ void dcn30_set_disp_pattern_generator(const struct dc *dc,
 void dcn30_prepare_bandwidth(struct dc *dc,
  	struct dc_state *context)
 {
+	if (context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching) {
+		dc->optimized_required = true;
+		context->bw_ctx.bw.dcn.clk.p_state_change_support = false;
+	}
+
 	if (dc->clk_mgr->dc_mode_softmax_enabled)
 		if (dc->clk_mgr->clks.dramclk_khz <= dc->clk_mgr->bw_params->dc_mode_softmax_memclk * 1000 &&
 				context->bw_ctx.bw.dcn.clk.dramclk_khz > dc->clk_mgr->bw_params->dc_mode_softmax_memclk * 1000)
 			dc->clk_mgr->funcs->set_max_memclk(dc->clk_mgr, dc->clk_mgr->bw_params->clk_table.entries[dc->clk_mgr->bw_params->clk_table.num_entries - 1].memclk_mhz);
 
 	dcn20_prepare_bandwidth(dc, context);
+
+	dc_dmub_srv_p_state_delegate(dc, false, context);
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 6.1 9/9] drm/amdgpu: Use the default reset when loading or reloading the driver
  2023-05-11 19:39 [PATCH AUTOSEL 6.1 1/9] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init" Sasha Levin
  2023-05-11 19:39 ` [PATCH AUTOSEL 6.1 4/9] drm/amd/display: Do not set drr on pipe commit Sasha Levin
@ 2023-05-11 19:39 ` Sasha Levin
  1 sibling, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2023-05-11 19:39 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, andrey.grodzovsky, lijo.lazar,
	Amaranath.Somalapuram, lyndonli, dri-devel, Xinhui.Pan, amd-gfx,
	Yunxiang Li, YiPeng.Chai, mario.limonciello, Feifei Xu,
	Alex Deucher, Bokun.Zhang, Kenneth Feng, christian.koenig,
	Hawking.Zhang

From: lyndonli <Lyndon.Li@amd.com>

[ Upstream commit 4eea7fb980dc44545a32eec92e2662053b34cd9d ]

Below call trace and errors are observed when reloading
amdgpu driver with the module parameter reset_method=3.

It should do a default reset when loading or reloading the
driver, regardless of the module parameter reset_method.

v2: add comments inside and modify commit messages.

[  +2.180243] [drm] psp gfx command ID_LOAD_TOC(0x20) failed
and response status is (0x0)
[  +0.000011] [drm:psp_hw_start [amdgpu]] *ERROR* Failed to load toc
[  +0.000890] [drm:psp_hw_start [amdgpu]] *ERROR* PSP tmr init failed!
[  +0.020683] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to
clear memory with ring turned off.
[  +0.000003] RIP: 0010:amdgpu_bo_release_notify+0x1ef/0x210 [amdgpu]
[  +0.000004] Call Trace:
[  +0.000003]  <TASK>
[  +0.000008]  ttm_bo_release+0x2c4/0x330 [amdttm]
[  +0.000026]  amdttm_bo_put+0x3c/0x70 [amdttm]
[  +0.000020]  amdgpu_bo_free_kernel+0xe6/0x140 [amdgpu]
[  +0.000728]  psp_v11_0_ring_destroy+0x34/0x60 [amdgpu]
[  +0.000826]  psp_hw_init+0xe7/0x2f0 [amdgpu]
[  +0.000813]  amdgpu_device_fw_loading+0x1ad/0x2d0 [amdgpu]
[  +0.000731]  amdgpu_device_init.cold+0x108e/0x2002 [amdgpu]
[  +0.001071]  ? do_pci_enable_device+0xe1/0x110
[  +0.000011]  amdgpu_driver_load_kms+0x1a/0x160 [amdgpu]
[  +0.000729]  amdgpu_pci_probe+0x179/0x3a0 [amdgpu]

Signed-off-by: lyndonli <Lyndon.Li@amd.com>
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 494e8ce52af22..0a92fb8abf0e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	int r, i;
 	bool px = false;
 	u32 max_MBps;
+	int tmp;
 
 	adev->shutdown = false;
 	adev->flags = flags;
@@ -3775,7 +3776,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 				}
 			}
 		} else {
+			tmp = amdgpu_reset_method;
+			/* It should do a default reset when loading or reloading the driver,
+			 * regardless of the module parameter reset_method.
+			 */
+			amdgpu_reset_method = AMD_RESET_METHOD_NONE;
 			r = amdgpu_asic_reset(adev);
+			amdgpu_reset_method = tmp;
 			if (r) {
 				dev_err(adev->dev, "asic reset on init failed\n");
 				goto failed;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH AUTOSEL 6.1 4/9] drm/amd/display: Do not set drr on pipe commit
  2023-05-11 19:39 ` [PATCH AUTOSEL 6.1 4/9] drm/amd/display: Do not set drr on pipe commit Sasha Levin
@ 2023-05-15 13:04   ` Michel Dänzer
  2023-06-01  9:53     ` Sasha Levin
  0 siblings, 1 reply; 5+ messages in thread
From: Michel Dänzer @ 2023-05-15 13:04 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: felipe.clark, wenjing.liu, dri-devel, Jun.Lei, jiapeng.chong,
	Rodrigo Siqueira, amd-gfx, aurabindo.pillai, Alvin.Lee2,
	sunpeng.li, mwen, Daniel Wheeler, Dillon.Varone, Wesley Chalmers,
	qingqing.zhuo, Xinhui.Pan, Alex Deucher, christian.koenig

On 5/11/23 21:39, Sasha Levin wrote:
> From: Wesley Chalmers <Wesley.Chalmers@amd.com>
> 
> [ Upstream commit 474f01015ffdb74e01c2eb3584a2822c64e7b2be ]
> 
> [WHY]
> Writing to DRR registers such as OTG_V_TOTAL_MIN on the same frame as a
> pipe commit can cause underflow.
> 
> [HOW]
> Move DMUB p-state delegate into optimze_bandwidth; enabling FAMS sets
> optimized_required.
> 
> This change expects that Freesync requests are blocked when
> optimized_required is true.

This change caused a regression, see https://patchwork.freedesktop.org/patch/532240/?series=116487&rev=1#comment_972234 / 9deeb132-a317-7419-e9da-cbc0a379c0eb@daenzer.net .


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH AUTOSEL 6.1 4/9] drm/amd/display: Do not set drr on pipe commit
  2023-05-15 13:04   ` Michel Dänzer
@ 2023-06-01  9:53     ` Sasha Levin
  0 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2023-06-01  9:53 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: felipe.clark, wenjing.liu, dri-devel, Jun.Lei, jiapeng.chong,
	Rodrigo Siqueira, amd-gfx, aurabindo.pillai, Alvin.Lee2,
	sunpeng.li, mwen, Daniel Wheeler, Dillon.Varone, Wesley Chalmers,
	qingqing.zhuo, Xinhui.Pan, linux-kernel, stable, Alex Deucher,
	christian.koenig

On Mon, May 15, 2023 at 03:04:43PM +0200, Michel Dänzer wrote:
>On 5/11/23 21:39, Sasha Levin wrote:
>> From: Wesley Chalmers <Wesley.Chalmers@amd.com>
>>
>> [ Upstream commit 474f01015ffdb74e01c2eb3584a2822c64e7b2be ]
>>
>> [WHY]
>> Writing to DRR registers such as OTG_V_TOTAL_MIN on the same frame as a
>> pipe commit can cause underflow.
>>
>> [HOW]
>> Move DMUB p-state delegate into optimze_bandwidth; enabling FAMS sets
>> optimized_required.
>>
>> This change expects that Freesync requests are blocked when
>> optimized_required is true.
>
>This change caused a regression, see https://patchwork.freedesktop.org/patch/532240/?series=116487&rev=1#comment_972234 / 9deeb132-a317-7419-e9da-cbc0a379c0eb@daenzer.net .

Dropped, thanks!

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-01  9:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-11 19:39 [PATCH AUTOSEL 6.1 1/9] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init" Sasha Levin
2023-05-11 19:39 ` [PATCH AUTOSEL 6.1 4/9] drm/amd/display: Do not set drr on pipe commit Sasha Levin
2023-05-15 13:04   ` Michel Dänzer
2023-06-01  9:53     ` Sasha Levin
2023-05-11 19:39 ` [PATCH AUTOSEL 6.1 9/9] drm/amdgpu: Use the default reset when loading or reloading the driver Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).