* [PATCH AUTOSEL 6.3 01/11] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init"
@ 2023-05-11 19:37 Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 04/11] drm/amd/display: Do not set drr on pipe commit Sasha Levin
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Sasha Levin @ 2023-05-11 19:37 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, andrey.grodzovsky, lijo.lazar, Chong Li, dri-devel,
Amaranath.Somalapuram, Bokun.Zhang, JingWen.Chen2, Xinhui.Pan,
amd-gfx, YiPeng.Chai, mario.limonciello, Alex Deucher,
christian.koenig, Hawking.Zhang
From: Chong Li <chongli2@amd.com>
[ Upstream commit 38eecbe086a4e52f54b2bbda8feba65d44addbef ]
[WHY]
Function "amdgpu_irq_update()" called by "amdgpu_device_ip_late_init()" is an atomic context.
We shouldn't access registers through KIQ since "msleep()" may be called in "amdgpu_kiq_rreg()".
[HOW]
Move function "amdgpu_virt_release_full_gpu()" after function "amdgpu_device_ip_late_init()",
to ensure that registers be accessed through RLCG instead of KIQ.
Call Trace:
<TASK>
show_stack+0x52/0x69
dump_stack_lvl+0x49/0x6d
dump_stack+0x10/0x18
__schedule_bug.cold+0x4f/0x6b
__schedule+0x473/0x5d0
? __wake_up_klogd.part.0+0x40/0x70
? vprintk_emit+0xbe/0x1f0
schedule+0x68/0x110
schedule_timeout+0x87/0x160
? timer_migration_handler+0xa0/0xa0
msleep+0x2d/0x50
amdgpu_kiq_rreg+0x18d/0x1f0 [amdgpu]
amdgpu_device_rreg.part.0+0x59/0xd0 [amdgpu]
amdgpu_device_rreg+0x3a/0x50 [amdgpu]
amdgpu_sriov_rreg+0x3c/0xb0 [amdgpu]
gfx_v10_0_set_gfx_eop_interrupt_state.constprop.0+0x16c/0x190 [amdgpu]
gfx_v10_0_set_eop_interrupt_state+0xa5/0xb0 [amdgpu]
amdgpu_irq_update+0x53/0x80 [amdgpu]
amdgpu_irq_get+0x7c/0xb0 [amdgpu]
amdgpu_fence_driver_hw_init+0x58/0x90 [amdgpu]
amdgpu_device_init.cold+0x16b7/0x2022 [amdgpu]
Signed-off-by: Chong Li <chongli2@amd.com>
Reviewed-by: JingWen.Chen2@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 32 ++++++++++++----------
1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3d98fc2ad36b0..7543683b2583f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2522,8 +2522,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
amdgpu_fru_get_product_info(adev);
init_failed:
- if (amdgpu_sriov_vf(adev))
- amdgpu_virt_release_full_gpu(adev, true);
return r;
}
@@ -3840,18 +3838,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
r = amdgpu_device_ip_init(adev);
if (r) {
- /* failed in exclusive mode due to timeout */
- if (amdgpu_sriov_vf(adev) &&
- !amdgpu_sriov_runtime(adev) &&
- amdgpu_virt_mmio_blocked(adev) &&
- !amdgpu_virt_wait_reset(adev)) {
- dev_err(adev->dev, "VF exclusive mode timeout\n");
- /* Don't send request since VF is inactive. */
- adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
- adev->virt.ops = NULL;
- r = -EAGAIN;
- goto release_ras_con;
- }
dev_err(adev->dev, "amdgpu_device_ip_init failed\n");
amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_AMDGPU_INIT_FAIL, 0, 0);
goto release_ras_con;
@@ -3923,8 +3909,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
msecs_to_jiffies(AMDGPU_RESUME_MS));
}
- if (amdgpu_sriov_vf(adev))
+ if (amdgpu_sriov_vf(adev)) {
+ amdgpu_virt_release_full_gpu(adev, true);
flush_delayed_work(&adev->delayed_init_work);
+ }
r = sysfs_create_files(&adev->dev->kobj, amdgpu_dev_attributes);
if (r)
@@ -3961,6 +3949,20 @@ int amdgpu_device_init(struct amdgpu_device *adev,
return 0;
release_ras_con:
+ if (amdgpu_sriov_vf(adev))
+ amdgpu_virt_release_full_gpu(adev, true);
+
+ /* failed in exclusive mode due to timeout */
+ if (amdgpu_sriov_vf(adev) &&
+ !amdgpu_sriov_runtime(adev) &&
+ amdgpu_virt_mmio_blocked(adev) &&
+ !amdgpu_virt_wait_reset(adev)) {
+ dev_err(adev->dev, "VF exclusive mode timeout\n");
+ /* Don't send request since VF is inactive. */
+ adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
+ adev->virt.ops = NULL;
+ r = -EAGAIN;
+ }
amdgpu_release_ras_context(adev);
failed:
--
2.39.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH AUTOSEL 6.3 04/11] drm/amd/display: Do not set drr on pipe commit
2023-05-11 19:37 [PATCH AUTOSEL 6.3 01/11] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init" Sasha Levin
@ 2023-05-11 19:37 ` Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 05/11] drm/amd/display: fix memleak in aconnector->timing_requested Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 11/11] drm/amdgpu: Use the default reset when loading or reloading the driver Sasha Levin
2 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2023-05-11 19:37 UTC (permalink / raw)
To: linux-kernel, stable
Cc: felipe.clark, wenjing.liu, dri-devel, Jun.Lei, Sasha Levin,
jiapeng.chong, Rodrigo Siqueira, amd-gfx, aurabindo.pillai,
Alvin.Lee2, sunpeng.li, mwen, Daniel Wheeler, Dillon.Varone,
Wesley Chalmers, qingqing.zhuo, Xinhui.Pan, Alex Deucher,
christian.koenig
From: Wesley Chalmers <Wesley.Chalmers@amd.com>
[ Upstream commit 474f01015ffdb74e01c2eb3584a2822c64e7b2be ]
[WHY]
Writing to DRR registers such as OTG_V_TOTAL_MIN on the same frame as a
pipe commit can cause underflow.
[HOW]
Move DMUB p-state delegate into optimze_bandwidth; enabling FAMS sets
optimized_required.
This change expects that Freesync requests are blocked when
optimized_required is true.
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 6 ++++++
drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 7 +++++++
2 files changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index b83873a3a534a..7f2e9a300fed9 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -2092,6 +2092,12 @@ void dcn20_optimize_bandwidth(
if (hubbub->funcs->program_compbuf_size)
hubbub->funcs->program_compbuf_size(hubbub, context->bw_ctx.bw.dcn.compbuf_size_kb, true);
+ if (context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching) {
+ dc_dmub_srv_p_state_delegate(dc,
+ true, context);
+ context->bw_ctx.bw.dcn.clk.p_state_change_support = true;
+ }
+
dc->clk_mgr->funcs->update_clocks(
dc->clk_mgr,
context,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index df787fcf8e86e..7f4fe6f8f0214 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -992,11 +992,18 @@ void dcn30_set_disp_pattern_generator(const struct dc *dc,
void dcn30_prepare_bandwidth(struct dc *dc,
struct dc_state *context)
{
+ if (context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching) {
+ dc->optimized_required = true;
+ context->bw_ctx.bw.dcn.clk.p_state_change_support = false;
+ }
+
if (dc->clk_mgr->dc_mode_softmax_enabled)
if (dc->clk_mgr->clks.dramclk_khz <= dc->clk_mgr->bw_params->dc_mode_softmax_memclk * 1000 &&
context->bw_ctx.bw.dcn.clk.dramclk_khz > dc->clk_mgr->bw_params->dc_mode_softmax_memclk * 1000)
dc->clk_mgr->funcs->set_max_memclk(dc->clk_mgr, dc->clk_mgr->bw_params->clk_table.entries[dc->clk_mgr->bw_params->clk_table.num_entries - 1].memclk_mhz);
dcn20_prepare_bandwidth(dc, context);
+
+ dc_dmub_srv_p_state_delegate(dc, false, context);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH AUTOSEL 6.3 05/11] drm/amd/display: fix memleak in aconnector->timing_requested
2023-05-11 19:37 [PATCH AUTOSEL 6.3 01/11] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init" Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 04/11] drm/amd/display: Do not set drr on pipe commit Sasha Levin
@ 2023-05-11 19:37 ` Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 11/11] drm/amdgpu: Use the default reset when loading or reloading the driver Sasha Levin
2 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2023-05-11 19:37 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, stylon.wang, sunpeng.li, Qingqing Zhuo, Xinhui.Pan,
Rodrigo.Siqueira, amd-gfx, hdegoede, Daniel Wheeler,
aurabindo.pillai, Hersen Wu, dri-devel, Alex Deucher,
christian.koenig
From: Hersen Wu <hersenxs.wu@amd.com>
[ Upstream commit 025ce392b5f213696ca0af3e07735d0fae020694 ]
[Why]
when amdgpu_dm_update_connector_after_detect is called
two times successively with valid sink, memory allocated of
aconnector->timing_requested for the first call is not free.
this causes memeleak.
[How]
allocate memory only when aconnector->timing_requested
is null.
Reviewed-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a01fd41643fc2..5dced13f4ae8d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3094,9 +3094,12 @@ void amdgpu_dm_update_connector_after_detect(
aconnector->edid);
}
- aconnector->timing_requested = kzalloc(sizeof(struct dc_crtc_timing), GFP_KERNEL);
- if (!aconnector->timing_requested)
- dm_error("%s: failed to create aconnector->requested_timing\n", __func__);
+ if (!aconnector->timing_requested) {
+ aconnector->timing_requested =
+ kzalloc(sizeof(struct dc_crtc_timing), GFP_KERNEL);
+ if (!aconnector->timing_requested)
+ dm_error("failed to create aconnector->requested_timing\n");
+ }
drm_connector_update_edid_property(connector, aconnector->edid);
amdgpu_dm_update_freesync_caps(connector, aconnector->edid);
--
2.39.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH AUTOSEL 6.3 11/11] drm/amdgpu: Use the default reset when loading or reloading the driver
2023-05-11 19:37 [PATCH AUTOSEL 6.3 01/11] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init" Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 04/11] drm/amd/display: Do not set drr on pipe commit Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 05/11] drm/amd/display: fix memleak in aconnector->timing_requested Sasha Levin
@ 2023-05-11 19:37 ` Sasha Levin
2 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2023-05-11 19:37 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, andrey.grodzovsky, lijo.lazar,
Amaranath.Somalapuram, lyndonli, dri-devel, Xinhui.Pan, amd-gfx,
Yunxiang Li, YiPeng.Chai, mario.limonciello, Feifei Xu,
Alex Deucher, Bokun.Zhang, Kenneth Feng, christian.koenig,
Hawking.Zhang
From: lyndonli <Lyndon.Li@amd.com>
[ Upstream commit 4eea7fb980dc44545a32eec92e2662053b34cd9d ]
Below call trace and errors are observed when reloading
amdgpu driver with the module parameter reset_method=3.
It should do a default reset when loading or reloading the
driver, regardless of the module parameter reset_method.
v2: add comments inside and modify commit messages.
[ +2.180243] [drm] psp gfx command ID_LOAD_TOC(0x20) failed
and response status is (0x0)
[ +0.000011] [drm:psp_hw_start [amdgpu]] *ERROR* Failed to load toc
[ +0.000890] [drm:psp_hw_start [amdgpu]] *ERROR* PSP tmr init failed!
[ +0.020683] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to
clear memory with ring turned off.
[ +0.000003] RIP: 0010:amdgpu_bo_release_notify+0x1ef/0x210 [amdgpu]
[ +0.000004] Call Trace:
[ +0.000003] <TASK>
[ +0.000008] ttm_bo_release+0x2c4/0x330 [amdttm]
[ +0.000026] amdttm_bo_put+0x3c/0x70 [amdttm]
[ +0.000020] amdgpu_bo_free_kernel+0xe6/0x140 [amdgpu]
[ +0.000728] psp_v11_0_ring_destroy+0x34/0x60 [amdgpu]
[ +0.000826] psp_hw_init+0xe7/0x2f0 [amdgpu]
[ +0.000813] amdgpu_device_fw_loading+0x1ad/0x2d0 [amdgpu]
[ +0.000731] amdgpu_device_init.cold+0x108e/0x2002 [amdgpu]
[ +0.001071] ? do_pci_enable_device+0xe1/0x110
[ +0.000011] amdgpu_driver_load_kms+0x1a/0x160 [amdgpu]
[ +0.000729] amdgpu_pci_probe+0x179/0x3a0 [amdgpu]
Signed-off-by: lyndonli <Lyndon.Li@amd.com>
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7543683b2583f..b1a2d4b0918fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3557,6 +3557,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
int r, i;
bool px = false;
u32 max_MBps;
+ int tmp;
adev->shutdown = false;
adev->flags = flags;
@@ -3778,7 +3779,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
}
}
} else {
+ tmp = amdgpu_reset_method;
+ /* It should do a default reset when loading or reloading the driver,
+ * regardless of the module parameter reset_method.
+ */
+ amdgpu_reset_method = AMD_RESET_METHOD_NONE;
r = amdgpu_asic_reset(adev);
+ amdgpu_reset_method = tmp;
if (r) {
dev_err(adev->dev, "asic reset on init failed\n");
goto failed;
--
2.39.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-05-11 19:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-11 19:37 [PATCH AUTOSEL 6.3 01/11] drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init" Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 04/11] drm/amd/display: Do not set drr on pipe commit Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 05/11] drm/amd/display: fix memleak in aconnector->timing_requested Sasha Levin
2023-05-11 19:37 ` [PATCH AUTOSEL 6.3 11/11] drm/amdgpu: Use the default reset when loading or reloading the driver Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).