dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.8 04/12] drm/amdgpu: prevent double kfree ttm->sg
       [not found] <20201005144501.2527477-1-sashal@kernel.org>
@ 2020-10-05 14:44 ` Sasha Levin
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 08/12] drm/amd/pm: Removed fixed clock in auto mode DPM Sasha Levin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2020-10-05 14:44 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Philip Yang, Felix Kuehling, amd-gfx, dri-devel,
	Alex Deucher, Christian König

From: Philip Yang <Philip.Yang@amd.com>

[ Upstream commit 1d0e16ac1a9e800598dcfa5b6bc53b704a103390 ]

Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace:

[  420.932812] kernel BUG at
/build/linux-do9eLF/linux-4.15.0/mm/slub.c:295!
[  420.934182] invalid opcode: 0000 [#1] SMP NOPTI
[  420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE
[  420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS
1.5.4 07/09/2020
[  420.952887] RIP: 0010:__slab_free+0x180/0x2d0
[  420.954419] RSP: 0018:ffffbe426291fa60 EFLAGS: 00010246
[  420.955963] RAX: ffff9e29263e9c30 RBX: ffff9e29263e9c30 RCX:
000000018100004b
[  420.957512] RDX: ffff9e29263e9c30 RSI: fffff3d33e98fa40 RDI:
ffff9e297e407a80
[  420.959055] RBP: ffffbe426291fb00 R08: 0000000000000001 R09:
ffffffffc0d39ade
[  420.960587] R10: ffffbe426291fb20 R11: ffff9e49ffdd4000 R12:
ffff9e297e407a80
[  420.962105] R13: fffff3d33e98fa40 R14: ffff9e29263e9c30 R15:
ffff9e2954464fd8
[  420.963611] FS:  00007fa2ea097780(0000) GS:ffff9e297e840000(0000)
knlGS:0000000000000000
[  420.965144] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  420.966663] CR2: 00007f16bfffefb8 CR3: 0000001ff0c62000 CR4:
0000000000340ee0
[  420.968193] Call Trace:
[  420.969703]  ? __page_cache_release+0x3c/0x220
[  420.971294]  ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.972789]  kfree+0x168/0x180
[  420.974353]  ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu]
[  420.975850]  ? kfree+0x168/0x180
[  420.977403]  amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.978888]  ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm]
[  420.980357]  ttm_tt_destroy.part.11+0x4f/0x60 [amdttm]
[  420.981814]  ttm_tt_destroy+0x13/0x20 [amdttm]
[  420.983273]  ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm]
[  420.984725]  ttm_bo_release+0x1c9/0x360 [amdttm]
[  420.986167]  amdttm_bo_put+0x24/0x30 [amdttm]
[  420.987663]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[  420.989165]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10
[amdgpu]
[  420.990666]  kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index e59c01a83dace..9a3267f06376f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1052,6 +1052,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 
 release_sg:
 	kfree(ttm->sg);
+	ttm->sg = NULL;
 	return r;
 }
 
-- 
2.25.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 5.8 08/12] drm/amd/pm: Removed fixed clock in auto mode DPM
       [not found] <20201005144501.2527477-1-sashal@kernel.org>
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 04/12] drm/amdgpu: prevent double kfree ttm->sg Sasha Levin
@ 2020-10-05 14:44 ` Sasha Levin
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 09/12] drm/amd/display: fix return value check for hdcp_work Sasha Levin
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 10/12] drm/vmwgfx: Fix error handling in get_node Sasha Levin
  3 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2020-10-05 14:44 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Evan Quan, dri-devel, amd-gfx, Alex Deucher,
	Sudheesh Mavila

From: Sudheesh Mavila <sudheesh.mavila@amd.com>

[ Upstream commit 97cf32996c46d9935cc133d910a75fb687dd6144 ]

SMU10_UMD_PSTATE_PEAK_FCLK value should not be used to set the DPM.

Suggested-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
index 9ee8cf8267c88..43f7adff6cb74 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
@@ -563,6 +563,8 @@ static int smu10_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
 	struct smu10_hwmgr *data = hwmgr->backend;
 	uint32_t min_sclk = hwmgr->display_config->min_core_set_clock;
 	uint32_t min_mclk = hwmgr->display_config->min_mem_set_clock/100;
+	uint32_t index_fclk = data->clock_vol_info.vdd_dep_on_fclk->count - 1;
+	uint32_t index_socclk = data->clock_vol_info.vdd_dep_on_socclk->count - 1;
 
 	if (hwmgr->smu_version < 0x1E3700) {
 		pr_info("smu firmware version too old, can not set dpm level\n");
@@ -676,13 +678,13 @@ static int smu10_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
 		smum_send_msg_to_smc_with_parameter(hwmgr,
 						PPSMC_MSG_SetHardMinFclkByFreq,
 						hwmgr->display_config->num_display > 3 ?
-						SMU10_UMD_PSTATE_PEAK_FCLK :
+						data->clock_vol_info.vdd_dep_on_fclk->entries[0].clk :
 						min_mclk,
 						NULL);
 
 		smum_send_msg_to_smc_with_parameter(hwmgr,
 						PPSMC_MSG_SetHardMinSocclkByFreq,
-						SMU10_UMD_PSTATE_MIN_SOCCLK,
+						data->clock_vol_info.vdd_dep_on_socclk->entries[0].clk,
 						NULL);
 		smum_send_msg_to_smc_with_parameter(hwmgr,
 						PPSMC_MSG_SetHardMinVcn,
@@ -695,11 +697,11 @@ static int smu10_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
 						NULL);
 		smum_send_msg_to_smc_with_parameter(hwmgr,
 						PPSMC_MSG_SetSoftMaxFclkByFreq,
-						SMU10_UMD_PSTATE_PEAK_FCLK,
+						data->clock_vol_info.vdd_dep_on_fclk->entries[index_fclk].clk,
 						NULL);
 		smum_send_msg_to_smc_with_parameter(hwmgr,
 						PPSMC_MSG_SetSoftMaxSocclkByFreq,
-						SMU10_UMD_PSTATE_PEAK_SOCCLK,
+						data->clock_vol_info.vdd_dep_on_socclk->entries[index_socclk].clk,
 						NULL);
 		smum_send_msg_to_smc_with_parameter(hwmgr,
 						PPSMC_MSG_SetSoftMaxVcn,
-- 
2.25.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 5.8 09/12] drm/amd/display: fix return value check for hdcp_work
       [not found] <20201005144501.2527477-1-sashal@kernel.org>
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 04/12] drm/amdgpu: prevent double kfree ttm->sg Sasha Levin
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 08/12] drm/amd/pm: Removed fixed clock in auto mode DPM Sasha Levin
@ 2020-10-05 14:44 ` Sasha Levin
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 10/12] drm/vmwgfx: Fix error handling in get_node Sasha Levin
  3 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2020-10-05 14:44 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Feifei Xu, dri-devel, amd-gfx, Alex Deucher, Flora Cui

From: Flora Cui <flora.cui@amd.com>

[ Upstream commit 898c7302f4de1d91065e80fc46552b3ec70894ff ]

max_caps might be 0, thus hdcp_work might be ZERO_SIZE_PTR

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c
index 949d10ef83040..6dd1f3f8d9903 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c
@@ -568,7 +568,7 @@ struct hdcp_workqueue *hdcp_create_workqueue(struct amdgpu_device *adev, struct
 	int i = 0;
 
 	hdcp_work = kcalloc(max_caps, sizeof(*hdcp_work), GFP_KERNEL);
-	if (hdcp_work == NULL)
+	if (ZERO_OR_NULL_PTR(hdcp_work))
 		return NULL;
 
 	hdcp_work->srm = kcalloc(PSP_HDCP_SRM_FIRST_GEN_MAX_SIZE, sizeof(*hdcp_work->srm), GFP_KERNEL);
-- 
2.25.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 5.8 10/12] drm/vmwgfx: Fix error handling in get_node
       [not found] <20201005144501.2527477-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 09/12] drm/amd/display: fix return value check for hdcp_work Sasha Levin
@ 2020-10-05 14:44 ` Sasha Levin
  2020-10-05 16:03   ` Roland Scheidegger
  3 siblings, 1 reply; 5+ messages in thread
From: Sasha Levin @ 2020-10-05 14:44 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Martin Krastev, Roland Scheidegger, dri-devel

From: Zack Rusin <zackr@vmware.com>

[ Upstream commit f54c4442893b8dfbd3aff8e903c54dfff1aef990 ]

ttm_mem_type_manager_func.get_node was changed to return -ENOSPC
instead of setting the node pointer to NULL. Unfortunately
vmwgfx still had two places where it was explicitly converting
-ENOSPC to 0 causing regressions. This fixes those spots by
allowing -ENOSPC to be returned. That seems to fix recent
regressions with vmwgfx.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Sigend-off-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c | 2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_thp.c           | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
index 7da752ca1c34b..b93c558dd86e0 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
@@ -57,7 +57,7 @@ static int vmw_gmrid_man_get_node(struct ttm_mem_type_manager *man,
 
 	id = ida_alloc_max(&gman->gmr_ida, gman->max_gmr_ids - 1, GFP_KERNEL);
 	if (id < 0)
-		return (id != -ENOMEM ? 0 : id);
+		return id;
 
 	spin_lock(&gman->lock);
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_thp.c b/drivers/gpu/drm/vmwgfx/vmwgfx_thp.c
index b7c816ba71663..c8b9335bccd8d 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_thp.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_thp.c
@@ -95,7 +95,7 @@ static int vmw_thp_get_node(struct ttm_mem_type_manager *man,
 		mem->start = node->start;
 	}
 
-	return 0;
+	return ret;
 }
 
 
-- 
2.25.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH AUTOSEL 5.8 10/12] drm/vmwgfx: Fix error handling in get_node
  2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 10/12] drm/vmwgfx: Fix error handling in get_node Sasha Levin
@ 2020-10-05 16:03   ` Roland Scheidegger
  0 siblings, 0 replies; 5+ messages in thread
From: Roland Scheidegger @ 2020-10-05 16:03 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Martin Krastev, Roland Scheidegger, dri-devel

Not entirely sure how the patches for autosel were selected, but this
patch is no good for 5.8, since the patch which introduced the breakage
in the first place is only in 5.9 (in particular it was
58e4d686d456c3e356439ae160ff4a0728940b8e, drm/ttm: cleanup
ttm_mem_type_manager_func.get_node interface v3), and at least I don't
see that one being backported to 5.8.

Roland

Am 05.10.20 um 16:44 schrieb Sasha Levin:
> From: Zack Rusin <zackr@vmware.com>
> 
> [ Upstream commit f54c4442893b8dfbd3aff8e903c54dfff1aef990 ]
> 
> ttm_mem_type_manager_func.get_node was changed to return -ENOSPC
> instead of setting the node pointer to NULL. Unfortunately
> vmwgfx still had two places where it was explicitly converting
> -ENOSPC to 0 causing regressions. This fixes those spots by
> allowing -ENOSPC to be returned. That seems to fix recent
> regressions with vmwgfx.
> 
> Signed-off-by: Zack Rusin <zackr@vmware.com>
> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
> Reviewed-by: Martin Krastev <krastevm@vmware.com>
> Sigend-off-by: Roland Scheidegger <sroland@vmware.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c | 2 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_thp.c           | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> index 7da752ca1c34b..b93c558dd86e0 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> @@ -57,7 +57,7 @@ static int vmw_gmrid_man_get_node(struct ttm_mem_type_manager *man,
>  
>  	id = ida_alloc_max(&gman->gmr_ida, gman->max_gmr_ids - 1, GFP_KERNEL);
>  	if (id < 0)
> -		return (id != -ENOMEM ? 0 : id);
> +		return id;
>  
>  	spin_lock(&gman->lock);
>  
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_thp.c b/drivers/gpu/drm/vmwgfx/vmwgfx_thp.c
> index b7c816ba71663..c8b9335bccd8d 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_thp.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_thp.c
> @@ -95,7 +95,7 @@ static int vmw_thp_get_node(struct ttm_mem_type_manager *man,
>  		mem->start = node->start;
>  	}
>  
> -	return 0;
> +	return ret;
>  }
>  
>  
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-05 16:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20201005144501.2527477-1-sashal@kernel.org>
2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 04/12] drm/amdgpu: prevent double kfree ttm->sg Sasha Levin
2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 08/12] drm/amd/pm: Removed fixed clock in auto mode DPM Sasha Levin
2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 09/12] drm/amd/display: fix return value check for hdcp_work Sasha Levin
2020-10-05 14:44 ` [PATCH AUTOSEL 5.8 10/12] drm/vmwgfx: Fix error handling in get_node Sasha Levin
2020-10-05 16:03   ` Roland Scheidegger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).