stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: YuBiao Wang <YuBiao.Wang@amd.com>,
	Luben Tuikov <luben.tuikov@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Sasha Levin <sashal@kernel.org>,
	christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@gmail.com,
	daniel@ffwll.ch, andrey.grodzovsky@amd.com, Jiadong.Zhu@amd.com,
	gpiccoli@igalia.com, Likun.Gao@amd.com, Jack.Xiao@amd.com,
	amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.2 23/25] drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
Date: Fri, 31 Mar 2023 21:41:21 -0400	[thread overview]
Message-ID: <20230401014126.3356410-23-sashal@kernel.org> (raw)
In-Reply-To: <20230401014126.3356410-1-sashal@kernel.org>

From: YuBiao Wang <YuBiao.Wang@amd.com>

[ Upstream commit 033c56474acf567a450f8bafca50e0b610f2b716 ]

[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. The fences in this case
are never signaled and next time when we try to free the sa_bo, kernel
will hang.

[How]
During pre_asic_reset, driver will clear job fences and afterwards the
fences' refcount will be reduced to 1. For drm_sched_jobs it will be
released in job_free_cb, and for non-sched jobs like ib_test, it's meant
to be released in sa_bo_free but only when the fences are signaled. So
we have to force signal the non_sched bad job's fence during
pre_asic_reset or the clear is not complete.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index faff4a3f96e6e..f52d0ba91a770 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -678,6 +678,15 @@ void amdgpu_fence_driver_clear_job_fences(struct amdgpu_ring *ring)
 		ptr = &ring->fence_drv.fences[i];
 		old = rcu_dereference_protected(*ptr, 1);
 		if (old && old->ops == &amdgpu_job_fence_ops) {
+			struct amdgpu_job *job;
+
+			/* For non-scheduler bad job, i.e. failed ib test, we need to signal
+			 * it right here or we won't be able to track them in fence_drv
+			 * and they will remain unsignaled during sa_bo free.
+			 */
+			job = container_of(old, struct amdgpu_job, hw_fence);
+			if (!job->base.s_fence && !dma_fence_is_signaled(old))
+				dma_fence_signal(old);
 			RCU_INIT_POINTER(*ptr, NULL);
 			dma_fence_put(old);
 		}
-- 
2.39.2


  parent reply	other threads:[~2023-04-01  1:44 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-01  1:40 [PATCH AUTOSEL 6.2 01/25] ARM: 9290/1: uaccess: Fix KASAN false-positives Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 02/25] ARM: dts: qcom: apq8026-lg-lenok: add missing reserved memory Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 03/25] arm64: dts: qcom: sa8540p-ride: correct name of remoteproc_nsp0 firmware Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 04/25] power: supply: rk817: Fix unsigned comparison with less than zero Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 05/25] power: supply: cros_usbpd: reclassify "default case!" as debug Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 06/25] power: supply: axp288_fuel_gauge: Added check for negative values Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 07/25] selftests/bpf: Fix progs/find_vma_fail1.c build error Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 08/25] wifi: mwifiex: mark OF related data as maybe unused Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 09/25] i2c: imx-lpi2c: clean rx/tx buffers upon new message Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 10/25] i2c: hisi: Avoid redundant interrupts Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 11/25] efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 12/25] block: ublk_drv: mark device as LIVE before adding disk Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 13/25] ACPI: video: Add backlight=native DMI quirk for Acer Aspire 3830TG Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 14/25] drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 15/25] hwmon: (peci/cputemp) Fix miscalculated DTS for SKX Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 16/25] hwmon: (xgene) Fix ioremap and memremap leak Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 17/25] verify_pefile: relax wrapper length check Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 18/25] asymmetric_keys: log on fatal failures in PE/pkcs7 Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 19/25] nvme: send Identify with CNS 06h only to I/O controllers Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 20/25] wifi: iwlwifi: mvm: fix mvmtxq->stopped handling Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 21/25] wifi: iwlwifi: mvm: protect TXQ list manipulation Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 22/25] drm/amdgpu: add mes resume when do gfx post soft reset Sasha Levin
2023-04-01  1:41 ` Sasha Levin [this message]
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 24/25] drm/amdgpu/gfx: set cg flags to enter/exit safe mode Sasha Levin
2023-04-01  1:41 ` [PATCH AUTOSEL 6.2 25/25] ACPI: resource: Add Medion S17413 to IRQ override quirk Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230401014126.3356410-23-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Jack.Xiao@amd.com \
    --cc=Jiadong.Zhu@amd.com \
    --cc=Likun.Gao@amd.com \
    --cc=Xinhui.Pan@amd.com \
    --cc=YuBiao.Wang@amd.com \
    --cc=airlied@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrey.grodzovsky@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gpiccoli@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luben.tuikov@amd.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).