All of lore.kernel.org
 help / color / mirror / Atom feed
From: Akhil P Oommen <quic_akhilpo@quicinc.com>
To: freedreno <freedreno@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>,
	<linux-arm-msm@vger.kernel.org>, Rob Clark <robdclark@gmail.com>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	"Dmitry Baryshkov" <dmitry.baryshkov@linaro.org>
Cc: Matthias Kaehlcke <mka@chromium.org>,
	Jonathan Marek <jonathan@marek.ca>,
	Jordan Crouse <jordan@cosmicpenguin.net>,
	Douglas Anderson <dianders@chromium.org>,
	Akhil P Oommen <quic_akhilpo@quicinc.com>,
	"Abhinav Kumar" <quic_abhinavk@quicinc.com>,
	Chia-I Wu <olvaffe@gmail.com>, "Daniel Vetter" <daniel@ffwll.ch>,
	David Airlie <airlied@linux.ie>, Sean Paul <sean@poorly.run>,
	<linux-kernel@vger.kernel.org>
Subject: [PATCH v4 4/7] drm/msm: Fix cx collapse issue during recovery
Date: Wed, 17 Aug 2022 20:44:17 +0530	[thread overview]
Message-ID: <20220817204224.v4.4.I4ac27a0b34ea796ce0f938bb509e257516bc6f57@changeid> (raw)
In-Reply-To: <1660749261-7602-1-git-send-email-quic_akhilpo@quicinc.com>

There are some hardware logic under CX domain. For a successful
recovery, we should ensure cx headswitch collapses to ensure all the
stale states are cleard out. This is especially true to for a6xx family
where we can GMU co-processor.

Currently, cx doesn't collapse due to a devlink between gpu and its
smmu. So the *struct gpu device* needs to be runtime suspended to ensure
that the iommu driver removes its vote on cx gdsc.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
---

Changes in v4:
- Keep active_submit lock across the suspend & resume (Rob)
- Clear gpu->active_submits to silence a WARN() during runpm suspend (Rob)

Changes in v3:
- Simplied the pm refcount drop since we have just a single refcount now
for all active submits

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 32 +++++++++++++++++++++++++++++---
 drivers/gpu/drm/msm/msm_gpu.c         |  4 +---
 2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 42ed9a3..0c8f19e 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1193,7 +1193,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
-	int i;
+	int i, active_submits;
 
 	adreno_dump_info(gpu);
 
@@ -1210,8 +1210,34 @@ static void a6xx_recover(struct msm_gpu *gpu)
 	 */
 	gmu_write(&a6xx_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
 
-	gpu->funcs->pm_suspend(gpu);
-	gpu->funcs->pm_resume(gpu);
+	pm_runtime_dont_use_autosuspend(&gpu->pdev->dev);
+
+	/* active_submit won't change until we make a submission */
+	mutex_lock(&gpu->active_lock);
+	active_submits = gpu->active_submits;
+
+	/*
+	 * Temporarily clear active_submits count to silence a WARN() in the
+	 * runtime suspend cb
+	 */
+	gpu->active_submits = 0;
+
+	/* Drop the rpm refcount from active submits */
+	if (active_submits)
+		pm_runtime_put(&gpu->pdev->dev);
+
+	/* And the final one from recover worker */
+	pm_runtime_put_sync(&gpu->pdev->dev);
+
+	pm_runtime_use_autosuspend(&gpu->pdev->dev);
+
+	if (active_submits)
+		pm_runtime_get(&gpu->pdev->dev);
+
+	pm_runtime_get_sync(&gpu->pdev->dev);
+
+	gpu->active_submits = active_submits;
+	mutex_unlock(&gpu->active_lock);
 
 	msm_gpu_hw_init(gpu);
 }
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 1945efb..07e55a6 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -426,9 +426,7 @@ static void recover_worker(struct kthread_work *work)
 		/* retire completed submits, plus the one that hung: */
 		retire_submits(gpu);
 
-		pm_runtime_get_sync(&gpu->pdev->dev);
 		gpu->funcs->recover(gpu);
-		pm_runtime_put_sync(&gpu->pdev->dev);
 
 		/*
 		 * Replay all remaining submits starting with highest priority
@@ -445,7 +443,7 @@ static void recover_worker(struct kthread_work *work)
 		}
 	}
 
-	pm_runtime_put_sync(&gpu->pdev->dev);
+	pm_runtime_put(&gpu->pdev->dev);
 
 	mutex_unlock(&gpu->lock);
 
-- 
2.7.4


WARNING: multiple messages have this Message-ID (diff)
From: Akhil P Oommen <quic_akhilpo@quicinc.com>
To: freedreno <freedreno@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>,
	<linux-arm-msm@vger.kernel.org>, Rob Clark <robdclark@gmail.com>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	"Dmitry Baryshkov" <dmitry.baryshkov@linaro.org>
Cc: Jonathan Marek <jonathan@marek.ca>,
	Akhil P Oommen <quic_akhilpo@quicinc.com>,
	linux-kernel@vger.kernel.org,
	Abhinav Kumar <quic_abhinavk@quicinc.com>,
	Douglas Anderson <dianders@chromium.org>,
	David Airlie <airlied@linux.ie>,
	Matthias Kaehlcke <mka@chromium.org>,
	Jordan Crouse <jordan@cosmicpenguin.net>,
	Sean Paul <sean@poorly.run>
Subject: [PATCH v4 4/7] drm/msm: Fix cx collapse issue during recovery
Date: Wed, 17 Aug 2022 20:44:17 +0530	[thread overview]
Message-ID: <20220817204224.v4.4.I4ac27a0b34ea796ce0f938bb509e257516bc6f57@changeid> (raw)
In-Reply-To: <1660749261-7602-1-git-send-email-quic_akhilpo@quicinc.com>

There are some hardware logic under CX domain. For a successful
recovery, we should ensure cx headswitch collapses to ensure all the
stale states are cleard out. This is especially true to for a6xx family
where we can GMU co-processor.

Currently, cx doesn't collapse due to a devlink between gpu and its
smmu. So the *struct gpu device* needs to be runtime suspended to ensure
that the iommu driver removes its vote on cx gdsc.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
---

Changes in v4:
- Keep active_submit lock across the suspend & resume (Rob)
- Clear gpu->active_submits to silence a WARN() during runpm suspend (Rob)

Changes in v3:
- Simplied the pm refcount drop since we have just a single refcount now
for all active submits

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 32 +++++++++++++++++++++++++++++---
 drivers/gpu/drm/msm/msm_gpu.c         |  4 +---
 2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 42ed9a3..0c8f19e 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1193,7 +1193,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
-	int i;
+	int i, active_submits;
 
 	adreno_dump_info(gpu);
 
@@ -1210,8 +1210,34 @@ static void a6xx_recover(struct msm_gpu *gpu)
 	 */
 	gmu_write(&a6xx_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
 
-	gpu->funcs->pm_suspend(gpu);
-	gpu->funcs->pm_resume(gpu);
+	pm_runtime_dont_use_autosuspend(&gpu->pdev->dev);
+
+	/* active_submit won't change until we make a submission */
+	mutex_lock(&gpu->active_lock);
+	active_submits = gpu->active_submits;
+
+	/*
+	 * Temporarily clear active_submits count to silence a WARN() in the
+	 * runtime suspend cb
+	 */
+	gpu->active_submits = 0;
+
+	/* Drop the rpm refcount from active submits */
+	if (active_submits)
+		pm_runtime_put(&gpu->pdev->dev);
+
+	/* And the final one from recover worker */
+	pm_runtime_put_sync(&gpu->pdev->dev);
+
+	pm_runtime_use_autosuspend(&gpu->pdev->dev);
+
+	if (active_submits)
+		pm_runtime_get(&gpu->pdev->dev);
+
+	pm_runtime_get_sync(&gpu->pdev->dev);
+
+	gpu->active_submits = active_submits;
+	mutex_unlock(&gpu->active_lock);
 
 	msm_gpu_hw_init(gpu);
 }
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 1945efb..07e55a6 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -426,9 +426,7 @@ static void recover_worker(struct kthread_work *work)
 		/* retire completed submits, plus the one that hung: */
 		retire_submits(gpu);
 
-		pm_runtime_get_sync(&gpu->pdev->dev);
 		gpu->funcs->recover(gpu);
-		pm_runtime_put_sync(&gpu->pdev->dev);
 
 		/*
 		 * Replay all remaining submits starting with highest priority
@@ -445,7 +443,7 @@ static void recover_worker(struct kthread_work *work)
 		}
 	}
 
-	pm_runtime_put_sync(&gpu->pdev->dev);
+	pm_runtime_put(&gpu->pdev->dev);
 
 	mutex_unlock(&gpu->lock);
 
-- 
2.7.4


  parent reply	other threads:[~2022-08-18  4:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-17 15:14 [PATCH v4 0/7] Improve GPU Recovery Akhil P Oommen
2022-08-17 15:14 ` Akhil P Oommen
2022-08-17 15:14 ` [PATCH v4 1/7] drm/msm: Remove unnecessary pm_runtime_get/put Akhil P Oommen
2022-08-17 15:14   ` Akhil P Oommen
2022-08-17 15:14 ` [PATCH v4 2/7] drm/msm: Take single rpm refcount on behalf of all submits Akhil P Oommen
2022-08-17 15:14   ` Akhil P Oommen
2022-08-17 15:14 ` [PATCH v4 3/7] drm/msm: Correct pm_runtime votes in recover worker Akhil P Oommen
2022-08-17 15:14   ` Akhil P Oommen
2022-08-17 15:14 ` Akhil P Oommen [this message]
2022-08-17 15:14   ` [PATCH v4 4/7] drm/msm: Fix cx collapse issue during recovery Akhil P Oommen
2022-08-17 15:14 ` [PATCH v4 5/7] drm/msm/a6xx: Ensure CX collapse during gpu recovery Akhil P Oommen
2022-08-17 15:14   ` Akhil P Oommen
2022-08-18  8:54   ` Philipp Zabel
2022-08-18  8:54     ` Philipp Zabel
2022-08-17 15:14 ` [PATCH v4 6/7] drm/msm/a6xx: Improve gpu recovery sequence Akhil P Oommen
2022-08-17 15:14   ` Akhil P Oommen
2022-08-17 15:14 ` [PATCH v4 7/7] drm/msm/a6xx: Handle GMU prepare-slumber hfi failure Akhil P Oommen
2022-08-17 15:14   ` Akhil P Oommen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220817204224.v4.4.I4ac27a0b34ea796ce0f938bb509e257516bc6f57@changeid \
    --to=quic_akhilpo@quicinc.com \
    --cc=airlied@linux.ie \
    --cc=bjorn.andersson@linaro.org \
    --cc=daniel@ffwll.ch \
    --cc=dianders@chromium.org \
    --cc=dmitry.baryshkov@linaro.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=jonathan@marek.ca \
    --cc=jordan@cosmicpenguin.net \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mka@chromium.org \
    --cc=olvaffe@gmail.com \
    --cc=quic_abhinavk@quicinc.com \
    --cc=robdclark@gmail.com \
    --cc=sean@poorly.run \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.