All of lore.kernel.org
 help / color / mirror / Atom feed
From: Douglas Anderson <dianders@chromium.org>
To: Rob Clark <robdclark@gmail.com>,
	Abhinav Kumar <quic_abhinavk@quicinc.com>,
	Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Douglas Anderson <dianders@chromium.org>,
	Akhil P Oommen <quic_akhilpo@quicinc.com>,
	Chia-I Wu <olvaffe@gmail.com>, Daniel Vetter <daniel@ffwll.ch>,
	David Airlie <airlied@gmail.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Konrad Dybcio <konrad.dybcio@somainline.org>,
	Sean Paul <sean@poorly.run>,
	dri-devel@lists.freedesktop.org, freedreno@lists.freedesktop.org,
	linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH] drm/msm/a6xx: Make GPU destroy a bit safer
Date: Thu,  2 Feb 2023 10:48:43 -0800	[thread overview]
Message-ID: <20230202104822.1.I0e49003bf4dd1dead9be4a29dbee41f3b1236e48@changeid> (raw)

If, for whatever reason, we're trying process adreno_runtime_resume()
at the same time that a6xx_destroy() is running then things can go
boom. Specifically adreno_runtime_resume() will eventually call
a6xx_pm_resume() and that may try to resume the gmu.

Let's grab the GMU lock as we're destroying the GMU. That will solve
the race because a6xx_pm_resume() grabs the same lock. That makes the
access of `gmu->initialized` in a6xx_gmu_resume() safe.

We'll also return an error code in a6xx_gmu_resume() if we see that
`gmu->initialized` was false. If this happens we'll bail out of the
rest of a6xx_pm_resume(), which is good because the rest of that
function is also not good to do if we're racing with a6xx_destroy().

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This doesn't _really_ matter for upstream, but downstream in ChromeOS
we have a GPU inputboost patch. That inputboost patch was related to
adreno_runtime_resume() getting called at the same time that
a6xx_destroy() was running. This was seen at bootup when the panel
failed to probe.

Despite the fact that this isn't truly fixing any bugs upstream, it
still seems like a general improvement for the GPU driver.

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index f3c9600221d4..7f5bc73b2040 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -974,7 +974,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 	int status, ret;
 
 	if (WARN(!gmu->initialized, "The GMU is not set up yet\n"))
-		return 0;
+		return -EINVAL;
 
 	gmu->hung = false;
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index aae60cbd9164..6faea5049f76 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1746,7 +1746,9 @@ static void a6xx_destroy(struct msm_gpu *gpu)
 
 	a6xx_llc_slices_destroy(a6xx_gpu);
 
+	mutex_lock(&a6xx_gpu->gmu.lock);
 	a6xx_gmu_remove(a6xx_gpu);
+	mutex_unlock(&a6xx_gpu->gmu.lock);
 
 	adreno_gpu_cleanup(adreno_gpu);
 
-- 
2.39.1.519.gcb327c4b5f-goog


WARNING: multiple messages have this Message-ID (diff)
From: Douglas Anderson <dianders@chromium.org>
To: Rob Clark <robdclark@gmail.com>,
	Abhinav Kumar <quic_abhinavk@quicinc.com>,
	Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: freedreno@lists.freedesktop.org,
	Akhil P Oommen <quic_akhilpo@quicinc.com>,
	Sean Paul <sean@poorly.run>,
	Konrad Dybcio <konrad.dybcio@somainline.org>,
	Douglas Anderson <dianders@chromium.org>,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	linux-arm-msm@vger.kernel.org
Subject: [PATCH] drm/msm/a6xx: Make GPU destroy a bit safer
Date: Thu,  2 Feb 2023 10:48:43 -0800	[thread overview]
Message-ID: <20230202104822.1.I0e49003bf4dd1dead9be4a29dbee41f3b1236e48@changeid> (raw)

If, for whatever reason, we're trying process adreno_runtime_resume()
at the same time that a6xx_destroy() is running then things can go
boom. Specifically adreno_runtime_resume() will eventually call
a6xx_pm_resume() and that may try to resume the gmu.

Let's grab the GMU lock as we're destroying the GMU. That will solve
the race because a6xx_pm_resume() grabs the same lock. That makes the
access of `gmu->initialized` in a6xx_gmu_resume() safe.

We'll also return an error code in a6xx_gmu_resume() if we see that
`gmu->initialized` was false. If this happens we'll bail out of the
rest of a6xx_pm_resume(), which is good because the rest of that
function is also not good to do if we're racing with a6xx_destroy().

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This doesn't _really_ matter for upstream, but downstream in ChromeOS
we have a GPU inputboost patch. That inputboost patch was related to
adreno_runtime_resume() getting called at the same time that
a6xx_destroy() was running. This was seen at bootup when the panel
failed to probe.

Despite the fact that this isn't truly fixing any bugs upstream, it
still seems like a general improvement for the GPU driver.

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index f3c9600221d4..7f5bc73b2040 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -974,7 +974,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 	int status, ret;
 
 	if (WARN(!gmu->initialized, "The GMU is not set up yet\n"))
-		return 0;
+		return -EINVAL;
 
 	gmu->hung = false;
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index aae60cbd9164..6faea5049f76 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1746,7 +1746,9 @@ static void a6xx_destroy(struct msm_gpu *gpu)
 
 	a6xx_llc_slices_destroy(a6xx_gpu);
 
+	mutex_lock(&a6xx_gpu->gmu.lock);
 	a6xx_gmu_remove(a6xx_gpu);
+	mutex_unlock(&a6xx_gpu->gmu.lock);
 
 	adreno_gpu_cleanup(adreno_gpu);
 
-- 
2.39.1.519.gcb327c4b5f-goog


             reply	other threads:[~2023-02-02 18:49 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-02 18:48 Douglas Anderson [this message]
2023-02-02 18:48 ` [PATCH] drm/msm/a6xx: Make GPU destroy a bit safer Douglas Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230202104822.1.I0e49003bf4dd1dead9be4a29dbee41f3b1236e48@changeid \
    --to=dianders@chromium.org \
    --cc=airlied@gmail.com \
    --cc=daniel@ffwll.ch \
    --cc=dmitry.baryshkov@linaro.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=geert@linux-m68k.org \
    --cc=konrad.dybcio@somainline.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=olvaffe@gmail.com \
    --cc=quic_abhinavk@quicinc.com \
    --cc=quic_akhilpo@quicinc.com \
    --cc=robdclark@gmail.com \
    --cc=sean@poorly.run \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.