From: Douglas Anderson <dianders@chromium.org> To: Rob Clark <robdclark@gmail.com>, Abhinav Kumar <quic_abhinavk@quicinc.com>, Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Douglas Anderson <dianders@chromium.org>, Akhil P Oommen <quic_akhilpo@quicinc.com>, Chia-I Wu <olvaffe@gmail.com>, Daniel Vetter <daniel@ffwll.ch>, David Airlie <airlied@gmail.com>, Geert Uytterhoeven <geert@linux-m68k.org>, Konrad Dybcio <konrad.dybcio@somainline.org>, Sean Paul <sean@poorly.run>, dri-devel@lists.freedesktop.org, freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] drm/msm/a6xx: Make GPU destroy a bit safer Date: Thu, 2 Feb 2023 10:48:43 -0800 [thread overview] Message-ID: <20230202104822.1.I0e49003bf4dd1dead9be4a29dbee41f3b1236e48@changeid> (raw) If, for whatever reason, we're trying process adreno_runtime_resume() at the same time that a6xx_destroy() is running then things can go boom. Specifically adreno_runtime_resume() will eventually call a6xx_pm_resume() and that may try to resume the gmu. Let's grab the GMU lock as we're destroying the GMU. That will solve the race because a6xx_pm_resume() grabs the same lock. That makes the access of `gmu->initialized` in a6xx_gmu_resume() safe. We'll also return an error code in a6xx_gmu_resume() if we see that `gmu->initialized` was false. If this happens we'll bail out of the rest of a6xx_pm_resume(), which is good because the rest of that function is also not good to do if we're racing with a6xx_destroy(). Signed-off-by: Douglas Anderson <dianders@chromium.org> --- This doesn't _really_ matter for upstream, but downstream in ChromeOS we have a GPU inputboost patch. That inputboost patch was related to adreno_runtime_resume() getting called at the same time that a6xx_destroy() was running. This was seen at bootup when the panel failed to probe. Despite the fact that this isn't truly fixing any bugs upstream, it still seems like a general improvement for the GPU driver. drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index f3c9600221d4..7f5bc73b2040 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -974,7 +974,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) int status, ret; if (WARN(!gmu->initialized, "The GMU is not set up yet\n")) - return 0; + return -EINVAL; gmu->hung = false; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index aae60cbd9164..6faea5049f76 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1746,7 +1746,9 @@ static void a6xx_destroy(struct msm_gpu *gpu) a6xx_llc_slices_destroy(a6xx_gpu); + mutex_lock(&a6xx_gpu->gmu.lock); a6xx_gmu_remove(a6xx_gpu); + mutex_unlock(&a6xx_gpu->gmu.lock); adreno_gpu_cleanup(adreno_gpu); -- 2.39.1.519.gcb327c4b5f-goog
WARNING: multiple messages have this Message-ID (diff)
From: Douglas Anderson <dianders@chromium.org> To: Rob Clark <robdclark@gmail.com>, Abhinav Kumar <quic_abhinavk@quicinc.com>, Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: freedreno@lists.freedesktop.org, Akhil P Oommen <quic_akhilpo@quicinc.com>, Sean Paul <sean@poorly.run>, Konrad Dybcio <konrad.dybcio@somainline.org>, Douglas Anderson <dianders@chromium.org>, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Geert Uytterhoeven <geert@linux-m68k.org>, linux-arm-msm@vger.kernel.org Subject: [PATCH] drm/msm/a6xx: Make GPU destroy a bit safer Date: Thu, 2 Feb 2023 10:48:43 -0800 [thread overview] Message-ID: <20230202104822.1.I0e49003bf4dd1dead9be4a29dbee41f3b1236e48@changeid> (raw) If, for whatever reason, we're trying process adreno_runtime_resume() at the same time that a6xx_destroy() is running then things can go boom. Specifically adreno_runtime_resume() will eventually call a6xx_pm_resume() and that may try to resume the gmu. Let's grab the GMU lock as we're destroying the GMU. That will solve the race because a6xx_pm_resume() grabs the same lock. That makes the access of `gmu->initialized` in a6xx_gmu_resume() safe. We'll also return an error code in a6xx_gmu_resume() if we see that `gmu->initialized` was false. If this happens we'll bail out of the rest of a6xx_pm_resume(), which is good because the rest of that function is also not good to do if we're racing with a6xx_destroy(). Signed-off-by: Douglas Anderson <dianders@chromium.org> --- This doesn't _really_ matter for upstream, but downstream in ChromeOS we have a GPU inputboost patch. That inputboost patch was related to adreno_runtime_resume() getting called at the same time that a6xx_destroy() was running. This was seen at bootup when the panel failed to probe. Despite the fact that this isn't truly fixing any bugs upstream, it still seems like a general improvement for the GPU driver. drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index f3c9600221d4..7f5bc73b2040 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -974,7 +974,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) int status, ret; if (WARN(!gmu->initialized, "The GMU is not set up yet\n")) - return 0; + return -EINVAL; gmu->hung = false; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index aae60cbd9164..6faea5049f76 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1746,7 +1746,9 @@ static void a6xx_destroy(struct msm_gpu *gpu) a6xx_llc_slices_destroy(a6xx_gpu); + mutex_lock(&a6xx_gpu->gmu.lock); a6xx_gmu_remove(a6xx_gpu); + mutex_unlock(&a6xx_gpu->gmu.lock); adreno_gpu_cleanup(adreno_gpu); -- 2.39.1.519.gcb327c4b5f-goog
next reply other threads:[~2023-02-02 18:49 UTC|newest] Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-02-02 18:48 Douglas Anderson [this message] 2023-02-02 18:48 ` [PATCH] drm/msm/a6xx: Make GPU destroy a bit safer Douglas Anderson
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230202104822.1.I0e49003bf4dd1dead9be4a29dbee41f3b1236e48@changeid \ --to=dianders@chromium.org \ --cc=airlied@gmail.com \ --cc=daniel@ffwll.ch \ --cc=dmitry.baryshkov@linaro.org \ --cc=dri-devel@lists.freedesktop.org \ --cc=freedreno@lists.freedesktop.org \ --cc=geert@linux-m68k.org \ --cc=konrad.dybcio@somainline.org \ --cc=linux-arm-msm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=olvaffe@gmail.com \ --cc=quic_abhinavk@quicinc.com \ --cc=quic_akhilpo@quicinc.com \ --cc=robdclark@gmail.com \ --cc=sean@poorly.run \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.