From: Rob Clark <robdclark@gmail.com> To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, "Rob Clark" <robdclark@chromium.org>, "Rob Clark" <robdclark@gmail.com>, "Sean Paul" <sean@poorly.run>, "Abhinav Kumar" <quic_abhinavk@quicinc.com>, "David Airlie" <airlied@linux.ie>, "Daniel Vetter" <daniel@ffwll.ch>, "Jordan Crouse" <jordan@cosmicpenguin.net>, "Christian König" <christian.koenig@amd.com>, "Dmitry Baryshkov" <dmitry.baryshkov@linaro.org>, "Douglas Anderson" <dianders@chromium.org>, "Jonathan Marek" <jonathan@marek.ca>, "Akhil P Oommen" <quic_akhilpo@quicinc.com>, "Sai Prakash Ranjan" <saiprakash.ranjan@codeaurora.org>, "Yangtao Li" <tiny.windzz@gmail.com>, "Konrad Dybcio" <konrad.dybcio@somainline.org>, linux-kernel@vger.kernel.org (open list) Subject: [PATCH] drm/msm/a6xx: Skip crashdumper state if GPU needs_hw_init Date: Thu, 9 Dec 2021 11:31:13 -0800 [thread overview] Message-ID: <20211209193118.1163248-1-robdclark@gmail.com> (raw) From: Rob Clark <robdclark@chromium.org> I am seeing some crash logs which imply that we are trying to use crashdumper hw to read back GPU state when the GPU isn't initialized. This doesn't go well (for example, GPU could be in 32b address mode and ignoring the upper bits of buffer that it is trying to dump state to). I'm not *quite* sure how we get into this state in the first place, but lets not make a bad situation worse by triggering iova fault crashes. While we're at it, also add the information about whether the GPU is initialized to the devcore dump to make this easier to see in the logs (which makes the WARN_ON() redundant and even harmful because it fills up the small bit of dmesg we get with the crash report). Signed-off-by: Rob Clark <robdclark@chromium.org> --- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 9 ++++++++- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 1 - 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c index bdd0059a81ff..55f443328d8e 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c @@ -49,6 +49,8 @@ struct a6xx_gpu_state { s32 hfi_queue_history[2][HFI_HISTORY_SZ]; struct list_head objs; + + bool gpu_initialized; }; static inline int CRASHDUMP_WRITE(u64 *in, u32 reg, u32 val) @@ -1001,7 +1003,8 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) * write out GPU state, so we need to skip this when the SMMU is * stalled in response to an iova fault */ - if (!stalled && !a6xx_crashdumper_init(gpu, &_dumper)) { + if (!stalled && !gpu->needs_hw_init && + !a6xx_crashdumper_init(gpu, &_dumper)) { dumper = &_dumper; } @@ -1018,6 +1021,8 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) if (snapshot_debugbus) a6xx_get_debugbus(gpu, a6xx_state); + a6xx_state->gpu_initialized = !gpu->needs_hw_init; + return &a6xx_state->base; } @@ -1246,6 +1251,8 @@ void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state, if (IS_ERR_OR_NULL(state)) return; + drm_printf(p, "gpu-initialized: %d\n", a6xx_state->gpu_initialized); + adreno_show(gpu, state, p); drm_puts(p, "gmu-log:\n"); diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 47cb40bdbd43..f33cfa4ef1c8 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -504,7 +504,6 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state) struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); int i, count = 0; - WARN_ON(gpu->needs_hw_init); WARN_ON(!mutex_is_locked(&gpu->lock)); kref_init(&state->ref); -- 2.33.1
WARNING: multiple messages have this Message-ID (diff)
From: Rob Clark <robdclark@gmail.com> To: dri-devel@lists.freedesktop.org Cc: "Rob Clark" <robdclark@chromium.org>, "Sai Prakash Ranjan" <saiprakash.ranjan@codeaurora.org>, "Douglas Anderson" <dianders@chromium.org>, "Jonathan Marek" <jonathan@marek.ca>, "Akhil P Oommen" <quic_akhilpo@quicinc.com>, "David Airlie" <airlied@linux.ie>, linux-arm-msm@vger.kernel.org, "Yangtao Li" <tiny.windzz@gmail.com>, "Konrad Dybcio" <konrad.dybcio@somainline.org>, "Abhinav Kumar" <quic_abhinavk@quicinc.com>, "Jordan Crouse" <jordan@cosmicpenguin.net>, "Sean Paul" <sean@poorly.run>, "Dmitry Baryshkov" <dmitry.baryshkov@linaro.org>, freedreno@lists.freedesktop.org, "Christian König" <christian.koenig@amd.com>, "open list" <linux-kernel@vger.kernel.org> Subject: [PATCH] drm/msm/a6xx: Skip crashdumper state if GPU needs_hw_init Date: Thu, 9 Dec 2021 11:31:13 -0800 [thread overview] Message-ID: <20211209193118.1163248-1-robdclark@gmail.com> (raw) From: Rob Clark <robdclark@chromium.org> I am seeing some crash logs which imply that we are trying to use crashdumper hw to read back GPU state when the GPU isn't initialized. This doesn't go well (for example, GPU could be in 32b address mode and ignoring the upper bits of buffer that it is trying to dump state to). I'm not *quite* sure how we get into this state in the first place, but lets not make a bad situation worse by triggering iova fault crashes. While we're at it, also add the information about whether the GPU is initialized to the devcore dump to make this easier to see in the logs (which makes the WARN_ON() redundant and even harmful because it fills up the small bit of dmesg we get with the crash report). Signed-off-by: Rob Clark <robdclark@chromium.org> --- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 9 ++++++++- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 1 - 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c index bdd0059a81ff..55f443328d8e 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c @@ -49,6 +49,8 @@ struct a6xx_gpu_state { s32 hfi_queue_history[2][HFI_HISTORY_SZ]; struct list_head objs; + + bool gpu_initialized; }; static inline int CRASHDUMP_WRITE(u64 *in, u32 reg, u32 val) @@ -1001,7 +1003,8 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) * write out GPU state, so we need to skip this when the SMMU is * stalled in response to an iova fault */ - if (!stalled && !a6xx_crashdumper_init(gpu, &_dumper)) { + if (!stalled && !gpu->needs_hw_init && + !a6xx_crashdumper_init(gpu, &_dumper)) { dumper = &_dumper; } @@ -1018,6 +1021,8 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) if (snapshot_debugbus) a6xx_get_debugbus(gpu, a6xx_state); + a6xx_state->gpu_initialized = !gpu->needs_hw_init; + return &a6xx_state->base; } @@ -1246,6 +1251,8 @@ void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state, if (IS_ERR_OR_NULL(state)) return; + drm_printf(p, "gpu-initialized: %d\n", a6xx_state->gpu_initialized); + adreno_show(gpu, state, p); drm_puts(p, "gmu-log:\n"); diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 47cb40bdbd43..f33cfa4ef1c8 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -504,7 +504,6 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state) struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); int i, count = 0; - WARN_ON(gpu->needs_hw_init); WARN_ON(!mutex_is_locked(&gpu->lock)); kref_init(&state->ref); -- 2.33.1
next reply other threads:[~2021-12-09 19:26 UTC|newest] Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-12-09 19:31 Rob Clark [this message] 2021-12-09 19:31 ` [PATCH] drm/msm/a6xx: Skip crashdumper state if GPU needs_hw_init Rob Clark
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20211209193118.1163248-1-robdclark@gmail.com \ --to=robdclark@gmail.com \ --cc=airlied@linux.ie \ --cc=christian.koenig@amd.com \ --cc=daniel@ffwll.ch \ --cc=dianders@chromium.org \ --cc=dmitry.baryshkov@linaro.org \ --cc=dri-devel@lists.freedesktop.org \ --cc=freedreno@lists.freedesktop.org \ --cc=jonathan@marek.ca \ --cc=jordan@cosmicpenguin.net \ --cc=konrad.dybcio@somainline.org \ --cc=linux-arm-msm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=quic_abhinavk@quicinc.com \ --cc=quic_akhilpo@quicinc.com \ --cc=robdclark@chromium.org \ --cc=saiprakash.ranjan@codeaurora.org \ --cc=sean@poorly.run \ --cc=tiny.windzz@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.