From: Akhil P Oommen <quic_akhilpo@quicinc.com> To: Rajendra Nayak <quic_rjendra@quicinc.com>, Stephen Boyd <swboyd@chromium.org>, Doug Anderson <dianders@chromium.org>, Taniya Das <quic_tdas@quicinc.com> Cc: <devicetree@vger.kernel.org>, Jonathan Marek <jonathan@marek.ca>, linux-arm-msm <linux-arm-msm@vger.kernel.org>, Andy Gross <agross@kernel.org>, dri-devel <dri-devel@lists.freedesktop.org>, "Bjorn Andersson" <bjorn.andersson@linaro.org>, Rob Herring <robh+dt@kernel.org>, Rob Clark <robdclark@gmail.com>, Matthias Kaehlcke <mka@chromium.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>, Jordan Crouse <jordan@cosmicpenguin.net>, freedreno <freedreno@lists.freedesktop.org>, LKML <linux-kernel@vger.kernel.org> Subject: Re: [Freedreno] [PATCH v2 5/7] arm64: dts: qcom: sc7280: Update gpu register list Date: Wed, 20 Jul 2022 11:34:01 +0530 [thread overview] Message-ID: <698d3279-6a02-9b1e-a3bd-627b6afbc57e@quicinc.com> (raw) In-Reply-To: <b6ab023b-601d-1df2-b04b-af5961b73bea@quicinc.com> On 7/19/2022 3:26 PM, Rajendra Nayak wrote: > > > On 7/19/2022 12:49 PM, Stephen Boyd wrote: >> Quoting Akhil P Oommen (2022-07-18 23:37:16) >>> On 7/19/2022 11:19 AM, Stephen Boyd wrote: >>>> Quoting Akhil P Oommen (2022-07-18 21:07:05) >>>>> On 7/14/2022 11:10 AM, Akhil P Oommen wrote: >>>>>> IIUC, qcom gdsc driver doesn't ensure hardware is collapsed since >>>>>> they >>>>>> are vote-able switches. Ideally, we should ensure that the hw has >>>>>> collapsed for gpu recovery because there could be transient votes >>>>>> from >>>>>> other subsystems like hypervisor using their vote register. >>>>>> >>>>>> I am not sure how complex the plumbing to gpucc driver would be >>>>>> to allow >>>>>> gpu driver to check hw status. OTOH, with this patch, gpu driver >>>>>> does a >>>>>> read operation on a gpucc register which is in always-on domain. >>>>>> That >>>>>> means we don't need to vote any resource to access this register. >> >> Reading between the lines here, you're saying that you have to read the >> gdsc register to make sure that the gdsc is in some state? Can you >> clarify exactly what you're doing? And how do you know that something >> else in the kernel can't cause the register to change after it is read? >> It certainly seems like we can't be certain because there is voting >> involved. From gpu driver, cx_gdscr.bit[31] (power off status) register can be polled to ensure that it *collapsed at least once*. We don't need to care if something turns ON gdsc after that. > > yes, this looks like the best case effort to get the gpu to recover, but > the kernel driver really has no control to make sure this condition can > always be met (because it depends on other entities like hyp, > trustzone etc right?) > Why not just put a worst case polling delay? I didn't get you entirely. Where do you mean to keep the polling delay? > >> >>>>>> >>>>>> Stephen/Rajendra/Taniya, any suggestion? >>>> Why can't you assert a gpu reset signal with the reset APIs? This >>>> series >>>> seems to jump through a bunch of hoops to get the gdsc and power >>>> domain >>>> to "reset" when I don't know why any of that is necessary. Can't we >>>> simply assert a reset to the hardware after recovery completes so the >>>> device is back into a good known POR (power on reset) state? >>> That is because there is no register interface to reset GPU CX domain. >>> The recommended sequence from HW design folks is to collapse both cx >>> and >>> gx gdsc to properly reset gpu/gmu. >>> >> >> Ok. One knee jerk reaction is to treat the gdsc as a reset then and >> possibly mux that request along with any power domain on/off so that if >> the reset is requested and the power domain is off nothing happens. >> Otherwise if the power domain is on then it manually sequences and >> controls the two gdscs so that the GPU is reset and then restores the >> enable state of the power domain. It would be fatal to asynchronously pull the plug on CX gdsc forcefully because there might be another gpu/smmu driver thread accessing registers in cx domain. -Akhil.
WARNING: multiple messages have this Message-ID (diff)
From: Akhil P Oommen <quic_akhilpo@quicinc.com> To: Rajendra Nayak <quic_rjendra@quicinc.com>, Stephen Boyd <swboyd@chromium.org>, Doug Anderson <dianders@chromium.org>, Taniya Das <quic_tdas@quicinc.com> Cc: devicetree@vger.kernel.org, Jonathan Marek <jonathan@marek.ca>, linux-arm-msm <linux-arm-msm@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>, dri-devel <dri-devel@lists.freedesktop.org>, Bjorn Andersson <bjorn.andersson@linaro.org>, Rob Herring <robh+dt@kernel.org>, Andy Gross <agross@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>, Jordan Crouse <jordan@cosmicpenguin.net>, freedreno <freedreno@lists.freedesktop.org>, Matthias Kaehlcke <mka@chromium.org> Subject: Re: [Freedreno] [PATCH v2 5/7] arm64: dts: qcom: sc7280: Update gpu register list Date: Wed, 20 Jul 2022 11:34:01 +0530 [thread overview] Message-ID: <698d3279-6a02-9b1e-a3bd-627b6afbc57e@quicinc.com> (raw) In-Reply-To: <b6ab023b-601d-1df2-b04b-af5961b73bea@quicinc.com> On 7/19/2022 3:26 PM, Rajendra Nayak wrote: > > > On 7/19/2022 12:49 PM, Stephen Boyd wrote: >> Quoting Akhil P Oommen (2022-07-18 23:37:16) >>> On 7/19/2022 11:19 AM, Stephen Boyd wrote: >>>> Quoting Akhil P Oommen (2022-07-18 21:07:05) >>>>> On 7/14/2022 11:10 AM, Akhil P Oommen wrote: >>>>>> IIUC, qcom gdsc driver doesn't ensure hardware is collapsed since >>>>>> they >>>>>> are vote-able switches. Ideally, we should ensure that the hw has >>>>>> collapsed for gpu recovery because there could be transient votes >>>>>> from >>>>>> other subsystems like hypervisor using their vote register. >>>>>> >>>>>> I am not sure how complex the plumbing to gpucc driver would be >>>>>> to allow >>>>>> gpu driver to check hw status. OTOH, with this patch, gpu driver >>>>>> does a >>>>>> read operation on a gpucc register which is in always-on domain. >>>>>> That >>>>>> means we don't need to vote any resource to access this register. >> >> Reading between the lines here, you're saying that you have to read the >> gdsc register to make sure that the gdsc is in some state? Can you >> clarify exactly what you're doing? And how do you know that something >> else in the kernel can't cause the register to change after it is read? >> It certainly seems like we can't be certain because there is voting >> involved. From gpu driver, cx_gdscr.bit[31] (power off status) register can be polled to ensure that it *collapsed at least once*. We don't need to care if something turns ON gdsc after that. > > yes, this looks like the best case effort to get the gpu to recover, but > the kernel driver really has no control to make sure this condition can > always be met (because it depends on other entities like hyp, > trustzone etc right?) > Why not just put a worst case polling delay? I didn't get you entirely. Where do you mean to keep the polling delay? > >> >>>>>> >>>>>> Stephen/Rajendra/Taniya, any suggestion? >>>> Why can't you assert a gpu reset signal with the reset APIs? This >>>> series >>>> seems to jump through a bunch of hoops to get the gdsc and power >>>> domain >>>> to "reset" when I don't know why any of that is necessary. Can't we >>>> simply assert a reset to the hardware after recovery completes so the >>>> device is back into a good known POR (power on reset) state? >>> That is because there is no register interface to reset GPU CX domain. >>> The recommended sequence from HW design folks is to collapse both cx >>> and >>> gx gdsc to properly reset gpu/gmu. >>> >> >> Ok. One knee jerk reaction is to treat the gdsc as a reset then and >> possibly mux that request along with any power domain on/off so that if >> the reset is requested and the power domain is off nothing happens. >> Otherwise if the power domain is on then it manually sequences and >> controls the two gdscs so that the GPU is reset and then restores the >> enable state of the power domain. It would be fatal to asynchronously pull the plug on CX gdsc forcefully because there might be another gpu/smmu driver thread accessing registers in cx domain. -Akhil.
next prev parent reply other threads:[~2022-07-20 6:04 UTC|newest] Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-07-09 5:59 [PATCH v2 0/7] Improve GPU Recovery Akhil P Oommen 2022-07-09 5:59 ` Akhil P Oommen 2022-07-09 5:59 ` [PATCH v2 1/7] drm/msm: Remove unnecessary pm_runtime_get/put Akhil P Oommen 2022-07-09 5:59 ` Akhil P Oommen 2022-07-09 5:59 ` [PATCH v2 2/7] drm/msm: Correct pm_runtime votes in recover worker Akhil P Oommen 2022-07-09 5:59 ` Akhil P Oommen 2022-07-09 5:59 ` [PATCH v2 3/7] drm/msm: Fix cx collapse issue during recovery Akhil P Oommen 2022-07-09 5:59 ` Akhil P Oommen 2022-07-11 23:22 ` Doug Anderson 2022-07-11 23:22 ` Doug Anderson 2022-07-12 5:04 ` [Freedreno] " Akhil P Oommen 2022-07-12 5:04 ` Akhil P Oommen 2022-07-12 16:44 ` Rob Clark 2022-07-12 16:44 ` Rob Clark 2022-07-12 19:15 ` Akhil P Oommen 2022-07-12 19:15 ` Akhil P Oommen 2022-07-20 18:06 ` Rob Clark 2022-07-20 18:06 ` Rob Clark 2022-07-20 20:38 ` Akhil P Oommen 2022-07-20 20:38 ` Akhil P Oommen 2022-07-22 17:25 ` Akhil P Oommen 2022-07-22 17:25 ` Akhil P Oommen 2022-07-09 5:59 ` [PATCH v2 4/7] drm/msm: Ensure cx gdsc collapse " Akhil P Oommen 2022-07-09 5:59 ` Akhil P Oommen 2022-07-09 5:59 ` [PATCH v2 5/7] arm64: dts: qcom: sc7280: Update gpu register list Akhil P Oommen 2022-07-09 5:59 ` Akhil P Oommen 2022-07-11 23:27 ` Doug Anderson 2022-07-11 23:27 ` Doug Anderson 2022-07-14 5:40 ` Akhil P Oommen 2022-07-14 5:40 ` Akhil P Oommen 2022-07-19 4:07 ` [Freedreno] " Akhil P Oommen 2022-07-19 4:07 ` Akhil P Oommen 2022-07-19 5:49 ` Stephen Boyd 2022-07-19 5:49 ` Stephen Boyd 2022-07-19 6:37 ` Akhil P Oommen 2022-07-19 6:37 ` Akhil P Oommen 2022-07-19 7:19 ` Stephen Boyd 2022-07-19 7:19 ` Stephen Boyd 2022-07-19 9:56 ` Rajendra Nayak 2022-07-19 9:56 ` Rajendra Nayak 2022-07-20 6:04 ` Akhil P Oommen [this message] 2022-07-20 6:04 ` Akhil P Oommen 2022-07-21 16:04 ` Akhil P Oommen 2022-07-21 16:04 ` Akhil P Oommen 2022-07-22 15:28 ` Rob Clark 2022-07-22 15:28 ` Rob Clark 2022-07-09 5:59 ` [PATCH v2 6/7] drm/msm/a6xx: Improve gpu recovery sequence Akhil P Oommen 2022-07-09 5:59 ` Akhil P Oommen 2022-07-09 5:59 ` [PATCH v2 7/7] drm/msm/a6xx: Handle GMU prepare-slumber hfi failure Akhil P Oommen 2022-07-09 5:59 ` Akhil P Oommen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=698d3279-6a02-9b1e-a3bd-627b6afbc57e@quicinc.com \ --to=quic_akhilpo@quicinc.com \ --cc=agross@kernel.org \ --cc=bjorn.andersson@linaro.org \ --cc=devicetree@vger.kernel.org \ --cc=dianders@chromium.org \ --cc=dri-devel@lists.freedesktop.org \ --cc=freedreno@lists.freedesktop.org \ --cc=jonathan@marek.ca \ --cc=jordan@cosmicpenguin.net \ --cc=krzysztof.kozlowski+dt@linaro.org \ --cc=linux-arm-msm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mka@chromium.org \ --cc=quic_rjendra@quicinc.com \ --cc=quic_tdas@quicinc.com \ --cc=robdclark@gmail.com \ --cc=robh+dt@kernel.org \ --cc=swboyd@chromium.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.