From: Daniel Vetter <daniel@ffwll.ch> To: "Christian König" <christian.koenig@amd.com> Cc: Rob Clark <robdclark@gmail.com>, Rob Clark <robdclark@chromium.org>, Jonathan Marek <jonathan@marek.ca>, David Airlie <airlied@linux.ie>, freedreno <freedreno@lists.freedesktop.org>, Vladimir Lypak <vladimir.lypak@gmail.com>, Abhinav Kumar <quic_abhinavk@quicinc.com>, dri-devel <dri-devel@lists.freedesktop.org>, Bjorn Andersson <bjorn.andersson@linaro.org>, Akhil P Oommen <quic_akhilpo@quicinc.com>, linux-arm-msm <linux-arm-msm@vger.kernel.org>, Sean Paul <sean@poorly.run>, open list <linux-kernel@vger.kernel.org>, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Subject: Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend Date: Thu, 17 Mar 2022 18:29:29 +0100 [thread overview] Message-ID: <YjNv+csmFyjNYc5b@phenom.ffwll.local> (raw) In-Reply-To: <915537e2-ac5b-ab0e-3697-2b16a9ec8f91@amd.com> On Thu, Mar 17, 2022 at 05:44:57PM +0100, Christian König wrote: > Am 17.03.22 um 17:18 schrieb Rob Clark: > > On Thu, Mar 17, 2022 at 9:04 AM Christian König > > <christian.koenig@amd.com> wrote: > > > Am 17.03.22 um 16:10 schrieb Rob Clark: > > > > [SNIP] > > > > userspace frozen != kthread frozen .. that is what this patch is > > > > trying to address, so we aren't racing between shutting down the hw > > > > and the scheduler shoveling more jobs at us. > > > Well exactly that's the problem. The scheduler is supposed to shoveling > > > more jobs at us until it is empty. > > > > > > Thinking more about it we will then keep some dma_fence instance > > > unsignaled and that is and extremely bad idea since it can lead to > > > deadlocks during suspend. > > Hmm, perhaps that is true if you need to migrate things out of vram? > > It is at least not a problem when vram is not involved. > > No, it's much wider than that. > > See what can happen is that the memory management shrinkers want to wait for > a dma_fence during suspend. > > And if you stop the scheduler they will just wait forever. > > What you need to do instead is to drain the scheduler, e.g. call > drm_sched_entity_flush() with a proper timeout for each entity you have > created. Yeah I think properly flushing the scheduler and stopping it and cutting all drivers over to that sounds like the right approach. Generally suspend shouldn't be such a critical path that this will hurt us, all the other io queues get flushed too afaik. Resume is the thing that needs to go real fast. So a patch set to move all drivers that open code the kthread_park to the right scheduler function sounds like the right idea here to me. -Daniel > > Regards, > Christian. > > > > > > So this patch here is an absolute clear NAK from my side. If amdgpu is > > > doing something similar that is a severe bug and needs to be addressed > > > somehow. > > I think amdgpu's use of kthread_park is not related to suspend, but > > didn't look too closely. > > > > And perhaps the solution for this problem is more complex in the case > > of amdgpu, I'm not super familiar with the constraints there. But I > > think it is a fine solution for integrated GPUs. > > > > BR, > > -R > > > > > Regards, > > > Christian. > > > > > > > BR, > > > > -R > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch> To: "Christian König" <christian.koenig@amd.com> Cc: Rob Clark <robdclark@chromium.org>, Jonathan Marek <jonathan@marek.ca>, David Airlie <airlied@linux.ie>, linux-arm-msm <linux-arm-msm@vger.kernel.org>, Vladimir Lypak <vladimir.lypak@gmail.com>, Abhinav Kumar <quic_abhinavk@quicinc.com>, dri-devel <dri-devel@lists.freedesktop.org>, Bjorn Andersson <bjorn.andersson@linaro.org>, Akhil P Oommen <quic_akhilpo@quicinc.com>, Sean Paul <sean@poorly.run>, freedreno <freedreno@lists.freedesktop.org>, open list <linux-kernel@vger.kernel.org>, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Subject: Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend Date: Thu, 17 Mar 2022 18:29:29 +0100 [thread overview] Message-ID: <YjNv+csmFyjNYc5b@phenom.ffwll.local> (raw) In-Reply-To: <915537e2-ac5b-ab0e-3697-2b16a9ec8f91@amd.com> On Thu, Mar 17, 2022 at 05:44:57PM +0100, Christian König wrote: > Am 17.03.22 um 17:18 schrieb Rob Clark: > > On Thu, Mar 17, 2022 at 9:04 AM Christian König > > <christian.koenig@amd.com> wrote: > > > Am 17.03.22 um 16:10 schrieb Rob Clark: > > > > [SNIP] > > > > userspace frozen != kthread frozen .. that is what this patch is > > > > trying to address, so we aren't racing between shutting down the hw > > > > and the scheduler shoveling more jobs at us. > > > Well exactly that's the problem. The scheduler is supposed to shoveling > > > more jobs at us until it is empty. > > > > > > Thinking more about it we will then keep some dma_fence instance > > > unsignaled and that is and extremely bad idea since it can lead to > > > deadlocks during suspend. > > Hmm, perhaps that is true if you need to migrate things out of vram? > > It is at least not a problem when vram is not involved. > > No, it's much wider than that. > > See what can happen is that the memory management shrinkers want to wait for > a dma_fence during suspend. > > And if you stop the scheduler they will just wait forever. > > What you need to do instead is to drain the scheduler, e.g. call > drm_sched_entity_flush() with a proper timeout for each entity you have > created. Yeah I think properly flushing the scheduler and stopping it and cutting all drivers over to that sounds like the right approach. Generally suspend shouldn't be such a critical path that this will hurt us, all the other io queues get flushed too afaik. Resume is the thing that needs to go real fast. So a patch set to move all drivers that open code the kthread_park to the right scheduler function sounds like the right idea here to me. -Daniel > > Regards, > Christian. > > > > > > So this patch here is an absolute clear NAK from my side. If amdgpu is > > > doing something similar that is a severe bug and needs to be addressed > > > somehow. > > I think amdgpu's use of kthread_park is not related to suspend, but > > didn't look too closely. > > > > And perhaps the solution for this problem is more complex in the case > > of amdgpu, I'm not super familiar with the constraints there. But I > > think it is a fine solution for integrated GPUs. > > > > BR, > > -R > > > > > Regards, > > > Christian. > > > > > > > BR, > > > > -R > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
next prev parent reply other threads:[~2022-03-17 17:29 UTC|newest] Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-03-10 23:46 [PATCH 0/3] drm/msm/gpu: More system suspend fixes Rob Clark 2022-03-10 23:46 ` Rob Clark 2022-03-10 23:46 ` [PATCH 1/3] drm/msm/gpu: Rename runtime suspend/resume functions Rob Clark 2022-03-10 23:46 ` Rob Clark 2022-03-11 9:26 ` AngeloGioacchino Del Regno 2022-03-11 9:26 ` AngeloGioacchino Del Regno 2022-03-10 23:46 ` [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend Rob Clark 2022-03-10 23:46 ` Rob Clark 2022-03-17 9:59 ` Daniel Vetter 2022-03-17 9:59 ` Daniel Vetter 2022-03-17 10:06 ` Christian König 2022-03-17 14:58 ` Matthew Brost 2022-03-17 14:58 ` Matthew Brost 2022-03-17 15:10 ` Rob Clark 2022-03-17 15:10 ` Rob Clark 2022-03-17 16:04 ` Christian König 2022-03-17 16:04 ` Christian König 2022-03-17 16:18 ` Rob Clark 2022-03-17 16:18 ` Rob Clark 2022-03-17 16:44 ` Christian König 2022-03-17 16:44 ` Christian König 2022-03-17 17:29 ` Daniel Vetter [this message] 2022-03-17 17:29 ` Daniel Vetter 2022-03-17 17:35 ` Rob Clark 2022-03-17 17:35 ` Rob Clark 2022-03-17 18:10 ` Andrey Grodzovsky 2022-03-17 18:10 ` Andrey Grodzovsky 2022-03-17 18:25 ` Rob Clark 2022-03-17 18:25 ` Rob Clark 2022-03-17 19:49 ` Andrey Grodzovsky 2022-03-17 19:49 ` Andrey Grodzovsky 2022-03-17 20:35 ` Rob Clark 2022-03-17 20:35 ` Rob Clark 2022-03-18 16:04 ` Andrey Grodzovsky 2022-03-18 16:04 ` Andrey Grodzovsky 2022-03-18 16:20 ` Rob Clark 2022-03-18 16:20 ` Rob Clark 2022-03-18 16:27 ` Andrey Grodzovsky 2022-03-18 16:27 ` Andrey Grodzovsky 2022-03-18 17:22 ` Rob Clark 2022-03-18 17:22 ` Rob Clark 2022-03-18 20:14 ` Andrey Grodzovsky 2022-03-18 20:14 ` Andrey Grodzovsky 2022-03-17 17:46 ` Andrey Grodzovsky 2022-03-17 17:46 ` Andrey Grodzovsky 2022-03-10 23:46 ` [PATCH 3/3] drm/msm/gpu: Remove mutex from wait_event condition Rob Clark 2022-03-10 23:46 ` Rob Clark 2022-03-17 20:45 ` [Freedreno] " Akhil P Oommen 2022-03-17 20:45 ` Akhil P Oommen 2022-03-17 21:07 ` Rob Clark 2022-03-17 21:07 ` Rob Clark 2022-03-18 2:55 ` Hillf Danton
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YjNv+csmFyjNYc5b@phenom.ffwll.local \ --to=daniel@ffwll.ch \ --cc=airlied@linux.ie \ --cc=angelogioacchino.delregno@collabora.com \ --cc=bjorn.andersson@linaro.org \ --cc=christian.koenig@amd.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=freedreno@lists.freedesktop.org \ --cc=jonathan@marek.ca \ --cc=linux-arm-msm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=quic_abhinavk@quicinc.com \ --cc=quic_akhilpo@quicinc.com \ --cc=robdclark@chromium.org \ --cc=robdclark@gmail.com \ --cc=sean@poorly.run \ --cc=vladimir.lypak@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.