All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Rob Clark <robdclark@gmail.com>,
	Andrey Grodzovsky <andrey.grodzovsky@amd.com>,
	Christian Koenig <christian.koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org, freedreno@lists.freedesktop.org,
	linux-arm-msm@vger.kernel.org, Rob Clark <robdclark@chromium.org>,
	Sean Paul <sean@poorly.run>,
	Abhinav Kumar <quic_abhinavk@quicinc.com>,
	David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>,
	Akhil P Oommen <quic_akhilpo@quicinc.com>,
	Jonathan Marek <jonathan@marek.ca>,
	AngeloGioacchino Del Regno 
	<angelogioacchino.delregno@collabora.com>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Vladimir Lypak <vladimir.lypak@gmail.com>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
Date: Thu, 17 Mar 2022 10:59:05 +0100	[thread overview]
Message-ID: <YjMGac4Hnjmg1wE8@phenom.ffwll.local> (raw)
In-Reply-To: <20220310234611.424743-3-robdclark@gmail.com>

On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> In the system suspend path, we don't want to be racing with the
> scheduler kthreads pushing additional queued up jobs to the hw
> queue (ringbuffer).  So park them first.  While we are at it,
> move the wait for active jobs to complete into the new system-
> suspend path.
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 68 ++++++++++++++++++++--
>  1 file changed, 64 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 8859834b51b8..0440a98988fc 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu)
>  static int adreno_runtime_suspend(struct device *dev)
>  {
>  	struct msm_gpu *gpu = dev_to_gpu(dev);
> -	int remaining;
> +
> +	/*
> +	 * We should be holding a runpm ref, which will prevent
> +	 * runtime suspend.  In the system suspend path, we've
> +	 * already waited for active jobs to complete.
> +	 */
> +	WARN_ON_ONCE(gpu->active_submits);
> +
> +	return gpu->funcs->pm_suspend(gpu);
> +}
> +
> +static void suspend_scheduler(struct msm_gpu *gpu)
> +{
> +	int i;
> +
> +	/*
> +	 * Shut down the scheduler before we force suspend, so that
> +	 * suspend isn't racing with scheduler kthread feeding us
> +	 * more work.
> +	 *
> +	 * Note, we just want to park the thread, and let any jobs
> +	 * that are already on the hw queue complete normally, as
> +	 * opposed to the drm_sched_stop() path used for handling
> +	 * faulting/timed-out jobs.  We can't really cancel any jobs
> +	 * already on the hw queue without racing with the GPU.
> +	 */
> +	for (i = 0; i < gpu->nr_rings; i++) {
> +		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> +		kthread_park(sched->thread);

Shouldn't we have some proper interfaces for this? Also I'm kinda
wondering how other drivers do this, feels like we should have a standard
way.

Finally not flushing out all in-flight requests sounds a bit like a bad
idea for system suspend/resume since that's also the hibernation path, and
that would mean your shrinker/page reclaim stops working. At least in full
generality. Which ain't good for hibernation.

Adding Christian and Andrey.
-Daniel

> +	}
> +}
> +
> +static void resume_scheduler(struct msm_gpu *gpu)
> +{
> +	int i;
> +
> +	for (i = 0; i < gpu->nr_rings; i++) {
> +		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> +		kthread_unpark(sched->thread);
> +	}
> +}
> +
> +static int adreno_system_suspend(struct device *dev)
> +{
> +	struct msm_gpu *gpu = dev_to_gpu(dev);
> +	int remaining, ret;
> +
> +	suspend_scheduler(gpu);
>  
>  	remaining = wait_event_timeout(gpu->retire_event,
>  				       active_submits(gpu) == 0,
>  				       msecs_to_jiffies(1000));
>  	if (remaining == 0) {
>  		dev_err(dev, "Timeout waiting for GPU to suspend\n");
> -		return -EBUSY;
> +		ret = -EBUSY;
> +		goto out;
>  	}
>  
> -	return gpu->funcs->pm_suspend(gpu);
> +	ret = pm_runtime_force_suspend(dev);
> +out:
> +	if (ret)
> +		resume_scheduler(gpu);
> +
> +	return ret;
>  }
> +
> +static int adreno_system_resume(struct device *dev)
> +{
> +	resume_scheduler(dev_to_gpu(dev));
> +	return pm_runtime_force_resume(dev);
> +}
> +
>  #endif
>  
>  static const struct dev_pm_ops adreno_pm_ops = {
> -	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> +	SET_SYSTEM_SLEEP_PM_OPS(adreno_system_suspend, adreno_system_resume)
>  	SET_RUNTIME_PM_OPS(adreno_runtime_suspend, adreno_runtime_resume, NULL)
>  };
>  
> -- 
> 2.35.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Rob Clark <robdclark@gmail.com>,
	Andrey Grodzovsky <andrey.grodzovsky@amd.com>,
	Christian Koenig <christian.koenig@amd.com>
Cc: Rob Clark <robdclark@chromium.org>,
	freedreno@lists.freedesktop.org,
	Jonathan Marek <jonathan@marek.ca>,
	David Airlie <airlied@linux.ie>,
	linux-arm-msm@vger.kernel.org,
	Vladimir Lypak <vladimir.lypak@gmail.com>,
	Abhinav Kumar <quic_abhinavk@quicinc.com>,
	dri-devel@lists.freedesktop.org,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Akhil P Oommen <quic_akhilpo@quicinc.com>,
	Sean Paul <sean@poorly.run>,
	open list <linux-kernel@vger.kernel.org>,
	AngeloGioacchino Del Regno
	<angelogioacchino.delregno@collabora.com>
Subject: Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
Date: Thu, 17 Mar 2022 10:59:05 +0100	[thread overview]
Message-ID: <YjMGac4Hnjmg1wE8@phenom.ffwll.local> (raw)
In-Reply-To: <20220310234611.424743-3-robdclark@gmail.com>

On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> In the system suspend path, we don't want to be racing with the
> scheduler kthreads pushing additional queued up jobs to the hw
> queue (ringbuffer).  So park them first.  While we are at it,
> move the wait for active jobs to complete into the new system-
> suspend path.
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 68 ++++++++++++++++++++--
>  1 file changed, 64 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 8859834b51b8..0440a98988fc 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu)
>  static int adreno_runtime_suspend(struct device *dev)
>  {
>  	struct msm_gpu *gpu = dev_to_gpu(dev);
> -	int remaining;
> +
> +	/*
> +	 * We should be holding a runpm ref, which will prevent
> +	 * runtime suspend.  In the system suspend path, we've
> +	 * already waited for active jobs to complete.
> +	 */
> +	WARN_ON_ONCE(gpu->active_submits);
> +
> +	return gpu->funcs->pm_suspend(gpu);
> +}
> +
> +static void suspend_scheduler(struct msm_gpu *gpu)
> +{
> +	int i;
> +
> +	/*
> +	 * Shut down the scheduler before we force suspend, so that
> +	 * suspend isn't racing with scheduler kthread feeding us
> +	 * more work.
> +	 *
> +	 * Note, we just want to park the thread, and let any jobs
> +	 * that are already on the hw queue complete normally, as
> +	 * opposed to the drm_sched_stop() path used for handling
> +	 * faulting/timed-out jobs.  We can't really cancel any jobs
> +	 * already on the hw queue without racing with the GPU.
> +	 */
> +	for (i = 0; i < gpu->nr_rings; i++) {
> +		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> +		kthread_park(sched->thread);

Shouldn't we have some proper interfaces for this? Also I'm kinda
wondering how other drivers do this, feels like we should have a standard
way.

Finally not flushing out all in-flight requests sounds a bit like a bad
idea for system suspend/resume since that's also the hibernation path, and
that would mean your shrinker/page reclaim stops working. At least in full
generality. Which ain't good for hibernation.

Adding Christian and Andrey.
-Daniel

> +	}
> +}
> +
> +static void resume_scheduler(struct msm_gpu *gpu)
> +{
> +	int i;
> +
> +	for (i = 0; i < gpu->nr_rings; i++) {
> +		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> +		kthread_unpark(sched->thread);
> +	}
> +}
> +
> +static int adreno_system_suspend(struct device *dev)
> +{
> +	struct msm_gpu *gpu = dev_to_gpu(dev);
> +	int remaining, ret;
> +
> +	suspend_scheduler(gpu);
>  
>  	remaining = wait_event_timeout(gpu->retire_event,
>  				       active_submits(gpu) == 0,
>  				       msecs_to_jiffies(1000));
>  	if (remaining == 0) {
>  		dev_err(dev, "Timeout waiting for GPU to suspend\n");
> -		return -EBUSY;
> +		ret = -EBUSY;
> +		goto out;
>  	}
>  
> -	return gpu->funcs->pm_suspend(gpu);
> +	ret = pm_runtime_force_suspend(dev);
> +out:
> +	if (ret)
> +		resume_scheduler(gpu);
> +
> +	return ret;
>  }
> +
> +static int adreno_system_resume(struct device *dev)
> +{
> +	resume_scheduler(dev_to_gpu(dev));
> +	return pm_runtime_force_resume(dev);
> +}
> +
>  #endif
>  
>  static const struct dev_pm_ops adreno_pm_ops = {
> -	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> +	SET_SYSTEM_SLEEP_PM_OPS(adreno_system_suspend, adreno_system_resume)
>  	SET_RUNTIME_PM_OPS(adreno_runtime_suspend, adreno_runtime_resume, NULL)
>  };
>  
> -- 
> 2.35.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

  reply	other threads:[~2022-03-17  9:59 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-10 23:46 [PATCH 0/3] drm/msm/gpu: More system suspend fixes Rob Clark
2022-03-10 23:46 ` Rob Clark
2022-03-10 23:46 ` [PATCH 1/3] drm/msm/gpu: Rename runtime suspend/resume functions Rob Clark
2022-03-10 23:46   ` Rob Clark
2022-03-11  9:26   ` AngeloGioacchino Del Regno
2022-03-11  9:26     ` AngeloGioacchino Del Regno
2022-03-10 23:46 ` [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend Rob Clark
2022-03-10 23:46   ` Rob Clark
2022-03-17  9:59   ` Daniel Vetter [this message]
2022-03-17  9:59     ` Daniel Vetter
2022-03-17 10:06     ` Christian König
2022-03-17 14:58       ` Matthew Brost
2022-03-17 14:58         ` Matthew Brost
2022-03-17 15:10       ` Rob Clark
2022-03-17 15:10         ` Rob Clark
2022-03-17 16:04         ` Christian König
2022-03-17 16:04           ` Christian König
2022-03-17 16:18           ` Rob Clark
2022-03-17 16:18             ` Rob Clark
2022-03-17 16:44             ` Christian König
2022-03-17 16:44               ` Christian König
2022-03-17 17:29               ` Daniel Vetter
2022-03-17 17:29                 ` Daniel Vetter
2022-03-17 17:35               ` Rob Clark
2022-03-17 17:35                 ` Rob Clark
2022-03-17 18:10                 ` Andrey Grodzovsky
2022-03-17 18:10                   ` Andrey Grodzovsky
2022-03-17 18:25                   ` Rob Clark
2022-03-17 18:25                     ` Rob Clark
2022-03-17 19:49                     ` Andrey Grodzovsky
2022-03-17 19:49                       ` Andrey Grodzovsky
2022-03-17 20:35                       ` Rob Clark
2022-03-17 20:35                         ` Rob Clark
2022-03-18 16:04                         ` Andrey Grodzovsky
2022-03-18 16:04                           ` Andrey Grodzovsky
2022-03-18 16:20                           ` Rob Clark
2022-03-18 16:20                             ` Rob Clark
2022-03-18 16:27                             ` Andrey Grodzovsky
2022-03-18 16:27                               ` Andrey Grodzovsky
2022-03-18 17:22                               ` Rob Clark
2022-03-18 17:22                                 ` Rob Clark
2022-03-18 20:14                                 ` Andrey Grodzovsky
2022-03-18 20:14                                   ` Andrey Grodzovsky
2022-03-17 17:46           ` Andrey Grodzovsky
2022-03-17 17:46             ` Andrey Grodzovsky
2022-03-10 23:46 ` [PATCH 3/3] drm/msm/gpu: Remove mutex from wait_event condition Rob Clark
2022-03-10 23:46   ` Rob Clark
2022-03-17 20:45   ` [Freedreno] " Akhil P Oommen
2022-03-17 20:45     ` Akhil P Oommen
2022-03-17 21:07     ` Rob Clark
2022-03-17 21:07       ` Rob Clark
2022-03-18  2:55       ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YjMGac4Hnjmg1wE8@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=airlied@linux.ie \
    --cc=andrey.grodzovsky@amd.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=bjorn.andersson@linaro.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=jonathan@marek.ca \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=quic_abhinavk@quicinc.com \
    --cc=quic_akhilpo@quicinc.com \
    --cc=robdclark@chromium.org \
    --cc=robdclark@gmail.com \
    --cc=sean@poorly.run \
    --cc=vladimir.lypak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.