dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: dinghao.liu@zju.edu.cn
To: "Steven Price" <steven.price@arm.com>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>,
	David Airlie <airlied@linux.ie>,
	kjlu@umn.edu, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org,
	Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Subject: Re: Re: [PATCH] drm/panfrost: fix runtime pm imbalance on error
Date: Thu, 21 May 2020 15:00:14 +0800 (GMT+08:00)	[thread overview]
Message-ID: <1986c141.ba6f5.172360851d6.Coremail.dinghao.liu@zju.edu.cn> (raw)
In-Reply-To: <73a1dc37-f862-f908-4c9f-64e256283857@arm.com>

Hi Steve,

There are two bailing out points in panfrost_job_hw_submit(): one is 
the error path beginning from pm_runtime_get_sync(), the other one is 
the error path beginning from WARN_ON() in the if statement. The pm 
imbalance fixed in this patch is between these two paths. I think the 
caller of panfrost_job_hw_submit() cannot distinguish this imbalance 
outside this function. 

panfrost_job_timedout() calls pm_runtime_put_noidle() for every job it 
finds, but all jobs are added to the pfdev->jobs just before calling
panfrost_job_hw_submit(). Therefore I think the imbalance still exists.
But I'm not very sure if we should add pm_runtime_put on the error path
after pm_runtime_get_sync(), or remove pm_runtime_put one the error path
after WARN_ON(). 

As for the problem about panfrost_devfreq_record_busy(), this may be a 
new bug and requires independent patch to fix it.

Regards,
Dinghao


> On 20/05/2020 12:05, Dinghao Liu wrote:
> > pm_runtime_get_sync() increments the runtime PM usage counter even
> > the call returns an error code. Thus a pairing decrement is needed
> > on the error handling path to keep the counter balanced.
> > 
> > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
> 
> Actually I think we have the opposite problem. To be honest we don't 
> handle this situation very well. By the time panfrost_job_hw_submit() is 
> called the job has already been added to the pfdev->jobs array, so it's 
> considered submitted even if it never actually lands on the hardware. So 
> in the case of this function bailing out early we will then (eventually) 
> hit a timeout and trigger a GPU reset.
> 
> panfrost_job_timedout() iterates through the pfdev->jobs array and calls 
> pm_runtime_put_noidle() for each job it finds. So there's no inbalance 
> here that I can see.
> 
> Have you actually observed the situation where pm_runtime_get_sync() 
> returns a failure?
> 
> HOWEVER, it appears that by bailing out early the call to 
> panfrost_devfreq_record_busy() is never made, which as far as I can see 
> means that there may be an extra call to panfrost_devfreq_record_idle() 
> when the jobs have timed out. Which could underflow the counter.
> 
> But equally looking at panfrost_job_timedout(), we only call 
> panfrost_devfreq_record_idle() *once* even though multiple jobs might be 
> processed.
> 
> There's a completely untested patch below which in theory should fix that...
> 
> Steve
> 
> ----8<---
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 7914b1570841..f9519afca29d 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -145,6 +145,8 @@ static void panfrost_job_hw_submit(struct 
> panfrost_job *job, int js)
>   	u64 jc_head = job->jc;
>   	int ret;
> 
> +	panfrost_devfreq_record_busy(pfdev);
> +
>   	ret = pm_runtime_get_sync(pfdev->dev);
>   	if (ret < 0)
>   		return;
> @@ -155,7 +157,6 @@ static void panfrost_job_hw_submit(struct 
> panfrost_job *job, int js)
>   	}
> 
>   	cfg = panfrost_mmu_as_get(pfdev, &job->file_priv->mmu);
> -	panfrost_devfreq_record_busy(pfdev);
> 
>   	job_write(pfdev, JS_HEAD_NEXT_LO(js), jc_head & 0xFFFFFFFF);
>   	job_write(pfdev, JS_HEAD_NEXT_HI(js), jc_head >> 32);
> @@ -410,12 +411,12 @@ static void panfrost_job_timedout(struct 
> drm_sched_job *sched_job)
>   	for (i = 0; i < NUM_JOB_SLOTS; i++) {
>   		if (pfdev->jobs[i]) {
>   			pm_runtime_put_noidle(pfdev->dev);
> +			panfrost_devfreq_record_idle(pfdev);
>   			pfdev->jobs[i] = NULL;
>   		}
>   	}
>   	spin_unlock_irqrestore(&pfdev->js->job_lock, flags);
> 
> -	panfrost_devfreq_record_idle(pfdev);
>   	panfrost_device_reset(pfdev);
> 
>   	for (i = 0; i < NUM_JOB_SLOTS; i++)
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-05-22  6:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-20 11:05 [PATCH] drm/panfrost: fix runtime pm imbalance on error Dinghao Liu
2020-05-20 14:02 ` Steven Price
2020-05-21  7:00   ` dinghao.liu [this message]
2020-05-22 13:09     ` Steven Price
2020-05-22 13:23       ` dinghao.liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1986c141.ba6f5.172360851d6.Coremail.dinghao.liu@zju.edu.cn \
    --to=dinghao.liu@zju.edu.cn \
    --cc=airlied@linux.ie \
    --cc=alyssa.rosenzweig@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kjlu@umn.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=steven.price@arm.com \
    --cc=tomeu.vizoso@collabora.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).