dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Chema Casanova <jmcasanova@igalia.com>
To: Yukimasa Sugizaki <ysugi@idein.jp>, dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Subject: Re: [PATCH 1/3] drm/v3d: Don't resubmit guilty CSD jobs
Date: Thu, 4 Feb 2021 14:54:11 +0100	[thread overview]
Message-ID: <c934402e-efe7-8e7a-0182-5ffd2d05a4e8@igalia.com> (raw)
In-Reply-To: <20200903164821.2879-2-i.can.speak.c.and.basic@gmail.com>

I've tested the patch and confirmed that applies correctly over drm-next.

I've also confirmed that the timeout happens with the described test 
case by the developer.

https://github.com/raspberrypi/linux/pull/3816#issuecomment-682251862

Considering this is my first review of a patch in v3d kernel side I 
think this patch is fine.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>

On 3/9/20 18:48, Yukimasa Sugizaki wrote:
> From: Yukimasa Sugizaki <ysugi@idein.jp>
>
> The previous code misses a check for the timeout error set by
> drm_sched_resubmit_jobs(), which results in an infinite GPU reset loop
> if once a timeout occurs:
>
> [  178.799106] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
> [  178.807836] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
> [  179.839132] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
> [  179.847865] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
> [  180.879146] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
> [  180.887925] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
> [  181.919188] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
> [  181.928002] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
> ...
>
> This commit adds the check for timeout as in v3d_{bin,render}_job_run():
>
> [   66.408962] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
> [   66.417734] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
> [   66.428296] [drm] Skipping CSD job resubmission due to previous error (-125)
>
> , where -125 is -ECANCELED, though users currently have no way other
> than inspecting the dmesg to check if the timeout has occurred.
>
> Signed-off-by: Yukimasa Sugizaki <ysugi@idein.jp>
> ---
>   drivers/gpu/drm/v3d/v3d_sched.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 0747614a78f0..001216f22017 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -226,6 +226,17 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
>   	struct dma_fence *fence;
>   	int i;
>
> +	/* This error is set to -ECANCELED by drm_sched_resubmit_jobs() if this
> +	 * job timed out more than sched_job->sched->hang_limit times.
> +	 */
> +	int error = sched_job->s_fence->finished.error;
> +
> +	if (unlikely(error < 0)) {
> +		DRM_WARN("Skipping CSD job resubmission due to previous error (%d)\n",
> +			 error);
> +		return ERR_PTR(error);
> +	}
> +
>   	v3d->csd_job = job;
>
>   	v3d_invalidate_caches(v3d);
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2021-02-04 14:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-03 16:48 [PATCH 0/3] drm/v3d: CL/CSD job timeout fixes Yukimasa Sugizaki
2020-09-03 16:48 ` [PATCH 1/3] drm/v3d: Don't resubmit guilty CSD jobs Yukimasa Sugizaki
2021-02-04 13:54   ` Chema Casanova [this message]
2020-09-03 16:48 ` [PATCH 2/3] drm/v3d: Correctly restart the timer when progress is made Yukimasa Sugizaki
2020-09-03 16:48 ` [PATCH 3/3] drm/v3d: Add job timeout module param Yukimasa Sugizaki
2021-02-04 18:09   ` Chema Casanova
2021-02-04 19:34     ` Eric Anholt
2021-02-05 12:28       ` Yukimasa Sugizaki
2021-02-10 17:59         ` Chema Casanova
2021-02-11  6:31           ` Yukimasa Sugizaki
2020-09-04  8:15 [PATCH 0/3] drm/v3d: CL/CSD job timeout fixes Yukimasa Sugizaki
2020-09-04  8:15 ` [PATCH 1/3] drm/v3d: Don't resubmit guilty CSD jobs Yukimasa Sugizaki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c934402e-efe7-8e7a-0182-5ffd2d05a4e8@igalia.com \
    --to=jmcasanova@igalia.com \
    --cc=airlied@linux.ie \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ysugi@idein.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).