dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/etnaviv: always start/stop scheduler in timeout processing
@ 2020-08-24 11:02 Lucas Stach
  2020-08-24 11:54 ` Russell King - ARM Linux admin
  2020-08-24 14:11 ` Fabio Estevam
  0 siblings, 2 replies; 4+ messages in thread
From: Lucas Stach @ 2020-08-24 11:02 UTC (permalink / raw)
  To: etnaviv; +Cc: patchwork-lst, kernel, dri-devel, Russell King

The drm scheduler currently expects that the stop/start sequence is always
executed in the timeout handling, as the job at the head of the hardware
execution list is always removed from the ring mirror before the driver
function is called and only inserted back into the list when starting the
scheduler.

This adds some unnecessary overhead if the timeout handler determines
that the GPU is still executing jobs normally and just wished to extend
the timeout, but a better solution requires a major rearchitecture of the
scheduler, which is not applicable as a fix.

Fixes: 135517d3565b drm/scheduler: Avoid accessing freed bad job.)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
---
 drivers/gpu/drm/etnaviv/etnaviv_sched.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 4e3e95dce6d8..cd46c882269c 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -89,12 +89,15 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 	u32 dma_addr;
 	int change;
 
+	/* block scheduler */
+	drm_sched_stop(&gpu->sched, sched_job);
+
 	/*
 	 * If the GPU managed to complete this jobs fence, the timout is
 	 * spurious. Bail out.
 	 */
 	if (dma_fence_is_signaled(submit->out_fence))
-		return;
+		goto out_no_timeout;
 
 	/*
 	 * If the GPU is still making forward progress on the front-end (which
@@ -105,12 +108,9 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 	change = dma_addr - gpu->hangcheck_dma_addr;
 	if (change < 0 || change > 16) {
 		gpu->hangcheck_dma_addr = dma_addr;
-		return;
+		goto out_no_timeout;
 	}
 
-	/* block scheduler */
-	drm_sched_stop(&gpu->sched, sched_job);
-
 	if(sched_job)
 		drm_sched_increase_karma(sched_job);
 
@@ -120,6 +120,7 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 
 	drm_sched_resubmit_jobs(&gpu->sched);
 
+out_no_timeout:
 	/* restart scheduler after GPU is usable again */
 	drm_sched_start(&gpu->sched, true);
 }
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/etnaviv: always start/stop scheduler in timeout processing
  2020-08-24 11:02 [PATCH] drm/etnaviv: always start/stop scheduler in timeout processing Lucas Stach
@ 2020-08-24 11:54 ` Russell King - ARM Linux admin
  2020-08-24 14:11 ` Fabio Estevam
  1 sibling, 0 replies; 4+ messages in thread
From: Russell King - ARM Linux admin @ 2020-08-24 11:54 UTC (permalink / raw)
  To: Lucas Stach; +Cc: kernel, etnaviv, dri-devel, patchwork-lst

On Mon, Aug 24, 2020 at 01:02:48PM +0200, Lucas Stach wrote:
> The drm scheduler currently expects that the stop/start sequence is always
> executed in the timeout handling, as the job at the head of the hardware
> execution list is always removed from the ring mirror before the driver
> function is called and only inserted back into the list when starting the
> scheduler.
> 
> This adds some unnecessary overhead if the timeout handler determines
> that the GPU is still executing jobs normally and just wished to extend
> the timeout, but a better solution requires a major rearchitecture of the
> scheduler, which is not applicable as a fix.
> 
> Fixes: 135517d3565b drm/scheduler: Avoid accessing freed bad job.)
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

From a brief test, this seems to fix the problem, thanks.

Tested-by: Russell King <rmk+kernel@armlinux.org.uk>

> ---
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 4e3e95dce6d8..cd46c882269c 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -89,12 +89,15 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>  	u32 dma_addr;
>  	int change;
>  
> +	/* block scheduler */
> +	drm_sched_stop(&gpu->sched, sched_job);
> +
>  	/*
>  	 * If the GPU managed to complete this jobs fence, the timout is
>  	 * spurious. Bail out.
>  	 */
>  	if (dma_fence_is_signaled(submit->out_fence))
> -		return;
> +		goto out_no_timeout;
>  
>  	/*
>  	 * If the GPU is still making forward progress on the front-end (which
> @@ -105,12 +108,9 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>  	change = dma_addr - gpu->hangcheck_dma_addr;
>  	if (change < 0 || change > 16) {
>  		gpu->hangcheck_dma_addr = dma_addr;
> -		return;
> +		goto out_no_timeout;
>  	}
>  
> -	/* block scheduler */
> -	drm_sched_stop(&gpu->sched, sched_job);
> -
>  	if(sched_job)
>  		drm_sched_increase_karma(sched_job);
>  
> @@ -120,6 +120,7 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>  
>  	drm_sched_resubmit_jobs(&gpu->sched);
>  
> +out_no_timeout:
>  	/* restart scheduler after GPU is usable again */
>  	drm_sched_start(&gpu->sched, true);
>  }
> -- 
> 2.20.1
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/etnaviv: always start/stop scheduler in timeout processing
  2020-08-24 11:02 [PATCH] drm/etnaviv: always start/stop scheduler in timeout processing Lucas Stach
  2020-08-24 11:54 ` Russell King - ARM Linux admin
@ 2020-08-24 14:11 ` Fabio Estevam
  2020-08-25  8:44   ` Lucas Stach
  1 sibling, 1 reply; 4+ messages in thread
From: Fabio Estevam @ 2020-08-24 14:11 UTC (permalink / raw)
  To: Lucas Stach
  Cc: The etnaviv authors, DRI mailing list, patchwork-lst,
	Sascha Hauer, Russell King

Hi Lucas,

On Mon, Aug 24, 2020 at 8:02 AM Lucas Stach <l.stach@pengutronix.de> wrote:
>
> The drm scheduler currently expects that the stop/start sequence is always
> executed in the timeout handling, as the job at the head of the hardware
> execution list is always removed from the ring mirror before the driver
> function is called and only inserted back into the list when starting the
> scheduler.
>
> This adds some unnecessary overhead if the timeout handler determines
> that the GPU is still executing jobs normally and just wished to extend
> the timeout, but a better solution requires a major rearchitecture of the
> scheduler, which is not applicable as a fix.
>
> Fixes: 135517d3565b drm/scheduler: Avoid accessing freed bad job.)

Just a nit: the correct syntax for the Fixes line is:

Fixes: 135517d3565b ("drm/scheduler: Avoid accessing freed bad job.")
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/etnaviv: always start/stop scheduler in timeout processing
  2020-08-24 14:11 ` Fabio Estevam
@ 2020-08-25  8:44   ` Lucas Stach
  0 siblings, 0 replies; 4+ messages in thread
From: Lucas Stach @ 2020-08-25  8:44 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: The etnaviv authors, DRI mailing list, patchwork-lst,
	Sascha Hauer, Russell King

Hi all,

Am Montag, den 24.08.2020, 11:11 -0300 schrieb Fabio Estevam:
> Hi Lucas,
> 
> On Mon, Aug 24, 2020 at 8:02 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > The drm scheduler currently expects that the stop/start sequence is always
> > executed in the timeout handling, as the job at the head of the hardware
> > execution list is always removed from the ring mirror before the driver
> > function is called and only inserted back into the list when starting the
> > scheduler.
> > 
> > This adds some unnecessary overhead if the timeout handler determines
> > that the GPU is still executing jobs normally and just wished to extend
> > the timeout, but a better solution requires a major rearchitecture of the
> > scheduler, which is not applicable as a fix.
> > 
> > Fixes: 135517d3565b drm/scheduler: Avoid accessing freed bad job.)
> 
> Just a nit: the correct syntax for the Fixes line is:
> 
> Fixes: 135517d3565b ("drm/scheduler: Avoid accessing freed bad job.")

I've added this patch with the above fixed and Russell's T-b to my
etnaviv/fixes branch.

Regards,
Lucas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-25  8:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-24 11:02 [PATCH] drm/etnaviv: always start/stop scheduler in timeout processing Lucas Stach
2020-08-24 11:54 ` Russell King - ARM Linux admin
2020-08-24 14:11 ` Fabio Estevam
2020-08-25  8:44   ` Lucas Stach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).