dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/etnaviv: consider completed fence seqno in hang check
@ 2021-12-22  0:17 Lucas Stach
  2021-12-22  6:56 ` Christian Gmeiner
  0 siblings, 1 reply; 2+ messages in thread
From: Lucas Stach @ 2021-12-22  0:17 UTC (permalink / raw)
  To: etnaviv; +Cc: Joerg Albert, dri-devel, Russell King

Some GPU heavy test programs manage to trigger the hangcheck quite often.
If there are no other GPU users in the system and the test program
exhibits a very regular structure in the commandstreams that are being
submitted, we can end up with two distinct submits managing to trigger
the hangcheck with the FE in a very similar address range. This leads
the hangcheck to believe that the GPU is stuck, while in reality the GPU
is already busy working on a different job. To avoid those spurious
GPU resets, also remember and consider the last completed fence seqno
in the hang check.

Reported-by: Joerg Albert <joerg.albert@iav.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
---
 drivers/gpu/drm/etnaviv/etnaviv_gpu.h   | 1 +
 drivers/gpu/drm/etnaviv/etnaviv_sched.c | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
index 1c75c8ed5bce..85eddd492774 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
@@ -130,6 +130,7 @@ struct etnaviv_gpu {
 
 	/* hang detection */
 	u32 hangcheck_dma_addr;
+	u32 hangcheck_fence;
 
 	void __iomem *mmio;
 	int irq;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 180bb633d5c5..58f593b278c1 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -107,8 +107,10 @@ static enum drm_gpu_sched_stat etnaviv_sched_timedout_job(struct drm_sched_job
 	 */
 	dma_addr = gpu_read(gpu, VIVS_FE_DMA_ADDRESS);
 	change = dma_addr - gpu->hangcheck_dma_addr;
-	if (change < 0 || change > 16) {
+	if (gpu->completed_fence != gpu->hangcheck_fence ||
+	    change < 0 || change > 16) {
 		gpu->hangcheck_dma_addr = dma_addr;
+		gpu->hangcheck_fence = gpu->completed_fence;
 		goto out_no_timeout;
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] drm/etnaviv: consider completed fence seqno in hang check
  2021-12-22  0:17 [PATCH] drm/etnaviv: consider completed fence seqno in hang check Lucas Stach
@ 2021-12-22  6:56 ` Christian Gmeiner
  0 siblings, 0 replies; 2+ messages in thread
From: Christian Gmeiner @ 2021-12-22  6:56 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Joerg Albert, The etnaviv authors, DRI mailing list, Russell King

Am Mi., 22. Dez. 2021 um 01:17 Uhr schrieb Lucas Stach <l.stach@pengutronix.de>:
>
> Some GPU heavy test programs manage to trigger the hangcheck quite often.
> If there are no other GPU users in the system and the test program
> exhibits a very regular structure in the commandstreams that are being
> submitted, we can end up with two distinct submits managing to trigger
> the hangcheck with the FE in a very similar address range. This leads
> the hangcheck to believe that the GPU is stuck, while in reality the GPU
> is already busy working on a different job. To avoid those spurious
> GPU resets, also remember and consider the last completed fence seqno
> in the hang check.
>
> Reported-by: Joerg Albert <joerg.albert@iav.de>
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>

-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-12-22  6:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22  0:17 [PATCH] drm/etnaviv: consider completed fence seqno in hang check Lucas Stach
2021-12-22  6:56 ` Christian Gmeiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).