All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rob Clark <robdclark@gmail.com>
To: dri-devel@lists.freedesktop.org
Cc: Rob Clark <robdclark@chromium.org>,
	Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>,
	Douglas Anderson <dianders@chromium.org>,
	Jonathan Marek <jonathan@marek.ca>,
	David Airlie <airlied@linux.ie>,
	linux-arm-msm@vger.kernel.org, Yangtao Li <tiny.windzz@gmail.com>,
	Konrad Dybcio <konrad.dybcio@somainline.org>,
	Akhil P Oommen <akhilpo@codeaurora.org>,
	Jordan Crouse <jordan@cosmicpenguin.net>,
	Sean Paul <sean@poorly.run>,
	AngeloGioacchino Del Regno
	<angelogioacchino.delregno@somainline.org>,
	Dmitry Baryshkov <dmitry.baryshkov@linaro.org>,
	Marijn Suijten <marijn.suijten@somainline.org>,
	freedreno@lists.freedesktop.org,
	Sharat Masetty <smasetty@codeaurora.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: [PATCH 5/5] drm/msm: Add debugfs to disable hw err handling
Date: Tue,  9 Nov 2021 10:11:05 -0800	[thread overview]
Message-ID: <20211109181117.591148-6-robdclark@gmail.com> (raw)
In-Reply-To: <20211109181117.591148-1-robdclark@gmail.com>

From: Rob Clark <robdclark@chromium.org>

Add a debugfs interface to ignore hw error irqs, in order to force
fallback to sw hangcheck mechanism.  Because the hw error detection is
pretty good on newer gens, we need this for igt tests to test the sw
hang detection.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 6 ++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 ++++
 drivers/gpu/drm/msm/msm_debugfs.c     | 3 +++
 drivers/gpu/drm/msm/msm_drv.h         | 9 +++++++++
 4 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 6163990a4d09..ec8e043c9d38 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1252,6 +1252,7 @@ static void a5xx_fault_detect_irq(struct msm_gpu *gpu)
 
 static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
 {
+	struct msm_drm_private *priv = gpu->dev->dev_private;
 	u32 status = gpu_read(gpu, REG_A5XX_RBBM_INT_0_STATUS);
 
 	/*
@@ -1261,6 +1262,11 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
 	gpu_write(gpu, REG_A5XX_RBBM_INT_CLEAR_CMD,
 		status & ~A5XX_RBBM_INT_0_MASK_RBBM_AHB_ERROR);
 
+	if (priv->disable_err_irq) {
+		status &= A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS |
+			  A5XX_RBBM_INT_0_MASK_CP_SW;
+	}
+
 	/* Pass status to a5xx_rbbm_err_irq because we've already cleared it */
 	if (status & RBBM_ERROR_MASK)
 		a5xx_rbbm_err_irq(gpu, status);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 3d2da81cb2c9..8a2af3a27e33 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1373,10 +1373,14 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
 
 static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
 {
+	struct msm_drm_private *priv = gpu->dev->dev_private;
 	u32 status = gpu_read(gpu, REG_A6XX_RBBM_INT_0_STATUS);
 
 	gpu_write(gpu, REG_A6XX_RBBM_INT_CLEAR_CMD, status);
 
+	if (priv->disable_err_irq)
+		status &= A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS;
+
 	if (status & A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT)
 		a6xx_fault_detect_irq(gpu);
 
diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c
index 6a99e8b5d25d..956b1efc3721 100644
--- a/drivers/gpu/drm/msm/msm_debugfs.c
+++ b/drivers/gpu/drm/msm/msm_debugfs.c
@@ -242,6 +242,9 @@ void msm_debugfs_init(struct drm_minor *minor)
 	debugfs_create_u32("hangcheck_period_ms", 0600, minor->debugfs_root,
 		&priv->hangcheck_period);
 
+	debugfs_create_bool("disable_err_irq", 0600, minor->debugfs_root,
+		&priv->disable_err_irq);
+
 	debugfs_create_file("shrink", S_IRWXU, minor->debugfs_root,
 		dev, &shrink_fops);
 
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 2943c21d9aac..a8da7a7efb84 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -246,6 +246,15 @@ struct msm_drm_private {
 
 	/* For hang detection, in ms */
 	unsigned int hangcheck_period;
+
+	/**
+	 * disable_err_irq:
+	 *
+	 * Disable handling of GPU hw error interrupts, to force fallback to
+	 * sw hangcheck timer.  Written (via debugfs) by igt tests to test
+	 * the sw hangcheck mechanism.
+	 */
+	bool disable_err_irq;
 };
 
 struct msm_format {
-- 
2.31.1


WARNING: multiple messages have this Message-ID (diff)
From: Rob Clark <robdclark@gmail.com>
To: dri-devel@lists.freedesktop.org
Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org,
	Jordan Crouse <jordan@cosmicpenguin.net>,
	Akhil P Oommen <akhilpo@codeaurora.org>,
	Rob Clark <robdclark@chromium.org>,
	Rob Clark <robdclark@gmail.com>, Sean Paul <sean@poorly.run>,
	David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>,
	AngeloGioacchino Del Regno 
	<angelogioacchino.delregno@somainline.org>,
	Konrad Dybcio <konrad.dybcio@somainline.org>,
	Yangtao Li <tiny.windzz@gmail.com>,
	Marijn Suijten <marijn.suijten@somainline.org>,
	Jonathan Marek <jonathan@marek.ca>,
	Dmitry Baryshkov <dmitry.baryshkov@linaro.org>,
	Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>,
	Sharat Masetty <smasetty@codeaurora.org>,
	Douglas Anderson <dianders@chromium.org>,
	linux-kernel@vger.kernel.org (open list)
Subject: [PATCH 5/5] drm/msm: Add debugfs to disable hw err handling
Date: Tue,  9 Nov 2021 10:11:05 -0800	[thread overview]
Message-ID: <20211109181117.591148-6-robdclark@gmail.com> (raw)
In-Reply-To: <20211109181117.591148-1-robdclark@gmail.com>

From: Rob Clark <robdclark@chromium.org>

Add a debugfs interface to ignore hw error irqs, in order to force
fallback to sw hangcheck mechanism.  Because the hw error detection is
pretty good on newer gens, we need this for igt tests to test the sw
hang detection.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 6 ++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 ++++
 drivers/gpu/drm/msm/msm_debugfs.c     | 3 +++
 drivers/gpu/drm/msm/msm_drv.h         | 9 +++++++++
 4 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 6163990a4d09..ec8e043c9d38 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1252,6 +1252,7 @@ static void a5xx_fault_detect_irq(struct msm_gpu *gpu)
 
 static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
 {
+	struct msm_drm_private *priv = gpu->dev->dev_private;
 	u32 status = gpu_read(gpu, REG_A5XX_RBBM_INT_0_STATUS);
 
 	/*
@@ -1261,6 +1262,11 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
 	gpu_write(gpu, REG_A5XX_RBBM_INT_CLEAR_CMD,
 		status & ~A5XX_RBBM_INT_0_MASK_RBBM_AHB_ERROR);
 
+	if (priv->disable_err_irq) {
+		status &= A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS |
+			  A5XX_RBBM_INT_0_MASK_CP_SW;
+	}
+
 	/* Pass status to a5xx_rbbm_err_irq because we've already cleared it */
 	if (status & RBBM_ERROR_MASK)
 		a5xx_rbbm_err_irq(gpu, status);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 3d2da81cb2c9..8a2af3a27e33 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1373,10 +1373,14 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
 
 static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
 {
+	struct msm_drm_private *priv = gpu->dev->dev_private;
 	u32 status = gpu_read(gpu, REG_A6XX_RBBM_INT_0_STATUS);
 
 	gpu_write(gpu, REG_A6XX_RBBM_INT_CLEAR_CMD, status);
 
+	if (priv->disable_err_irq)
+		status &= A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS;
+
 	if (status & A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT)
 		a6xx_fault_detect_irq(gpu);
 
diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c
index 6a99e8b5d25d..956b1efc3721 100644
--- a/drivers/gpu/drm/msm/msm_debugfs.c
+++ b/drivers/gpu/drm/msm/msm_debugfs.c
@@ -242,6 +242,9 @@ void msm_debugfs_init(struct drm_minor *minor)
 	debugfs_create_u32("hangcheck_period_ms", 0600, minor->debugfs_root,
 		&priv->hangcheck_period);
 
+	debugfs_create_bool("disable_err_irq", 0600, minor->debugfs_root,
+		&priv->disable_err_irq);
+
 	debugfs_create_file("shrink", S_IRWXU, minor->debugfs_root,
 		dev, &shrink_fops);
 
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 2943c21d9aac..a8da7a7efb84 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -246,6 +246,15 @@ struct msm_drm_private {
 
 	/* For hang detection, in ms */
 	unsigned int hangcheck_period;
+
+	/**
+	 * disable_err_irq:
+	 *
+	 * Disable handling of GPU hw error interrupts, to force fallback to
+	 * sw hangcheck timer.  Written (via debugfs) by igt tests to test
+	 * the sw hangcheck mechanism.
+	 */
+	bool disable_err_irq;
 };
 
 struct msm_format {
-- 
2.31.1


  parent reply	other threads:[~2021-11-09 18:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-09 18:11 [PATCH 0/5] drm/msm: Cleanup and drm/sched tdr prep Rob Clark
2021-11-09 18:11 ` Rob Clark
2021-11-09 18:11 ` [PATCH 1/5] drm/msm: Remove unnecessary struct_mutex Rob Clark
2021-11-09 18:11   ` Rob Clark
2021-11-09 18:11 ` [PATCH 2/5] drm/msm: Drop priv->lastctx Rob Clark
2021-11-09 18:11   ` Rob Clark
2021-11-11 16:44   ` Akhil P Oommen
2021-11-11 16:44     ` Akhil P Oommen
2021-11-09 18:11 ` [PATCH 3/5] drm/msm: Remove struct_mutex usage Rob Clark
2021-11-09 18:11   ` Rob Clark
2021-11-09 18:11 ` [PATCH 4/5] drm/msm: Handle fence rollover Rob Clark
2021-11-09 18:11   ` Rob Clark
2021-11-11 16:58   ` Akhil P Oommen
2021-11-11 16:58     ` Akhil P Oommen
2021-11-09 18:11 ` Rob Clark [this message]
2021-11-09 18:11   ` [PATCH 5/5] drm/msm: Add debugfs to disable hw err handling Rob Clark
2021-11-11 17:14   ` Akhil P Oommen
2021-11-11 17:14     ` Akhil P Oommen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211109181117.591148-6-robdclark@gmail.com \
    --to=robdclark@gmail.com \
    --cc=airlied@linux.ie \
    --cc=akhilpo@codeaurora.org \
    --cc=angelogioacchino.delregno@somainline.org \
    --cc=dianders@chromium.org \
    --cc=dmitry.baryshkov@linaro.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=jonathan@marek.ca \
    --cc=jordan@cosmicpenguin.net \
    --cc=konrad.dybcio@somainline.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marijn.suijten@somainline.org \
    --cc=robdclark@chromium.org \
    --cc=saiprakash.ranjan@codeaurora.org \
    --cc=sean@poorly.run \
    --cc=smasetty@codeaurora.org \
    --cc=tiny.windzz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.