All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Shyti <andi.shyti@linux.intel.com>
To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	Matt Roper <matthew.d.roper@intel.com>
Cc: Andi Shyti <andi.shyti@kernel.org>,
	Mika Kuoppala <mika.kuoppala@linux.intel.com>,
	Stuart Summers <stuart.summers@intel.com>,
	Andrzej Hajda <andrzej.hajda@intel.com>,
	Andi Shyti <andi.shyti@linux.intel.com>
Subject: [PATCH v3 2/2] drm/i915: Check for unreliable MMIO during forcewake
Date: Mon, 27 Mar 2023 21:55:47 +0200	[thread overview]
Message-ID: <20230327195547.356584-3-andi.shyti@linux.intel.com> (raw)
In-Reply-To: <20230327195547.356584-1-andi.shyti@linux.intel.com>

From: Matt Roper <matthew.d.roper@intel.com>

Although we now sanitycheck MMIO access during driver load to make sure
the MMIO BAR isn't returning all 0xFFFFFFFF, there have been a few cases
where (temporarily?) unreliable MMIO access has happened after GPU
resets or power events.  We'll often notice this on our next GT register
access since forcewake handling will fail; let's change our handling
slightly so that when this happens we print a more meaningful message
clarifying that the problem is the MMIO access, not forcewake
specifically.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 14ec45e6facfa..796ebfe6c5507 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -177,12 +177,19 @@ wait_ack_set(const struct intel_uncore_forcewake_domain *d,
 static inline void
 fw_domain_wait_ack_clear(const struct intel_uncore_forcewake_domain *d)
 {
-	if (wait_ack_clear(d, FORCEWAKE_KERNEL)) {
+	if (!wait_ack_clear(d, FORCEWAKE_KERNEL))
+		return;
+
+	if (fw_ack(d) == ~0)
+		drm_err(&d->uncore->i915->drm,
+			"%s: MMIO unreliable (forcewake register returns 0xFFFFFFFF)!\n",
+			intel_uncore_forcewake_domain_to_str(d->id));
+	else
 		drm_err(&d->uncore->i915->drm,
 			"%s: timed out waiting for forcewake ack to clear.\n",
 			intel_uncore_forcewake_domain_to_str(d->id));
-		add_taint_for_CI(d->uncore->i915, TAINT_WARN); /* CI now unreliable */
-	}
+
+	add_taint_for_CI(d->uncore->i915, TAINT_WARN); /* CI now unreliable */
 }
 
 enum ack_type {
-- 
2.39.2


WARNING: multiple messages have this Message-ID (diff)
From: Andi Shyti <andi.shyti@linux.intel.com>
To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	Matt Roper <matthew.d.roper@intel.com>
Cc: Andi Shyti <andi.shyti@kernel.org>,
	Andrzej Hajda <andrzej.hajda@intel.com>
Subject: [Intel-gfx] [PATCH v3 2/2] drm/i915: Check for unreliable MMIO during forcewake
Date: Mon, 27 Mar 2023 21:55:47 +0200	[thread overview]
Message-ID: <20230327195547.356584-3-andi.shyti@linux.intel.com> (raw)
In-Reply-To: <20230327195547.356584-1-andi.shyti@linux.intel.com>

From: Matt Roper <matthew.d.roper@intel.com>

Although we now sanitycheck MMIO access during driver load to make sure
the MMIO BAR isn't returning all 0xFFFFFFFF, there have been a few cases
where (temporarily?) unreliable MMIO access has happened after GPU
resets or power events.  We'll often notice this on our next GT register
access since forcewake handling will fail; let's change our handling
slightly so that when this happens we print a more meaningful message
clarifying that the problem is the MMIO access, not forcewake
specifically.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 14ec45e6facfa..796ebfe6c5507 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -177,12 +177,19 @@ wait_ack_set(const struct intel_uncore_forcewake_domain *d,
 static inline void
 fw_domain_wait_ack_clear(const struct intel_uncore_forcewake_domain *d)
 {
-	if (wait_ack_clear(d, FORCEWAKE_KERNEL)) {
+	if (!wait_ack_clear(d, FORCEWAKE_KERNEL))
+		return;
+
+	if (fw_ack(d) == ~0)
+		drm_err(&d->uncore->i915->drm,
+			"%s: MMIO unreliable (forcewake register returns 0xFFFFFFFF)!\n",
+			intel_uncore_forcewake_domain_to_str(d->id));
+	else
 		drm_err(&d->uncore->i915->drm,
 			"%s: timed out waiting for forcewake ack to clear.\n",
 			intel_uncore_forcewake_domain_to_str(d->id));
-		add_taint_for_CI(d->uncore->i915, TAINT_WARN); /* CI now unreliable */
-	}
+
+	add_taint_for_CI(d->uncore->i915, TAINT_WARN); /* CI now unreliable */
 }
 
 enum ack_type {
-- 
2.39.2


  parent reply	other threads:[~2023-03-27 19:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-27 19:55 [PATCH v3 0/2] Report MMIO communication problems more clearly Andi Shyti
2023-03-27 19:55 ` [Intel-gfx] " Andi Shyti
2023-03-27 19:55 ` [PATCH v3 1/2] drm/i915: Sanitycheck MMIO access early in driver load Andi Shyti
2023-03-27 19:55   ` [Intel-gfx] " Andi Shyti
2023-03-27 19:55 ` Andi Shyti [this message]
2023-03-27 19:55   ` [Intel-gfx] [PATCH v3 2/2] drm/i915: Check for unreliable MMIO during forcewake Andi Shyti
2023-03-27 23:04 ` [Intel-gfx] ✓ Fi.CI.BAT: success for Report MMIO communication problems more clearly (rev3) Patchwork
2023-03-28  6:52 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230327195547.356584-3-andi.shyti@linux.intel.com \
    --to=andi.shyti@linux.intel.com \
    --cc=andi.shyti@kernel.org \
    --cc=andrzej.hajda@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=matthew.d.roper@intel.com \
    --cc=mika.kuoppala@linux.intel.com \
    --cc=stuart.summers@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.