All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Roper <matthew.d.roper@intel.com>
To: intel-gfx@lists.freedesktop.org
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Subject: [Intel-gfx] [PATCH 2/2] drm/i915: Try to detect sudden loss of MMIO access
Date: Fri, 12 Feb 2021 13:19:25 -0800	[thread overview]
Message-ID: <20210212211925.3418280-2-matthew.d.roper@intel.com> (raw)
In-Reply-To: <20210212211925.3418280-1-matthew.d.roper@intel.com>

In rare circumstances bugs in PCI programming, broken BIOS, or failing
hardware can cause the CPU to lose access to the MMIO BAR on dgfx
platforms.  This is a pretty catastrophic failure since all register
reads come back with values of 0xFFFFFFFF.  Let's check for this special
case while doing our usual checks for unclaimed registers; the FPGA_DBG
register we use for those checks on modern platforms has some unused
bits that will always read back as 0 when things are behaving properly;
we can use them as canaries to detect when MMIO itself has suddenly
broken and try to print a more informative error message in the logs.

v2: Let the detection function still return 'true' if we've lost our
    MMIO access.  We'll still get an extra false positive message about
    an unclaimed register access, but we'll still honor the 'mmio_debug'
    limit and not spam the log.  (Lucas)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 5098f95d71b0..661b50191f2b 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -465,6 +465,22 @@ fpga_check_for_unclaimed_mmio(struct intel_uncore *uncore)
 	if (likely(!(dbg & FPGA_DBG_RM_NOCLAIM)))
 		return false;
 
+	/*
+	 * Bugs in PCI programming (or failing hardware) can occasionally cause
+	 * us to lose access to the MMIO BAR.  When this happens, register
+	 * reads will come back with 0xFFFFFFFF for every register and things
+	 * go bad very quickly.  Let's try to detect that special case and at
+	 * least try to print a more informative message about what has
+	 * happened.
+	 *
+	 * During normal operation the FPGA_DBG register has several unused
+	 * bits that will always read back as 0's so we can use them as canaries
+	 * to recognize when MMIO accesses are just busted.
+	 */
+	if (unlikely(dbg == ~0))
+		drm_err(&uncore->i915->drm,
+			"Lost access to MMIO BAR; all registers now read back as 0xFFFFFFFF!\n");
+
 	__raw_uncore_write32(uncore, FPGA_DBG, FPGA_DBG_RM_NOCLAIM);
 
 	return true;
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-02-12 21:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-12 21:19 [Intel-gfx] [PATCH 1/2] drm/i915: FPGA_DBG is display-specific Matt Roper
2021-02-12 21:19 ` Matt Roper [this message]
2021-02-12 21:59   ` [Intel-gfx] [PATCH 2/2] drm/i915: Try to detect sudden loss of MMIO access Lucas De Marchi
2021-02-12 21:45 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] drm/i915: FPGA_DBG is display-specific Patchwork
2021-02-12 21:59 ` [Intel-gfx] [PATCH 1/2] " Lucas De Marchi
2021-02-12 22:20 ` [Intel-gfx] [PATCH v2 " Matt Roper
2021-02-12 23:35 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [v2,1/2] drm/i915: FPGA_DBG is display-specific (rev2) Patchwork
2021-02-13  0:05 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-02-13  2:05 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-02-13  2:50   ` Matt Roper
2021-02-13  3:40 ` [Intel-gfx] [PATCH 1/2] drm/i915: FPGA_DBG is display-specific kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210212211925.3418280-2-matthew.d.roper@intel.com \
    --to=matthew.d.roper@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.