From: Andi Shyti <andi.shyti@linux.intel.com> To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Matt Roper <matthew.d.roper@intel.com> Cc: Andi Shyti <andi.shyti@kernel.org>, Mika Kuoppala <mika.kuoppala@linux.intel.com>, Stuart Summers <stuart.summers@intel.com>, Andrzej Hajda <andrzej.hajda@intel.com>, Andi Shyti <andi.shyti@linux.intel.com> Subject: [PATCH v3 2/2] drm/i915: Check for unreliable MMIO during forcewake Date: Mon, 27 Mar 2023 21:55:47 +0200 [thread overview] Message-ID: <20230327195547.356584-3-andi.shyti@linux.intel.com> (raw) In-Reply-To: <20230327195547.356584-1-andi.shyti@linux.intel.com> From: Matt Roper <matthew.d.roper@intel.com> Although we now sanitycheck MMIO access during driver load to make sure the MMIO BAR isn't returning all 0xFFFFFFFF, there have been a few cases where (temporarily?) unreliable MMIO access has happened after GPU resets or power events. We'll often notice this on our next GT register access since forcewake handling will fail; let's change our handling slightly so that when this happens we print a more meaningful message clarifying that the problem is the MMIO access, not forcewake specifically. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> --- drivers/gpu/drm/i915/intel_uncore.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 14ec45e6facfa..796ebfe6c5507 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -177,12 +177,19 @@ wait_ack_set(const struct intel_uncore_forcewake_domain *d, static inline void fw_domain_wait_ack_clear(const struct intel_uncore_forcewake_domain *d) { - if (wait_ack_clear(d, FORCEWAKE_KERNEL)) { + if (!wait_ack_clear(d, FORCEWAKE_KERNEL)) + return; + + if (fw_ack(d) == ~0) + drm_err(&d->uncore->i915->drm, + "%s: MMIO unreliable (forcewake register returns 0xFFFFFFFF)!\n", + intel_uncore_forcewake_domain_to_str(d->id)); + else drm_err(&d->uncore->i915->drm, "%s: timed out waiting for forcewake ack to clear.\n", intel_uncore_forcewake_domain_to_str(d->id)); - add_taint_for_CI(d->uncore->i915, TAINT_WARN); /* CI now unreliable */ - } + + add_taint_for_CI(d->uncore->i915, TAINT_WARN); /* CI now unreliable */ } enum ack_type { -- 2.39.2
WARNING: multiple messages have this Message-ID (diff)
From: Andi Shyti <andi.shyti@linux.intel.com> To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Matt Roper <matthew.d.roper@intel.com> Cc: Andi Shyti <andi.shyti@kernel.org>, Andrzej Hajda <andrzej.hajda@intel.com> Subject: [Intel-gfx] [PATCH v3 2/2] drm/i915: Check for unreliable MMIO during forcewake Date: Mon, 27 Mar 2023 21:55:47 +0200 [thread overview] Message-ID: <20230327195547.356584-3-andi.shyti@linux.intel.com> (raw) In-Reply-To: <20230327195547.356584-1-andi.shyti@linux.intel.com> From: Matt Roper <matthew.d.roper@intel.com> Although we now sanitycheck MMIO access during driver load to make sure the MMIO BAR isn't returning all 0xFFFFFFFF, there have been a few cases where (temporarily?) unreliable MMIO access has happened after GPU resets or power events. We'll often notice this on our next GT register access since forcewake handling will fail; let's change our handling slightly so that when this happens we print a more meaningful message clarifying that the problem is the MMIO access, not forcewake specifically. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> --- drivers/gpu/drm/i915/intel_uncore.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 14ec45e6facfa..796ebfe6c5507 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -177,12 +177,19 @@ wait_ack_set(const struct intel_uncore_forcewake_domain *d, static inline void fw_domain_wait_ack_clear(const struct intel_uncore_forcewake_domain *d) { - if (wait_ack_clear(d, FORCEWAKE_KERNEL)) { + if (!wait_ack_clear(d, FORCEWAKE_KERNEL)) + return; + + if (fw_ack(d) == ~0) + drm_err(&d->uncore->i915->drm, + "%s: MMIO unreliable (forcewake register returns 0xFFFFFFFF)!\n", + intel_uncore_forcewake_domain_to_str(d->id)); + else drm_err(&d->uncore->i915->drm, "%s: timed out waiting for forcewake ack to clear.\n", intel_uncore_forcewake_domain_to_str(d->id)); - add_taint_for_CI(d->uncore->i915, TAINT_WARN); /* CI now unreliable */ - } + + add_taint_for_CI(d->uncore->i915, TAINT_WARN); /* CI now unreliable */ } enum ack_type { -- 2.39.2
next prev parent reply other threads:[~2023-03-27 19:56 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-03-27 19:55 [PATCH v3 0/2] Report MMIO communication problems more clearly Andi Shyti 2023-03-27 19:55 ` [Intel-gfx] " Andi Shyti 2023-03-27 19:55 ` [PATCH v3 1/2] drm/i915: Sanitycheck MMIO access early in driver load Andi Shyti 2023-03-27 19:55 ` [Intel-gfx] " Andi Shyti 2023-03-27 19:55 ` Andi Shyti [this message] 2023-03-27 19:55 ` [Intel-gfx] [PATCH v3 2/2] drm/i915: Check for unreliable MMIO during forcewake Andi Shyti 2023-03-27 23:04 ` [Intel-gfx] ✓ Fi.CI.BAT: success for Report MMIO communication problems more clearly (rev3) Patchwork 2023-03-28 6:52 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230327195547.356584-3-andi.shyti@linux.intel.com \ --to=andi.shyti@linux.intel.com \ --cc=andi.shyti@kernel.org \ --cc=andrzej.hajda@intel.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=intel-gfx@lists.freedesktop.org \ --cc=matthew.d.roper@intel.com \ --cc=mika.kuoppala@linux.intel.com \ --cc=stuart.summers@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.