From: Andi Shyti <andi.shyti@kernel.org>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: Andi Shyti <andi.shyti@kernel.org>,
intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Sanitycheck MMIO access early in driver load
Date: Tue, 21 Mar 2023 23:43:53 +0100 [thread overview]
Message-ID: <20230321224353.h6l2gwv3iuac6vd2@intel.intel> (raw)
In-Reply-To: <20230321215527.GQ4085390@mdroper-desk1.amr.corp.intel.com>
Hi Matt,
> > We occasionally see the PCI device in a non-accessible state at the
> > point the driver is loaded. When this happens, all BAR accesses will
> > read back as 0xFFFFFFFF. Rather than reading registers and
> > misinterpreting their (invalid) values, let's specifically check for
> > 0xFFFFFFFF in a register that cannot have that value to see if the
> > device is accessible.
> >
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_uncore.c | 35 +++++++++++++++++++++++++++++
> > 1 file changed, 35 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> > index e1e1f34490c8e..0b69081d6d285 100644
> > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > @@ -2602,11 +2602,46 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)
> > return 0;
> > }
> >
> > +static int sanity_check_mmio_access(struct intel_uncore *uncore)
> > +{
> > + struct drm_i915_private *i915 = uncore->i915;
> > + int ret;
> > +
> > + if (GRAPHICS_VER(i915) < 8)
> > + return 0;
> > +
> > + /*
> > + * Sanitycheck that MMIO access to the device is working properly. If
> > + * the CPU is unable to communcate with a PCI device, BAR reads will
> > + * return 0xFFFFFFFF. Let's make sure the device isn't in this state
> > + * before we start trying to access registers.
> > + *
> > + * We use the primary GT's forcewake register as our guinea pig since
> > + * it's been around since HSW and it's a masked register so the upper
> > + * 16 bits can never read back as 1's if device access is operating
> > + * properly.
> > + *
> > + * If MMIO isn't working, we'll wait up to 2 seconds to see if it
> > + * recovers, then give up.
> > + */
> > + ret = intel_wait_for_register_fw(uncore, FORCEWAKE_MT, 0, 0, 2000000);
>
> It looks like you lost the check for 0xFFFFFFFF specifically. In fact
> with a mask/value of 0, isn't this always going to just always pass
> immediately?
uh... yes... sorry, I just got confused and lost track of the
goal of the patch.
Sorry, then please ignore... I don't see then how
intel_wait_for_register_fw() can be used with a '!='.
Please, ignore this v2.
Thanks and sorry, again,
Andi
> We don't know what the value of this register will be (there may or may
> not be some bits set), but we need to make sure that it isn't 0xFFFFFFFF
> because that means we're not even truly accessing the register, just
> hitting a PCI BAR read failure.
>
>
> Matt
>
> > + if (ret == -ETIMEDOUT) {
> > + drm_err(&i915->drm, "Device is non-operational; MMIO access returns 0xFFFFFFFF!\n");
> > + return -EIO;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > int intel_uncore_init_mmio(struct intel_uncore *uncore)
> > {
> > struct drm_i915_private *i915 = uncore->i915;
> > int ret;
> >
> > + ret = sanity_check_mmio_access(uncore);
> > + if (ret)
> > + return ret;
> > +
> > /*
> > * The boot firmware initializes local memory and assesses its health.
> > * If memory training fails, the punit will have been instructed to
> > --
> > 2.39.2
> >
>
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation
next prev parent reply other threads:[~2023-03-21 22:44 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-21 17:09 [Intel-gfx] [PATCH v2 0/2] Report MMIO communication problems more clearly Andi Shyti
2023-03-21 17:09 ` [Intel-gfx] [PATCH v2 1/2] drm/i915: Sanitycheck MMIO access early in driver load Andi Shyti
2023-03-21 17:15 ` Andi Shyti
2023-03-21 21:55 ` Matt Roper
2023-03-21 22:43 ` Andi Shyti [this message]
2023-03-22 8:40 ` Andrzej Hajda
2023-03-21 17:09 ` [Intel-gfx] [PATCH v2 2/2] drm/i915: Check for unreliable MMIO during forcewake Andi Shyti
2023-03-21 17:16 ` Andi Shyti
2023-03-22 9:10 ` Andrzej Hajda
2023-03-21 18:27 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning for Report MMIO communication problems more clearly (rev2) Patchwork
2023-03-21 18:27 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2023-03-21 18:44 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-03-22 7:29 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230321224353.h6l2gwv3iuac6vd2@intel.intel \
--to=andi.shyti@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=matthew.d.roper@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).