All of lore.kernel.org
 help / color / mirror / Atom feed
From: Imre Deak <imre.deak@intel.com>
To: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/2] drm/i915/bxt: work around HW coherency issue when accessing GPU seqno
Date: Wed, 10 Jun 2015 18:16:20 +0300	[thread overview]
Message-ID: <1433949380.25216.84.camel@intel.com> (raw)
In-Reply-To: <20150610150043.GQ5176@intel.com>

On ke, 2015-06-10 at 18:00 +0300, Ville Syrjälä wrote:
> On Wed, Jun 10, 2015 at 05:55:24PM +0300, Imre Deak wrote:
> > On ke, 2015-06-10 at 15:21 +0100, Chris Wilson wrote:
> > > On Wed, Jun 10, 2015 at 05:07:46PM +0300, Imre Deak wrote:
> > > > On ti, 2015-06-09 at 11:21 +0300, Jani Nikula wrote:
> > > > > On Mon, 08 Jun 2015, Imre Deak <imre.deak@intel.com> wrote:
> > > > > > By running igt/store_dword_loop_render on BXT we can hit a coherency
> > > > > > problem where the seqno written at GPU command completion time is not
> > > > > > seen by the CPU. This results in __i915_wait_request seeing the stale
> > > > > > seqno and not completing the request (not considering the lost
> > > > > > interrupt/GPU reset mechanism). I also verified that this isn't a case
> > > > > > of a lost interrupt, or that the command didn't complete somehow: when
> > > > > > the coherency issue occured I read the seqno via an uncached GTT mapping
> > > > > > too. While the cached version of the seqno still showed the stale value
> > > > > > the one read via the uncached mapping was the correct one.
> > > > > >
> > > > > > Work around this issue by clflushing the corresponding CPU cacheline
> > > > > > following any store of the seqno and preceding any reading of it. When
> > > > > > reading it do this only when the caller expects a coherent view.
> > > > > >
> > > > > > Testcase: igt/store_dword_loop_render
> > > > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/intel_lrc.c        | 17 +++++++++++++++++
> > > > > >  drivers/gpu/drm/i915/intel_ringbuffer.h |  7 +++++++
> > > > > >  2 files changed, 24 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > > > > > index 9f5485d..88bc5525 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > > > > > @@ -1288,12 +1288,29 @@ static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
> > > > > >  
> > > > > >  static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
> > > > > >  {
> > > > > > +	/*
> > > > > > +	 * On BXT-A1 there is a coherency issue whereby the MI_STORE_DATA_IMM
> > > > > > +	 * storing the completed request's seqno occasionally doesn't
> > > > > > +	 * invalidate the CPU cache. Work around this by clflushing the
> > > > > > +	 * corresponding cacheline whenever the caller wants the coherency to
> > > > > > +	 * be guaranteed. Note that this cacheline is known to be
> > > > > > +	 * clean at this point, since we only write it in gen8_set_seqno(),
> > > > > > +	 * where we also do a clflush after the write. So this clflush in
> > > > > > +	 * practice becomes an invalidate operation.
> > > 
> > > Did you compare and contrast with the gen6+ w/a? A clflush may just work
> > > out quicker considering that the posting read would involve a spinlock
> > > and fw dance.
> > 
> > Actually, I did, but only saw that it only works, didn't benchmark it.
> > I'd also think that clflush would be faster, since it's only a cache
> > invalidate at this point. But I will compare the two things now.
> 
> If an mmio read fixes it then it doesn't feel like a snoop problem after
> all.

Ok, I retract what I just said. I tried now and with the patch below and
still see the problem. I must have remembered the testcase where I
created a separate GTT mapping for the status page and read the seqno
for that. Sorry for the confusion.

diff --git a/drivers/gpu/drm/i915/intel_lrc.c
b/drivers/gpu/drm/i915/intel_lrc.c
index 9f5485d..36e5fd6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1288,6 +1288,21 @@ static int gen8_emit_flush_render(struct
intel_ringbuffer *ringbuf,
 
 static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool
lazy_coherency)
 {
+	if (!lazy_coherency) {
+		struct drm_i915_private *dev_priv = ring->dev->dev_private;
+		POSTING_READ(RING_ACTHD(ring->mmio_base));
+	}
+
 	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
 }
 


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-06-10 15:16 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08 16:28 [PATCH 0/2] drm/i915/bxt: work around HW coherency issue Imre Deak
2015-06-08 16:28 ` [PATCH 1/2] drm/i915/bxt: work around HW coherency issue when accessing GPU seqno Imre Deak
2015-06-08 17:08   ` Dave Gordon
2015-06-08 17:12     ` Chris Wilson
2015-06-08 17:34       ` Ville Syrjälä
2015-06-08 18:00         ` Chris Wilson
2015-06-08 18:40           ` Ville Syrjälä
2015-06-08 19:33             ` Dave Gordon
2015-06-10 10:59               ` Imre Deak
2015-06-10 15:10                 ` Jesse Barnes
2015-06-10 15:26                   ` Imre Deak
2015-06-10 15:33                     ` Jesse Barnes
2015-06-10 15:55                       ` Imre Deak
2015-06-10 15:52                     ` Chris Wilson
2015-06-11  8:02                       ` Dave Gordon
2015-06-11  8:20                         ` Chris Wilson
2015-06-11 19:14                         ` Imre Deak
2015-06-08 17:14     ` Imre Deak
2015-06-09  8:21   ` Jani Nikula
2015-06-10 14:07     ` Imre Deak
2015-06-10 14:21       ` Chris Wilson
2015-06-10 14:55         ` Imre Deak
2015-06-10 15:00           ` Ville Syrjälä
2015-06-10 15:16             ` Imre Deak [this message]
2015-06-10 15:35               ` Chris Wilson
2015-07-01 13:40   ` Mika Kuoppala
2015-07-01 13:53     ` Mika Kuoppala
2015-06-08 16:28 ` [PATCH 2/2] drm/i915/bxt: work around HW coherency issue for cached GEM mappings Imre Deak
2015-06-13 18:04   ` shuang.he

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1433949380.25216.84.camel@intel.com \
    --to=imre.deak@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.