All of lore.kernel.org
 help / color / mirror / Atom feed
From: Imre Deak <imre.deak@intel.com>
To: intel-gfx@lists.freedesktop.org
Cc: Jani Nikula <jani.nikula@intel.com>,
	Mika Kuoppala <mika.kuoppala@intel.com>
Subject: [PATCH v3 1/2] drm/i915/bxt: work around HW coherency issue when accessing GPU seqno
Date: Fri, 14 Aug 2015 18:35:27 +0300	[thread overview]
Message-ID: <1439566527-28692-1-git-send-email-imre.deak@intel.com> (raw)
In-Reply-To: <1439555937-8016-2-git-send-email-imre.deak@intel.com>

By running igt/store_dword_loop_render on BXT we can hit a coherency
problem where the seqno written at GPU command completion time is not
seen by the CPU. This results in __i915_wait_request seeing the stale
seqno and not completing the request (not considering the lost
interrupt/GPU reset mechanism). I also verified that this isn't a case
of a lost interrupt, or that the command didn't complete somehow: when
the coherency issue occured I read the seqno via an uncached GTT mapping
too. While the cached version of the seqno still showed the stale value
the one read via the uncached mapping was the correct one.

Work around this issue by clflushing the corresponding CPU cacheline
following any store of the seqno and preceding any reading of it. When
reading it do this only when the caller expects a coherent view.

v2:
- fix using the proper logical && instead of a bitwise & (Jani, Mika)
- limit the workaround to A stepping, on later steppings this HW issue
  is fixed
v3:
- use a separate get_seqno/set_seqno vfunc (Chris)

Testcase: igt/store_dword_loop_render
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 64 ++++++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  7 ++++
 2 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 138964a..1269d57 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1699,6 +1699,34 @@ static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno)
 	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
 }
 
+static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
+{
+
+	/*
+	 * On BXT A steppings there is a HW coherency issue whereby the
+	 * MI_STORE_DATA_IMM storing the completed request's seqno
+	 * occasionally doesn't invalidate the CPU cache. Work around this by
+	 * clflushing the corresponding cacheline whenever the caller wants
+	 * the coherency to be guaranteed. Note that this cacheline is known
+	 * to be clean at this point, since we only write it in
+	 * bxt_a_set_seqno(), where we also do a clflush after the write. So
+	 * this clflush in practice becomes an invalidate operation.
+	 */
+
+	if (!lazy_coherency)
+		intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
+
+	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
+}
+
+static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno)
+{
+	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
+
+	/* See bxt_a_get_seqno() explaining the reason for the clflush. */
+	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
+}
+
 static int gen8_emit_request(struct drm_i915_gem_request *request)
 {
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
@@ -1868,8 +1896,13 @@ static int logical_render_ring_init(struct drm_device *dev)
 		ring->init_hw = gen8_init_render_ring;
 	ring->init_context = gen8_init_rcs_context;
 	ring->cleanup = intel_fini_pipe_control;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
+	if (IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0) {
+		ring->get_seqno = bxt_a_get_seqno;
+		ring->set_seqno = bxt_a_set_seqno;
+	} else {
+		ring->get_seqno = gen8_get_seqno;
+		ring->set_seqno = gen8_set_seqno;
+	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush_render;
 	ring->irq_get = gen8_logical_ring_get_irq;
@@ -1915,8 +1948,13 @@ static int logical_bsd_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
+	if (IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0) {
+		ring->get_seqno = bxt_a_get_seqno;
+		ring->set_seqno = bxt_a_set_seqno;
+	} else {
+		ring->get_seqno = gen8_get_seqno;
+		ring->set_seqno = gen8_set_seqno;
+	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
 	ring->irq_get = gen8_logical_ring_get_irq;
@@ -1965,8 +2003,13 @@ static int logical_blt_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
+	if (IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0) {
+		ring->get_seqno = bxt_a_get_seqno;
+		ring->set_seqno = bxt_a_set_seqno;
+	} else {
+		ring->get_seqno = gen8_get_seqno;
+		ring->set_seqno = gen8_set_seqno;
+	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
 	ring->irq_get = gen8_logical_ring_get_irq;
@@ -1990,8 +2033,13 @@ static int logical_vebox_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
+	if (IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0) {
+		ring->get_seqno = bxt_a_get_seqno;
+		ring->set_seqno = bxt_a_set_seqno;
+	} else {
+		ring->get_seqno = gen8_get_seqno;
+		ring->set_seqno = gen8_set_seqno;
+	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
 	ring->irq_get = gen8_logical_ring_get_irq;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2e85fda..95b0b4b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -377,6 +377,13 @@ intel_ring_sync_index(struct intel_engine_cs *ring,
 	return idx;
 }
 
+static inline void
+intel_flush_status_page(struct intel_engine_cs *ring, int reg)
+{
+	drm_clflush_virt_range(&ring->status_page.page_addr[reg],
+			       sizeof(uint32_t));
+}
+
 static inline u32
 intel_read_status_page(struct intel_engine_cs *ring,
 		       int reg)
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2015-08-14 15:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-14 12:38 [PATCH v2 0/2] drm/i915/bxt: work around HW coherency issue Imre Deak
2015-08-14 12:38 ` [PATCH v2 1/2] drm/i915/bxt: work around HW coherency issue when accessing GPU seqno Imre Deak
2015-08-14 13:12   ` Chris Wilson
2015-08-14 13:31     ` Imre Deak
2015-08-14 15:35   ` Imre Deak [this message]
2015-08-14 16:21     ` [PATCH v3 " Chris Wilson
2015-08-14 12:38 ` [PATCH v2 2/2] drm/i915/bxt: work around HW coherency issue for cached GEM mappings Imre Deak
2015-08-14 13:11   ` Chris Wilson
2015-08-14 13:29     ` Imre Deak
2015-08-14 15:43   ` [PATCH v3 2/2] drm/i915/bxt: don't allow cached GEM mappings on A stepping Imre Deak
2015-08-14 16:18     ` Chris Wilson
2015-08-26  7:13       ` Daniel Vetter
2015-08-16 11:45   ` [PATCH v2 2/2] drm/i915/bxt: work around HW coherency issue for cached GEM mappings shuang.he
2015-08-14 12:49 ` [PATCH v2 0/2] drm/i915/bxt: work around HW coherency issue Chris Wilson
2015-08-14 13:26   ` Imre Deak
2015-08-14 13:31     ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1439566527-28692-1-git-send-email-imre.deak@intel.com \
    --to=imre.deak@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=mika.kuoppala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.