All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] drm/i915: outstanding_lazy_request is a u32
@ 2012-01-25 13:03 Daniel Vetter
  2012-01-25 13:03 ` [PATCH 2/4] drm/i915: fixup seqno allocation logic for lazy_request Daniel Vetter
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Daniel Vetter @ 2012-01-25 13:03 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

So don't assign it false, that's just confusing ... No functional
change here.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index eb98a7f..8f01c3d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1681,7 +1681,7 @@ i915_add_request(struct intel_ring_buffer *ring,
 		spin_unlock(&file_priv->mm.lock);
 	}
 
-	ring->outstanding_lazy_request = false;
+	ring->outstanding_lazy_request = 0;
 
 	if (!dev_priv->mm.suspended) {
 		if (i915_enable_hangcheck) {
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] drm/i915: fixup seqno allocation logic for lazy_request
  2012-01-25 13:03 [PATCH 1/4] drm/i915: outstanding_lazy_request is a u32 Daniel Vetter
@ 2012-01-25 13:03 ` Daniel Vetter
  2012-01-25 14:17   ` Chris Wilson
  2012-01-25 13:03 ` [PATCH 3/4] drm/i915: adjust ring->add_request to no longer pass back the seqno Daniel Vetter
  2012-01-25 13:04 ` [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6 Daniel Vetter
  2 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2012-01-25 13:03 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

Currently we reserve seqnos only when we emit the request to the ring
(by bumping dev_priv->next_seqno), but start using it much earlier for
ring->oustanding_lazy_request. When 2 threads compete for the gpu and
run on two different rings (e.g. ddx on blitter vs. compositor)
hilarity ensued, especially when we get constantly interrupted while
reserving buffers.

Breakage seems to have been introduced in

commit 6f392d548658a17600da7faaf8a5df25ee5f01f6
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Aug 7 11:01:22 2010 +0100

    drm/i915: Use a common seqno for all rings.

This patch fixes up the seqno reservation logic by moving it into
i915_gem_next_request_seqno. The ring->add_request functions now
superflously still return the new seqno through a pointer, that will
be refactored in the next patch.

v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it
clear that we only have one seqno counter for all rings. Suggested by
Chris Wilson.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h         |    7 +------
 drivers/gpu/drm/i915/i915_gem.c         |   23 +++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |   24 ++++--------------------
 3 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 32737a3..2f102ad 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1159,12 +1159,7 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 	return (int32_t)(seq1 - seq2) >= 0;
 }
 
-static inline u32
-i915_gem_next_request_seqno(struct intel_ring_buffer *ring)
-{
-	drm_i915_private_t *dev_priv = ring->dev->dev_private;
-	return ring->outstanding_lazy_request = dev_priv->next_seqno;
-}
+u32 i915_gem_next_request_seqno(struct intel_ring_buffer *ring);
 
 int __must_check i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
 					   struct intel_ring_buffer *pipelined);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8f01c3d..dc8e45f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1647,6 +1647,28 @@ i915_gem_process_flushing_list(struct intel_ring_buffer *ring,
 	}
 }
 
+static u32
+i915_gem_get_seqno(struct drm_device *dev)
+{
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	u32 seqno = dev_priv->next_seqno;
+
+	/* reserve 0 for non-seqno */
+	if (++dev_priv->next_seqno == 0)
+		dev_priv->next_seqno = 1;
+
+	return seqno;
+}
+
+u32
+i915_gem_next_request_seqno(struct intel_ring_buffer *ring)
+{
+	if (ring->outstanding_lazy_request == 0)
+		ring->outstanding_lazy_request = i915_gem_get_seqno(ring->dev);
+
+	return ring->outstanding_lazy_request;
+}
+
 int
 i915_add_request(struct intel_ring_buffer *ring,
 		 struct drm_file *file,
@@ -1658,6 +1680,7 @@ i915_add_request(struct intel_ring_buffer *ring,
 	int ret;
 
 	BUG_ON(request == NULL);
+	seqno = i915_gem_next_request_seqno(ring);
 
 	ret = ring->add_request(ring, &seqno);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 1ab842c..bc0d791 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -52,20 +52,6 @@ static inline int ring_space(struct intel_ring_buffer *ring)
 	return space;
 }
 
-static u32 i915_gem_get_seqno(struct drm_device *dev)
-{
-	drm_i915_private_t *dev_priv = dev->dev_private;
-	u32 seqno;
-
-	seqno = dev_priv->next_seqno;
-
-	/* reserve 0 for non-seqno */
-	if (++dev_priv->next_seqno == 0)
-		dev_priv->next_seqno = 1;
-
-	return seqno;
-}
-
 static int
 render_ring_flush(struct intel_ring_buffer *ring,
 		  u32	invalidate_domains,
@@ -467,7 +453,7 @@ gen6_add_request(struct intel_ring_buffer *ring,
 	mbox1_reg = ring->signal_mbox[0];
 	mbox2_reg = ring->signal_mbox[1];
 
-	*seqno = i915_gem_get_seqno(ring->dev);
+	*seqno = ring->outstanding_lazy_request;
 
 	update_mboxes(ring, *seqno, mbox1_reg);
 	update_mboxes(ring, *seqno, mbox2_reg);
@@ -565,8 +551,7 @@ static int
 pc_render_add_request(struct intel_ring_buffer *ring,
 		      u32 *result)
 {
-	struct drm_device *dev = ring->dev;
-	u32 seqno = i915_gem_get_seqno(dev);
+	u32 seqno = ring->outstanding_lazy_request;
 	struct pipe_control *pc = ring->private;
 	u32 scratch_addr = pc->gtt_offset + 128;
 	int ret;
@@ -617,8 +602,7 @@ static int
 render_ring_add_request(struct intel_ring_buffer *ring,
 			u32 *result)
 {
-	struct drm_device *dev = ring->dev;
-	u32 seqno = i915_gem_get_seqno(dev);
+	u32 seqno = ring->outstanding_lazy_request;
 	int ret;
 
 	ret = intel_ring_begin(ring, 4);
@@ -792,7 +776,7 @@ ring_add_request(struct intel_ring_buffer *ring,
 	if (ret)
 		return ret;
 
-	seqno = i915_gem_get_seqno(ring->dev);
+	seqno = ring->outstanding_lazy_request;
 
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] drm/i915: adjust ring->add_request to no longer pass back the seqno
  2012-01-25 13:03 [PATCH 1/4] drm/i915: outstanding_lazy_request is a u32 Daniel Vetter
  2012-01-25 13:03 ` [PATCH 2/4] drm/i915: fixup seqno allocation logic for lazy_request Daniel Vetter
@ 2012-01-25 13:03 ` Daniel Vetter
  2012-01-25 15:33   ` [PATCH] " Daniel Vetter
  2012-01-25 13:04 ` [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6 Daniel Vetter
  2 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2012-01-25 13:03 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

The seqno is now allocated ahead of calling ring->add_request, so this
is just unncessery and can be removed.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c         |    2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |   24 +++++++-----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    2 +-
 3 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dc8e45f..a72100c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1682,7 +1682,7 @@ i915_add_request(struct intel_ring_buffer *ring,
 	BUG_ON(request == NULL);
 	seqno = i915_gem_next_request_seqno(ring);
 
-	ret = ring->add_request(ring, &seqno);
+	ret = ring->add_request(ring, seqno);
 	if (ret)
 	    return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index bc0d791..117004f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -440,7 +440,7 @@ update_mboxes(struct intel_ring_buffer *ring,
  */
 static int
 gen6_add_request(struct intel_ring_buffer *ring,
-		 u32 *seqno)
+		 u32 seqno)
 {
 	u32 mbox1_reg;
 	u32 mbox2_reg;
@@ -453,13 +453,11 @@ gen6_add_request(struct intel_ring_buffer *ring,
 	mbox1_reg = ring->signal_mbox[0];
 	mbox2_reg = ring->signal_mbox[1];
 
-	*seqno = ring->outstanding_lazy_request;
-
-	update_mboxes(ring, *seqno, mbox1_reg);
-	update_mboxes(ring, *seqno, mbox2_reg);
+	update_mboxes(ring, seqno, mbox1_reg);
+	update_mboxes(ring, seqno, mbox2_reg);
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-	intel_ring_emit(ring, *seqno);
+	intel_ring_emit(ring, seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	intel_ring_advance(ring);
 
@@ -549,9 +547,8 @@ do {									\
 
 static int
 pc_render_add_request(struct intel_ring_buffer *ring,
-		      u32 *result)
+		      u32 seqno)
 {
-	u32 seqno = ring->outstanding_lazy_request;
 	struct pipe_control *pc = ring->private;
 	u32 scratch_addr = pc->gtt_offset + 128;
 	int ret;
@@ -594,15 +591,13 @@ pc_render_add_request(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
-	*result = seqno;
 	return 0;
 }
 
 static int
 render_ring_add_request(struct intel_ring_buffer *ring,
-			u32 *result)
+			u32 seqno)
 {
-	u32 seqno = ring->outstanding_lazy_request;
 	int ret;
 
 	ret = intel_ring_begin(ring, 4);
@@ -615,7 +610,6 @@ render_ring_add_request(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	intel_ring_advance(ring);
 
-	*result = seqno;
 	return 0;
 }
 
@@ -767,24 +761,20 @@ bsd_ring_flush(struct intel_ring_buffer *ring,
 
 static int
 ring_add_request(struct intel_ring_buffer *ring,
-		 u32 *result)
+		 u32 seqno)
 {
-	u32 seqno;
 	int ret;
 
 	ret = intel_ring_begin(ring, 4);
 	if (ret)
 		return ret;
 
-	seqno = ring->outstanding_lazy_request;
-
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	intel_ring_advance(ring);
 
-	*result = seqno;
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 68281c9..1638f32 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -70,7 +70,7 @@ struct  intel_ring_buffer {
 				  u32	invalidate_domains,
 				  u32	flush_domains);
 	int		(*add_request)(struct intel_ring_buffer *ring,
-				       u32 *seqno);
+				       u32 seqno);
 	u32		(*get_seqno)(struct intel_ring_buffer *ring);
 	int		(*dispatch_execbuffer)(struct intel_ring_buffer *ring,
 					       u32 offset, u32 length);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6
  2012-01-25 13:03 [PATCH 1/4] drm/i915: outstanding_lazy_request is a u32 Daniel Vetter
  2012-01-25 13:03 ` [PATCH 2/4] drm/i915: fixup seqno allocation logic for lazy_request Daniel Vetter
  2012-01-25 13:03 ` [PATCH 3/4] drm/i915: adjust ring->add_request to no longer pass back the seqno Daniel Vetter
@ 2012-01-25 13:04 ` Daniel Vetter
  2012-01-30 22:33   ` Daniel Vetter
  2012-02-08 10:00   ` Chris Wilson
  2 siblings, 2 replies; 12+ messages in thread
From: Daniel Vetter @ 2012-01-25 13:04 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

We still have reports of missed irqs even on Sandybridge with the
HWSTAM workaround in place. Testing by the bug reporter gets rid of
them with the forcewake voodoo and no HWSTAM writes.

Because I've slightly botched the rebasing I've left out the ACTHD
readback which is also required to get IVB working. Seems to still
work on the tester's machine, so I think we should go with the more
minmal approach on SNB. Especially since I've only found weak evidence
for holding forcewake while waiting for an interrupt to arrive, but
none for the ACTHD readback.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_irq.c         |   12 ------------
 drivers/gpu/drm/i915/intel_ringbuffer.c |    6 ++----
 2 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 5bd4361..b1c902f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1752,18 +1752,6 @@ static void ironlake_irq_preinstall(struct drm_device *dev)
 
 	I915_WRITE(HWSTAM, 0xeffe);
 
-	if (IS_GEN6(dev)) {
-		/* Workaround stalls observed on Sandy Bridge GPUs by
-		 * making the blitter command streamer generate a
-		 * write to the Hardware Status Page for
-		 * MI_USER_INTERRUPT.  This appears to serialize the
-		 * previous seqno write out before the interrupt
-		 * happens.
-		 */
-		I915_WRITE(GEN6_BLITTER_HWSTAM, ~GEN6_BLITTER_USER_INTERRUPT);
-		I915_WRITE(GEN6_BSD_HWSTAM, ~GEN6_BSD_USER_INTERRUPT);
-	}
-
 	/* XXX hotplug from PCH */
 
 	I915_WRITE(DEIMR, 0xffffffff);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 117004f..1fc8232 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -790,8 +790,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring, u32 gflag, u32 rflag)
 	/* It looks like we need to prevent the gt from suspending while waiting
 	 * for an notifiy irq, otherwise irqs seem to get lost on at least the
 	 * blt/bsd rings on ivb. */
-	if (IS_GEN7(dev))
-		gen6_gt_force_wake_get(dev_priv);
+	gen6_gt_force_wake_get(dev_priv);
 
 	spin_lock(&ring->irq_lock);
 	if (ring->irq_refcount++ == 0) {
@@ -818,8 +817,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring, u32 gflag, u32 rflag)
 	}
 	spin_unlock(&ring->irq_lock);
 
-	if (IS_GEN7(dev))
-		gen6_gt_force_wake_put(dev_priv);
+	gen6_gt_force_wake_put(dev_priv);
 }
 
 static bool
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] drm/i915: fixup seqno allocation logic for lazy_request
  2012-01-25 13:03 ` [PATCH 2/4] drm/i915: fixup seqno allocation logic for lazy_request Daniel Vetter
@ 2012-01-25 14:17   ` Chris Wilson
  2012-01-25 15:32     ` [PATCH] " Daniel Vetter
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2012-01-25 14:17 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

On Wed, 25 Jan 2012 14:03:58 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Currently we reserve seqnos only when we emit the request to the ring
> (by bumping dev_priv->next_seqno), but start using it much earlier for
> ring->oustanding_lazy_request. When 2 threads compete for the gpu and
> run on two different rings (e.g. ddx on blitter vs. compositor)
> hilarity ensued, especially when we get constantly interrupted while
> reserving buffers.
> 
> Breakage seems to have been introduced in
> 
> commit 6f392d548658a17600da7faaf8a5df25ee5f01f6
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Sat Aug 7 11:01:22 2010 +0100
> 
>     drm/i915: Use a common seqno for all rings.
> 
> This patch fixes up the seqno reservation logic by moving it into
> i915_gem_next_request_seqno. The ring->add_request functions now
> superflously still return the new seqno through a pointer, that will
> be refactored in the next patch.
> 
> v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it
> clear that we only have one seqno counter for all rings. Suggested by
> Chris Wilson.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
> Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  static int
>  render_ring_flush(struct intel_ring_buffer *ring,
>  		  u32	invalidate_domains,
> @@ -467,7 +453,7 @@ gen6_add_request(struct intel_ring_buffer *ring,
>  	mbox1_reg = ring->signal_mbox[0];
>  	mbox2_reg = ring->signal_mbox[1];
>  
> -	*seqno = i915_gem_get_seqno(ring->dev);
> +	*seqno = ring->outstanding_lazy_request;

In discussing this patch with Daniel, I made the mistake of reading that
as i915_gem_get_next_request_seqno() instead of get_seqno(). I'd suggest
the patch makes that change and hide the ugly ring->o_l_r. Then since we
do i915_gem_get_next_request_seqno() both here and in the caller, it
becomes much clearer that we are able to remove it.

Daniel, apologies for the confusion!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] drm/i915: fixup seqno allocation logic for lazy_request
  2012-01-25 14:17   ` Chris Wilson
@ 2012-01-25 15:32     ` Daniel Vetter
  2012-01-25 15:46       ` Chris Wilson
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2012-01-25 15:32 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

Currently we reserve seqnos only when we emit the request to the ring
(by bumping dev_priv->next_seqno), but start using it much earlier for
ring->oustanding_lazy_request. When 2 threads compete for the gpu and
run on two different rings (e.g. ddx on blitter vs. compositor)
hilarity ensued, especially when we get constantly interrupted while
reserving buffers.

Breakage seems to have been introduced in

commit 6f392d548658a17600da7faaf8a5df25ee5f01f6
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Aug 7 11:01:22 2010 +0100

    drm/i915: Use a common seqno for all rings.

This patch fixes up the seqno reservation logic by moving it into
i915_gem_next_request_seqno. The ring->add_request functions now
superflously still return the new seqno through a pointer, that will
be refactored in the next patch.

Note that with this change we now unconditionally allocate a seqno,
even when ->add_request might fail because the rings are full and the
gpu died. But this does not open up a new can of worms because we can
already leave behind an outstanding_request_seqno if e.g. the caller
gets interrupted with a signal while stalling for the gpu in the
eviciton paths. And with the bugfix we only ever have one seqno
allocated per ring (and only that ring), so there are no ordering
issues with multiple outstanding seqnos on the same ring.

v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it
clear that we only have one seqno counter for all rings. Suggested by
Chris Wilson.

v3: As suggested by Chris Wilson use i915_gem_next_request_seqno
instead of ring->oustanding_lazy_request to make the follow-up
refactoring more clearly correct. Also improve the commit message
with issues discussed on irc.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h         |    7 +------
 drivers/gpu/drm/i915/i915_gem.c         |   23 +++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |   24 ++++--------------------
 3 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 32737a3..2f102ad 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1159,12 +1159,7 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 	return (int32_t)(seq1 - seq2) >= 0;
 }
 
-static inline u32
-i915_gem_next_request_seqno(struct intel_ring_buffer *ring)
-{
-	drm_i915_private_t *dev_priv = ring->dev->dev_private;
-	return ring->outstanding_lazy_request = dev_priv->next_seqno;
-}
+u32 i915_gem_next_request_seqno(struct intel_ring_buffer *ring);
 
 int __must_check i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
 					   struct intel_ring_buffer *pipelined);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8f01c3d..dc8e45f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1647,6 +1647,28 @@ i915_gem_process_flushing_list(struct intel_ring_buffer *ring,
 	}
 }
 
+static u32
+i915_gem_get_seqno(struct drm_device *dev)
+{
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	u32 seqno = dev_priv->next_seqno;
+
+	/* reserve 0 for non-seqno */
+	if (++dev_priv->next_seqno == 0)
+		dev_priv->next_seqno = 1;
+
+	return seqno;
+}
+
+u32
+i915_gem_next_request_seqno(struct intel_ring_buffer *ring)
+{
+	if (ring->outstanding_lazy_request == 0)
+		ring->outstanding_lazy_request = i915_gem_get_seqno(ring->dev);
+
+	return ring->outstanding_lazy_request;
+}
+
 int
 i915_add_request(struct intel_ring_buffer *ring,
 		 struct drm_file *file,
@@ -1658,6 +1680,7 @@ i915_add_request(struct intel_ring_buffer *ring,
 	int ret;
 
 	BUG_ON(request == NULL);
+	seqno = i915_gem_next_request_seqno(ring);
 
 	ret = ring->add_request(ring, &seqno);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 1ab842c..7a107c9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -52,20 +52,6 @@ static inline int ring_space(struct intel_ring_buffer *ring)
 	return space;
 }
 
-static u32 i915_gem_get_seqno(struct drm_device *dev)
-{
-	drm_i915_private_t *dev_priv = dev->dev_private;
-	u32 seqno;
-
-	seqno = dev_priv->next_seqno;
-
-	/* reserve 0 for non-seqno */
-	if (++dev_priv->next_seqno == 0)
-		dev_priv->next_seqno = 1;
-
-	return seqno;
-}
-
 static int
 render_ring_flush(struct intel_ring_buffer *ring,
 		  u32	invalidate_domains,
@@ -467,7 +453,7 @@ gen6_add_request(struct intel_ring_buffer *ring,
 	mbox1_reg = ring->signal_mbox[0];
 	mbox2_reg = ring->signal_mbox[1];
 
-	*seqno = i915_gem_get_seqno(ring->dev);
+	*seqno = i915_gem_next_request_seqno(ring);
 
 	update_mboxes(ring, *seqno, mbox1_reg);
 	update_mboxes(ring, *seqno, mbox2_reg);
@@ -565,8 +551,7 @@ static int
 pc_render_add_request(struct intel_ring_buffer *ring,
 		      u32 *result)
 {
-	struct drm_device *dev = ring->dev;
-	u32 seqno = i915_gem_get_seqno(dev);
+	u32 seqno = i915_gem_next_request_seqno(ring);
 	struct pipe_control *pc = ring->private;
 	u32 scratch_addr = pc->gtt_offset + 128;
 	int ret;
@@ -617,8 +602,7 @@ static int
 render_ring_add_request(struct intel_ring_buffer *ring,
 			u32 *result)
 {
-	struct drm_device *dev = ring->dev;
-	u32 seqno = i915_gem_get_seqno(dev);
+	u32 seqno = i915_gem_next_request_seqno(ring);
 	int ret;
 
 	ret = intel_ring_begin(ring, 4);
@@ -792,7 +776,7 @@ ring_add_request(struct intel_ring_buffer *ring,
 	if (ret)
 		return ret;
 
-	seqno = i915_gem_get_seqno(ring->dev);
+	seqno = i915_gem_next_request_seqno(ring);
 
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH] drm/i915: adjust ring->add_request to no longer pass back the seqno
  2012-01-25 13:03 ` [PATCH 3/4] drm/i915: adjust ring->add_request to no longer pass back the seqno Daniel Vetter
@ 2012-01-25 15:33   ` Daniel Vetter
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2012-01-25 15:33 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

The seqno is now allocated ahead of calling ring->add_request, so this
is just unncessery and can be removed.

v2: Rebased on top of the changed fix patch.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c         |    2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |   24 +++++++-----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    2 +-
 3 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dc8e45f..a72100c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1682,7 +1682,7 @@ i915_add_request(struct intel_ring_buffer *ring,
 	BUG_ON(request == NULL);
 	seqno = i915_gem_next_request_seqno(ring);
 
-	ret = ring->add_request(ring, &seqno);
+	ret = ring->add_request(ring, seqno);
 	if (ret)
 	    return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 7a107c9..117004f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -440,7 +440,7 @@ update_mboxes(struct intel_ring_buffer *ring,
  */
 static int
 gen6_add_request(struct intel_ring_buffer *ring,
-		 u32 *seqno)
+		 u32 seqno)
 {
 	u32 mbox1_reg;
 	u32 mbox2_reg;
@@ -453,13 +453,11 @@ gen6_add_request(struct intel_ring_buffer *ring,
 	mbox1_reg = ring->signal_mbox[0];
 	mbox2_reg = ring->signal_mbox[1];
 
-	*seqno = i915_gem_next_request_seqno(ring);
-
-	update_mboxes(ring, *seqno, mbox1_reg);
-	update_mboxes(ring, *seqno, mbox2_reg);
+	update_mboxes(ring, seqno, mbox1_reg);
+	update_mboxes(ring, seqno, mbox2_reg);
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-	intel_ring_emit(ring, *seqno);
+	intel_ring_emit(ring, seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	intel_ring_advance(ring);
 
@@ -549,9 +547,8 @@ do {									\
 
 static int
 pc_render_add_request(struct intel_ring_buffer *ring,
-		      u32 *result)
+		      u32 seqno)
 {
-	u32 seqno = i915_gem_next_request_seqno(ring);
 	struct pipe_control *pc = ring->private;
 	u32 scratch_addr = pc->gtt_offset + 128;
 	int ret;
@@ -594,15 +591,13 @@ pc_render_add_request(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
-	*result = seqno;
 	return 0;
 }
 
 static int
 render_ring_add_request(struct intel_ring_buffer *ring,
-			u32 *result)
+			u32 seqno)
 {
-	u32 seqno = i915_gem_next_request_seqno(ring);
 	int ret;
 
 	ret = intel_ring_begin(ring, 4);
@@ -615,7 +610,6 @@ render_ring_add_request(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	intel_ring_advance(ring);
 
-	*result = seqno;
 	return 0;
 }
 
@@ -767,24 +761,20 @@ bsd_ring_flush(struct intel_ring_buffer *ring,
 
 static int
 ring_add_request(struct intel_ring_buffer *ring,
-		 u32 *result)
+		 u32 seqno)
 {
-	u32 seqno;
 	int ret;
 
 	ret = intel_ring_begin(ring, 4);
 	if (ret)
 		return ret;
 
-	seqno = i915_gem_next_request_seqno(ring);
-
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	intel_ring_advance(ring);
 
-	*result = seqno;
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 68281c9..1638f32 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -70,7 +70,7 @@ struct  intel_ring_buffer {
 				  u32	invalidate_domains,
 				  u32	flush_domains);
 	int		(*add_request)(struct intel_ring_buffer *ring,
-				       u32 *seqno);
+				       u32 seqno);
 	u32		(*get_seqno)(struct intel_ring_buffer *ring);
 	int		(*dispatch_execbuffer)(struct intel_ring_buffer *ring,
 					       u32 offset, u32 length);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: fixup seqno allocation logic for lazy_request
  2012-01-25 15:32     ` [PATCH] " Daniel Vetter
@ 2012-01-25 15:46       ` Chris Wilson
  2012-01-26 13:55         ` Daniel Vetter
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2012-01-25 15:46 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

On Wed, 25 Jan 2012 16:32:49 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Currently we reserve seqnos only when we emit the request to the ring
> (by bumping dev_priv->next_seqno), but start using it much earlier for
> ring->oustanding_lazy_request. When 2 threads compete for the gpu and
> run on two different rings (e.g. ddx on blitter vs. compositor)
> hilarity ensued, especially when we get constantly interrupted while
> reserving buffers.
> 
> Breakage seems to have been introduced in
> 
> commit 6f392d548658a17600da7faaf8a5df25ee5f01f6
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Sat Aug 7 11:01:22 2010 +0100
> 
>     drm/i915: Use a common seqno for all rings.
> 
> This patch fixes up the seqno reservation logic by moving it into
> i915_gem_next_request_seqno. The ring->add_request functions now
> superflously still return the new seqno through a pointer, that will
> be refactored in the next patch.
> 
> Note that with this change we now unconditionally allocate a seqno,
> even when ->add_request might fail because the rings are full and the
> gpu died. But this does not open up a new can of worms because we can
> already leave behind an outstanding_request_seqno if e.g. the caller
> gets interrupted with a signal while stalling for the gpu in the
> eviciton paths. And with the bugfix we only ever have one seqno
> allocated per ring (and only that ring), so there are no ordering
> issues with multiple outstanding seqnos on the same ring.
> 
> v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it
> clear that we only have one seqno counter for all rings. Suggested by
> Chris Wilson.
> 
> v3: As suggested by Chris Wilson use i915_gem_next_request_seqno
> instead of ring->oustanding_lazy_request to make the follow-up
> refactoring more clearly correct. Also improve the commit message
> with issues discussed on irc.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
> Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I'm not completely sold on the extra i915_gem_get_next_request_seqno()
in i915_add_request(). Daniel describes it as paranoia, I think of it as
muddling the real bugfix with a bit of extra confusion. Nevertheless, it
does fix the issue where the seqno emitted by the ring is at odds with
the seqno assigned to the buffers associated with that request, and that
is clearly a good thing. I haven't quite managed to join the dots and
create a scenario whereby the ring never advances its seqno to one more
advanced than assigned to a buffer (outside of pathological wraparound)
and so I don't see how by itself we would have confused the GPU as the
ring state itself would always be internally consistent.

Anyway,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

The actual refactoring patch are also ok, though I'd like for Daniel
to scope out who owns the seqno vs the request, especially in the light
of no-more-domains...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: fixup seqno allocation logic for lazy_request
  2012-01-25 15:46       ` Chris Wilson
@ 2012-01-26 13:55         ` Daniel Vetter
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2012-01-26 13:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, Intel Graphics Development

On Wed, Jan 25, 2012 at 03:46:51PM +0000, Chris Wilson wrote:
> The actual refactoring patch are also ok, though I'd like for Daniel
> to scope out who owns the seqno vs the request, especially in the light
> of no-more-domains...

I've thought a bit more about this and one great upside of seqnos over a
simple pointer to the request is that we'd need to properly manage
references to the latter. I.e. when we free a request we would need to
ensure that anyone still referencing it is actually gone, and the current
code is unfortunately quite far away from that :(

I think we should keep this refactor idea in mind and look at it again
after no-more-domains and after intel_ringbuffer.c has been simplified.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6
  2012-01-25 13:04 ` [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6 Daniel Vetter
@ 2012-01-30 22:33   ` Daniel Vetter
  2012-02-08 10:00   ` Chris Wilson
  1 sibling, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2012-01-30 22:33 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

On Wed, Jan 25, 2012 at 02:04:00PM +0100, Daniel Vetter wrote:
> We still have reports of missed irqs even on Sandybridge with the
> HWSTAM workaround in place. Testing by the bug reporter gets rid of
> them with the forcewake voodoo and no HWSTAM writes.
> 
> Because I've slightly botched the rebasing I've left out the ACTHD
> readback which is also required to get IVB working. Seems to still
> work on the tester's machine, so I think we should go with the more
> minmal approach on SNB. Especially since I've only found weak evidence
> for holding forcewake while waiting for an interrupt to arrive, but
> none for the ACTHD readback.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
> Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Another bug reporter just confirmed that this patch gets rid of 'missed
IRQ?' quasi-hangs on snb. Unfortunately there's still another bug left to
squash, which hangs the gpu ...

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45332
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6
  2012-01-25 13:04 ` [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6 Daniel Vetter
  2012-01-30 22:33   ` Daniel Vetter
@ 2012-02-08 10:00   ` Chris Wilson
  2012-02-13 10:00     ` Daniel Vetter
  1 sibling, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2012-02-08 10:00 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

On Wed, 25 Jan 2012 14:04:00 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> We still have reports of missed irqs even on Sandybridge with the
> HWSTAM workaround in place. Testing by the bug reporter gets rid of
> them with the forcewake voodoo and no HWSTAM writes.
> 
> Because I've slightly botched the rebasing I've left out the ACTHD
> readback which is also required to get IVB working. Seems to still
> work on the tester's machine, so I think we should go with the more
> minmal approach on SNB. Especially since I've only found weak evidence
> for holding forcewake while waiting for an interrupt to arrive, but
> none for the ACTHD readback.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
> Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Whilst this patch doesn't directly enable the ACTHD w/a, I have observed
that with a very light load (drawing a single large transformed texture)
reading back the ACTHD (along with the GT forcewake dance) becomes the
predominant consumer of CPU time for the system. (The rate-limiting step
is still the GPU, it just irked me to see the kernel consume more CPU time
than X.)

This workaround appears more successful than the last, and doesn't
appear to break anything else, so

Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6
  2012-02-08 10:00   ` Chris Wilson
@ 2012-02-13 10:00     ` Daniel Vetter
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2012-02-13 10:00 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, Intel Graphics Development

On Wed, Feb 08, 2012 at 10:00:54AM +0000, Chris Wilson wrote:
> On Wed, 25 Jan 2012 14:04:00 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > We still have reports of missed irqs even on Sandybridge with the
> > HWSTAM workaround in place. Testing by the bug reporter gets rid of
> > them with the forcewake voodoo and no HWSTAM writes.
> > 
> > Because I've slightly botched the rebasing I've left out the ACTHD
> > readback which is also required to get IVB working. Seems to still
> > work on the tester's machine, so I think we should go with the more
> > minmal approach on SNB. Especially since I've only found weak evidence
> > for holding forcewake while waiting for an interrupt to arrive, but
> > none for the ACTHD readback.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
> > Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
> > Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> Whilst this patch doesn't directly enable the ACTHD w/a, I have observed
> that with a very light load (drawing a single large transformed texture)
> reading back the ACTHD (along with the GT forcewake dance) becomes the
> predominant consumer of CPU time for the system. (The rate-limiting step
> is still the GPU, it just irked me to see the kernel consume more CPU time
> than X.)
> 
> This workaround appears more successful than the last, and doesn't
> appear to break anything else, so
> 
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>

I've picked up patches 1,2 & 4 of this series, thanks for taking a look at
them. I'll postpone 3 until we do the seqno/request refactoring for real,
atm there's too much stuff in this area outstanding and we need to judge
this one in the context of the real thing.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-02-13  9:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-25 13:03 [PATCH 1/4] drm/i915: outstanding_lazy_request is a u32 Daniel Vetter
2012-01-25 13:03 ` [PATCH 2/4] drm/i915: fixup seqno allocation logic for lazy_request Daniel Vetter
2012-01-25 14:17   ` Chris Wilson
2012-01-25 15:32     ` [PATCH] " Daniel Vetter
2012-01-25 15:46       ` Chris Wilson
2012-01-26 13:55         ` Daniel Vetter
2012-01-25 13:03 ` [PATCH 3/4] drm/i915: adjust ring->add_request to no longer pass back the seqno Daniel Vetter
2012-01-25 15:33   ` [PATCH] " Daniel Vetter
2012-01-25 13:04 ` [PATCH 4/4] drm/i915: enable forcewake voodoo also for gen6 Daniel Vetter
2012-01-30 22:33   ` Daniel Vetter
2012-02-08 10:00   ` Chris Wilson
2012-02-13 10:00     ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.