All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful
@ 2011-12-14 12:56 Daniel Vetter
  2011-12-14 12:56 ` [PATCH 02/43] drm/i915: don't bail out of intel_wait_ring_buffer too early Daniel Vetter
                   ` (41 more replies)
  0 siblings, 42 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:56 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

If our semaphore logic gets confused and we have a ring stuck waiting
for one, there's a decent chance it'll just execute garbage when being
kicked. Also, kicking the ring obscures the place where the error
first occured, making error_state decoding much harder.

So drop this an let gpu reset handle this mess in a clean fashion.

In contrast, kicking rings stuck on MI_WAIT is rather harmless, at
worst there'll be a bit of screen-flickering. There's also old
broken userspace out there which needs this as a  work-around.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@hchris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c |    7 -------
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index d47a53b..070345b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1649,13 +1649,6 @@ static bool kick_ring(struct intel_ring_buffer *ring)
 		I915_WRITE_CTL(ring, tmp);
 		return true;
 	}
-	if (IS_GEN6(dev) &&
-	    (tmp & RING_WAIT_SEMAPHORE)) {
-		DRM_ERROR("Kicking stuck semaphore on %s\n",
-			  ring->name);
-		I915_WRITE_CTL(ring, tmp);
-		return true;
-	}
 	return false;
 }
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 02/43] drm/i915: don't bail out of intel_wait_ring_buffer too early
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
@ 2011-12-14 12:56 ` Daniel Vetter
  2011-12-14 18:39   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 03/43] drm/i915: switch ring->id to be a real id Daniel Vetter
                   ` (40 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:56 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

In the pre-gem days with non-existing hangcheck and gpu reset code,
this timeout of 3 seconds was pretty important to avoid stuck
processes.

But now we have the hangcheck code in gem that goes to great length
to ensure that the gpu is really dead before declaring it wedged.

So there's no need for this timeout anymore. Actually it's even harmful
because we can bail out too early (e.g. with xscreensaver slip)
when running giant batchbuffers. And our code isn't robust enough
to properly unroll any state-changes, we pretty much rely on the gpu
reset code cleaning up the mess (like cache tracking, fencing state,
active list/request tracking, ...).

With this change intel_begin_ring can only fail when the gpu is
wedged, and it will return -EAGAIN (like wait_request in case the
gpu reset is still outstanding).

v2: Chris Wilson noted that on resume timers aren't running and hence
we won't ever get kicked out of this loop by the hangcheck code. Use
an insanely large timeout instead for the HAS_GEM case to prevent
resume bugs from totally hanging the machine.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ca70e2f..6e28301 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1119,7 +1119,16 @@ int intel_wait_ring_buffer(struct intel_ring_buffer *ring, int n)
 	}
 
 	trace_i915_ring_wait_begin(ring);
-	end = jiffies + 3 * HZ;
+	if (drm_core_check_feature(dev, DRIVER_GEM))
+		/* With GEM the hangcheck timer should kick us out of the loop,
+		 * leaving it early runs the risk of corrupting GEM state (due
+		 * to running on almost untested codepaths). But on resume
+		 * timers don't work yet, so prevent a complete hang in that
+		 * case by choosing an insanely large timeout. */
+		end = jiffies + 60 * HZ;
+	else
+		end = jiffies + 3 * HZ;
+
 	do {
 		ring->head = I915_READ_HEAD(ring);
 		ring->space = ring_space(ring);
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 03/43] drm/i915: switch ring->id to be a real id
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
  2011-12-14 12:56 ` [PATCH 02/43] drm/i915: don't bail out of intel_wait_ring_buffer too early Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 18:42   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 04/43] drm/i915: refactor ring error state capture to use arrays Daniel Vetter
                   ` (39 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

... and add a helpr function for the places where we want a flag.

This way we can use ring->id to index into arrays.

v2: Resurrect the missing beautification-space Chris Wilson noted.
I'm moving this space around because I'll reuse ring_str in the next
patch.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |    9 +++++----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    4 ++--
 drivers/gpu/drm/i915/i915_irq.c            |    2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |   14 +++++++-------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   20 ++++++++++----------
 5 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d09a6e0..224d164 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -667,9 +667,9 @@ static int i915_ringbuffer_info(struct seq_file *m, void *data)
 static const char *ring_str(int ring)
 {
 	switch (ring) {
-	case RING_RENDER: return " render";
-	case RING_BSD: return " bsd";
-	case RING_BLT: return " blt";
+	case RCS: return "render";
+	case VCS: return "bsd";
+	case BCS: return "blt";
 	default: return "";
 	}
 }
@@ -712,7 +712,7 @@ static void print_error_buffers(struct seq_file *m,
 	seq_printf(m, "%s [%d]:\n", name, count);
 
 	while (count--) {
-		seq_printf(m, "  %08x %8u %04x %04x %08x%s%s%s%s%s%s",
+		seq_printf(m, "  %08x %8u %04x %04x %08x%s%s%s%s%s%s%s",
 			   err->gtt_offset,
 			   err->size,
 			   err->read_domains,
@@ -722,6 +722,7 @@ static void print_error_buffers(struct seq_file *m,
 			   tiling_flag(err->tiling),
 			   dirty_flag(err->dirty),
 			   purgeable_flag(err->purgeable),
+			   err->ring != -1 ? " " : "",
 			   ring_str(err->ring),
 			   cache_level_str(err->cache_level));
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3693e83..926ed48 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -202,9 +202,9 @@ i915_gem_object_set_to_gpu_domain(struct drm_i915_gem_object *obj,
 	cd->invalidate_domains |= invalidate_domains;
 	cd->flush_domains |= flush_domains;
 	if (flush_domains & I915_GEM_GPU_DOMAINS)
-		cd->flush_rings |= obj->ring->id;
+		cd->flush_rings |= intel_ring_flag(obj->ring);
 	if (invalidate_domains & I915_GEM_GPU_DOMAINS)
-		cd->flush_rings |= ring->id;
+		cd->flush_rings |= intel_ring_flag(ring);
 }
 
 struct eb_objects {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 070345b..919d244 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -804,7 +804,7 @@ static u32 capture_bo_list(struct drm_i915_error_buffer *err,
 		err->tiling = obj->tiling_mode;
 		err->dirty = obj->dirty;
 		err->purgeable = obj->madv != I915_MADV_WILLNEED;
-		err->ring = obj->ring ? obj->ring->id : 0;
+		err->ring = obj->ring ? obj->ring->id : -1;
 		err->cache_level = obj->cache_level;
 
 		if (++i == count)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6e28301..3c30dba 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -726,13 +726,13 @@ void intel_ring_setup_status_page(struct intel_ring_buffer *ring)
 	 */
 	if (IS_GEN7(dev)) {
 		switch (ring->id) {
-		case RING_RENDER:
+		case RCS:
 			mmio = RENDER_HWS_PGA_GEN7;
 			break;
-		case RING_BLT:
+		case BCS:
 			mmio = BLT_HWS_PGA_GEN7;
 			break;
-		case RING_BSD:
+		case VCS:
 			mmio = BSD_HWS_PGA_GEN7;
 			break;
 		}
@@ -1185,7 +1185,7 @@ void intel_ring_advance(struct intel_ring_buffer *ring)
 
 static const struct intel_ring_buffer render_ring = {
 	.name			= "render ring",
-	.id			= RING_RENDER,
+	.id			= RCS,
 	.mmio_base		= RENDER_RING_BASE,
 	.size			= 32 * PAGE_SIZE,
 	.init			= init_render_ring,
@@ -1208,7 +1208,7 @@ static const struct intel_ring_buffer render_ring = {
 
 static const struct intel_ring_buffer bsd_ring = {
 	.name                   = "bsd ring",
-	.id			= RING_BSD,
+	.id			= VCS,
 	.mmio_base		= BSD_RING_BASE,
 	.size			= 32 * PAGE_SIZE,
 	.init			= init_ring_common,
@@ -1318,7 +1318,7 @@ gen6_bsd_ring_put_irq(struct intel_ring_buffer *ring)
 /* ring buffer for Video Codec for Gen6+ */
 static const struct intel_ring_buffer gen6_bsd_ring = {
 	.name			= "gen6 bsd ring",
-	.id			= RING_BSD,
+	.id			= VCS,
 	.mmio_base		= GEN6_BSD_RING_BASE,
 	.size			= 32 * PAGE_SIZE,
 	.init			= init_ring_common,
@@ -1453,7 +1453,7 @@ static void blt_ring_cleanup(struct intel_ring_buffer *ring)
 
 static const struct intel_ring_buffer gen6_blt_ring = {
 	.name			= "blt ring",
-	.id			= RING_BLT,
+	.id			= BCS,
 	.mmio_base		= BLT_RING_BASE,
 	.size			= 32 * PAGE_SIZE,
 	.init			= blt_ring_init,
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 68281c9..c8b9cc0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -1,13 +1,6 @@
 #ifndef _INTEL_RINGBUFFER_H_
 #define _INTEL_RINGBUFFER_H_
 
-enum {
-	RCS = 0x0,
-	VCS,
-	BCS,
-	I915_NUM_RINGS,
-};
-
 struct  intel_hw_status_page {
 	u32	__iomem	*page_addr;
 	unsigned int	gfx_addr;
@@ -36,10 +29,11 @@ struct  intel_hw_status_page {
 struct  intel_ring_buffer {
 	const char	*name;
 	enum intel_ring_id {
-		RING_RENDER = 0x1,
-		RING_BSD = 0x2,
-		RING_BLT = 0x4,
+		RCS = 0x0,
+		VCS,
+		BCS,
 	} id;
+#define I915_NUM_RINGS 3
 	u32		mmio_base;
 	void		__iomem *virtual_start;
 	struct		drm_device *dev;
@@ -119,6 +113,12 @@ struct  intel_ring_buffer {
 	void *private;
 };
 
+static inline unsigned
+intel_ring_flag(struct intel_ring_buffer *ring)
+{
+	return 1 << ring->id;
+}
+
 static inline u32
 intel_ring_sync_index(struct intel_ring_buffer *ring,
 		      struct intel_ring_buffer *other)
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 04/43] drm/i915: refactor ring error state capture to use arrays
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
  2011-12-14 12:56 ` [PATCH 02/43] drm/i915: don't bail out of intel_wait_ring_buffer too early Daniel Vetter
  2011-12-14 12:57 ` [PATCH 03/43] drm/i915: switch ring->id to be a real id Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 18:43   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 05/43] drm/i915: collect more per ring error state Daniel Vetter
                   ` (38 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx, Ben Widawsky

The code already got unwieldy and we want to dump more per-ring
registers.

Only functional change is that we now also capture the video
ring registers on ilk.

v2: fixup a refactor fumble spotted by Chris Wilson.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   55 ++++++++++++++-------------
 drivers/gpu/drm/i915/i915_drv.h     |   20 ++-------
 drivers/gpu/drm/i915/i915_irq.c     |   70 ++++++++++++++++++-----------------
 drivers/gpu/drm/i915/i915_reg.h     |   11 +----
 4 files changed, 73 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 224d164..37a74cf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -736,6 +736,26 @@ static void print_error_buffers(struct seq_file *m,
 	}
 }
 
+static void i915_ring_error_state(struct seq_file *m,
+				  struct drm_device *dev,
+				  struct drm_i915_error_state *error,
+				  unsigned ring)
+{
+	seq_printf(m, "%s command stream:\n", ring_str(ring));
+	seq_printf(m, "  ACTHD: 0x%08x\n", error->acthd[ring]);
+	seq_printf(m, "  IPEIR: 0x%08x\n", error->ipeir[ring]);
+	seq_printf(m, "  IPEHR: 0x%08x\n", error->ipehr[ring]);
+	seq_printf(m, "  INSTDONE: 0x%08x\n", error->instdone[ring]);
+	if (ring == RCS) {
+		if (INTEL_INFO(dev)->gen >= 4) {
+			seq_printf(m, "  INSTDONE1: 0x%08x\n", error->instdone1);
+			seq_printf(m, "  INSTPS: 0x%08x\n", error->instps);
+		}
+		seq_printf(m, "  INSTPM: 0x%08x\n", error->instpm);
+	}
+	seq_printf(m, "  seqno: 0x%08x\n", error->seqno[ring]);
+}
+
 static int i915_error_state(struct seq_file *m, void *unused)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -758,36 +778,19 @@ static int i915_error_state(struct seq_file *m, void *unused)
 	seq_printf(m, "PCI ID: 0x%04x\n", dev->pci_device);
 	seq_printf(m, "EIR: 0x%08x\n", error->eir);
 	seq_printf(m, "PGTBL_ER: 0x%08x\n", error->pgtbl_er);
-	if (INTEL_INFO(dev)->gen >= 6) {
-		seq_printf(m, "ERROR: 0x%08x\n", error->error);
-		seq_printf(m, "Blitter command stream:\n");
-		seq_printf(m, "  ACTHD:    0x%08x\n", error->bcs_acthd);
-		seq_printf(m, "  IPEIR:    0x%08x\n", error->bcs_ipeir);
-		seq_printf(m, "  IPEHR:    0x%08x\n", error->bcs_ipehr);
-		seq_printf(m, "  INSTDONE: 0x%08x\n", error->bcs_instdone);
-		seq_printf(m, "  seqno:    0x%08x\n", error->bcs_seqno);
-		seq_printf(m, "Video (BSD) command stream:\n");
-		seq_printf(m, "  ACTHD:    0x%08x\n", error->vcs_acthd);
-		seq_printf(m, "  IPEIR:    0x%08x\n", error->vcs_ipeir);
-		seq_printf(m, "  IPEHR:    0x%08x\n", error->vcs_ipehr);
-		seq_printf(m, "  INSTDONE: 0x%08x\n", error->vcs_instdone);
-		seq_printf(m, "  seqno:    0x%08x\n", error->vcs_seqno);
-	}
-	seq_printf(m, "Render command stream:\n");
-	seq_printf(m, "  ACTHD: 0x%08x\n", error->acthd);
-	seq_printf(m, "  IPEIR: 0x%08x\n", error->ipeir);
-	seq_printf(m, "  IPEHR: 0x%08x\n", error->ipehr);
-	seq_printf(m, "  INSTDONE: 0x%08x\n", error->instdone);
-	if (INTEL_INFO(dev)->gen >= 4) {
-		seq_printf(m, "  INSTDONE1: 0x%08x\n", error->instdone1);
-		seq_printf(m, "  INSTPS: 0x%08x\n", error->instps);
-	}
-	seq_printf(m, "  INSTPM: 0x%08x\n", error->instpm);
-	seq_printf(m, "  seqno: 0x%08x\n", error->seqno);
 
 	for (i = 0; i < dev_priv->num_fence_regs; i++)
 		seq_printf(m, "  fence[%d] = %08llx\n", i, error->fence[i]);
 
+	if (INTEL_INFO(dev)->gen >= 6) 
+		seq_printf(m, "ERROR: 0x%08x\n", error->error);
+
+	i915_ring_error_state(m, dev, error, RCS);
+	if (HAS_BLT(dev))
+		i915_ring_error_state(m, dev, error, BCS);
+	if (HAS_BSD(dev))
+		i915_ring_error_state(m, dev, error, VCS);
+
 	if (error->active_bo)
 		print_error_buffers(m, "Active",
 				    error->active_bo,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 86615b8..187cfc0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -152,25 +152,15 @@ struct drm_i915_error_state {
 	u32 eir;
 	u32 pgtbl_er;
 	u32 pipestat[I915_MAX_PIPES];
-	u32 ipeir;
-	u32 ipehr;
-	u32 instdone;
-	u32 acthd;
+	u32 ipeir[I915_NUM_RINGS];
+	u32 ipehr[I915_NUM_RINGS];
+	u32 instdone[I915_NUM_RINGS];
+	u32 acthd[I915_NUM_RINGS];
 	u32 error; /* gen6+ */
-	u32 bcs_acthd; /* gen6+ blt engine */
-	u32 bcs_ipehr;
-	u32 bcs_ipeir;
-	u32 bcs_instdone;
-	u32 bcs_seqno;
-	u32 vcs_acthd; /* gen6+ bsd engine */
-	u32 vcs_ipehr;
-	u32 vcs_ipeir;
-	u32 vcs_instdone;
-	u32 vcs_seqno;
 	u32 instpm;
 	u32 instps;
 	u32 instdone1;
-	u32 seqno;
+	u32 seqno[I915_NUM_RINGS];
 	u64 bbaddr;
 	u64 fence[I915_MAX_NUM_FENCES];
 	struct timeval time;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 919d244..407555b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -876,6 +876,32 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	return NULL;
 }
 
+static void i915_record_ring_state(struct drm_device *dev,
+				   struct drm_i915_error_state *error,
+				   struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (INTEL_INFO(dev)->gen >= 4) {
+		error->ipeir[ring->id] = I915_READ(RING_IPEIR(ring->mmio_base));
+		error->ipehr[ring->id] = I915_READ(RING_IPEHR(ring->mmio_base));
+		error->instdone[ring->id] = I915_READ(RING_INSTDONE(ring->mmio_base));
+		if (ring->id == RCS) {
+			error->instps = I915_READ(INSTPS);
+			error->instdone1 = I915_READ(INSTDONE1);
+			error->bbaddr = I915_READ64(BB_ADDR);
+		}
+	} else {
+		error->ipeir[ring->id] = I915_READ(IPEIR);
+		error->ipehr[ring->id] = I915_READ(IPEHR);
+		error->instdone[ring->id] = I915_READ(INSTDONE);
+		error->bbaddr = 0;
+	}
+
+	error->seqno[ring->id] = ring->get_seqno(ring);
+	error->acthd[ring->id] = intel_ring_get_active_head(ring);
+}
+
 /**
  * i915_capture_error_state - capture an error record for later analysis
  * @dev: drm device
@@ -909,47 +935,23 @@ static void i915_capture_error_state(struct drm_device *dev)
 	DRM_INFO("capturing error event; look for more information in /debug/dri/%d/i915_error_state\n",
 		 dev->primary->index);
 
-	error->seqno = dev_priv->ring[RCS].get_seqno(&dev_priv->ring[RCS]);
 	error->eir = I915_READ(EIR);
 	error->pgtbl_er = I915_READ(PGTBL_ER);
 	for_each_pipe(pipe)
 		error->pipestat[pipe] = I915_READ(PIPESTAT(pipe));
 	error->instpm = I915_READ(INSTPM);
-	error->error = 0;
-	if (INTEL_INFO(dev)->gen >= 6) {
+
+	if (INTEL_INFO(dev)->gen >= 6)
 		error->error = I915_READ(ERROR_GEN6);
+	else
+		error->error = 0;
+
+	i915_record_ring_state(dev, error, &dev_priv->ring[RCS]);
+	if (HAS_BLT(dev))
+		i915_record_ring_state(dev, error, &dev_priv->ring[BCS]);
+	if (HAS_BSD(dev))
+		i915_record_ring_state(dev, error, &dev_priv->ring[VCS]);
 
-		error->bcs_acthd = I915_READ(BCS_ACTHD);
-		error->bcs_ipehr = I915_READ(BCS_IPEHR);
-		error->bcs_ipeir = I915_READ(BCS_IPEIR);
-		error->bcs_instdone = I915_READ(BCS_INSTDONE);
-		error->bcs_seqno = 0;
-		if (dev_priv->ring[BCS].get_seqno)
-			error->bcs_seqno = dev_priv->ring[BCS].get_seqno(&dev_priv->ring[BCS]);
-
-		error->vcs_acthd = I915_READ(VCS_ACTHD);
-		error->vcs_ipehr = I915_READ(VCS_IPEHR);
-		error->vcs_ipeir = I915_READ(VCS_IPEIR);
-		error->vcs_instdone = I915_READ(VCS_INSTDONE);
-		error->vcs_seqno = 0;
-		if (dev_priv->ring[VCS].get_seqno)
-			error->vcs_seqno = dev_priv->ring[VCS].get_seqno(&dev_priv->ring[VCS]);
-	}
-	if (INTEL_INFO(dev)->gen >= 4) {
-		error->ipeir = I915_READ(IPEIR_I965);
-		error->ipehr = I915_READ(IPEHR_I965);
-		error->instdone = I915_READ(INSTDONE_I965);
-		error->instps = I915_READ(INSTPS);
-		error->instdone1 = I915_READ(INSTDONE1);
-		error->acthd = I915_READ(ACTHD_I965);
-		error->bbaddr = I915_READ64(BB_ADDR);
-	} else {
-		error->ipeir = I915_READ(IPEIR);
-		error->ipehr = I915_READ(IPEHR);
-		error->instdone = I915_READ(INSTDONE);
-		error->acthd = I915_READ(ACTHD);
-		error->bbaddr = 0;
-	}
 	i915_gem_record_fences(dev, error);
 
 	/* Record the active batch and ring buffers */
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 6ef68c7..c6a0daa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -352,6 +352,9 @@
 #define IPEIR_I965	0x02064
 #define IPEHR_I965	0x02068
 #define INSTDONE_I965	0x0206c
+#define RING_IPEIR(base)	((base)+0x64)
+#define RING_IPEHR(base)	((base)+0x68)
+#define RING_INSTDONE(base)	((base)+0x6c)
 #define INSTPS		0x02070 /* 965+ only */
 #define INSTDONE1	0x0207c /* 965+ only */
 #define ACTHD_I965	0x02074
@@ -365,14 +368,6 @@
 #define INSTDONE	0x02090
 #define NOPID		0x02094
 #define HWSTAM		0x02098
-#define VCS_INSTDONE	0x1206C
-#define VCS_IPEIR	0x12064
-#define VCS_IPEHR	0x12068
-#define VCS_ACTHD	0x12074
-#define BCS_INSTDONE	0x2206C
-#define BCS_IPEIR	0x22064
-#define BCS_IPEHR	0x22068
-#define BCS_ACTHD	0x22074
 
 #define ERROR_GEN6	0x040a0
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 05/43] drm/i915: collect more per ring error state
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (2 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 04/43] drm/i915: refactor ring error state capture to use arrays Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 18:43   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock Daniel Vetter
                   ` (37 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx, Ben Widawsky

Based on a patch by Ben Widawsky, but with different colors
for the bikeshed.

In contrast to Ben's patch this one doesn't add the fault regs.
Afaics they're for the optional page fault support which
- we're not enabling
- and which seems to be unsupported by the hw team. Recent bspec
  lacks tons of information about this that the public docs released
  half a year back still contain.

Also dump ring HEAD/TAIL registers - I've recently seen a few
error_state where just guessing these is not good enough.

v2: Also dump INSTPM for every ring.

v3: Fix a few really silly goof-ups spotted by Chris Wilson.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   16 ++++++++++------
 drivers/gpu/drm/i915/i915_drv.h     |    7 +++++--
 drivers/gpu/drm/i915/i915_irq.c     |    9 +++++++--
 drivers/gpu/drm/i915/i915_reg.h     |    3 +++
 4 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 37a74cf..60e8092 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -742,17 +742,21 @@ static void i915_ring_error_state(struct seq_file *m,
 				  unsigned ring)
 {
 	seq_printf(m, "%s command stream:\n", ring_str(ring));
+	seq_printf(m, "  HEAD: 0x%08x\n", error->head[ring]);
+	seq_printf(m, "  TAIL: 0x%08x\n", error->tail[ring]);
 	seq_printf(m, "  ACTHD: 0x%08x\n", error->acthd[ring]);
 	seq_printf(m, "  IPEIR: 0x%08x\n", error->ipeir[ring]);
 	seq_printf(m, "  IPEHR: 0x%08x\n", error->ipehr[ring]);
 	seq_printf(m, "  INSTDONE: 0x%08x\n", error->instdone[ring]);
-	if (ring == RCS) {
-		if (INTEL_INFO(dev)->gen >= 4) {
-			seq_printf(m, "  INSTDONE1: 0x%08x\n", error->instdone1);
-			seq_printf(m, "  INSTPS: 0x%08x\n", error->instps);
-		}
-		seq_printf(m, "  INSTPM: 0x%08x\n", error->instpm);
+	if (ring == RCS && INTEL_INFO(dev)->gen >= 4) {
+		seq_printf(m, "  INSTDONE1: 0x%08x\n", error->instdone1);
+		seq_printf(m, "  BBADDR: 0x%08llx\n", error->bbaddr);
 	}
+	if (INTEL_INFO(dev)->gen >= 4)
+		seq_printf(m, "  INSTPS: 0x%08x\n", error->instps[ring]);
+	seq_printf(m, "  INSTPM: 0x%08x\n", error->instpm[ring]);
+	if (INTEL_INFO(dev)->gen >= 6)
+		seq_printf(m, "  FADDR: 0x%08x\n", error->faddr[ring]);
 	seq_printf(m, "  seqno: 0x%08x\n", error->seqno[ring]);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 187cfc0..8141e97 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -152,16 +152,19 @@ struct drm_i915_error_state {
 	u32 eir;
 	u32 pgtbl_er;
 	u32 pipestat[I915_MAX_PIPES];
+	u32 tail[I915_NUM_RINGS];
+	u32 head[I915_NUM_RINGS];
 	u32 ipeir[I915_NUM_RINGS];
 	u32 ipehr[I915_NUM_RINGS];
 	u32 instdone[I915_NUM_RINGS];
 	u32 acthd[I915_NUM_RINGS];
 	u32 error; /* gen6+ */
-	u32 instpm;
-	u32 instps;
+	u32 instpm[I915_NUM_RINGS];
+	u32 instps[I915_NUM_RINGS];
 	u32 instdone1;
 	u32 seqno[I915_NUM_RINGS];
 	u64 bbaddr;
+	u32 faddr[I915_NUM_RINGS];
 	u64 fence[I915_MAX_NUM_FENCES];
 	struct timeval time;
 	struct drm_i915_error_object {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 407555b..bfa4964 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -882,12 +882,15 @@ static void i915_record_ring_state(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
+	if (INTEL_INFO(dev)->gen >= 6)
+		error->faddr[ring->id] = I915_READ(RING_DMA_FADD(ring->mmio_base));
+
 	if (INTEL_INFO(dev)->gen >= 4) {
 		error->ipeir[ring->id] = I915_READ(RING_IPEIR(ring->mmio_base));
 		error->ipehr[ring->id] = I915_READ(RING_IPEHR(ring->mmio_base));
 		error->instdone[ring->id] = I915_READ(RING_INSTDONE(ring->mmio_base));
+		error->instps[ring->id] = I915_READ(RING_INSTPS(ring->mmio_base));
 		if (ring->id == RCS) {
-			error->instps = I915_READ(INSTPS);
 			error->instdone1 = I915_READ(INSTDONE1);
 			error->bbaddr = I915_READ64(BB_ADDR);
 		}
@@ -898,8 +901,11 @@ static void i915_record_ring_state(struct drm_device *dev,
 		error->bbaddr = 0;
 	}
 
+	error->instpm[ring->id] = I915_READ(RING_INSTPM(ring->mmio_base));
 	error->seqno[ring->id] = ring->get_seqno(ring);
 	error->acthd[ring->id] = intel_ring_get_active_head(ring);
+	error->head[ring->id] = I915_READ_HEAD(ring);
+	error->tail[ring->id] = I915_READ_TAIL(ring);
 }
 
 /**
@@ -939,7 +945,6 @@ static void i915_capture_error_state(struct drm_device *dev)
 	error->pgtbl_er = I915_READ(PGTBL_ER);
 	for_each_pipe(pipe)
 		error->pipestat[pipe] = I915_READ(PIPESTAT(pipe));
-	error->instpm = I915_READ(INSTPM);
 
 	if (INTEL_INFO(dev)->gen >= 6)
 		error->error = I915_READ(ERROR_GEN6);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index c6a0daa..8a9f113 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -355,6 +355,9 @@
 #define RING_IPEIR(base)	((base)+0x64)
 #define RING_IPEHR(base)	((base)+0x68)
 #define RING_INSTDONE(base)	((base)+0x6c)
+#define RING_INSTPS(base)	((base)+0x70)
+#define RING_DMA_FADD(base)	((base)+0x78)
+#define RING_INSTPM(base)	((base)+0xc0)
 #define INSTPS		0x02070 /* 965+ only */
 #define INSTDONE1	0x0207c /* 965+ only */
 #define ACTHD_I965	0x02074
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (3 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 05/43] drm/i915: collect more per ring error state Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-03 18:51   ` Keith Packard
  2011-12-14 12:57 ` [PATCH 07/43] drm/i915: convert force_wake_get to func pointer in the gpu reset code Daniel Vetter
                   ` (36 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

The problem this patch solves is that the forcewake accounting
necessary for register reads is protected by dev->struct_mutex. But the
hangcheck and error_capture code need to access registers without
grabbing this mutex because we hold it while waiting for the gpu.
So a new lock is required. Because currently the error_state capture
is called from the error irq handler and the hangcheck code runs from
a timer, it needs to be an irqsafe spinlock (note that the registers
used by the irq handler (neglecting the error handling part) only uses
registers that don't need the forcewake dance).

We could tune this down to a normal spinlock when we rework the
error_state capture and hangcheck code to run from a workqueue.  But
we don't have any read in a fastpath that needs forcewake, so I've
decided to not care much about overhead.

This prevents tests/gem_hangcheck_forcewake from i-g-t from killing my
snb on recent kernels - something must have slightly changed the
timings. On previous kernels it only trigger a WARN about the broken
locking.

v2: Drop the previous patch for the register writes.

v3: Improve the commit message per Chris Wilson's suggestions.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |    8 ++++++--
 drivers/gpu/drm/i915/i915_dma.c     |    1 +
 drivers/gpu/drm/i915/i915_drv.c     |   18 ++++++++++++------
 drivers/gpu/drm/i915/i915_drv.h     |   10 +++++++---
 4 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 60e8092..c130c5d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1321,9 +1321,13 @@ static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	unsigned forcewake_count;
 
-	seq_printf(m, "forcewake count = %d\n",
-		   atomic_read(&dev_priv->forcewake_count));
+	spin_lock_irq(&dev_priv->gt_lock);
+	forcewake_count = dev_priv->forcewake_count;
+	spin_unlock_irq(&dev_priv->gt_lock);
+
+	seq_printf(m, "forcewake count = %u\n", forcewake_count);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a9533c5..448d5b1 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2032,6 +2032,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	if (!IS_I945G(dev) && !IS_I945GM(dev))
 		pci_enable_msi(dev->pdev);
 
+	spin_lock_init(&dev_priv->gt_lock);
 	spin_lock_init(&dev_priv->irq_lock);
 	spin_lock_init(&dev_priv->error_lock);
 	spin_lock_init(&dev_priv->rps_lock);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 28836fe..34f5115 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -368,11 +368,12 @@ void __gen6_gt_force_wake_mt_get(struct drm_i915_private *dev_priv)
  */
 void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 {
-	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+	unsigned long irqflags;
 
-	/* Forcewake is atomic in case we get in here without the lock */
-	if (atomic_add_return(1, &dev_priv->forcewake_count) == 1)
+	spin_lock_irqsave(&dev_priv->gt_lock, irqflags);
+	if (dev_priv->forcewake_count++ == 0)
 		dev_priv->display.force_wake_get(dev_priv);
+	spin_unlock_irqrestore(&dev_priv->gt_lock, irqflags);
 }
 
 void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
@@ -392,10 +393,12 @@ void __gen6_gt_force_wake_mt_put(struct drm_i915_private *dev_priv)
  */
 void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
-	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+	unsigned long irqflags;
 
-	if (atomic_dec_and_test(&dev_priv->forcewake_count))
+	spin_lock_irqsave(&dev_priv->gt_lock, irqflags);
+	if (--dev_priv->forcewake_count == 0)
 		dev_priv->display.force_wake_put(dev_priv);
+	spin_unlock_irqrestore(&dev_priv->gt_lock, irqflags);
 }
 
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv)
@@ -626,6 +629,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
 	 * need to
 	 */
 	bool need_display = true;
+	unsigned long irqflags;
 	int ret;
 
 	if (!i915_try_reset)
@@ -644,8 +648,10 @@ int i915_reset(struct drm_device *dev, u8 flags)
 	case 6:
 		ret = gen6_do_reset(dev, flags);
 		/* If reset with a user forcewake, try to restore */
-		if (atomic_read(&dev_priv->forcewake_count))
+		spin_lock_irqsave(&dev_priv->gt_lock, irqflags);
+		if (dev_priv->forcewake_count)
 			__gen6_gt_force_wake_get(dev_priv);
+		spin_unlock_irqrestore(&dev_priv->gt_lock, irqflags);
 		break;
 	case 5:
 		ret = ironlake_do_reset(dev, flags);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8141e97..9e6ccc2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -279,7 +279,13 @@ typedef struct drm_i915_private {
 	int relative_constants_mode;
 
 	void __iomem *regs;
-	u32 gt_fifo_count;
+	/** gt_fifo_count and the subsequent register write are synchronized
+	 * with dev->struct_mutex. */
+	unsigned gt_fifo_count;
+	/** forcewake_count is protected by gt_lock */
+	unsigned forcewake_count;
+	/** gt_lock is also taken in irq contexts. */
+	struct spinlock gt_lock;
 
 	struct intel_gmbus {
 		struct i2c_adapter adapter;
@@ -730,8 +736,6 @@ typedef struct drm_i915_private {
 
 	struct drm_property *broadcast_rgb_property;
 	struct drm_property *force_audio_property;
-
-	atomic_t forcewake_count;
 } drm_i915_private_t;
 
 enum i915_cache_level {
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 07/43] drm/i915: convert force_wake_get to func pointer in the gpu reset code
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (4 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 08/43] drm/i915: drop register special-casing in forcewake Daniel Vetter
                   ` (35 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

This was forgotten in the original multi-threaded forcewake
conversion:

commit 8d715f0024f64ad1b1be85d8c081cf577944c847
Author: Keith Packard <keithp@keithp.com>
Date:   Fri Nov 18 20:39:01 2011 -0800

    drm/i915: add multi-threaded forcewake support

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 34f5115..ac3ad22 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -650,7 +650,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
 		/* If reset with a user forcewake, try to restore */
 		spin_lock_irqsave(&dev_priv->gt_lock, irqflags);
 		if (dev_priv->forcewake_count)
-			__gen6_gt_force_wake_get(dev_priv);
+			dev_priv->display.force_wake_get(dev_priv);
 		spin_unlock_irqrestore(&dev_priv->gt_lock, irqflags);
 		break;
 	case 5:
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 08/43] drm/i915: drop register special-casing in forcewake
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (5 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 07/43] drm/i915: convert force_wake_get to func pointer in the gpu reset code Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 15:05   ` Chris Wilson
  2011-12-14 12:57 ` [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions Daniel Vetter
                   ` (34 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

We currently have 3 register for which we must not grab forcewake for:
FORCEWAKE, FROCEWAKE_MT and ECOBUS.
- FORCEWAKE is excluded in the NEEDS_FORCE_WAKE macro and accessed
  with _NOTRACE.
- FORCEWAKE_MT is just accessed with _NOTRACE.
- ECOBUS is only excluded in the macro.

In fear of an ever-growing list of special cases and to cut down the
confusion, just access all of them with the _NOTRACE variants.

Also kill the duplicate definition of the macro, which is unused.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c      |    4 +---
 drivers/gpu/drm/i915/i915_drv.h      |    7 -------
 drivers/gpu/drm/i915/intel_display.c |    2 +-
 3 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ac3ad22..653b6f2 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -931,9 +931,7 @@ MODULE_LICENSE("GPL and additional rights");
 /* We give fast paths for the really cool registers */
 #define NEEDS_FORCE_WAKE(dev_priv, reg) \
 	(((dev_priv)->info->gen >= 6) && \
-	 ((reg) < 0x40000) &&		 \
-	 ((reg) != FORCEWAKE) &&	 \
-	 ((reg) != ECOBUS))
+	 ((reg) < 0x40000))		 \
 
 #define __i915_read(x, y) \
 u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9e6ccc2..4ee4626 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1356,13 +1356,6 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
 void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
 
-/* We give fast paths for the really cool registers */
-#define NEEDS_FORCE_WAKE(dev_priv, reg) \
-	(((dev_priv)->info->gen >= 6) && \
-	 ((reg) < 0x40000) &&		 \
-	 ((reg) != FORCEWAKE) &&	 \
-	 ((reg) != ECOBUS))
-
 #define __i915_read(x, y) \
 	u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg);
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 633c693..70436c7 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8500,7 +8500,7 @@ static void intel_init_display(struct drm_device *dev)
 
 			mutex_lock(&dev->struct_mutex);
 			__gen6_gt_force_wake_mt_get(dev_priv);
-			ecobus = I915_READ(ECOBUS);
+			ecobus = I915_READ_NOTRACE(ECOBUS);
 			__gen6_gt_force_wake_mt_put(dev_priv);
 			mutex_unlock(&dev->struct_mutex);
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (6 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 08/43] drm/i915: drop register special-casing in forcewake Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 15:06   ` Chris Wilson
  2011-12-14 18:58   ` Kenneth Graunke
  2011-12-14 12:57 ` [PATCH 10/43] drm/i915/ringbuffer: kill snb blt workaround Daniel Vetter
                   ` (33 subsequent siblings)
  41 siblings, 2 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

... like for forcewake, which protects everything _but_ display.
Expect more things (like gtt abstractions, rings, inter-ring sync)
to come.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c      |    6 +++---
 drivers/gpu/drm/i915/i915_drv.h      |    8 ++++++--
 drivers/gpu/drm/i915/intel_display.c |    8 ++++----
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 653b6f2..4a2eb68 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -372,7 +372,7 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 
 	spin_lock_irqsave(&dev_priv->gt_lock, irqflags);
 	if (dev_priv->forcewake_count++ == 0)
-		dev_priv->display.force_wake_get(dev_priv);
+		dev_priv->core.force_wake_get(dev_priv);
 	spin_unlock_irqrestore(&dev_priv->gt_lock, irqflags);
 }
 
@@ -397,7 +397,7 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 
 	spin_lock_irqsave(&dev_priv->gt_lock, irqflags);
 	if (--dev_priv->forcewake_count == 0)
-		dev_priv->display.force_wake_put(dev_priv);
+		dev_priv->core.force_wake_put(dev_priv);
 	spin_unlock_irqrestore(&dev_priv->gt_lock, irqflags);
 }
 
@@ -650,7 +650,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
 		/* If reset with a user forcewake, try to restore */
 		spin_lock_irqsave(&dev_priv->gt_lock, irqflags);
 		if (dev_priv->forcewake_count)
-			dev_priv->display.force_wake_get(dev_priv);
+			dev_priv->core.force_wake_get(dev_priv);
 		spin_unlock_irqrestore(&dev_priv->gt_lock, irqflags);
 		break;
 	case 5:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4ee4626..40e0848 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -215,8 +215,6 @@ struct drm_i915_display_funcs {
 			  struct drm_i915_gem_object *obj);
 	int (*update_plane)(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 			    int x, int y);
-	void (*force_wake_get)(struct drm_i915_private *dev_priv);
-	void (*force_wake_put)(struct drm_i915_private *dev_priv);
 	/* clock updates for mode set */
 	/* cursor updates */
 	/* render clock increase/decrease */
@@ -224,6 +222,11 @@ struct drm_i915_display_funcs {
 	/* pll clock increase/decrease */
 };
 
+struct drm_i915_core_funcs {
+	void (*force_wake_get)(struct drm_i915_private *dev_priv);
+	void (*force_wake_put)(struct drm_i915_private *dev_priv);
+};
+
 struct intel_device_info {
 	u8 gen;
 	u8 is_mobile:1;
@@ -296,6 +299,7 @@ typedef struct drm_i915_private {
 	struct pci_dev *bridge_dev;
 	struct intel_ring_buffer ring[I915_NUM_RINGS];
 	uint32_t next_seqno;
+	struct drm_i915_core_funcs core;
 
 	drm_dma_handle_t *status_page_dmah;
 	uint32_t counter;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 70436c7..2f0cc52 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8491,8 +8491,8 @@ static void intel_init_display(struct drm_device *dev)
 
 	/* For FIFO watermark updates */
 	if (HAS_PCH_SPLIT(dev)) {
-		dev_priv->display.force_wake_get = __gen6_gt_force_wake_get;
-		dev_priv->display.force_wake_put = __gen6_gt_force_wake_put;
+		dev_priv->core.force_wake_get = __gen6_gt_force_wake_get;
+		dev_priv->core.force_wake_put = __gen6_gt_force_wake_put;
 
 		/* IVB configs may use multi-threaded forcewake */
 		if (IS_IVYBRIDGE(dev)) {
@@ -8506,9 +8506,9 @@ static void intel_init_display(struct drm_device *dev)
 
 			if (ecobus & FORCEWAKE_MT_ENABLE) {
 				DRM_DEBUG_KMS("Using MT version of forcewake\n");
-				dev_priv->display.force_wake_get =
+				dev_priv->core.force_wake_get =
 					__gen6_gt_force_wake_mt_get;
-				dev_priv->display.force_wake_put =
+				dev_priv->core.force_wake_put =
 					__gen6_gt_force_wake_mt_put;
 			}
 		}
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 10/43] drm/i915/ringbuffer: kill snb blt workaround
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (7 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-29 16:52   ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 11/43] drm/i915: Separate fence pin counting from normal bind pin counting Daniel Vetter
                   ` (32 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

This was just to facilitate product enablement with pre-production hw.
Allows us to kill quite a bit of cruft.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c |   81 +------------------------------
 1 files changed, 2 insertions(+), 79 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3c30dba..2d476a9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1354,79 +1354,13 @@ blt_ring_put_irq(struct intel_ring_buffer *ring)
 			  GEN6_BLITTER_USER_INTERRUPT);
 }
 
-
-/* Workaround for some stepping of SNB,
- * each time when BLT engine ring tail moved,
- * the first command in the ring to be parsed
- * should be MI_BATCH_BUFFER_START
- */
-#define NEED_BLT_WORKAROUND(dev) \
-	(IS_GEN6(dev) && (dev->pdev->revision < 8))
-
-static inline struct drm_i915_gem_object *
-to_blt_workaround(struct intel_ring_buffer *ring)
-{
-	return ring->private;
-}
-
-static int blt_ring_init(struct intel_ring_buffer *ring)
-{
-	if (NEED_BLT_WORKAROUND(ring->dev)) {
-		struct drm_i915_gem_object *obj;
-		u32 *ptr;
-		int ret;
-
-		obj = i915_gem_alloc_object(ring->dev, 4096);
-		if (obj == NULL)
-			return -ENOMEM;
-
-		ret = i915_gem_object_pin(obj, 4096, true);
-		if (ret) {
-			drm_gem_object_unreference(&obj->base);
-			return ret;
-		}
-
-		ptr = kmap(obj->pages[0]);
-		*ptr++ = MI_BATCH_BUFFER_END;
-		*ptr++ = MI_NOOP;
-		kunmap(obj->pages[0]);
-
-		ret = i915_gem_object_set_to_gtt_domain(obj, false);
-		if (ret) {
-			i915_gem_object_unpin(obj);
-			drm_gem_object_unreference(&obj->base);
-			return ret;
-		}
-
-		ring->private = obj;
-	}
-
-	return init_ring_common(ring);
-}
-
-static int blt_ring_begin(struct intel_ring_buffer *ring,
-			  int num_dwords)
-{
-	if (ring->private) {
-		int ret = intel_ring_begin(ring, num_dwords+2);
-		if (ret)
-			return ret;
-
-		intel_ring_emit(ring, MI_BATCH_BUFFER_START);
-		intel_ring_emit(ring, to_blt_workaround(ring)->gtt_offset);
-
-		return 0;
-	} else
-		return intel_ring_begin(ring, 4);
-}
-
 static int blt_ring_flush(struct intel_ring_buffer *ring,
 			  u32 invalidate, u32 flush)
 {
 	uint32_t cmd;
 	int ret;
 
-	ret = blt_ring_begin(ring, 4);
+	ret = intel_ring_begin(ring, 4);
 	if (ret)
 		return ret;
 
@@ -1441,22 +1375,12 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
 	return 0;
 }
 
-static void blt_ring_cleanup(struct intel_ring_buffer *ring)
-{
-	if (!ring->private)
-		return;
-
-	i915_gem_object_unpin(ring->private);
-	drm_gem_object_unreference(ring->private);
-	ring->private = NULL;
-}
-
 static const struct intel_ring_buffer gen6_blt_ring = {
 	.name			= "blt ring",
 	.id			= BCS,
 	.mmio_base		= BLT_RING_BASE,
 	.size			= 32 * PAGE_SIZE,
-	.init			= blt_ring_init,
+	.init			= init_ring_common,
 	.write_tail		= ring_write_tail,
 	.flush			= blt_ring_flush,
 	.add_request		= gen6_add_request,
@@ -1464,7 +1388,6 @@ static const struct intel_ring_buffer gen6_blt_ring = {
 	.irq_get		= blt_ring_get_irq,
 	.irq_put		= blt_ring_put_irq,
 	.dispatch_execbuffer	= gen6_ring_dispatch_execbuffer,
-	.cleanup		= blt_ring_cleanup,
 	.sync_to		= gen6_blt_ring_sync_to,
 	.semaphore_register	= {MI_SEMAPHORE_SYNC_BR,
 				   MI_SEMAPHORE_SYNC_BV,
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 11/43] drm/i915: Separate fence pin counting from normal bind pin counting
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (8 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 10/43] drm/i915/ringbuffer: kill snb blt workaround Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-29 16:56   ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 12/43] drm/i915: don't trash the gtt when running out of fences Daniel Vetter
                   ` (31 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

In order to correctly account for reserving space in the GTT and fences
for a batch buffer, we need to independently track whether the fence is
pinned due to a fenced GPU access in the batch or whether the buffer is
pinned in the aperture. Currently we count the fenced as pinned if the
buffer has already been seen in the execbuffer. This leads to a false
accounting of available fence registers, causing frequent mass evictions.
Worse, if coupled with the change to make i915_gem_object_get_fence()
report EDADLK upon fence starvation, the batchbuffer can fail with only
one fence required...

Fixes intel-gpu-tools/tests/gem_fenced_exec_thrash

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38735
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Paul Neumann <paul104x@yahoo.de>
---
 drivers/gpu/drm/i915/i915_drv.h            |   19 ++++
 drivers/gpu/drm/i915/i915_gem.c            |    7 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  139 ++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_display.c       |   16 +++-
 4 files changed, 126 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 40e0848..5abc828 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -135,6 +135,7 @@ struct drm_i915_fence_reg {
 	struct list_head lru_list;
 	struct drm_i915_gem_object *obj;
 	uint32_t setup_seqno;
+	int pin_count;
 };
 
 struct sdvo_device_mapping {
@@ -1174,6 +1175,24 @@ int __must_check i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
 					   struct intel_ring_buffer *pipelined);
 int __must_check i915_gem_object_put_fence(struct drm_i915_gem_object *obj);
 
+static inline void
+i915_gem_object_pin_fence(struct drm_i915_gem_object *obj)
+{
+	if (obj->fence_reg != I915_FENCE_REG_NONE) {
+		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+		dev_priv->fence_regs[obj->fence_reg].pin_count++;
+	}
+}
+
+static inline void
+i915_gem_object_unpin_fence(struct drm_i915_gem_object *obj)
+{
+	if (obj->fence_reg != I915_FENCE_REG_NONE) {
+		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+		dev_priv->fence_regs[obj->fence_reg].pin_count--;
+	}
+}
+
 void i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_reset(struct drm_device *dev);
 void i915_gem_clflush_object(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8359dc7..695709a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2432,6 +2432,8 @@ i915_gem_object_put_fence(struct drm_i915_gem_object *obj)
 
 	if (obj->fence_reg != I915_FENCE_REG_NONE) {
 		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+
+		WARN_ON(dev_priv->fence_regs[obj->fence_reg].pin_count);
 		i915_gem_clear_fence_reg(obj->base.dev,
 					 &dev_priv->fence_regs[obj->fence_reg]);
 
@@ -2456,7 +2458,7 @@ i915_find_fence_reg(struct drm_device *dev,
 		if (!reg->obj)
 			return reg;
 
-		if (!reg->obj->pin_count)
+		if (!reg->pin_count)
 			avail = reg;
 	}
 
@@ -2466,7 +2468,7 @@ i915_find_fence_reg(struct drm_device *dev,
 	/* None available, try to steal one or wait for a user to finish */
 	avail = first = NULL;
 	list_for_each_entry(reg, &dev_priv->mm.fence_list, lru_list) {
-		if (reg->obj->pin_count)
+		if (reg->pin_count)
 			continue;
 
 		if (first == NULL)
@@ -2660,6 +2662,7 @@ i915_gem_clear_fence_reg(struct drm_device *dev,
 	list_del_init(&reg->lru_list);
 	reg->obj = NULL;
 	reg->setup_seqno = 0;
+	reg->pin_count = 0;
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 926ed48..c918124 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -460,6 +460,54 @@ i915_gem_execbuffer_relocate(struct drm_device *dev,
 	return ret;
 }
 
+#define  __EXEC_OBJECT_HAS_FENCE (1<<31)
+
+static int
+pin_and_fence_object(struct drm_i915_gem_object *obj,
+		     struct intel_ring_buffer *ring)
+{
+	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
+	bool need_fence, need_mappable;
+	int ret;
+
+	need_fence =
+		has_fenced_gpu_access &&
+		entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
+		obj->tiling_mode != I915_TILING_NONE;
+	need_mappable =
+		entry->relocation_count ? true : need_fence;
+
+	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable);
+	if (ret)
+		return ret;
+
+	if (has_fenced_gpu_access) {
+		if (entry->flags & EXEC_OBJECT_NEEDS_FENCE) {
+			if (obj->tiling_mode) {
+				ret = i915_gem_object_get_fence(obj, ring);
+				if (ret)
+					goto err_unpin;
+
+				entry->flags |= __EXEC_OBJECT_HAS_FENCE;
+				i915_gem_object_pin_fence(obj);
+			} else {
+				ret = i915_gem_object_put_fence(obj);
+				if (ret)
+					goto err_unpin;
+			}
+		}
+		obj->pending_fenced_gpu_access = need_fence;
+	}
+
+	entry->offset = obj->gtt_offset;
+	return 0;
+
+err_unpin:
+	i915_gem_object_unpin(obj);
+	return ret;
+}
+
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			    struct drm_file *file,
@@ -517,6 +565,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 		list_for_each_entry(obj, objects, exec_list) {
 			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 			bool need_fence, need_mappable;
+
 			if (!obj->gtt_space)
 				continue;
 
@@ -531,58 +580,47 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			    (need_mappable && !obj->map_and_fenceable))
 				ret = i915_gem_object_unbind(obj);
 			else
-				ret = i915_gem_object_pin(obj,
-							  entry->alignment,
-							  need_mappable);
+				ret = pin_and_fence_object(obj, ring);
 			if (ret)
 				goto err;
-
-			entry++;
 		}
 
 		/* Bind fresh objects */
 		list_for_each_entry(obj, objects, exec_list) {
-			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
-			bool need_fence;
-
-			need_fence =
-				has_fenced_gpu_access &&
-				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
-				obj->tiling_mode != I915_TILING_NONE;
-
-			if (!obj->gtt_space) {
-				bool need_mappable =
-					entry->relocation_count ? true : need_fence;
-
-				ret = i915_gem_object_pin(obj,
-							  entry->alignment,
-							  need_mappable);
-				if (ret)
-					break;
-			}
+			if (obj->gtt_space)
+				continue;
 
-			if (has_fenced_gpu_access) {
-				if (need_fence) {
-					ret = i915_gem_object_get_fence(obj, ring);
-					if (ret)
-						break;
-				} else if (entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
-					   obj->tiling_mode == I915_TILING_NONE) {
-					/* XXX pipelined! */
-					ret = i915_gem_object_put_fence(obj);
-					if (ret)
-						break;
-				}
-				obj->pending_fenced_gpu_access = need_fence;
+			ret = pin_and_fence_object(obj, ring);
+			if (ret) {
+				int ret_ignore;
+
+				/* This can potentially raise a harmless
+				 * -EINVAL if we failed to bind in the above
+				 * call. It cannot raise -EINTR since we know
+				 * that the bo is freshly bound and so will
+				 * not need to be flushed or waited upon.
+				 */
+				ret_ignore = i915_gem_object_unbind(obj);
+				(void)ret_ignore;
+				WARN_ON(obj->gtt_space);
+				break;
 			}
-
-			entry->offset = obj->gtt_offset;
 		}
 
 		/* Decrement pin count for bound objects */
 		list_for_each_entry(obj, objects, exec_list) {
-			if (obj->gtt_space)
-				i915_gem_object_unpin(obj);
+			struct drm_i915_gem_exec_object2 *entry;
+
+			if (!obj->gtt_space)
+				continue;
+
+			entry = obj->exec_entry;
+			if (entry->flags & __EXEC_OBJECT_HAS_FENCE) {
+				i915_gem_object_unpin_fence(obj);
+				entry->flags &= ~__EXEC_OBJECT_HAS_FENCE;
+			}
+
+			i915_gem_object_unpin(obj);
 		}
 
 		if (ret != -ENOSPC || retry > 1)
@@ -599,16 +637,19 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 	} while (1);
 
 err:
-	obj = list_entry(obj->exec_list.prev,
-			 struct drm_i915_gem_object,
-			 exec_list);
-	while (objects != &obj->exec_list) {
-		if (obj->gtt_space)
-			i915_gem_object_unpin(obj);
+	list_for_each_entry_continue_reverse(obj, objects, exec_list) {
+		struct drm_i915_gem_exec_object2 *entry;
+
+		if (!obj->gtt_space)
+			continue;
+
+		entry = obj->exec_entry;
+		if (entry->flags & __EXEC_OBJECT_HAS_FENCE) {
+			i915_gem_object_unpin_fence(obj);
+			entry->flags &= ~__EXEC_OBJECT_HAS_FENCE;
+		}
 
-		obj = list_entry(obj->exec_list.prev,
-				 struct drm_i915_gem_object,
-				 exec_list);
+		i915_gem_object_unpin(obj);
 	}
 
 	return ret;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 2f0cc52..571bf4e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2004,6 +2004,8 @@ intel_pin_and_fence_fb_obj(struct drm_device *dev,
 		ret = i915_gem_object_get_fence(obj, pipelined);
 		if (ret)
 			goto err_unpin;
+
+		i915_gem_object_pin_fence(obj);
 	}
 
 	dev_priv->mm.interruptible = true;
@@ -2016,6 +2018,12 @@ err_interruptible:
 	return ret;
 }
 
+static void intel_unpin_fb_obj(struct drm_i915_gem_object *obj)
+{
+	i915_gem_object_unpin_fence(obj);
+	i915_gem_object_unpin(obj);
+}
+
 static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 			     int x, int y)
 {
@@ -2247,7 +2255,7 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y,
 	ret = intel_pipe_set_base_atomic(crtc, crtc->fb, x, y,
 					 LEAVE_ATOMIC_MODE_SET);
 	if (ret) {
-		i915_gem_object_unpin(to_intel_framebuffer(crtc->fb)->obj);
+		intel_unpin_fb_obj(to_intel_framebuffer(crtc->fb)->obj);
 		mutex_unlock(&dev->struct_mutex);
 		DRM_ERROR("failed to update base address\n");
 		return ret;
@@ -2255,7 +2263,7 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y,
 
 	if (old_fb) {
 		intel_wait_for_vblank(dev, intel_crtc->pipe);
-		i915_gem_object_unpin(to_intel_framebuffer(old_fb)->obj);
+		intel_unpin_fb_obj(to_intel_framebuffer(old_fb)->obj);
 	}
 
 	mutex_unlock(&dev->struct_mutex);
@@ -3316,7 +3324,7 @@ static void intel_crtc_disable(struct drm_crtc *crtc)
 
 	if (crtc->fb) {
 		mutex_lock(&dev->struct_mutex);
-		i915_gem_object_unpin(to_intel_framebuffer(crtc->fb)->obj);
+		intel_unpin_fb_obj(to_intel_framebuffer(crtc->fb)->obj);
 		mutex_unlock(&dev->struct_mutex);
 	}
 }
@@ -6860,7 +6868,7 @@ static void intel_unpin_work_fn(struct work_struct *__work)
 		container_of(__work, struct intel_unpin_work, work);
 
 	mutex_lock(&work->dev->struct_mutex);
-	i915_gem_object_unpin(work->old_fb_obj);
+	intel_unpin_fb_obj(work->old_fb_obj);
 	drm_gem_object_unreference(&work->pending_flip_obj->base);
 	drm_gem_object_unreference(&work->old_fb_obj->base);
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 12/43] drm/i915: don't trash the gtt when running out of fences
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (9 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 11/43] drm/i915: Separate fence pin counting from normal bind pin counting Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 15:09   ` Chris Wilson
  2011-12-14 12:57 ` [PATCH 13/43] drm/i915: refactor debugfs open function Daniel Vetter
                   ` (30 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

With the fence accounting fixed up in the previous commit not finding
enough fences is a fatal error and userspace bug. Trashing the entire
gtt is not gonna turn up that missing fence, so don't to this by
returning another error thatn ENOSPC.

This has the added benefit that it's easier to distinguish fence
accounting errors from gtt space accounting issues.

TTM serves as precendence for the EDEADLK error code - it returns it
when the reservation code needs resources already blocked by the
current reservation.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 695709a..e995248 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2562,7 +2562,7 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
 
 	reg = i915_find_fence_reg(dev, pipelined);
 	if (reg == NULL)
-		return -ENOSPC;
+		return -EDEADLK;
 
 	ret = i915_gem_object_flush_fence(obj, pipelined);
 	if (ret)
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 13/43] drm/i915: refactor debugfs open function
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (10 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 12/43] drm/i915: don't trash the gtt when running out of fences Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 15:10   ` Chris Wilson
                     ` (2 more replies)
  2011-12-14 12:57 ` [PATCH 14/43] drm/i915: refactor debugfs create functions Daniel Vetter
                   ` (29 subsequent siblings)
  41 siblings, 3 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Only forcewake has an open with special semantics, the other r/w
debugfs only assign the file private pointer.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   26 +++++---------------------
 1 files changed, 5 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c130c5d..21547cd 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1333,8 +1333,8 @@ static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data)
 }
 
 static int
-i915_wedged_open(struct inode *inode,
-		 struct file *filp)
+i915_debugfs_common_open(struct inode *inode,
+			 struct file *filp)
 {
 	filp->private_data = inode->i_private;
 	return 0;
@@ -1390,20 +1390,12 @@ i915_wedged_write(struct file *filp,
 
 static const struct file_operations i915_wedged_fops = {
 	.owner = THIS_MODULE,
-	.open = i915_wedged_open,
+	.open = i915_debugfs_common_open,
 	.read = i915_wedged_read,
 	.write = i915_wedged_write,
 	.llseek = default_llseek,
 };
 
-static int
-i915_max_freq_open(struct inode *inode,
-		   struct file *filp)
-{
-	filp->private_data = inode->i_private;
-	return 0;
-}
-
 static ssize_t
 i915_max_freq_read(struct file *filp,
 		   char __user *ubuf,
@@ -1460,20 +1452,12 @@ i915_max_freq_write(struct file *filp,
 
 static const struct file_operations i915_max_freq_fops = {
 	.owner = THIS_MODULE,
-	.open = i915_max_freq_open,
+	.open = i915_debugfs_common_open,
 	.read = i915_max_freq_read,
 	.write = i915_max_freq_write,
 	.llseek = default_llseek,
 };
 
-static int
-i915_cache_sharing_open(struct inode *inode,
-		   struct file *filp)
-{
-	filp->private_data = inode->i_private;
-	return 0;
-}
-
 static ssize_t
 i915_cache_sharing_read(struct file *filp,
 		   char __user *ubuf,
@@ -1539,7 +1523,7 @@ i915_cache_sharing_write(struct file *filp,
 
 static const struct file_operations i915_cache_sharing_fops = {
 	.owner = THIS_MODULE,
-	.open = i915_cache_sharing_open,
+	.open = i915_debugfs_common_open,
 	.read = i915_cache_sharing_read,
 	.write = i915_cache_sharing_write,
 	.llseek = default_llseek,
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 14/43] drm/i915: refactor debugfs create functions
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (11 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 13/43] drm/i915: refactor debugfs open function Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 18:44   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 15/43] drm/i915: add interface to simulate gpu hangs Daniel Vetter
                   ` (28 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

All r/w debugfs files are created equal.

v2: Add some newlines to make the code easier on the eyes as requested
by Ben Widawsky.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   55 +++++++++++-----------------------
 1 files changed, 18 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 21547cd..db83552 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1555,21 +1555,6 @@ drm_add_fake_info_node(struct drm_minor *minor,
 	return 0;
 }
 
-static int i915_wedged_create(struct dentry *root, struct drm_minor *minor)
-{
-	struct drm_device *dev = minor->dev;
-	struct dentry *ent;
-
-	ent = debugfs_create_file("i915_wedged",
-				  S_IRUGO | S_IWUSR,
-				  root, dev,
-				  &i915_wedged_fops);
-	if (IS_ERR(ent))
-		return PTR_ERR(ent);
-
-	return drm_add_fake_info_node(minor, ent, &i915_wedged_fops);
-}
-
 static int i915_forcewake_open(struct inode *inode, struct file *file)
 {
 	struct drm_device *dev = inode->i_private;
@@ -1631,34 +1616,22 @@ static int i915_forcewake_create(struct dentry *root, struct drm_minor *minor)
 	return drm_add_fake_info_node(minor, ent, &i915_forcewake_fops);
 }
 
-static int i915_max_freq_create(struct dentry *root, struct drm_minor *minor)
-{
-	struct drm_device *dev = minor->dev;
-	struct dentry *ent;
-
-	ent = debugfs_create_file("i915_max_freq",
-				  S_IRUGO | S_IWUSR,
-				  root, dev,
-				  &i915_max_freq_fops);
-	if (IS_ERR(ent))
-		return PTR_ERR(ent);
-
-	return drm_add_fake_info_node(minor, ent, &i915_max_freq_fops);
-}
-
-static int i915_cache_sharing_create(struct dentry *root, struct drm_minor *minor)
+static int i915_debugfs_create(struct dentry *root,
+			       struct drm_minor *minor,
+			       const char *name,
+			       const struct file_operations *fops)
 {
 	struct drm_device *dev = minor->dev;
 	struct dentry *ent;
 
-	ent = debugfs_create_file("i915_cache_sharing",
+	ent = debugfs_create_file(name,
 				  S_IRUGO | S_IWUSR,
 				  root, dev,
-				  &i915_cache_sharing_fops);
+				  fops);
 	if (IS_ERR(ent))
 		return PTR_ERR(ent);
 
-	return drm_add_fake_info_node(minor, ent, &i915_cache_sharing_fops);
+	return drm_add_fake_info_node(minor, ent, fops);
 }
 
 static struct drm_info_list i915_debugfs_list[] = {
@@ -1707,17 +1680,25 @@ int i915_debugfs_init(struct drm_minor *minor)
 {
 	int ret;
 
-	ret = i915_wedged_create(minor->debugfs_root, minor);
+	ret = i915_debugfs_create(minor->debugfs_root, minor,
+				  "i915_wedged",
+				  &i915_wedged_fops);
 	if (ret)
 		return ret;
 
 	ret = i915_forcewake_create(minor->debugfs_root, minor);
 	if (ret)
 		return ret;
-	ret = i915_max_freq_create(minor->debugfs_root, minor);
+
+	ret = i915_debugfs_create(minor->debugfs_root, minor,
+				  "i915_max_freq",
+				  &i915_max_freq_fops);
 	if (ret)
 		return ret;
-	ret = i915_cache_sharing_create(minor->debugfs_root, minor);
+
+	ret = i915_debugfs_create(minor->debugfs_root, minor,
+				  "i915_cache_sharing",
+				  &i915_cache_sharing_fops);
 	if (ret)
 		return ret;
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 15/43] drm/i915: add interface to simulate gpu hangs
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (12 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 14/43] drm/i915: refactor debugfs create functions Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 19:00   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 16/43] drm/i915: rework dev->first_error locking Daniel Vetter
                   ` (27 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

gpu reset is a very important piece of our infrastructure.
Unfortunately we only really it test by actually hanging the gpu,
which often has bad side-effects for the entire system. And the gpu
hang handling code is one of the rather complicated pieces of code we
have, consisting of
- hang detection
- error capture
- actual gpu reset
- reset of all the gem bookkeeping
- reinitialition of the entire gpu

This patch adds a debugfs to selectively stopping rings by ceasing to
update the hw tail pointer, which will result in the gpu no longer
updating it's head pointer and eventually to the hangcheck firing.
This way we can exercise the gpu hang code under controlled conditions
without a dying gpu taking down the entire systems.

Patch motivated by me forgetting to properly reinitialize ppgtt after
a gpu reset.

Usage:

echo $((1 << $ringnum)) > i915_ring_stop # stops one ring

echo 0xffffffff > i915_ring_stop # stops all, future-proof version

then run whatever testload is desired. i915_ring_stop automatically
resets after a gpu hang is detected to avoid hanging the gpu to fast
and declaring it wedged.

v2: Incorporate feedback from Chris Wilson.

v3: Add the missing cleanup.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |   65 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.c         |    2 +
 drivers/gpu/drm/i915/i915_drv.h         |    2 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |    4 ++
 4 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index db83552..67d7567 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1397,6 +1397,64 @@ static const struct file_operations i915_wedged_fops = {
 };
 
 static ssize_t
+i915_ring_stop_read(struct file *filp,
+		    char __user *ubuf,
+		    size_t max,
+		    loff_t *ppos)
+{
+	struct drm_device *dev = filp->private_data;
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	char buf[80];
+	int len;
+
+	len = snprintf(buf, sizeof(buf),
+		       "0x%08x\n", dev_priv->stop_rings);
+
+	if (len > sizeof(buf))
+		len = sizeof(buf);
+
+	return simple_read_from_buffer(ubuf, max, ppos, buf, len);
+}
+
+static ssize_t
+i915_ring_stop_write(struct file *filp,
+		     const char __user *ubuf,
+		     size_t cnt,
+		     loff_t *ppos)
+{
+	struct drm_device *dev = filp->private_data;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	char buf[20];
+	int val = 0;
+
+	if (cnt > 0) {
+		if (cnt > sizeof(buf) - 1)
+			return -EINVAL;
+
+		if (copy_from_user(buf, ubuf, cnt))
+			return -EFAULT;
+		buf[cnt] = 0;
+
+		val = simple_strtoul(buf, NULL, 0);
+	}
+
+	DRM_DEBUG_DRIVER("Stopping rings 0x%08x\n", val);
+
+	mutex_lock(&dev->struct_mutex);
+	dev_priv->stop_rings = val;
+	mutex_unlock(&dev->struct_mutex);
+
+	return cnt;
+}
+
+static const struct file_operations i915_ring_stop_fops = {
+	.owner = THIS_MODULE,
+	.open = i915_debugfs_common_open,
+	.read = i915_ring_stop_read,
+	.write = i915_ring_stop_write,
+	.llseek = default_llseek,
+};
+static ssize_t
 i915_max_freq_read(struct file *filp,
 		   char __user *ubuf,
 		   size_t max,
@@ -1701,6 +1759,11 @@ int i915_debugfs_init(struct drm_minor *minor)
 				  &i915_cache_sharing_fops);
 	if (ret)
 		return ret;
+	ret = i915_debugfs_create(minor->debugfs_root, minor,
+				  "i915_ring_stop",
+				  &i915_ring_stop_fops);
+	if (ret)
+		return ret;
 
 	return drm_debugfs_create_files(i915_debugfs_list,
 					I915_DEBUGFS_ENTRIES,
@@ -1719,6 +1782,8 @@ void i915_debugfs_cleanup(struct drm_minor *minor)
 				 1, minor);
 	drm_debugfs_remove_files((struct drm_info_list *) &i915_cache_sharing_fops,
 				 1, minor);
+	drm_debugfs_remove_files((struct drm_info_list *) &i915_ring_stop_fops,
+				 1, minor);
 }
 
 #endif /* CONFIG_DEBUG_FS */
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 4a2eb68..6dd219b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -638,6 +638,8 @@ int i915_reset(struct drm_device *dev, u8 flags)
 	if (!mutex_trylock(&dev->struct_mutex))
 		return -EBUSY;
 
+	dev_priv->stop_rings = 0;
+
 	i915_gem_reset(dev);
 
 	ret = -ENODEV;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5abc828..621349e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -346,6 +346,8 @@ typedef struct drm_i915_private {
 	uint32_t last_instdone;
 	uint32_t last_instdone1;
 
+	unsigned int stop_rings;
+
 	unsigned long cfb_size;
 	unsigned int cfb_fb;
 	enum plane cfb_plane;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 2d476a9..cf6e159 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1179,7 +1179,11 @@ int intel_ring_begin(struct intel_ring_buffer *ring,
 
 void intel_ring_advance(struct intel_ring_buffer *ring)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+
 	ring->tail &= ring->size - 1;
+	if (dev_priv->stop_rings & intel_ring_flag(ring))
+		return;
 	ring->write_tail(ring, ring->tail);
 }
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 16/43] drm/i915: rework dev->first_error locking
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (13 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 15/43] drm/i915: add interface to simulate gpu hangs Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 15:13   ` Chris Wilson
  2011-12-14 18:37   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 17/43] drm/i915: destroy existing error_state when simulating a gpu hang Daniel Vetter
                   ` (26 subsequent siblings)
  41 siblings, 2 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

- reduce the irq disabled section, even for a debugfs file this was
  way too long.
- always disable irqs when taking the lock.

v2: Thou shalt not mistake locking for reference counting, so:
- reference count the error_state to protect from concurent freeeing.
  This will be only really used in the next patch.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   13 ++++++++-----
 drivers/gpu/drm/i915/i915_drv.h     |    4 ++++
 drivers/gpu/drm/i915/i915_irq.c     |   17 ++++++++++-------
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 67d7567..b8a1770 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -770,12 +770,16 @@ static int i915_error_state(struct seq_file *m, void *unused)
 	int i, page, offset, elt;
 
 	spin_lock_irqsave(&dev_priv->error_lock, flags);
-	if (!dev_priv->first_error) {
+	error = dev_priv->first_error;
+	if (error)
+		kref_get(&error->ref);
+	spin_unlock_irqrestore(&dev_priv->error_lock, flags);
+
+	if (!error) {
 		seq_printf(m, "no error state collected\n");
-		goto out;
+		return 0;
 	}
 
-	error = dev_priv->first_error;
 
 	seq_printf(m, "Time: %ld s %ld us\n", error->time.tv_sec,
 		   error->time.tv_usec);
@@ -846,8 +850,7 @@ static int i915_error_state(struct seq_file *m, void *unused)
 	if (error->display)
 		intel_display_print_error_state(m, dev, error->display);
 
-out:
-	spin_unlock_irqrestore(&dev_priv->error_lock, flags);
+	kref_put(&error->ref, i915_error_state_free);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 621349e..b46fac5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -37,6 +37,7 @@
 #include <linux/i2c.h>
 #include <drm/intel-gtt.h>
 #include <linux/backlight.h>
+#include <linux/kref.h>
 
 /* General customization:
  */
@@ -150,6 +151,7 @@ struct sdvo_device_mapping {
 struct intel_display_error_state;
 
 struct drm_i915_error_state {
+	struct kref ref;
 	u32 eir;
 	u32 pgtbl_er;
 	u32 pipestat[I915_MAX_PIPES];
@@ -396,6 +398,7 @@ typedef struct drm_i915_private {
 	unsigned int fsb_freq, mem_freq, is_ddr3;
 
 	spinlock_t error_lock;
+	/* Protected by dev->error_lock. */
 	struct drm_i915_error_state *first_error;
 	struct work_struct error_work;
 	struct completion error_completion;
@@ -1059,6 +1062,7 @@ extern int i915_vblank_pipe_get(struct drm_device *dev, void *data,
 				struct drm_file *file_priv);
 extern int i915_vblank_swap(struct drm_device *dev, void *data,
 			    struct drm_file *file_priv);
+void i915_error_state_free(struct kref *error_ref);
 
 void
 i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index bfa4964..56f2b2f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -764,10 +764,11 @@ i915_error_object_free(struct drm_i915_error_object *obj)
 	kfree(obj);
 }
 
-static void
-i915_error_state_free(struct drm_device *dev,
-		      struct drm_i915_error_state *error)
+void
+i915_error_state_free(struct kref *error_ref)
 {
+	struct drm_i915_error_state *error = container_of(error_ref,
+							  typeof(*error), ref);
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(error->batchbuffer); i++)
@@ -941,6 +942,7 @@ static void i915_capture_error_state(struct drm_device *dev)
 	DRM_INFO("capturing error event; look for more information in /debug/dri/%d/i915_error_state\n",
 		 dev->primary->index);
 
+	kref_init(&error->ref);
 	error->eir = I915_READ(EIR);
 	error->pgtbl_er = I915_READ(PGTBL_ER);
 	for_each_pipe(pipe)
@@ -1017,21 +1019,22 @@ static void i915_capture_error_state(struct drm_device *dev)
 	spin_unlock_irqrestore(&dev_priv->error_lock, flags);
 
 	if (error)
-		i915_error_state_free(dev, error);
+		i915_error_state_free(&error->ref);
 }
 
 void i915_destroy_error_state(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_error_state *error;
+	unsigned long flags;
 
-	spin_lock(&dev_priv->error_lock);
+	spin_lock_irqsave(&dev_priv->error_lock, flags);
 	error = dev_priv->first_error;
 	dev_priv->first_error = NULL;
-	spin_unlock(&dev_priv->error_lock);
+	spin_unlock_irqrestore(&dev_priv->error_lock, flags);
 
 	if (error)
-		i915_error_state_free(dev, error);
+		kref_put(&error->ref, i915_error_state_free);
 }
 #else
 #define i915_capture_error_state(x)
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 17/43] drm/i915: destroy existing error_state when simulating a gpu hang
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (14 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 16/43] drm/i915: rework dev->first_error locking Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 18:45   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 18/43] drm/i915: fix swizzle detection for gen3 Daniel Vetter
                   ` (25 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

This way we can simulate a bunch of gpu hangs and run the error_state
capture code every time (without the need to reload the module).

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index b8a1770..4a30756 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1444,6 +1444,8 @@ i915_ring_stop_write(struct file *filp,
 	DRM_DEBUG_DRIVER("Stopping rings 0x%08x\n", val);
 
 	mutex_lock(&dev->struct_mutex);
+	i915_destroy_error_state(dev);
+
 	dev_priv->stop_rings = val;
 	mutex_unlock(&dev->struct_mutex);
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 18/43] drm/i915: fix swizzle detection for gen3
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (15 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 17/43] drm/i915: destroy existing error_state when simulating a gpu hang Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-29 17:36   ` Chris Wilson
  2011-12-14 12:57 ` [PATCH 19/43] drm/i915: add debugfs file for swizzling information Daniel Vetter
                   ` (24 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx, stable

It looks like the desktop variants of i915 and i945 also have the DCC
register to control dram channel interleave and cpu side bit6
swizzling.

Unfortunately internal Cspec/ConfigDB documentation for these ancient chips
have already been dropped and there seem to be no archives. Also
somebody thought the swizzling behaviour is surely a worthy secret to
keep and redacted any mention of these fields from the published Intel
datasheets.

I suspect the hw engineers were really proud of the page coloring
they've achieved in their first dual channel dram controller with
bit17 - after all Bspec explains in great length the optimal layout of
page frame numbers modulo 4 for the color and depth buffers, too.
Later on when they've started to work on VT-d they shamefully
discoverd their stupidity and tried to cover the tracks ...

Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch> (i915g)
Tested-by: Pavel Ondračka <pavel.ondracka@email.cz> (i945g)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42625
Cc: stable@kernel.org
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem_tiling.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 31d334d..861223b 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -107,10 +107,10 @@ i915_gem_detect_bit_6_swizzle(struct drm_device *dev)
 		 */
 		swizzle_x = I915_BIT_6_SWIZZLE_NONE;
 		swizzle_y = I915_BIT_6_SWIZZLE_NONE;
-	} else if (IS_MOBILE(dev)) {
+	} else if (IS_MOBILE(dev) || (IS_GEN3(dev) && !IS_G33(dev))) {
 		uint32_t dcc;
 
-		/* On mobile 9xx chipsets, channel interleave by the CPU is
+		/* On 9xx chipsets, channel interleave by the CPU is
 		 * determined by DCC.  For single-channel, neither the CPU
 		 * nor the GPU do swizzling.  For dual channel interleaved,
 		 * the GPU's interleave is bit 9 and 10 for X tiled, and bit
-- 
1.7.7.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 19/43] drm/i915: add debugfs file for swizzling information
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (16 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 18/43] drm/i915: fix swizzle detection for gen3 Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-29 17:37   ` Chris Wilson
  2011-12-14 12:57 ` [PATCH 20/43] drm/i915: swizzling support for snb/ivb Daniel Vetter
                   ` (23 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

This will also come handy for the gen6+ swizzling support, where the
driver is supposed to control swizzling depending upon dram
configuration.

v2: CxDRB3 are 16 bit regs! Noticed by Chris Wilson.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   50 +++++++++++++++++++++++++++++++++++
 1 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 4a30756..3eab427 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1335,6 +1335,55 @@ static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static const char *swizzle_string(unsigned swizzle)
+{
+	switch(swizzle) {
+	case I915_BIT_6_SWIZZLE_NONE:
+		return "none";
+	case I915_BIT_6_SWIZZLE_9:
+		return "bit9";
+	case I915_BIT_6_SWIZZLE_9_10:
+		return "bit9/bit10";
+	case I915_BIT_6_SWIZZLE_9_11:
+		return "bit9/bit11";
+	case I915_BIT_6_SWIZZLE_9_10_11:
+		return "bit9/bit10/bit11";
+	case I915_BIT_6_SWIZZLE_9_17:
+		return "bit9/bit17";
+	case I915_BIT_6_SWIZZLE_9_10_17:
+		return "bit9/bit10/bit17";
+	case I915_BIT_6_SWIZZLE_UNKNOWN:
+		return "unkown";
+	}
+
+	return "bug";
+}
+
+static int i915_swizzle_info(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	mutex_lock(&dev->struct_mutex);
+	seq_printf(m, "bit6 swizzle for X-tiling = %s\n",
+		   swizzle_string(dev_priv->mm.bit_6_swizzle_x));
+	seq_printf(m, "bit6 swizzle for Y-tiling = %s\n",
+		   swizzle_string(dev_priv->mm.bit_6_swizzle_y));
+
+	if (IS_GEN3(dev) || IS_GEN4(dev)) {
+		seq_printf(m, "DDC = 0x%08x\n",
+			   I915_READ(DCC));
+		seq_printf(m, "C0DRB3 = 0x%04x\n",
+			   I915_READ16(C0DRB3));
+		seq_printf(m, "C1DRB3 = 0x%04x\n",
+			   I915_READ16(C1DRB3));
+	}
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
 static int
 i915_debugfs_common_open(struct inode *inode,
 			 struct file *filp)
@@ -1736,6 +1785,7 @@ static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_gem_framebuffer", i915_gem_framebuffer_info, 0},
 	{"i915_context_status", i915_context_status, 0},
 	{"i915_gen6_forcewake_count", i915_gen6_forcewake_count_info, 0},
+	{"i915_swizzle_info", i915_swizzle_info, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 20/43] drm/i915: swizzling support for snb/ivb
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (17 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 19/43] drm/i915: add debugfs file for swizzling information Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-29 18:34   ` Chris Wilson
  2012-01-31  7:44   ` Ben Widawsky
  2011-12-14 12:57 ` [PATCH 21/43] drm/i915: add gen6+ registers to i915_swizzle_info Daniel Vetter
                   ` (22 subsequent siblings)
  41 siblings, 2 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

We have to do this manually. Somebody had a Great Idea.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_dma.c        |    2 +-
 drivers/gpu/drm/i915/i915_drv.c        |    4 ++-
 drivers/gpu/drm/i915/i915_drv.h        |    3 +-
 drivers/gpu/drm/i915/i915_gem.c        |   23 ++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_tiling.c |   16 +++++++++++++-
 drivers/gpu/drm/i915/i915_reg.h        |   33 ++++++++++++++++++++++++++++++++
 6 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 448d5b1..4c21c67 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1202,7 +1202,7 @@ static int i915_load_gem_init(struct drm_device *dev)
 	i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
 
 	mutex_lock(&dev->struct_mutex);
-	ret = i915_gem_init_ringbuffer(dev);
+	ret = i915_gem_init_hw(dev);
 	mutex_unlock(&dev->struct_mutex);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6dd219b..f12b43e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -494,7 +494,7 @@ static int i915_drm_thaw(struct drm_device *dev)
 		mutex_lock(&dev->struct_mutex);
 		dev_priv->mm.suspended = 0;
 
-		error = i915_gem_init_ringbuffer(dev);
+		error = i915_gem_init_hw(dev);
 		mutex_unlock(&dev->struct_mutex);
 
 		if (HAS_PCH_SPLIT(dev))
@@ -690,6 +690,8 @@ int i915_reset(struct drm_device *dev, u8 flags)
 			!dev_priv->mm.suspended) {
 		dev_priv->mm.suspended = 0;
 
+		i915_gem_init_swizzling(dev);
+
 		dev_priv->ring[RCS].init(&dev_priv->ring[RCS]);
 		if (HAS_BSD(dev))
 		    dev_priv->ring[VCS].init(&dev_priv->ring[VCS]);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b46fac5..311a4e1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1206,7 +1206,8 @@ int __must_check i915_gem_object_set_domain(struct drm_i915_gem_object *obj,
 					    uint32_t read_domains,
 					    uint32_t write_domain);
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
-int __must_check i915_gem_init_ringbuffer(struct drm_device *dev);
+int __must_check i915_gem_init_hw(struct drm_device *dev);
+void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 void i915_gem_do_init(struct drm_device *dev,
 		      unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e995248..39459d2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3744,12 +3744,31 @@ i915_gem_idle(struct drm_device *dev)
 	return 0;
 }
 
+void i915_gem_init_swizzling(struct drm_device *dev)
+{
+	drm_i915_private_t *dev_priv = dev->dev_private;
+
+	if (INTEL_INFO(dev)->gen < 6 ||
+	    dev_priv->mm.bit_6_swizzle_x == I915_BIT_6_SWIZZLE_NONE)
+		return;
+
+	I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_SWZCTL);
+	if (IS_GEN6(dev))
+		I915_WRITE(ARB_MODE, ARB_MODE_ENABLE(ARB_MODE_SWIZZLE_SNB));
+	else
+		I915_WRITE(ARB_MODE, ARB_MODE_ENABLE(ARB_MODE_SWIZZLE_IVB));
+	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
+				 DISP_TILE_SURFACE_SWIZZLING);
+
+}
 int
-i915_gem_init_ringbuffer(struct drm_device *dev)
+i915_gem_init_hw(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	int ret;
 
+	i915_gem_init_swizzling(dev);
+
 	ret = intel_init_render_ring_buffer(dev);
 	if (ret)
 		return ret;
@@ -3805,7 +3824,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
 	mutex_lock(&dev->struct_mutex);
 	dev_priv->mm.suspended = 0;
 
-	ret = i915_gem_init_ringbuffer(dev);
+	ret = i915_gem_init_hw(dev);
 	if (ret != 0) {
 		mutex_unlock(&dev->struct_mutex);
 		return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 861223b..af0a2fc 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -93,8 +93,20 @@ i915_gem_detect_bit_6_swizzle(struct drm_device *dev)
 	uint32_t swizzle_y = I915_BIT_6_SWIZZLE_UNKNOWN;
 
 	if (INTEL_INFO(dev)->gen >= 6) {
-		swizzle_x = I915_BIT_6_SWIZZLE_NONE;
-		swizzle_y = I915_BIT_6_SWIZZLE_NONE;
+		uint32_t dimm_c0, dimm_c1;
+		dimm_c0 = I915_READ(MAD_DIMM_C0);
+		dimm_c1 = I915_READ(MAD_DIMM_C1);
+		dimm_c0 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_A_SIZE_MASK;
+		dimm_c1 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_A_SIZE_MASK;
+		/* Enable swizzling when the channels are populated with
+		 * identically sized dimms. */
+		if (dimm_c0 == dimm_c1) {
+			swizzle_x = I915_BIT_6_SWIZZLE_9_10;
+			swizzle_y = I915_BIT_6_SWIZZLE_9;
+		} else {
+			swizzle_x = I915_BIT_6_SWIZZLE_NONE;
+			swizzle_y = I915_BIT_6_SWIZZLE_NONE;
+		}
 	} else if (IS_GEN5(dev)) {
 		/* On Ironlake whatever DRAM config, GPU always do
 		 * same swizzling setup.
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 8a9f113..e810723 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -295,6 +295,12 @@
 #define FENCE_REG_SANDYBRIDGE_0		0x100000
 #define   SANDYBRIDGE_FENCE_PITCH_SHIFT	32
 
+/* control register for cpu gtt access */
+#define TILECTL				0x101000
+#define   TILECTL_SWZCTL			(1 << 0)
+#define   TILECTL_TLB_PREFETCH_DIS	(1 << 2)
+#define   TILECTL_BACKSNOOP_DIS		(1 << 3)
+
 /*
  * Instruction and interrupt control regs
  */
@@ -318,6 +324,11 @@
 #define RING_MAX_IDLE(base)	((base)+0x54)
 #define RING_HWS_PGA(base)	((base)+0x80)
 #define RING_HWS_PGA_GEN6(base)	((base)+0x2080)
+#define ARB_MODE		0x04030
+#define   ARB_MODE_SWIZZLE_SNB	(1<<4)
+#define   ARB_MODE_SWIZZLE_IVB	(1<<5)
+#define   ARB_MODE_ENABLE(x)	GFX_MODE_ENABLE(x)
+#define   ARB_MODE_DISABLE(x)	GFX_MODE_DISABLE(x)
 #define RENDER_HWS_PGA_GEN7	(0x04080)
 #define BSD_HWS_PGA_GEN7	(0x04180)
 #define BLT_HWS_PGA_GEN7	(0x04280)
@@ -1034,6 +1045,28 @@
 #define C0DRB3			0x10206
 #define C1DRB3			0x10606
 
+/** snb MCH registers for reading the DRAM channel configuration */
+#define MAD_DIMM_C0			(MCHBAR_MIRROR_BASE_SNB + 0x5004)
+#define   MAD_DIMM_C1			(MCHBAR_MIRROR_BASE_SNB + 0x5008)
+#define   MAD_DIMM_C2			(MCHBAR_MIRROR_BASE_SNB + 0x500C)
+#define   MAD_DIMM_ECC_MASK		(0x3 << 24)
+#define   MAD_DIMM_ECC_OFF		(0x0 << 24)
+#define   MAD_DIMM_ECC_IO_ON_LOGIC_OFF	(0x1 << 24)
+#define   MAD_DIMM_ECC_IO_OFF_LOGIC_ON	(0x2 << 24)
+#define   MAD_DIMM_ECC_ON		(0x3 << 24)
+#define   MAD_DIMM_ENH_INTERLEAVE	(0x1 << 22)
+#define   MAD_DIMM_RANK_INTERLEAVE	(0x1 << 21)
+#define   MAD_DIMM_B_WIDTH_X16		(0x1 << 20) /* X8 chips if unset */
+#define   MAD_DIMM_A_WIDTH_X16		(0x1 << 19) /* X8 chips if unset */
+#define   MAD_DIMM_B_DUAL_RANK		(0x1 << 18)
+#define   MAD_DIMM_A_DUAL_RANK		(0x1 << 17)
+#define   MAD_DIMM_A_SELECT		(0x1 << 16)
+#define   MAD_DIMM_B_SIZE_MASK		(0xff << 8) /* in multiples of 256mb */
+#define   MAD_DIMM_B_SIZE_SHIFT		8
+#define   MAD_DIMM_A_SIZE_MASK		(0xff << 0) /* in multiples of 256mb */
+#define   MAD_DIMM_A_SIZE_SHIFT		8
+
+
 /* Clocking configuration register */
 #define CLKCFG			0x10c00
 #define CLKCFG_FSB_400					(5 << 0)	/* hrawclk 100 */
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 21/43] drm/i915: add gen6+ registers to i915_swizzle_info
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (18 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 20/43] drm/i915: swizzling support for snb/ivb Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 22/43] drm/i915: prevent division by zero when asking for chipset power Daniel Vetter
                   ` (21 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 3eab427..5ab7784 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1378,6 +1378,19 @@ static int i915_swizzle_info(struct seq_file *m, void *data)
 			   I915_READ16(C0DRB3));
 		seq_printf(m, "C1DRB3 = 0x%04x\n",
 			   I915_READ16(C1DRB3));
+	} else if (IS_GEN6(dev) || IS_GEN7(dev)) {
+		seq_printf(m, "MAD_DIMM_C0 = 0x%08x\n",
+			   I915_READ(MAD_DIMM_C0));
+		seq_printf(m, "MAD_DIMM_C1 = 0x%08x\n",
+			   I915_READ(MAD_DIMM_C1));
+		seq_printf(m, "MAD_DIMM_C2 = 0x%08x\n",
+			   I915_READ(MAD_DIMM_C2));
+		seq_printf(m, "TILECTL = 0x%08x\n",
+			   I915_READ(TILECTL));
+		seq_printf(m, "ARB_MODE = 0x%08x\n",
+			   I915_READ(ARB_MODE));
+		seq_printf(m, "DISP_ARB_CTL = 0x%08x\n",
+			   I915_READ(DISP_ARB_CTL));
 	}
 	mutex_unlock(&dev->struct_mutex);
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 22/43] drm/i915: prevent division by zero when asking for chipset power
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (19 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 21/43] drm/i915: add gen6+ registers to i915_swizzle_info Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 19:05   ` Kenneth Graunke
  2011-12-14 12:57 ` [PATCH 23/43] drm/i915: multithreaded forcewake is an ivb+ feature Daniel Vetter
                   ` (20 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx, stable, Eugeni Dodonov

From: Eugeni Dodonov <eugeni.dodonov@intel.com>

This prevents an in-kernel division by zero which happens when we are
asking for i915_chipset_val too quickly, or within a race condition
between the power monitoring thread and userspace accesses via debugfs.

The issue can be reproduced easily via the following command:
while ``; do cat /sys/kernel/debug/dri/0/i915_emon_status; done

This is particularly dangerous because it can be triggered by
a non-privileged user by just reading the debugfs entry.

This issue was also found independently by Konstantin Belousov
<kostikbel@gmail.com>, who proposed a similar patch.

Reported-by: Konstantin Belousov <kostikbel@gmail.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Keith Packard <keithp@keithp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_dma.c |   10 ++++++++++
 drivers/gpu/drm/i915/i915_drv.h |    1 +
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 4c21c67..fd617fb 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1454,6 +1454,14 @@ unsigned long i915_chipset_val(struct drm_i915_private *dev_priv)
 
 	diff1 = now - dev_priv->last_time1;
 
+	/* Prevent division-by-zero if we are asking too fast.
+	 * Also, we don't get interesting results in we are polling
+	 * faster than once in 10ms, so just return the saved value
+	 * in such cases.
+	 */
+	if (diff1 <= 10)
+		return dev_priv->chipset_power;
+
 	count1 = I915_READ(DMIEC);
 	count2 = I915_READ(DDREC);
 	count3 = I915_READ(CSIEC);
@@ -1484,6 +1492,8 @@ unsigned long i915_chipset_val(struct drm_i915_private *dev_priv)
 	dev_priv->last_count1 = total_count;
 	dev_priv->last_time1 = now;
 
+	dev_priv->chipset_power = ret;
+
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 311a4e1..6f1830f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -724,6 +724,7 @@ typedef struct drm_i915_private {
 
 	u64 last_count1;
 	unsigned long last_time1;
+	unsigned long chipset_power;
 	u64 last_count2;
 	struct timespec last_time2;
 	unsigned long gfx_power;
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 23/43] drm/i915: multithreaded forcewake is an ivb+ feature
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (20 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 22/43] drm/i915: prevent division by zero when asking for chipset power Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 21:07   ` Eric Anholt
  2011-12-14 12:57 ` [PATCH 24/43] drm/i915: capture error_state also for stuck rings Daniel Vetter
                   ` (19 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Name the function accordingly. Suggested by Chris Wilson.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c      |    4 ++--
 drivers/gpu/drm/i915/i915_drv.h      |    4 ++--
 drivers/gpu/drm/i915/intel_display.c |    8 ++++----
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f12b43e..6e0f868 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -344,7 +344,7 @@ void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 		udelay(10);
 }
 
-void __gen6_gt_force_wake_mt_get(struct drm_i915_private *dev_priv)
+void __gen7_gt_force_wake_mt_get(struct drm_i915_private *dev_priv)
 {
 	int count;
 
@@ -382,7 +382,7 @@ void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 	POSTING_READ(FORCEWAKE);
 }
 
-void __gen6_gt_force_wake_mt_put(struct drm_i915_private *dev_priv)
+void __gen7_gt_force_wake_mt_put(struct drm_i915_private *dev_priv)
 {
 	I915_WRITE_NOTRACE(FORCEWAKE_MT, (1<<16) | 0);
 	POSTING_READ(FORCEWAKE_MT);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6f1830f..bdbd6d8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1342,9 +1342,9 @@ extern void intel_detect_pch(struct drm_device *dev);
 extern int intel_trans_dp_port_sel(struct drm_crtc *crtc);
 
 extern void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
-extern void __gen6_gt_force_wake_mt_get(struct drm_i915_private *dev_priv);
+extern void __gen7_gt_force_wake_mt_get(struct drm_i915_private *dev_priv);
 extern void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
-extern void __gen6_gt_force_wake_mt_put(struct drm_i915_private *dev_priv);
+extern void __gen7_gt_force_wake_mt_put(struct drm_i915_private *dev_priv);
 
 /* overlay */
 #ifdef CONFIG_DEBUG_FS
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 571bf4e..6ac2e73 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8507,17 +8507,17 @@ static void intel_init_display(struct drm_device *dev)
 			u32	ecobus;
 
 			mutex_lock(&dev->struct_mutex);
-			__gen6_gt_force_wake_mt_get(dev_priv);
+			__gen7_gt_force_wake_mt_get(dev_priv);
 			ecobus = I915_READ_NOTRACE(ECOBUS);
-			__gen6_gt_force_wake_mt_put(dev_priv);
+			__gen7_gt_force_wake_mt_put(dev_priv);
 			mutex_unlock(&dev->struct_mutex);
 
 			if (ecobus & FORCEWAKE_MT_ENABLE) {
 				DRM_DEBUG_KMS("Using MT version of forcewake\n");
 				dev_priv->core.force_wake_get =
-					__gen6_gt_force_wake_mt_get;
+					__gen7_gt_force_wake_mt_get;
 				dev_priv->core.force_wake_put =
-					__gen6_gt_force_wake_mt_put;
+					__gen7_gt_force_wake_mt_put;
 			}
 		}
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 24/43] drm/i915: capture error_state also for stuck rings
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (21 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 23/43] drm/i915: multithreaded forcewake is an ivb+ feature Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-29 17:36   ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 25/43] drm/i915: properly flush the wc buffer in pwrites to phys objects Daniel Vetter
                   ` (18 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Since quite a while we also the basic output configuration in the
error_state, so it should contain enough information to diagnose
these MI_WAIT hangs.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 56f2b2f..0dac9af 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1708,6 +1708,7 @@ void i915_hangcheck_elapsed(unsigned long data)
 	    dev_priv->last_instdone1 == instdone1) {
 		if (dev_priv->hangcheck_count++ > 1) {
 			DRM_ERROR("Hangcheck timer elapsed... GPU hung\n");
+			i915_handle_error(dev, true);
 
 			if (!IS_GEN2(dev)) {
 				/* Is the chip hanging on a WAIT_FOR_EVENT?
@@ -1715,7 +1716,6 @@ void i915_hangcheck_elapsed(unsigned long data)
 				 * and break the hang. This should work on
 				 * all but the second generation chipsets.
 				 */
-
 				if (kick_ring(&dev_priv->ring[RCS]))
 					goto repeat;
 
@@ -1728,7 +1728,6 @@ void i915_hangcheck_elapsed(unsigned long data)
 					goto repeat;
 			}
 
-			i915_handle_error(dev, true);
 			return;
 		}
 	} else {
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 25/43] drm/i915: properly flush the wc buffer in pwrites to phys objects
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (22 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 24/43] drm/i915: capture error_state also for stuck rings Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 15:23   ` Chris Wilson
  2011-12-14 12:57 ` [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish Daniel Vetter
                   ` (17 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx, stable

Usually results in (rare) cursor corruptions on platforms
requiring physically addressed cursors.

Cc: stable@kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35460
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=21442
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 39459d2..d560175 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4123,6 +4123,7 @@ i915_gem_phys_pwrite(struct drm_device *dev,
 			return -EFAULT;
 	}
 
+	wmb();
 	intel_gtt_chipset_flush();
 	return 0;
 }
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (23 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 25/43] drm/i915: properly flush the wc buffer in pwrites to phys objects Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-16 20:07   ` Eric Anholt
  2012-03-01 20:40   ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 27/43] drm/i915: flush overlay regfile writes Daniel Vetter
                   ` (16 subsequent siblings)
  41 siblings, 2 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx, stable

From: Chris Wilson <chris@chris-wilson.co.uk>

By clearing the GPU read domains before waiting upon the buffer, we run
the risk of the wait being interrupted and the domains prematurely
cleared. The next time we attempt to wait upon the buffer (after
userspace handles the signal), we believe that the buffer is idle and so
skip the wait.

There are a number of bugs across all generations which show signs of an
overly haste reuse of active buffers.

Such as:

  https://bugs.freedesktop.org/show_bug.cgi?id=29046
  https://bugs.freedesktop.org/show_bug.cgi?id=35863
  https://bugs.freedesktop.org/show_bug.cgi?id=38952
  https://bugs.freedesktop.org/show_bug.cgi?id=40282
  https://bugs.freedesktop.org/show_bug.cgi?id=41098
  https://bugs.freedesktop.org/show_bug.cgi?id=41102
  https://bugs.freedesktop.org/show_bug.cgi?id=41284
  https://bugs.freedesktop.org/show_bug.cgi?id=42141

A couple of those pre-date i915_gem_object_finish_gpu(), so may be
unrelated (such as a wild write from a userspace command buffer), but
this does look like a convincing cause for most of those bugs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d560175..036bc58 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3087,10 +3087,13 @@ i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj)
 			return ret;
 	}
 
+	ret = i915_gem_object_wait_rendering(obj);
+	if (ret)
+		return ret;
+
 	/* Ensure that we invalidate the GPU's caches and TLBs. */
 	obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
-
-	return i915_gem_object_wait_rendering(obj);
+	return 0;
 }
 
 /**
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 27/43] drm/i915: flush overlay regfile writes
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (24 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 15:24   ` Chris Wilson
  2011-12-14 12:57 ` [PATCH 28/43] drm/i915: Handle unmappable buffers during error state capture Daniel Vetter
                   ` (15 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Better be paranoid. The wmb should flush the wc writes, and
the chipset_flush hopefully flushes any mch buffers. There've been a
few overlay hangs I've never really diagnosed, unfortunately all the
reporters disappeared.

Maybe-related: https://bugs.freedesktop.org/show_bug.cgi?id=33309
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_overlay.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index cdf17d4..f75e892 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -209,6 +209,9 @@ static void intel_overlay_unmap_regs(struct intel_overlay *overlay,
 {
 	if (!OVERLAY_NEEDS_PHYSICAL(overlay->dev))
 		io_mapping_unmap(regs);
+
+	wmb();
+	intel_gtt_chipset_flush();
 }
 
 static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 28/43] drm/i915: Handle unmappable buffers during error state capture
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (25 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 27/43] drm/i915: flush overlay regfile writes Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 18:46   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 29/43] drm/i915: remove the i915_batchbuffer_info debugfs file Daniel Vetter
                   ` (14 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

As the buffer is not necessarily accessible through the GTT at the time
of a GPU hang, and capturing some of its contents is far more valuable
than skipping it, provide a clflushed fallback read path. We still
prefer to read through the GTT as that is more consistent with the GPU
access of the same buffer. So example it will demonstrate any errorneous
tiling or swizzling of the command buffer as seen by the GPU.

This becomes necessary with use of CPU relocations and lazy GTT binding,
but could potentially happen anyway as a result of a pathological error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_irq.c |   28 +++++++++++++++++++++++-----
 1 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 0dac9af..7d06bc5 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -720,7 +720,6 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 	reloc_offset = src->gtt_offset;
 	for (page = 0; page < page_count; page++) {
 		unsigned long flags;
-		void __iomem *s;
 		void *d;
 
 		d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
@@ -728,10 +727,29 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 			goto unwind;
 
 		local_irq_save(flags);
-		s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
-					     reloc_offset);
-		memcpy_fromio(d, s, PAGE_SIZE);
-		io_mapping_unmap_atomic(s);
+		if (reloc_offset < dev_priv->mm.gtt_mappable_end) {
+			void __iomem *s;
+
+			/* Simply ignore tiling or any overlapping fence.
+			 * It's part of the error state, and this hopefully
+			 * captures what the GPU read.
+			 */
+
+			s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
+						     reloc_offset);
+			memcpy_fromio(d, s, PAGE_SIZE);
+			io_mapping_unmap_atomic(s);
+		} else {
+			void *s;
+
+			drm_clflush_pages(&src->pages[page], 1);
+
+			s = kmap_atomic(src->pages[page]);
+			memcpy(d, s, PAGE_SIZE);
+			kunmap_atomic(s);
+
+			drm_clflush_pages(&src->pages[page], 1);
+		}
 		local_irq_restore(flags);
 
 		dst->pages[page] = d;
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 29/43] drm/i915: remove the i915_batchbuffer_info debugfs file
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (26 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 28/43] drm/i915: Handle unmappable buffers during error state capture Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-29 17:35   ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 30/43] drm/i915: reject GTT domain in relocations Daniel Vetter
                   ` (13 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

With the error_state facility in place, this has outlived it's
usefulness. It also oopses with the lates llc-reloc patches because
it directly access objects through the gtt without any checks.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   40 -----------------------------------
 1 files changed, 0 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 5ab7784..aa305a6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -562,45 +562,6 @@ static int i915_hws_info(struct seq_file *m, void *data)
 	return 0;
 }
 
-static void i915_dump_object(struct seq_file *m,
-			     struct io_mapping *mapping,
-			     struct drm_i915_gem_object *obj)
-{
-	int page, page_count, i;
-
-	page_count = obj->base.size / PAGE_SIZE;
-	for (page = 0; page < page_count; page++) {
-		u32 *mem = io_mapping_map_wc(mapping,
-					     obj->gtt_offset + page * PAGE_SIZE);
-		for (i = 0; i < PAGE_SIZE; i += 4)
-			seq_printf(m, "%08x :  %08x\n", i, mem[i / 4]);
-		io_mapping_unmap(mem);
-	}
-}
-
-static int i915_batchbuffer_info(struct seq_file *m, void *data)
-{
-	struct drm_info_node *node = (struct drm_info_node *) m->private;
-	struct drm_device *dev = node->minor->dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj;
-	int ret;
-
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		return ret;
-
-	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
-		if (obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) {
-		    seq_printf(m, "--- gtt_offset = 0x%08x\n", obj->gtt_offset);
-		    i915_dump_object(m, dev_priv->mm.gtt_mapping, obj);
-		}
-	}
-
-	mutex_unlock(&dev->struct_mutex);
-	return 0;
-}
-
 static int i915_ringbuffer_data(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -1782,7 +1743,6 @@ static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_bsd_ringbuffer_info", i915_ringbuffer_info, 0, (void *)VCS},
 	{"i915_blt_ringbuffer_data", i915_ringbuffer_data, 0, (void *)BCS},
 	{"i915_blt_ringbuffer_info", i915_ringbuffer_info, 0, (void *)BCS},
-	{"i915_batchbuffers", i915_batchbuffer_info, 0},
 	{"i915_error_state", i915_error_state, 0},
 	{"i915_rstdby_delays", i915_rstdby_delays, 0},
 	{"i915_cur_delayinfo", i915_cur_delayinfo, 0},
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 30/43] drm/i915: reject GTT domain in relocations
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (27 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 29/43] drm/i915: remove the i915_batchbuffer_info debugfs file Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-29 17:38   ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 31/43] drm/i915: Use kcalloc instead of kzalloc to allocate array Daniel Vetter
                   ` (12 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

This confuses our domain tracking and can (for gtt write domains) lead
to a subsequent oops.

Tested by tests/gem_exec_bad_domains from i-g-t.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c918124..30b7862 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -302,8 +302,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 			  reloc->write_domain);
 		return ret;
 	}
-	if (unlikely((reloc->write_domain | reloc->read_domains) & I915_GEM_DOMAIN_CPU)) {
-		DRM_ERROR("reloc with read/write CPU domains: "
+	if (unlikely((reloc->write_domain | reloc->read_domains)
+		     & ~I915_GEM_GPU_DOMAINS)) {
+		DRM_ERROR("reloc with read/write non-GPU domains: "
 			  "obj %p target %d offset %d "
 			  "read %08x write %08x",
 			  obj, reloc->target_handle,
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 31/43] drm/i915: Use kcalloc instead of kzalloc to allocate array
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (28 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 30/43] drm/i915: reject GTT domain in relocations Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 18:48   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 32/43] drm/i915: Avoid using mappable space for relocation processing through the CPU Daniel Vetter
                   ` (11 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx, Thomas Meyer

From: Thomas Meyer <thomas@m3y3r.de>

The advantage of kcalloc is, that will prevent integer overflows which could
result from the multiplication of number of elements and size and it is also
a bit nicer to read.

The semantic patch that makes this change is available
in https://lkml.org/lkml/2011/11/25/107

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_bios.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_bios.c b/drivers/gpu/drm/i915/intel_bios.c
index 63880e2..5065633 100644
--- a/drivers/gpu/drm/i915/intel_bios.c
+++ b/drivers/gpu/drm/i915/intel_bios.c
@@ -572,7 +572,7 @@ parse_device_mapping(struct drm_i915_private *dev_priv,
 		DRM_DEBUG_KMS("no child dev is parsed from VBT\n");
 		return;
 	}
-	dev_priv->child_dev = kzalloc(sizeof(*p_child) * count, GFP_KERNEL);
+	dev_priv->child_dev = kcalloc(count, sizeof(*p_child), GFP_KERNEL);
 	if (!dev_priv->child_dev) {
 		DRM_DEBUG_KMS("No memory space for child device\n");
 		return;
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 32/43] drm/i915: Avoid using mappable space for relocation processing through the CPU
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (29 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 31/43] drm/i915: Use kcalloc instead of kzalloc to allocate array Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 33/43] drm/i915: fall through pwrite_gtt_slow to the shmem slow path Daniel Vetter
                   ` (10 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

We try to avoid writing the relocations through the uncached GTT, if the
buffer is currently in the CPU write domain and so will be flushed out to
main memory afterwards anyway. Also on SandyBridge we can safely write
to the pages in cacheable memory, so long as the buffer is LLC mapped.
In either of these caches, we therefore do not need to force the
reallocation of the buffer into the mappable region of the GTT, reducing
the aperture pressure.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h            |    2 +
 drivers/gpu/drm/i915/i915_gem.c            |    4 +--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   36 +++++++++++++++++++--------
 3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bdbd6d8..1cafe32 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1226,6 +1226,8 @@ int __must_check
 i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj,
 				  bool write);
 int __must_check
+i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
+int __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     struct intel_ring_buffer *pipelined);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 036bc58..7ada9d2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -39,8 +39,6 @@
 static __must_check int i915_gem_object_flush_gpu_write_domain(struct drm_i915_gem_object *obj);
 static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
-static __must_check int i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj,
-							  bool write);
 static __must_check int i915_gem_object_set_cpu_read_domain_range(struct drm_i915_gem_object *obj,
 								  uint64_t offset,
 								  uint64_t size);
@@ -3102,7 +3100,7 @@ i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj)
  * This function returns when the move is complete, including waiting on
  * flushes to occur.
  */
-static int
+int
 i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
 {
 	uint32_t old_write_domain, old_read_domains;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 30b7862..7e65505 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -265,6 +265,12 @@ eb_destroy(struct eb_objects *eb)
 	kfree(eb);
 }
 
+static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
+{
+	return (obj->base.write_domain == I915_GEM_DOMAIN_CPU ||
+		obj->cache_level != I915_CACHE_NONE);
+}
+
 static int
 i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 				   struct eb_objects *eb,
@@ -351,11 +357,19 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 		return ret;
 	}
 
+	/* We can't wait for rendering with pagefaults disabled */
+	if (obj->active && in_atomic())
+		return -EFAULT;
+
 	reloc->delta += target_offset;
-	if (obj->base.write_domain == I915_GEM_DOMAIN_CPU) {
+	if (use_cpu_reloc(obj)) {
 		uint32_t page_offset = reloc->offset & ~PAGE_MASK;
 		char *vaddr;
 
+		ret = i915_gem_object_set_to_cpu_domain(obj, 1);
+		if (ret)
+			return ret;
+
 		vaddr = kmap_atomic(obj->pages[reloc->offset >> PAGE_SHIFT]);
 		*(uint32_t *)(vaddr + page_offset) = reloc->delta;
 		kunmap_atomic(vaddr);
@@ -364,10 +378,6 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 		uint32_t __iomem *reloc_entry;
 		void __iomem *reloc_page;
 
-		/* We can't wait for rendering with pagefaults disabled */
-		if (obj->active && in_atomic())
-			return -EFAULT;
-
 		ret = i915_gem_object_set_to_gtt_domain(obj, 1);
 		if (ret)
 			return ret;
@@ -464,6 +474,13 @@ i915_gem_execbuffer_relocate(struct drm_device *dev,
 #define  __EXEC_OBJECT_HAS_FENCE (1<<31)
 
 static int
+need_reloc_mappable(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+	return entry->relocation_count && !use_cpu_reloc(obj);
+}
+
+static int
 pin_and_fence_object(struct drm_i915_gem_object *obj,
 		     struct intel_ring_buffer *ring)
 {
@@ -476,8 +493,7 @@ pin_and_fence_object(struct drm_i915_gem_object *obj,
 		has_fenced_gpu_access &&
 		entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 		obj->tiling_mode != I915_TILING_NONE;
-	need_mappable =
-		entry->relocation_count ? true : need_fence;
+	need_mappable = need_fence || need_reloc_mappable(obj);
 
 	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable);
 	if (ret)
@@ -533,8 +549,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			has_fenced_gpu_access &&
 			entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 			obj->tiling_mode != I915_TILING_NONE;
-		need_mappable =
-			entry->relocation_count ? true : need_fence;
+		need_mappable = need_fence || need_reloc_mappable(obj);
 
 		if (need_mappable)
 			list_move(&obj->exec_list, &ordered_objects);
@@ -574,8 +589,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 				has_fenced_gpu_access &&
 				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 				obj->tiling_mode != I915_TILING_NONE;
-			need_mappable =
-				entry->relocation_count ? true : need_fence;
+			need_mappable = need_fence || need_reloc_mappable(obj);
 
 			if ((entry->alignment && obj->gtt_offset & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 33/43] drm/i915: fall through pwrite_gtt_slow to the shmem slow path
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (30 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 32/43] drm/i915: Avoid using mappable space for relocation processing through the CPU Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 34/43] drm/i915: rewrite shmem_pwrite_slow to use copy_from_user Daniel Vetter
                   ` (9 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

The gtt_pwrite slowpath grabs the userspace memory with
get_user_pages. This will not work for non-page backed memory, like a
gtt mmapped gem object. Hence fall throuh to the shmem paths if we hit
-EFAULT in the gtt paths.

Now the shmem paths have exactly the same problem, but this way we
only need to rearrange the code in one write path.

v2: v1 accidentaly falls back to shmem pwrite for phys objects. Fixed.

v3: Make the codeflow around phys_pwrite cleara as suggested by Chris
Wilson.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |   33 +++++++++++++++++++++------------
 1 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7ada9d2..f74ecb8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -994,10 +994,13 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 	 * pread/pwrite currently are reading and writing from the CPU
 	 * perspective, requiring manual detiling by the client.
 	 */
-	if (obj->phys_obj)
+	if (obj->phys_obj) {
 		ret = i915_gem_phys_pwrite(dev, obj, args, file);
-	else if (obj->gtt_space &&
-		 obj->base.write_domain != I915_GEM_DOMAIN_CPU) {
+		goto out;
+	}
+
+	if (obj->gtt_space &&
+	    obj->base.write_domain != I915_GEM_DOMAIN_CPU) {
 		ret = i915_gem_object_pin(obj, 0, true);
 		if (ret)
 			goto out;
@@ -1016,18 +1019,24 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 
 out_unpin:
 		i915_gem_object_unpin(obj);
-	} else {
-		ret = i915_gem_object_set_to_cpu_domain(obj, 1);
-		if (ret)
-			goto out;
 
-		ret = -EFAULT;
-		if (!i915_gem_object_needs_bit17_swizzle(obj))
-			ret = i915_gem_shmem_pwrite_fast(dev, obj, args, file);
-		if (ret == -EFAULT)
-			ret = i915_gem_shmem_pwrite_slow(dev, obj, args, file);
+		if (ret != -EFAULT)
+			goto out;
+		/* Fall through to the shmfs paths because the gtt paths might
+		 * fail with non-page-backed user pointers (e.g. gtt mappings
+		 * when moving data between textures). */
 	}
 
+	ret = i915_gem_object_set_to_cpu_domain(obj, 1);
+	if (ret)
+		goto out;
+
+	ret = -EFAULT;
+	if (!i915_gem_object_needs_bit17_swizzle(obj))
+		ret = i915_gem_shmem_pwrite_fast(dev, obj, args, file);
+	if (ret == -EFAULT)
+		ret = i915_gem_shmem_pwrite_slow(dev, obj, args, file);
+
 out:
 	drm_gem_object_unreference(&obj->base);
 unlock:
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 34/43] drm/i915: rewrite shmem_pwrite_slow to use copy_from_user
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (31 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 33/43] drm/i915: fall through pwrite_gtt_slow to the shmem slow path Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 35/43] drm/i915: rewrite shmem_pread_slow to use copy_to_user Daniel Vetter
                   ` (8 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

... instead of get_user_pages, because that fails on non page-backed
user addresses like e.g. a gtt mapping of a bo.

To get there essentially copy the vfs read path into pagecache. We
can't call that right away because we have to take care of bit17
swizzling. To not deadlock with our own pagefault handler we need
to completely drop struct_mutex, reducing the atomicty-guarantees
of our userspace abi. Implications for racing with other gem ioctl:

- execbuf, pwrite, pread: Due to -EFAULT fallback to slow paths there's
  already the risk of the pwrite call not being atomic, no degration.
- read/write access to mmaps: already fully racy, no degration.
- set_tiling: Calling set_tiling while reading/writing is already
  pretty much undefined, now it just got a bit worse. set_tiling is
  only called by libdrm on unused/new bos, so no problem.
- set_domain: When changing to the gtt domain while copying (without any
  read/write access, e.g. for synchronization), we might leave unflushed
  data in the cpu caches. The clflush_object at the end of pwrite_slow
  takes care of this problem.
- truncating of purgeable objects: the shmem_read_mapping_page call could
  reinstate backing storage for truncated objects. The check at the end
  of pwrite_slow takes care of this.

v2:
- add missing intel_gtt_chipset_flush
- add __ to copy_from_user_swizzled as suggest by Chris Wilson.

v3: Fixup bit17 swizzling, it swizzled the wrong pages.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |  127 ++++++++++++++++++++-------------------
 1 files changed, 65 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f74ecb8..78776a4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -56,6 +56,7 @@ static void i915_gem_free_object_tail(struct drm_i915_gem_object *obj);
 
 static int i915_gem_inactive_shrink(struct shrinker *shrinker,
 				    struct shrink_control *sc);
+static void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
 
 /* some bookkeeping */
 static void i915_gem_info_add_obj(struct drm_i915_private *dev_priv,
@@ -383,6 +384,32 @@ i915_gem_shmem_pread_fast(struct drm_device *dev,
 	return 0;
 }
 
+static inline int
+__copy_from_user_swizzled(char __user *gpu_vaddr, int gpu_offset,
+			  const char *cpu_vaddr,
+			  int length)
+{
+	int ret, cpu_offset = 0;
+
+	while (length > 0) {
+		int cacheline_end = ALIGN(gpu_offset + 1, 64);
+		int this_length = min(cacheline_end - gpu_offset, length);
+		int swizzled_gpu_offset = gpu_offset ^ 64;
+
+		ret = __copy_from_user(gpu_vaddr + swizzled_gpu_offset,
+				       cpu_vaddr + cpu_offset,
+				       this_length);
+		if (ret)
+			return ret + length;
+
+		cpu_offset += this_length;
+		gpu_offset += this_length;
+		length -= this_length;
+	}
+
+	return 0;
+}
+
 /**
  * This is the fallback shmem pread path, which allocates temporary storage
  * in kernel space to copy_to_user into outside of the struct_mutex, so we
@@ -839,71 +866,36 @@ i915_gem_shmem_pwrite_slow(struct drm_device *dev,
 			   struct drm_file *file)
 {
 	struct address_space *mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
-	struct mm_struct *mm = current->mm;
-	struct page **user_pages;
 	ssize_t remain;
-	loff_t offset, pinned_pages, i;
-	loff_t first_data_page, last_data_page, num_pages;
-	int shmem_page_offset;
-	int data_page_index,  data_page_offset;
-	int page_length;
-	int ret;
-	uint64_t data_ptr = args->data_ptr;
-	int do_bit17_swizzling;
+	loff_t offset;
+	char __user *user_data;
+	int shmem_page_offset, page_length, ret;
+	int obj_do_bit17_swizzling, page_do_bit17_swizzling;
 
+	user_data = (char __user *) (uintptr_t) args->data_ptr;
 	remain = args->size;
 
-	/* Pin the user pages containing the data.  We can't fault while
-	 * holding the struct mutex, and all of the pwrite implementations
-	 * want to hold it while dereferencing the user data.
-	 */
-	first_data_page = data_ptr / PAGE_SIZE;
-	last_data_page = (data_ptr + args->size - 1) / PAGE_SIZE;
-	num_pages = last_data_page - first_data_page + 1;
-
-	user_pages = drm_malloc_ab(num_pages, sizeof(struct page *));
-	if (user_pages == NULL)
-		return -ENOMEM;
-
-	mutex_unlock(&dev->struct_mutex);
-	down_read(&mm->mmap_sem);
-	pinned_pages = get_user_pages(current, mm, (uintptr_t)args->data_ptr,
-				      num_pages, 0, 0, user_pages, NULL);
-	up_read(&mm->mmap_sem);
-	mutex_lock(&dev->struct_mutex);
-	if (pinned_pages < num_pages) {
-		ret = -EFAULT;
-		goto out;
-	}
-
-	ret = i915_gem_object_set_to_cpu_domain(obj, 1);
-	if (ret)
-		goto out;
-
-	do_bit17_swizzling = i915_gem_object_needs_bit17_swizzle(obj);
+	obj_do_bit17_swizzling = i915_gem_object_needs_bit17_swizzle(obj);
 
 	offset = args->offset;
 	obj->dirty = 1;
 
+	mutex_unlock(&dev->struct_mutex);
+
 	while (remain > 0) {
 		struct page *page;
+		char *vaddr;
 
 		/* Operation in this page
 		 *
 		 * shmem_page_offset = offset within page in shmem file
-		 * data_page_index = page number in get_user_pages return
-		 * data_page_offset = offset with data_page_index page.
 		 * page_length = bytes to copy for this page
 		 */
 		shmem_page_offset = offset_in_page(offset);
-		data_page_index = data_ptr / PAGE_SIZE - first_data_page;
-		data_page_offset = offset_in_page(data_ptr);
 
 		page_length = remain;
 		if ((shmem_page_offset + page_length) > PAGE_SIZE)
 			page_length = PAGE_SIZE - shmem_page_offset;
-		if ((data_page_offset + page_length) > PAGE_SIZE)
-			page_length = PAGE_SIZE - data_page_offset;
 
 		page = shmem_read_mapping_page(mapping, offset >> PAGE_SHIFT);
 		if (IS_ERR(page)) {
@@ -911,34 +903,45 @@ i915_gem_shmem_pwrite_slow(struct drm_device *dev,
 			goto out;
 		}
 
-		if (do_bit17_swizzling) {
-			slow_shmem_bit17_copy(page,
-					      shmem_page_offset,
-					      user_pages[data_page_index],
-					      data_page_offset,
-					      page_length,
-					      0);
-		} else {
-			slow_shmem_copy(page,
-					shmem_page_offset,
-					user_pages[data_page_index],
-					data_page_offset,
-					page_length);
-		}
+		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+			(page_to_phys(page) & (1 << 17)) != 0;
+
+		vaddr = kmap(page);
+		if (page_do_bit17_swizzling)
+			ret = __copy_from_user_swizzled(vaddr, shmem_page_offset,
+							user_data,
+							page_length);
+		else
+			ret = __copy_from_user(vaddr + shmem_page_offset,
+					       user_data,
+					       page_length);
+		kunmap(page);
 
 		set_page_dirty(page);
 		mark_page_accessed(page);
 		page_cache_release(page);
 
+		if (ret) {
+			ret = -EFAULT;
+			goto out;
+		}
+
 		remain -= page_length;
-		data_ptr += page_length;
+		user_data += page_length;
 		offset += page_length;
 	}
 
 out:
-	for (i = 0; i < pinned_pages; i++)
-		page_cache_release(user_pages[i]);
-	drm_free_large(user_pages);
+	mutex_lock(&dev->struct_mutex);
+	/* Fixup: Kill any reinstated backing storage pages */
+	if (obj->madv == __I915_MADV_PURGED)
+		i915_gem_object_truncate(obj);
+	/* and flush dirty cachelines in case the object isn't in the cpu write
+	 * domain anymore. */
+	if (obj->base.write_domain != I915_GEM_DOMAIN_CPU) {
+		i915_gem_clflush_object(obj);
+		intel_gtt_chipset_flush();
+	}
 
 	return ret;
 }
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 35/43] drm/i915: rewrite shmem_pread_slow to use copy_to_user
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (32 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 34/43] drm/i915: rewrite shmem_pwrite_slow to use copy_from_user Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2012-01-30 22:37   ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 36/43] agp/intel-gtt: export the scratch page dma address Daniel Vetter
                   ` (7 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Like for shmem_pwrite_slow. The only difference is that because we
read data, we can leave the fetched cachelines in the cpu: In the case
that the object isn't in the cpu read domain anymore, the clflush for
the next cpu read domain invalidation will simply drop these
cachelines.

slow_shmem_bit17_copy is now ununsed, so kill it.

With this patch tests/gem_mmap_gtt now actually works.

v2: add __ to copy_to_user_swizzled as suggested by Chris Wilson.

v3: Fixup the swizzling logic, it swizzled the wrong pages.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |  191 ++++++++++++---------------------------
 1 files changed, 57 insertions(+), 134 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 78776a4..d38e2e6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -257,73 +257,6 @@ static int i915_gem_object_needs_bit17_swizzle(struct drm_i915_gem_object *obj)
 		obj->tiling_mode != I915_TILING_NONE;
 }
 
-static inline void
-slow_shmem_copy(struct page *dst_page,
-		int dst_offset,
-		struct page *src_page,
-		int src_offset,
-		int length)
-{
-	char *dst_vaddr, *src_vaddr;
-
-	dst_vaddr = kmap(dst_page);
-	src_vaddr = kmap(src_page);
-
-	memcpy(dst_vaddr + dst_offset, src_vaddr + src_offset, length);
-
-	kunmap(src_page);
-	kunmap(dst_page);
-}
-
-static inline void
-slow_shmem_bit17_copy(struct page *gpu_page,
-		      int gpu_offset,
-		      struct page *cpu_page,
-		      int cpu_offset,
-		      int length,
-		      int is_read)
-{
-	char *gpu_vaddr, *cpu_vaddr;
-
-	/* Use the unswizzled path if this page isn't affected. */
-	if ((page_to_phys(gpu_page) & (1 << 17)) == 0) {
-		if (is_read)
-			return slow_shmem_copy(cpu_page, cpu_offset,
-					       gpu_page, gpu_offset, length);
-		else
-			return slow_shmem_copy(gpu_page, gpu_offset,
-					       cpu_page, cpu_offset, length);
-	}
-
-	gpu_vaddr = kmap(gpu_page);
-	cpu_vaddr = kmap(cpu_page);
-
-	/* Copy the data, XORing A6 with A17 (1). The user already knows he's
-	 * XORing with the other bits (A9 for Y, A9 and A10 for X)
-	 */
-	while (length > 0) {
-		int cacheline_end = ALIGN(gpu_offset + 1, 64);
-		int this_length = min(cacheline_end - gpu_offset, length);
-		int swizzled_gpu_offset = gpu_offset ^ 64;
-
-		if (is_read) {
-			memcpy(cpu_vaddr + cpu_offset,
-			       gpu_vaddr + swizzled_gpu_offset,
-			       this_length);
-		} else {
-			memcpy(gpu_vaddr + swizzled_gpu_offset,
-			       cpu_vaddr + cpu_offset,
-			       this_length);
-		}
-		cpu_offset += this_length;
-		gpu_offset += this_length;
-		length -= this_length;
-	}
-
-	kunmap(cpu_page);
-	kunmap(gpu_page);
-}
-
 /**
  * This is the fast shmem pread path, which attempts to copy_from_user directly
  * from the backing pages of the object to the user's address space.  On a
@@ -385,6 +318,32 @@ i915_gem_shmem_pread_fast(struct drm_device *dev,
 }
 
 static inline int
+__copy_to_user_swizzled(char __user *cpu_vaddr,
+			const char *gpu_vaddr, int gpu_offset,
+			int length)
+{
+	int ret, cpu_offset = 0;
+
+	while (length > 0) {
+		int cacheline_end = ALIGN(gpu_offset + 1, 64);
+		int this_length = min(cacheline_end - gpu_offset, length);
+		int swizzled_gpu_offset = gpu_offset ^ 64;
+
+		ret = __copy_to_user(cpu_vaddr + cpu_offset,
+				     gpu_vaddr + swizzled_gpu_offset,
+				     this_length);
+		if (ret)
+			return ret + length;
+
+		cpu_offset += this_length;
+		gpu_offset += this_length;
+		length -= this_length;
+	}
+
+	return 0;
+}
+
+static inline int
 __copy_from_user_swizzled(char __user *gpu_vaddr, int gpu_offset,
 			  const char *cpu_vaddr,
 			  int length)
@@ -423,72 +382,34 @@ i915_gem_shmem_pread_slow(struct drm_device *dev,
 			  struct drm_file *file)
 {
 	struct address_space *mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
-	struct mm_struct *mm = current->mm;
-	struct page **user_pages;
+	char __user *user_data;
 	ssize_t remain;
-	loff_t offset, pinned_pages, i;
-	loff_t first_data_page, last_data_page, num_pages;
-	int shmem_page_offset;
-	int data_page_index, data_page_offset;
-	int page_length;
-	int ret;
-	uint64_t data_ptr = args->data_ptr;
-	int do_bit17_swizzling;
+	loff_t offset;
+	int shmem_page_offset, page_length, ret;
+	int obj_do_bit17_swizzling, page_do_bit17_swizzling;
 
+	user_data = (char __user *) (uintptr_t) args->data_ptr;
 	remain = args->size;
 
-	/* Pin the user pages containing the data.  We can't fault while
-	 * holding the struct mutex, yet we want to hold it while
-	 * dereferencing the user data.
-	 */
-	first_data_page = data_ptr / PAGE_SIZE;
-	last_data_page = (data_ptr + args->size - 1) / PAGE_SIZE;
-	num_pages = last_data_page - first_data_page + 1;
+	obj_do_bit17_swizzling = i915_gem_object_needs_bit17_swizzle(obj);
 
-	user_pages = drm_malloc_ab(num_pages, sizeof(struct page *));
-	if (user_pages == NULL)
-		return -ENOMEM;
+	offset = args->offset;
 
 	mutex_unlock(&dev->struct_mutex);
-	down_read(&mm->mmap_sem);
-	pinned_pages = get_user_pages(current, mm, (uintptr_t)args->data_ptr,
-				      num_pages, 1, 0, user_pages, NULL);
-	up_read(&mm->mmap_sem);
-	mutex_lock(&dev->struct_mutex);
-	if (pinned_pages < num_pages) {
-		ret = -EFAULT;
-		goto out;
-	}
-
-	ret = i915_gem_object_set_cpu_read_domain_range(obj,
-							args->offset,
-							args->size);
-	if (ret)
-		goto out;
-
-	do_bit17_swizzling = i915_gem_object_needs_bit17_swizzle(obj);
-
-	offset = args->offset;
 
 	while (remain > 0) {
 		struct page *page;
+		char *vaddr;
 
 		/* Operation in this page
 		 *
 		 * shmem_page_offset = offset within page in shmem file
-		 * data_page_index = page number in get_user_pages return
-		 * data_page_offset = offset with data_page_index page.
 		 * page_length = bytes to copy for this page
 		 */
 		shmem_page_offset = offset_in_page(offset);
-		data_page_index = data_ptr / PAGE_SIZE - first_data_page;
-		data_page_offset = offset_in_page(data_ptr);
-
 		page_length = remain;
 		if ((shmem_page_offset + page_length) > PAGE_SIZE)
 			page_length = PAGE_SIZE - shmem_page_offset;
-		if ((data_page_offset + page_length) > PAGE_SIZE)
-			page_length = PAGE_SIZE - data_page_offset;
 
 		page = shmem_read_mapping_page(mapping, offset >> PAGE_SHIFT);
 		if (IS_ERR(page)) {
@@ -496,36 +417,38 @@ i915_gem_shmem_pread_slow(struct drm_device *dev,
 			goto out;
 		}
 
-		if (do_bit17_swizzling) {
-			slow_shmem_bit17_copy(page,
-					      shmem_page_offset,
-					      user_pages[data_page_index],
-					      data_page_offset,
-					      page_length,
-					      1);
-		} else {
-			slow_shmem_copy(user_pages[data_page_index],
-					data_page_offset,
-					page,
-					shmem_page_offset,
-					page_length);
-		}
+		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+			(page_to_phys(page) & (1 << 17)) != 0;
+
+		vaddr = kmap(page);
+		if (page_do_bit17_swizzling)
+			ret = __copy_to_user_swizzled(user_data,
+						      vaddr, shmem_page_offset,
+						      page_length);
+		else
+			ret = __copy_to_user(user_data,
+					     vaddr + shmem_page_offset,
+					     page_length);
+		kunmap(page);
 
 		mark_page_accessed(page);
 		page_cache_release(page);
 
+		if (ret) {
+			ret = -EFAULT;
+			goto out;
+		}
+
 		remain -= page_length;
-		data_ptr += page_length;
+		user_data += page_length;
 		offset += page_length;
 	}
 
 out:
-	for (i = 0; i < pinned_pages; i++) {
-		SetPageDirty(user_pages[i]);
-		mark_page_accessed(user_pages[i]);
-		page_cache_release(user_pages[i]);
-	}
-	drm_free_large(user_pages);
+	mutex_lock(&dev->struct_mutex);
+	/* Fixup: Kill any reinstated backing storage pages */
+	if (obj->madv == __I915_MADV_PURGED)
+		i915_gem_object_truncate(obj);
 
 	return ret;
 }
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 36/43] agp/intel-gtt: export the scratch page dma address
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (33 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 35/43] drm/i915: rewrite shmem_pread_slow to use copy_to_user Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 37/43] agp/intel-gtt: export the gtt pagetable iomapping Daniel Vetter
                   ` (6 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

To implement a PPGTT for drm/i915 that fully aliases the GTT, we also
need to properly alias the scratch page.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/char/agp/intel-gtt.c |    9 ++++-----
 include/drm/intel-gtt.h      |    2 ++
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index c92424c..0a305ac 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -76,7 +76,6 @@ static struct _intel_private {
 	struct resource ifp_resource;
 	int resource_valid;
 	struct page *scratch_page;
-	dma_addr_t scratch_page_dma;
 } intel_private;
 
 #define INTEL_GTT_GEN	intel_private.driver->gen
@@ -306,9 +305,9 @@ static int intel_gtt_setup_scratch_page(void)
 		if (pci_dma_mapping_error(intel_private.pcidev, dma_addr))
 			return -EINVAL;
 
-		intel_private.scratch_page_dma = dma_addr;
+		intel_private.base.scratch_page_dma = dma_addr;
 	} else
-		intel_private.scratch_page_dma = page_to_phys(page);
+		intel_private.base.scratch_page_dma = page_to_phys(page);
 
 	intel_private.scratch_page = page;
 
@@ -631,7 +630,7 @@ static unsigned int intel_gtt_mappable_entries(void)
 static void intel_gtt_teardown_scratch_page(void)
 {
 	set_pages_wb(intel_private.scratch_page, 1);
-	pci_unmap_page(intel_private.pcidev, intel_private.scratch_page_dma,
+	pci_unmap_page(intel_private.pcidev, intel_private.base.scratch_page_dma,
 		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
 	put_page(intel_private.scratch_page);
 	__free_page(intel_private.scratch_page);
@@ -975,7 +974,7 @@ void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
 	unsigned int i;
 
 	for (i = first_entry; i < (first_entry + num_entries); i++) {
-		intel_private.driver->write_entry(intel_private.scratch_page_dma,
+		intel_private.driver->write_entry(intel_private.base.scratch_page_dma,
 						  i, 0);
 	}
 	readl(intel_private.gtt+i-1);
diff --git a/include/drm/intel-gtt.h b/include/drm/intel-gtt.h
index b174620..6d4c77a 100644
--- a/include/drm/intel-gtt.h
+++ b/include/drm/intel-gtt.h
@@ -15,6 +15,8 @@ const struct intel_gtt {
 	unsigned int needs_dmar : 1;
 	/* Whether we idle the gpu before mapping/unmapping */
 	unsigned int do_idle_maps : 1;
+	/* Share the scratch page dma with ppgtts. */
+	dma_addr_t scratch_page_dma;
 } *intel_gtt_get(void);
 
 void intel_gtt_chipset_flush(void);
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 37/43] agp/intel-gtt: export the gtt pagetable iomapping
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (34 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 36/43] agp/intel-gtt: export the scratch page dma address Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 38/43] drm/i915: initialization/teardown for the aliasing ppgtt Daniel Vetter
                   ` (5 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

We need this because ppgtt page directory entries need to be in the
global gtt pagetable.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/char/agp/intel-gtt.c |    1 +
 include/drm/intel-gtt.h      |    2 ++
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index 0a305ac..5cf47ac 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -680,6 +680,7 @@ static int intel_gtt_init(void)
 		iounmap(intel_private.registers);
 		return -ENOMEM;
 	}
+	intel_private.base.gtt = intel_private.gtt;
 
 	global_cache_flush();   /* FIXME: ? */
 
diff --git a/include/drm/intel-gtt.h b/include/drm/intel-gtt.h
index 6d4c77a..0a0001b 100644
--- a/include/drm/intel-gtt.h
+++ b/include/drm/intel-gtt.h
@@ -17,6 +17,8 @@ const struct intel_gtt {
 	unsigned int do_idle_maps : 1;
 	/* Share the scratch page dma with ppgtts. */
 	dma_addr_t scratch_page_dma;
+	/* for ppgtt PDE access */
+	u32 __iomem *gtt;
 } *intel_gtt_get(void);
 
 void intel_gtt_chipset_flush(void);
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 38/43] drm/i915: initialization/teardown for the aliasing ppgtt
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (35 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 37/43] agp/intel-gtt: export the gtt pagetable iomapping Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 39/43] drm/i915: ppgtt binding/unbinding support Daniel Vetter
                   ` (4 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

This just adds the setup and teardown code for the ppgtt PDE and the
last-level pagetables, which are fixed for the entire lifetime, at
least for the moment.

v2: Kill the stray debug printk noted by and improve the pte
definitions as suggested by Chris Wilson.

v3: Clean up the aperture stealing code as noted by Ben Widawsky.

v4: Paint the init code in a more pleasing colour as suggest by Chris
Wilson.

v5: Explain the magic numbers noticed by Ben Widawsky.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c     |   41 ++++++++---
 drivers/gpu/drm/i915/i915_drv.h     |   18 +++++
 drivers/gpu/drm/i915/i915_gem_gtt.c |  139 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h     |   16 ++++
 4 files changed, 203 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index fd617fb..2cc0ba4 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1190,22 +1190,39 @@ static int i915_load_gem_init(struct drm_device *dev)
 	/* Basic memrange allocator for stolen space */
 	drm_mm_init(&dev_priv->mm.stolen, 0, prealloc_size);
 
-	/* Let GEM Manage all of the aperture.
-	 *
-	 * However, leave one page at the end still bound to the scratch page.
-	 * There are a number of places where the hardware apparently
-	 * prefetches past the end of the object, and we've seen multiple
-	 * hangs with the GPU head pointer stuck in a batchbuffer bound
-	 * at the last page of the aperture.  One page should be enough to
-	 * keep any prefetching inside of the aperture.
-	 */
-	i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
+	if (HAS_ALIASING_PPGTT(dev)) {
+		/* PPGTT pdes are stolen from global gtt ptes, so shrink the
+		 * aperture accordingly when using aliasing ppgtt. */
+		gtt_size -= I915_PPGTT_PD_ENTRIES*PAGE_SIZE;
+		/* For paranoia keep the guard page in between. */
+		gtt_size -= PAGE_SIZE;
+
+		i915_gem_do_init(dev, 0, mappable_size, gtt_size);
+
+		ret = i915_gem_init_aliasing_ppgtt(dev);
+		if (ret)
+			return ret;
+	} else {
+		/* Let GEM Manage all of the aperture.
+		 *
+		 * However, leave one page at the end still bound to the scratch
+		 * page.  There are a number of places where the hardware
+		 * apparently prefetches past the end of the object, and we've
+		 * seen multiple hangs with the GPU head pointer stuck in a
+		 * batchbuffer bound at the last page of the aperture.  One page
+		 * should be enough to keep any prefetching inside of the
+		 * aperture.
+		 */
+		i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
+	}
 
 	mutex_lock(&dev->struct_mutex);
 	ret = i915_gem_init_hw(dev);
 	mutex_unlock(&dev->struct_mutex);
-	if (ret)
+	if (ret) {
+		i915_gem_cleanup_aliasing_ppgtt(dev);
 		return ret;
+	}
 
 	/* Try to set up FBC with a reasonable compressed buffer size */
 	if (I915_HAS_FBC(dev) && i915_powersave) {
@@ -1292,6 +1309,7 @@ cleanup_gem:
 	mutex_lock(&dev->struct_mutex);
 	i915_gem_cleanup_ringbuffer(dev);
 	mutex_unlock(&dev->struct_mutex);
+	i915_gem_cleanup_aliasing_ppgtt(dev);
 cleanup_vga_switcheroo:
 	vga_switcheroo_unregister_client(dev->pdev);
 cleanup_vga_client:
@@ -2179,6 +2197,7 @@ int i915_driver_unload(struct drm_device *dev)
 		i915_gem_free_all_phys_object(dev);
 		i915_gem_cleanup_ringbuffer(dev);
 		mutex_unlock(&dev->struct_mutex);
+		i915_gem_cleanup_aliasing_ppgtt(dev);
 		if (I915_HAS_FBC(dev) && i915_powersave)
 			i915_cleanup_compression(dev);
 		drm_mm_takedown(&dev_priv->mm.stolen);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1cafe32..2dc5b27 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -254,6 +254,16 @@ struct intel_device_info {
 	u8 has_blt_ring:1;
 };
 
+#define I915_PPGTT_PD_ENTRIES 512
+#define I915_PPGTT_PT_ENTRIES 1024
+struct i915_hw_ppgtt {
+	unsigned num_pd_entries;
+	struct page **pt_pages;
+	uint32_t pd_offset;
+	dma_addr_t *pt_dma_addr;
+	dma_addr_t scratch_page_dma_addr;
+};
+
 enum no_fbc_reason {
 	FBC_NO_OUTPUT, /* no outputs enabled to compress */
 	FBC_STOLEN_TOO_SMALL, /* not enough space to hold compressed buffers */
@@ -584,6 +594,9 @@ typedef struct drm_i915_private {
 		struct io_mapping *gtt_mapping;
 		int gtt_mtrr;
 
+		/** PPGTT used for aliasing the PPGTT with the GTT */
+		struct i915_hw_ppgtt *aliasing_ppgtt;
+
 		struct shrinker inactive_shrinker;
 
 		/**
@@ -976,6 +989,8 @@ struct drm_i915_file_private {
 #define HAS_BLT(dev)            (INTEL_INFO(dev)->has_blt_ring)
 #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
 
+#define HAS_ALIASING_PPGTT(dev)	(INTEL_INFO(dev)->gen >=6)
+
 #define HAS_OVERLAY(dev)		(INTEL_INFO(dev)->has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev)	(INTEL_INFO(dev)->overlay_needs_physical)
 
@@ -1249,6 +1264,9 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level);
 
 /* i915_gem_gtt.c */
+int __must_check i915_gem_init_aliasing_ppgtt(struct drm_device *dev);
+void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
+
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_rebind_object(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6042c5e..548b885 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -29,6 +29,145 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 
+/* PPGTT support for Sandybdrige/Gen6 and later */
+static void i915_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
+				   unsigned first_entry,
+				   unsigned num_entries)
+{
+	int i, j;
+	uint32_t *pt_vaddr;
+	uint32_t scratch_pte;
+
+	scratch_pte = GEN6_PTE_ADDR_ENCODE(ppgtt->scratch_page_dma_addr);
+	scratch_pte |= GEN6_PTE_VALID | GEN6_PTE_CACHE_LLC;
+
+	for (i = 0; i < ppgtt->num_pd_entries; i++) {
+		pt_vaddr = kmap_atomic(ppgtt->pt_pages[i]);
+
+		for (j = 0; j < I915_PPGTT_PT_ENTRIES; j++)
+			pt_vaddr[j] = scratch_pte;
+
+		kunmap_atomic(pt_vaddr);
+	}
+
+}
+
+int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_hw_ppgtt *ppgtt;
+	uint32_t pd_entry;
+	unsigned first_pd_entry_in_global_pt;
+	uint32_t __iomem *pd_addr;
+	int i;
+	int ret = -ENOMEM;
+
+	/* ppgtt PDEs reside in the global gtt pagetable, which has 512*1024
+	 * entries. For aliasing ppgtt support we just steal them at the end for
+	 * now. */
+	first_pd_entry_in_global_pt = 512*1024 - I915_PPGTT_PD_ENTRIES;
+
+	ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL);
+	if (!ppgtt)
+		return ret;
+
+	ppgtt->num_pd_entries = I915_PPGTT_PD_ENTRIES;
+	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
+				  GFP_KERNEL);
+	if (!ppgtt->pt_pages)
+		goto err_ppgtt;
+
+	for (i = 0; i < ppgtt->num_pd_entries; i++) {
+		ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
+		if (!ppgtt->pt_pages[i])
+			goto err_pt_alloc;
+	}
+
+	if (dev_priv->mm.gtt->needs_dmar) {
+		ppgtt->pt_dma_addr = kzalloc(sizeof(dma_addr_t)
+						*ppgtt->num_pd_entries,
+					     GFP_KERNEL);
+		if (!ppgtt->pt_dma_addr)
+			goto err_pt_alloc;
+	}
+
+	pd_addr = dev_priv->mm.gtt->gtt + first_pd_entry_in_global_pt;
+	for (i = 0; i < ppgtt->num_pd_entries; i++) {
+		dma_addr_t pt_addr;
+		if (dev_priv->mm.gtt->needs_dmar) {
+			pt_addr = pci_map_page(dev->pdev, ppgtt->pt_pages[i],
+					       0, 4096,
+					       PCI_DMA_BIDIRECTIONAL);
+
+			if (pci_dma_mapping_error(dev->pdev,
+						  pt_addr)) {
+				ret = -EIO;
+				goto err_pd_pin;
+
+			}
+			ppgtt->pt_dma_addr[i] = pt_addr;
+		} else
+			pt_addr = page_to_phys(ppgtt->pt_pages[i]);
+
+		pd_entry = GEN6_PDE_ADDR_ENCODE(pt_addr);
+		pd_entry |= GEN6_PDE_VALID;
+
+		writel(pd_entry, pd_addr + i);
+	}
+	readl(pd_addr);
+
+	ppgtt->scratch_page_dma_addr = dev_priv->mm.gtt->scratch_page_dma;
+
+	i915_ppgtt_clear_range(ppgtt, 0,
+			       ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
+
+	ppgtt->pd_offset = (first_pd_entry_in_global_pt)*sizeof(uint32_t);
+
+	dev_priv->mm.aliasing_ppgtt = ppgtt;
+
+	return 0;
+
+err_pd_pin:
+	if (ppgtt->pt_dma_addr) {
+		for (i--; i >= 0; i--)
+			pci_unmap_page(dev->pdev, ppgtt->pt_dma_addr[i],
+				       4096, PCI_DMA_BIDIRECTIONAL);
+	}
+err_pt_alloc:
+	kfree(ppgtt->pt_dma_addr);
+	for (i = 0; i < ppgtt->num_pd_entries; i++) {
+		if (ppgtt->pt_pages[i])
+			__free_page(ppgtt->pt_pages[i]);
+	}
+	kfree(ppgtt->pt_pages);
+err_ppgtt:
+	kfree(ppgtt);
+
+	return ret;
+}
+
+void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
+	int i;
+
+	if (!ppgtt)
+		return;
+
+	if (ppgtt->pt_dma_addr) {
+		for (i = 0; i < ppgtt->num_pd_entries; i++)
+			pci_unmap_page(dev->pdev, ppgtt->pt_dma_addr[i],
+				       4096, PCI_DMA_BIDIRECTIONAL);
+	}
+
+	kfree(ppgtt->pt_dma_addr);
+	for (i = 0; i < ppgtt->num_pd_entries; i++)
+		__free_page(ppgtt->pt_pages[i]);
+	kfree(ppgtt->pt_pages);
+	kfree(ppgtt);
+}
+
 /* XXX kill agp_type! */
 static unsigned int cache_level_to_agp_type(struct drm_device *dev,
 					    enum i915_cache_level cache_level)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e810723..51ec59d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -92,6 +92,22 @@
 #define  GEN6_GRDOM_MEDIA		(1 << 2)
 #define  GEN6_GRDOM_BLT			(1 << 3)
 
+/* PPGTT stuff */
+#define GEN6_GTT_ADDR_ENCODE(addr)	((addr) | (((addr) >> 28) & 0xff0))
+
+#define GEN6_PDE_VALID			(1 << 0)
+#define GEN6_PDE_LARGE_PAGE		(2 << 0) /* use 32kb pages */
+/* gen6+ has bit 11-4 for physical addr bit 39-32 */
+#define GEN6_PDE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
+
+#define GEN6_PTE_VALID			(1 << 0)
+#define GEN6_PTE_UNCACHED		(1 << 1)
+#define GEN6_PTE_CACHE_LLC		(2 << 1)
+#define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
+#define GEN6_PTE_CACHE_BITS		(3 << 1)
+#define GEN6_PTE_GFDT			(1 << 3)
+#define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
+
 /* VGA stuff */
 
 #define VGA_ST01_MDA 0x3ba
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 39/43] drm/i915: ppgtt binding/unbinding support
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (36 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 38/43] drm/i915: initialization/teardown for the aliasing ppgtt Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 40/43] drm/i915: ppgtt register definitions Daniel Vetter
                   ` (3 subsequent siblings)
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

This adds support to bind/unbind objects and wires it up. Objects are
only put into the ppgtt when necessary, i.e. at execbuf time.

Objects are still unconditionally put into the global gtt.

v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind
like for the global gtt function. Noticed by Chris Wilson.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |    7 ++
 drivers/gpu/drm/i915/i915_gem.c            |   11 ++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    9 ++
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  146 ++++++++++++++++++++++++++-
 4 files changed, 167 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2dc5b27..66b4c87 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -854,6 +854,8 @@ struct drm_i915_gem_object {
 
 	unsigned int cache_level:2;
 
+	unsigned int has_aliasing_ppgtt_mapping:1;
+
 	struct page **pages;
 
 	/**
@@ -1266,6 +1268,11 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 /* i915_gem_gtt.c */
 int __must_check i915_gem_init_aliasing_ppgtt(struct drm_device *dev);
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
+void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
+			    struct drm_i915_gem_object *obj,
+			    enum i915_cache_level cache_level);
+void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
+			      struct drm_i915_gem_object *obj);
 
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d38e2e6..f231b80 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2022,6 +2022,7 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
 int
 i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 {
+	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
 	int ret = 0;
 
 	if (obj->gtt_space == NULL)
@@ -2066,6 +2067,11 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	trace_i915_gem_object_unbind(obj);
 
 	i915_gem_gtt_unbind_object(obj);
+	if (obj->has_aliasing_ppgtt_mapping) {
+		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
+		obj->has_aliasing_ppgtt_mapping = 0;
+	}
+
 	i915_gem_object_put_pages_gtt(obj);
 
 	list_del_init(&obj->gtt_list);
@@ -2882,6 +2888,8 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
+	struct drm_device *dev = obj->base.dev;
+	drm_i915_private_t *dev_priv = dev->dev_private;
 	int ret;
 
 	if (obj->cache_level == cache_level)
@@ -2910,6 +2918,9 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		}
 
 		i915_gem_gtt_rebind_object(obj, cache_level);
+		if (obj->has_aliasing_ppgtt_mapping)
+			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
+					       obj, cache_level);
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 7e65505..77b2473 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -530,6 +530,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			    struct drm_file *file,
 			    struct list_head *objects)
 {
+	drm_i915_private_t *dev_priv = ring->dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	int ret, retry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
@@ -636,6 +637,14 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			}
 
 			i915_gem_object_unpin(obj);
+
+			/* ... and ensure ppgtt mapping exist if needed. */
+			if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
+				i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
+						       obj, obj->cache_level);
+
+				obj->has_aliasing_ppgtt_mapping = 1;
+			}
 		}
 
 		if (ret != -ENOSPC || retry > 1)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 548b885..f895e20 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -34,22 +34,31 @@ static void i915_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
 				   unsigned first_entry,
 				   unsigned num_entries)
 {
-	int i, j;
 	uint32_t *pt_vaddr;
 	uint32_t scratch_pte;
+	unsigned act_pd = first_entry / I915_PPGTT_PT_ENTRIES;
+	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
+	unsigned last_pte, i;
 
 	scratch_pte = GEN6_PTE_ADDR_ENCODE(ppgtt->scratch_page_dma_addr);
 	scratch_pte |= GEN6_PTE_VALID | GEN6_PTE_CACHE_LLC;
 
-	for (i = 0; i < ppgtt->num_pd_entries; i++) {
-		pt_vaddr = kmap_atomic(ppgtt->pt_pages[i]);
+	while (num_entries) {
+		last_pte = first_pte + num_entries;
+		if (last_pte > I915_PPGTT_PT_ENTRIES)
+			last_pte = I915_PPGTT_PT_ENTRIES;
+
+		pt_vaddr = kmap_atomic(ppgtt->pt_pages[act_pd]);
 
-		for (j = 0; j < I915_PPGTT_PT_ENTRIES; j++)
-			pt_vaddr[j] = scratch_pte;
+		for (i = first_pte; i < last_pte; i++)
+			pt_vaddr[i] = scratch_pte;
 
 		kunmap_atomic(pt_vaddr);
-	}
 
+		num_entries -= last_pte - first_pte;
+		first_pte = 0;
+		act_pd++;
+	}
 }
 
 int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
@@ -168,6 +177,131 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	kfree(ppgtt);
 }
 
+static void i915_ppgtt_insert_sg_entries(struct i915_hw_ppgtt *ppgtt,
+					 struct scatterlist *sg_list,
+					 unsigned sg_len,
+					 unsigned first_entry,
+					 uint32_t pte_flags)
+{
+	uint32_t *pt_vaddr, pte;
+	unsigned act_pd = first_entry / I915_PPGTT_PT_ENTRIES;
+	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
+	unsigned i, j, m, segment_len;
+	dma_addr_t page_addr;
+	struct scatterlist *sg;
+
+	/* init sg walking */
+	sg = sg_list;
+	i = 0;
+	segment_len = sg_dma_len(sg) >> PAGE_SHIFT;
+	m = 0;
+
+	while (i < sg_len) {
+		pt_vaddr = kmap_atomic(ppgtt->pt_pages[act_pd]);
+
+		for (j = first_pte; j < I915_PPGTT_PT_ENTRIES; j++) {
+			page_addr = sg_dma_address(sg) + (m << PAGE_SHIFT);
+			pte = GEN6_PTE_ADDR_ENCODE(page_addr);
+			pt_vaddr[j] = pte | pte_flags;
+
+			/* grab the next page */
+			m++;
+			if (m == segment_len) {
+				sg = sg_next(sg);
+				i++;
+				if (i == sg_len)
+					break;
+
+				segment_len = sg_dma_len(sg) >> PAGE_SHIFT;
+				m = 0;
+			}
+		}
+
+		kunmap_atomic(pt_vaddr);
+
+		first_pte = 0;
+		act_pd++;
+	}
+}
+
+static void i915_ppgtt_insert_pages(struct i915_hw_ppgtt *ppgtt,
+				    unsigned first_entry, unsigned num_entries,
+				    struct page **pages, uint32_t pte_flags)
+{
+	uint32_t *pt_vaddr, pte;
+	unsigned act_pd = first_entry / I915_PPGTT_PT_ENTRIES;
+	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
+	unsigned last_pte, i;
+	dma_addr_t page_addr;
+
+	while (num_entries) {
+		last_pte = first_pte + num_entries;
+		last_pte = min_t(unsigned, last_pte, I915_PPGTT_PT_ENTRIES);
+
+		pt_vaddr = kmap_atomic(ppgtt->pt_pages[act_pd]);
+
+		for (i = first_pte; i < last_pte; i++) {
+			page_addr = page_to_phys(*pages);
+			pte = GEN6_PTE_ADDR_ENCODE(page_addr);
+			pt_vaddr[i] = pte | pte_flags;
+
+			pages++;
+		}
+
+		kunmap_atomic(pt_vaddr);
+
+		num_entries -= last_pte - first_pte;
+		first_pte = 0;
+		act_pd++;
+	}
+}
+
+void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
+			    struct drm_i915_gem_object *obj,
+			    enum i915_cache_level cache_level)
+{
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	uint32_t pte_flags = GEN6_PTE_VALID;
+
+	switch (cache_level) {
+	case I915_CACHE_LLC_MLC:
+		pte_flags |= GEN6_PTE_CACHE_LLC_MLC;
+		break;
+	case I915_CACHE_LLC:
+		pte_flags |= GEN6_PTE_CACHE_LLC;
+		break;
+	case I915_CACHE_NONE:
+		pte_flags |= GEN6_PTE_UNCACHED;
+		break;
+	default:
+		BUG();
+	}
+
+	if (dev_priv->mm.gtt->needs_dmar) {
+		BUG_ON(!obj->sg_list);
+
+		i915_ppgtt_insert_sg_entries(ppgtt,
+					     obj->sg_list,
+					     obj->num_sg,
+					     obj->gtt_space->start >> PAGE_SHIFT,
+					     pte_flags);
+	} else
+		i915_ppgtt_insert_pages(ppgtt,
+					obj->gtt_space->start >> PAGE_SHIFT,
+					obj->base.size >> PAGE_SHIFT,
+					obj->pages,
+					pte_flags);
+}
+
+void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
+			      struct drm_i915_gem_object *obj)
+{
+	i915_ppgtt_clear_range(ppgtt,
+			       obj->gtt_space->start >> PAGE_SHIFT,
+			       obj->base.size >> PAGE_SHIFT);
+}
+
 /* XXX kill agp_type! */
 static unsigned int cache_level_to_agp_type(struct drm_device *dev,
 					    enum i915_cache_level cache_level)
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 40/43] drm/i915: ppgtt register definitions
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (37 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 39/43] drm/i915: ppgtt binding/unbinding support Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 18:58   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 41/43] drm/i915: ppgtt debugfs info Daniel Vetter
                   ` (2 subsequent siblings)
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_reg.h |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 51ec59d..2fa7dfa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -86,6 +86,13 @@
 #define   GEN6_MBC_SNPCR_LOW	(2<<21)
 #define   GEN6_MBC_SNPCR_MIN	(3<<21) /* only 1/16th of the cache is shared */
 
+#define GEN6_MBCTL		0x0907c
+#define   GEN6_MBCTL_ENABLE_BOOT_FETCH	(1 << 4)
+#define   GEN6_MBCTL_CTX_FETCH_NEEDED	(1 << 3)
+#define   GEN6_MBCTL_BME_UPDATE_ENABLE	(1 << 2)
+#define   GEN6_MBCTL_MAE_UPDATE_ENABLE	(1 << 1)
+#define   GEN6_MBCTL_BOOT_FETCH_MECH	(1 << 0)
+
 #define GEN6_GDRST	0x941c
 #define  GEN6_GRDOM_FULL		(1 << 0)
 #define  GEN6_GRDOM_RENDER		(1 << 1)
@@ -108,6 +115,16 @@
 #define GEN6_PTE_GFDT			(1 << 3)
 #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
 
+#define RING_PP_DIR_BASE(ring)		((ring)->mmio_base+0x228)
+#define RING_PP_DIR_BASE_READ(ring)	((ring)->mmio_base+0x518)
+#define RING_PP_DIR_DCLV(ring)		((ring)->mmio_base+0x220)
+#define   PP_DIR_DCLV_2G		0xffffffff
+
+#define GAM_ECOCHK			0x4090
+#define   ECOCHK_SNB_BIT		(1<<10)
+#define   ECOCHK_PPGTT_CACHE64B		(0x3<<3)
+#define   ECOCHK_PPGTT_CACHE4B		(0x0<<3)
+
 /* VGA stuff */
 
 #define VGA_ST01_MDA 0x3ba
@@ -420,6 +437,7 @@
 
 #define GFX_MODE	0x02520
 #define GFX_MODE_GEN7	0x0229c
+#define RING_MODE_GEN7(ring)	((ring)->mmio_base+0x29c)
 #define   GFX_RUN_LIST_ENABLE		(1<<15)
 #define   GFX_TLB_INVALIDATE_ALWAYS	(1<<13)
 #define   GFX_SURFACE_FAULT_ENABLE	(1<<12)
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 41/43] drm/i915: ppgtt debugfs info
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (38 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 40/43] drm/i915: ppgtt register definitions Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 12:57 ` [PATCH 42/43] drm/i915: per-ring fault reg Daniel Vetter
  2011-12-14 12:57 ` [PATCH 43/43] drm/i915: enable ppgtt Daniel Vetter
  41 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   38 +++++++++++++++++++++++++++++++++++
 1 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index aa305a6..c4f8c7e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1358,6 +1358,43 @@ static int i915_swizzle_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_ppgtt_info(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *ring;
+	int i, ret;
+
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+	if (INTEL_INFO(dev)->gen == 6)
+		seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(GFX_MODE));
+
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		ring = &dev_priv->ring[i];
+
+		seq_printf(m, "%s\n", ring->name);
+		if (INTEL_INFO(dev)->gen == 7)
+			seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(RING_MODE_GEN7(ring)));
+		seq_printf(m, "PP_DIR_BASE: 0x%08x\n", I915_READ(RING_PP_DIR_BASE(ring)));
+		seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n", I915_READ(RING_PP_DIR_BASE_READ(ring)));
+		seq_printf(m, "PP_DIR_DCLV: 0x%08x\n", I915_READ(RING_PP_DIR_DCLV(ring)));
+	}
+	if (dev_priv->mm.aliasing_ppgtt) {
+		struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
+
+		seq_printf(m, "aliasing PPGTT:\n");
+		seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd_offset);
+	}
+	seq_printf(m, "ECOCHK: 0x%08x\n", I915_READ(GAM_ECOCHK));
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
 static int
 i915_debugfs_common_open(struct inode *inode,
 			 struct file *filp)
@@ -1759,6 +1796,7 @@ static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_context_status", i915_context_status, 0},
 	{"i915_gen6_forcewake_count", i915_gen6_forcewake_count_info, 0},
 	{"i915_swizzle_info", i915_swizzle_info, 0},
+	{"i915_ppgtt_info", i915_ppgtt_info, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 42/43] drm/i915: per-ring fault reg
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (39 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 41/43] drm/i915: ppgtt debugfs info Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 19:00   ` Eugeni Dodonov
  2011-12-14 12:57 ` [PATCH 43/43] drm/i915: enable ppgtt Daniel Vetter
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

v2: Chris Wilson suggested to allocate the error_state with kzalloc
for better paranioa. Also kill existing spurious clears of the
error_state while at it.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |    8 ++++++--
 drivers/gpu/drm/i915/i915_drv.h     |    2 ++
 drivers/gpu/drm/i915/i915_irq.c     |   13 +++++++------
 drivers/gpu/drm/i915/i915_reg.h     |    2 ++
 4 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c4f8c7e..8213257 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -716,8 +716,10 @@ static void i915_ring_error_state(struct seq_file *m,
 	if (INTEL_INFO(dev)->gen >= 4)
 		seq_printf(m, "  INSTPS: 0x%08x\n", error->instps[ring]);
 	seq_printf(m, "  INSTPM: 0x%08x\n", error->instpm[ring]);
-	if (INTEL_INFO(dev)->gen >= 6)
+	if (INTEL_INFO(dev)->gen >= 6) {
 		seq_printf(m, "  FADDR: 0x%08x\n", error->faddr[ring]);
+		seq_printf(m, "  FAULT_REG: 0x%08x\n", error->fault_reg[ring]);
+	}
 	seq_printf(m, "  seqno: 0x%08x\n", error->seqno[ring]);
 }
 
@@ -751,8 +753,10 @@ static int i915_error_state(struct seq_file *m, void *unused)
 	for (i = 0; i < dev_priv->num_fence_regs; i++)
 		seq_printf(m, "  fence[%d] = %08llx\n", i, error->fence[i]);
 
-	if (INTEL_INFO(dev)->gen >= 6) 
+	if (INTEL_INFO(dev)->gen >= 6) {
 		seq_printf(m, "ERROR: 0x%08x\n", error->error);
+		seq_printf(m, "DONE_REG: 0x%08x\n", error->done_reg);
+	}
 
 	i915_ring_error_state(m, dev, error, RCS);
 	if (HAS_BLT(dev))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 66b4c87..9b1a7ad 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -167,6 +167,8 @@ struct drm_i915_error_state {
 	u32 instdone1;
 	u32 seqno[I915_NUM_RINGS];
 	u64 bbaddr;
+	u32 fault_reg[I915_NUM_RINGS];
+	u32 done_reg;
 	u32 faddr[I915_NUM_RINGS];
 	u64 fence[I915_MAX_NUM_FENCES];
 	struct timeval time;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7d06bc5..b9ba180 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -901,8 +901,10 @@ static void i915_record_ring_state(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	if (INTEL_INFO(dev)->gen >= 6)
+	if (INTEL_INFO(dev)->gen >= 6) {
 		error->faddr[ring->id] = I915_READ(RING_DMA_FADD(ring->mmio_base));
+		error->fault_reg[ring->id] = I915_READ(RING_FAULT_REG(ring));
+	}
 
 	if (INTEL_INFO(dev)->gen >= 4) {
 		error->ipeir[ring->id] = I915_READ(RING_IPEIR(ring->mmio_base));
@@ -917,7 +919,6 @@ static void i915_record_ring_state(struct drm_device *dev,
 		error->ipeir[ring->id] = I915_READ(IPEIR);
 		error->ipehr[ring->id] = I915_READ(IPEHR);
 		error->instdone[ring->id] = I915_READ(INSTDONE);
-		error->bbaddr = 0;
 	}
 
 	error->instpm[ring->id] = I915_READ(RING_INSTPM(ring->mmio_base));
@@ -951,7 +952,7 @@ static void i915_capture_error_state(struct drm_device *dev)
 		return;
 
 	/* Account for pipe specific data like PIPE*STAT */
-	error = kmalloc(sizeof(*error), GFP_ATOMIC);
+	error = kzalloc(sizeof(*error), GFP_ATOMIC);
 	if (!error) {
 		DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
 		return;
@@ -966,10 +967,10 @@ static void i915_capture_error_state(struct drm_device *dev)
 	for_each_pipe(pipe)
 		error->pipestat[pipe] = I915_READ(PIPESTAT(pipe));
 
-	if (INTEL_INFO(dev)->gen >= 6)
+	if (INTEL_INFO(dev)->gen >= 6) {
 		error->error = I915_READ(ERROR_GEN6);
-	else
-		error->error = 0;
+		error->done_reg = I915_READ(DONE_REG);
+	}
 
 	i915_record_ring_state(dev, error, &dev_priv->ring[RCS]);
 	if (HAS_BLT(dev))
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2fa7dfa..8e10a89 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -363,6 +363,8 @@
 #define   ARB_MODE_ENABLE(x)	GFX_MODE_ENABLE(x)
 #define   ARB_MODE_DISABLE(x)	GFX_MODE_DISABLE(x)
 #define RENDER_HWS_PGA_GEN7	(0x04080)
+#define RING_FAULT_REG(ring)	(0x4094 + 0x100*(ring)->id)
+#define DONE_REG		0x40b0
 #define BSD_HWS_PGA_GEN7	(0x04180)
 #define BLT_HWS_PGA_GEN7	(0x04280)
 #define RING_ACTHD(base)	((base)+0x74)
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 43/43] drm/i915: enable ppgtt
  2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
                   ` (40 preceding siblings ...)
  2011-12-14 12:57 ` [PATCH 42/43] drm/i915: per-ring fault reg Daniel Vetter
@ 2011-12-14 12:57 ` Daniel Vetter
  2011-12-14 15:34   ` Chris Wilson
  41 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-14 12:57 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

v2: Don't try to enable ppgtt on pre-snb.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.c |    2 ++
 drivers/gpu/drm/i915/i915_drv.h |    1 +
 drivers/gpu/drm/i915/i915_gem.c |   39 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6e0f868..461356e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -698,6 +698,8 @@ int i915_reset(struct drm_device *dev, u8 flags)
 		if (HAS_BLT(dev))
 		    dev_priv->ring[BCS].init(&dev_priv->ring[BCS]);
 
+		i915_gem_init_ppgtt(dev);
+
 		mutex_unlock(&dev->struct_mutex);
 		drm_irq_uninstall(dev);
 		drm_mode_config_reset(dev);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9b1a7ad..1f83c61 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1228,6 +1228,7 @@ int __must_check i915_gem_object_set_domain(struct drm_i915_gem_object *obj,
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
 void i915_gem_init_swizzling(struct drm_device *dev);
+void i915_gem_init_ppgtt(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 void i915_gem_do_init(struct drm_device *dev,
 		      unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f231b80..10fed1e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3708,6 +3708,43 @@ void i915_gem_init_swizzling(struct drm_device *dev)
 				 DISP_TILE_SURFACE_SWIZZLING);
 
 }
+
+void i915_gem_init_ppgtt(struct drm_device *dev)
+{
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	uint32_t pd_offset;
+	struct intel_ring_buffer *ring;
+	int i;
+
+	if (!HAS_ALIASING_PPGTT(dev))
+		return;
+
+	pd_offset = dev_priv->mm.aliasing_ppgtt->pd_offset;
+	pd_offset /= 64; /* in cachelines, */
+	pd_offset <<= 16;
+
+	if (INTEL_INFO(dev)->gen == 6) {
+		uint32_t ecochk = I915_READ(GAM_ECOCHK);
+		I915_WRITE(GAM_ECOCHK, ecochk | ECOCHK_SNB_BIT |
+				       ECOCHK_PPGTT_CACHE64B);
+		I915_WRITE(GFX_MODE, GFX_MODE_ENABLE(GFX_PPGTT_ENABLE));
+	} else if (INTEL_INFO(dev)->gen >= 7) {
+		I915_WRITE(GAM_ECOCHK, ECOCHK_PPGTT_CACHE64B);
+		/* GFX_MODE is per-ring on gen7+ */
+	}
+
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		ring = &dev_priv->ring[i];
+
+		if (INTEL_INFO(dev)->gen >= 7)
+			I915_WRITE(RING_MODE_GEN7(ring),
+				   GFX_MODE_ENABLE(GFX_PPGTT_ENABLE));
+
+		I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
+		I915_WRITE(RING_PP_DIR_BASE(ring), pd_offset);
+	}
+}
+
 int
 i915_gem_init_hw(struct drm_device *dev)
 {
@@ -3734,6 +3771,8 @@ i915_gem_init_hw(struct drm_device *dev)
 
 	dev_priv->next_seqno = 1;
 
+	i915_gem_init_ppgtt(dev);
+
 	return 0;
 
 cleanup_bsd_ring:
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH 08/43] drm/i915: drop register special-casing in forcewake
  2011-12-14 12:57 ` [PATCH 08/43] drm/i915: drop register special-casing in forcewake Daniel Vetter
@ 2011-12-14 15:05   ` Chris Wilson
  2011-12-15 10:21     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Chris Wilson @ 2011-12-14 15:05 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:05 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> We currently have 3 register for which we must not grab forcewake for:
> FORCEWAKE, FROCEWAKE_MT and ECOBUS.
> - FORCEWAKE is excluded in the NEEDS_FORCE_WAKE macro and accessed
>   with _NOTRACE.
> - FORCEWAKE_MT is just accessed with _NOTRACE.
> - ECOBUS is only excluded in the macro.
> 
> In fear of an ever-growing list of special cases and to cut down the
> confusion, just access all of them with the _NOTRACE variants.

Instead you build in future confusion by making us guess wtf is this using
*_NOTRACE. The NOTRACE macro needs a bit of explanation as it now is
more than simply skipping the tracepoints, and why certain registers
must be accessed through the macro. Also add that warning to the
register define.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions
  2011-12-14 12:57 ` [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions Daniel Vetter
@ 2011-12-14 15:06   ` Chris Wilson
  2011-12-21 20:38     ` Daniel Vetter
  2011-12-14 18:58   ` Kenneth Graunke
  1 sibling, 1 reply; 115+ messages in thread
From: Chris Wilson @ 2011-12-14 15:06 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:06 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> ... like for forcewake, which protects everything _but_ display.
> Expect more things (like gtt abstractions, rings, inter-ring sync)
> to come.

The only thing that looks odd is the layout in drm_i915_private is a
little haphazard. Otherwise,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 12/43] drm/i915: don't trash the gtt when running out of fences
  2011-12-14 12:57 ` [PATCH 12/43] drm/i915: don't trash the gtt when running out of fences Daniel Vetter
@ 2011-12-14 15:09   ` Chris Wilson
  2012-01-29 16:57     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Chris Wilson @ 2011-12-14 15:09 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:09 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> With the fence accounting fixed up in the previous commit not finding
> enough fences is a fatal error and userspace bug. Trashing the entire
> gtt is not gonna turn up that missing fence, so don't to this by
> returning another error thatn ENOSPC.
> 
> This has the added benefit that it's easier to distinguish fence
> accounting errors from gtt space accounting issues.
> 
> TTM serves as precendence for the EDEADLK error code - it returns it
> when the reservation code needs resources already blocked by the
> current reservation.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 13/43] drm/i915: refactor debugfs open function
  2011-12-14 12:57 ` [PATCH 13/43] drm/i915: refactor debugfs open function Daniel Vetter
@ 2011-12-14 15:10   ` Chris Wilson
  2011-12-14 18:36   ` Eugeni Dodonov
  2012-01-29 17:28   ` Daniel Vetter
  2 siblings, 0 replies; 115+ messages in thread
From: Chris Wilson @ 2011-12-14 15:10 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:10 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Only forcewake has an open with special semantics, the other r/w
> debugfs only assign the file private pointer.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 16/43] drm/i915: rework dev->first_error locking
  2011-12-14 12:57 ` [PATCH 16/43] drm/i915: rework dev->first_error locking Daniel Vetter
@ 2011-12-14 15:13   ` Chris Wilson
  2011-12-14 18:37   ` Eugeni Dodonov
  1 sibling, 0 replies; 115+ messages in thread
From: Chris Wilson @ 2011-12-14 15:13 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:13 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> - reduce the irq disabled section, even for a debugfs file this was
>   way too long.
> - always disable irqs when taking the lock.
> 
> v2: Thou shalt not mistake locking for reference counting, so:
> - reference count the error_state to protect from concurent freeeing.
>   This will be only really used in the next patch.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 25/43] drm/i915: properly flush the wc buffer in pwrites to phys objects
  2011-12-14 12:57 ` [PATCH 25/43] drm/i915: properly flush the wc buffer in pwrites to phys objects Daniel Vetter
@ 2011-12-14 15:23   ` Chris Wilson
  0 siblings, 0 replies; 115+ messages in thread
From: Chris Wilson @ 2011-12-14 15:23 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx, stable

On Wed, 14 Dec 2011 13:57:22 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Usually results in (rare) cursor corruptions on platforms
> requiring physically addressed cursors.

I have to admit that I am still puzzled as to how this has any effect at
all given that call wbinvd afterwards. But if it makes 8xx happier,
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 27/43] drm/i915: flush overlay regfile writes
  2011-12-14 12:57 ` [PATCH 27/43] drm/i915: flush overlay regfile writes Daniel Vetter
@ 2011-12-14 15:24   ` Chris Wilson
  2011-12-21 20:41     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Chris Wilson @ 2011-12-14 15:24 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:24 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Better be paranoid. The wmb should flush the wc writes, and
> the chipset_flush hopefully flushes any mch buffers. There've been a
> few overlay hangs I've never really diagnosed, unfortunately all the
> reporters disappeared.

At which point we should just put the darn wmb into i830_chipset_flush()
and make snark remarks about wbinvd.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 43/43] drm/i915: enable ppgtt
  2011-12-14 12:57 ` [PATCH 43/43] drm/i915: enable ppgtt Daniel Vetter
@ 2011-12-14 15:34   ` Chris Wilson
  2011-12-21 20:46     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Chris Wilson @ 2011-12-14 15:34 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:40 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> v2: Don't try to enable ppgtt on pre-snb.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> ---
> +void i915_gem_init_ppgtt(struct drm_device *dev)
> +{
> +	drm_i915_private_t *dev_priv = dev->dev_private;
> +	uint32_t pd_offset;
> +	struct intel_ring_buffer *ring;
> +	int i;
> +
> +	if (!HAS_ALIASING_PPGTT(dev))
> +		return;
I still think it is safer to use
  if (dev_priv->mm.aliasing_ppgtt == NULL)
    return;
here

And pretty please can I module parameter to enable/disable ppgtt on
boot.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 13/43] drm/i915: refactor debugfs open function
  2011-12-14 12:57 ` [PATCH 13/43] drm/i915: refactor debugfs open function Daniel Vetter
  2011-12-14 15:10   ` Chris Wilson
@ 2011-12-14 18:36   ` Eugeni Dodonov
  2012-01-29 17:28   ` Daniel Vetter
  2 siblings, 0 replies; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:36 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 460 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Only forcewake has an open with special semantics, the other r/w
> debugfs only assign the file private pointer.
>
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
>

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 959 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 16/43] drm/i915: rework dev->first_error locking
  2011-12-14 12:57 ` [PATCH 16/43] drm/i915: rework dev->first_error locking Daniel Vetter
  2011-12-14 15:13   ` Chris Wilson
@ 2011-12-14 18:37   ` Eugeni Dodonov
  1 sibling, 0 replies; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:37 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 568 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> - reduce the irq disabled section, even for a debugfs file this was
>  way too long.
> - always disable irqs when taking the lock.
>
> v2: Thou shalt not mistake locking for reference counting, so:
> - reference count the error_state to protect from concurent freeeing.
>  This will be only really used in the next patch.
>
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>


-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 990 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 02/43] drm/i915: don't bail out of intel_wait_ring_buffer too early
  2011-12-14 12:56 ` [PATCH 02/43] drm/i915: don't bail out of intel_wait_ring_buffer too early Daniel Vetter
@ 2011-12-14 18:39   ` Eugeni Dodonov
  0 siblings, 0 replies; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:39 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1452 bytes --]

On Wed, Dec 14, 2011 at 10:56, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> In the pre-gem days with non-existing hangcheck and gpu reset code,
> this timeout of 3 seconds was pretty important to avoid stuck
> processes.
>
> But now we have the hangcheck code in gem that goes to great length
> to ensure that the gpu is really dead before declaring it wedged.
>
> So there's no need for this timeout anymore. Actually it's even harmful
> because we can bail out too early (e.g. with xscreensaver slip)
> when running giant batchbuffers. And our code isn't robust enough
> to properly unroll any state-changes, we pretty much rely on the gpu
> reset code cleaning up the mess (like cache tracking, fencing state,
> active list/request tracking, ...).
>
> With this change intel_begin_ring can only fail when the gpu is
> wedged, and it will return -EAGAIN (like wait_request in case the
> gpu reset is still outstanding).
>
> v2: Chris Wilson noted that on resume timers aren't running and hence
> we won't ever get kicked out of this loop by the hangcheck code. Use
> an insanely large timeout instead for the HAS_GEM case to prevent
> resume bugs from totally hanging the machine.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Acked-by: Ben Widawsky <ben@bwidawsk.net>
>


Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 2055 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 03/43] drm/i915: switch ring->id to be a real id
  2011-12-14 12:57 ` [PATCH 03/43] drm/i915: switch ring->id to be a real id Daniel Vetter
@ 2011-12-14 18:42   ` Eugeni Dodonov
  2012-01-29 16:40     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:42 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 620 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> ... and add a helpr function for the places where we want a flag.
>
> This way we can use ring->id to index into arrays.
>
> v2: Resurrect the missing beautification-space Chris Wilson noted.
> I'm moving this space around because I'll reuse ring_str in the next
> patch.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
>


Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 1167 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 04/43] drm/i915: refactor ring error state capture to use arrays
  2011-12-14 12:57 ` [PATCH 04/43] drm/i915: refactor ring error state capture to use arrays Daniel Vetter
@ 2011-12-14 18:43   ` Eugeni Dodonov
  2012-01-29 16:44     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Ben Widawsky


[-- Attachment #1.1: Type: text/plain, Size: 574 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> The code already got unwieldy and we want to dump more per-ring
> registers.
>
> Only functional change is that we now also capture the video
> ring registers on ilk.
>
> v2: fixup a refactor fumble spotted by Chris Wilson.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>


Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 1110 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 05/43] drm/i915: collect more per ring error state
  2011-12-14 12:57 ` [PATCH 05/43] drm/i915: collect more per ring error state Daniel Vetter
@ 2011-12-14 18:43   ` Eugeni Dodonov
  2012-01-29 16:48     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Ben Widawsky


[-- Attachment #1.1: Type: text/plain, Size: 984 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Based on a patch by Ben Widawsky, but with different colors
> for the bikeshed.
>
> In contrast to Ben's patch this one doesn't add the fault regs.
> Afaics they're for the optional page fault support which
> - we're not enabling
> - and which seems to be unsupported by the hw team. Recent bspec
>  lacks tons of information about this that the public docs released
>  half a year back still contain.
>
> Also dump ring HEAD/TAIL registers - I've recently seen a few
> error_state where just guessing these is not good enough.
>
> v2: Also dump INSTPM for every ring.
>
> v3: Fix a few really silly goof-ups spotted by Chris Wilson.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>


-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 1571 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 14/43] drm/i915: refactor debugfs create functions
  2011-12-14 12:57 ` [PATCH 14/43] drm/i915: refactor debugfs create functions Daniel Vetter
@ 2011-12-14 18:44   ` Eugeni Dodonov
  0 siblings, 0 replies; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 482 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> All r/w debugfs files are created equal.
>
> v2: Add some newlines to make the code easier on the eyes as requested
> by Ben Widawsky.
>
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
>

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 998 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 17/43] drm/i915: destroy existing error_state when simulating a gpu hang
  2011-12-14 12:57 ` [PATCH 17/43] drm/i915: destroy existing error_state when simulating a gpu hang Daniel Vetter
@ 2011-12-14 18:45   ` Eugeni Dodonov
  0 siblings, 0 replies; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:45 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 382 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> This way we can simulate a bunch of gpu hangs and run the error_state
> capture code every time (without the need to reload the module).
>
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 796 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 28/43] drm/i915: Handle unmappable buffers during error state capture
  2011-12-14 12:57 ` [PATCH 28/43] drm/i915: Handle unmappable buffers during error state capture Daniel Vetter
@ 2011-12-14 18:46   ` Eugeni Dodonov
  2012-01-31 19:32     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:46 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 928 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> As the buffer is not necessarily accessible through the GTT at the time
> of a GPU hang, and capturing some of its contents is far more valuable
> than skipping it, provide a clflushed fallback read path. We still
> prefer to read through the GTT as that is more consistent with the GPU
> access of the same buffer. So example it will demonstrate any errorneous
> tiling or swizzling of the command buffer as seen by the GPU.
>
> This becomes necessary with use of CPU relocations and lazy GTT binding,
> but could potentially happen anyway as a result of a pathological error.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 1480 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 31/43] drm/i915: Use kcalloc instead of kzalloc to allocate array
  2011-12-14 12:57 ` [PATCH 31/43] drm/i915: Use kcalloc instead of kzalloc to allocate array Daniel Vetter
@ 2011-12-14 18:48   ` Eugeni Dodonov
  0 siblings, 0 replies; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:48 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Thomas Meyer


[-- Attachment #1.1: Type: text/plain, Size: 646 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> From: Thomas Meyer <thomas@m3y3r.de>
>
> The advantage of kcalloc is, that will prevent integer overflows which
> could
> result from the multiplication of number of elements and size and it is
> also
> a bit nicer to read.
>
> The semantic patch that makes this change is available
> in https://lkml.org/lkml/2011/11/25/107
>
> Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>

It looks nicer indeed, so:

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 1224 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions
  2011-12-14 12:57 ` [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions Daniel Vetter
  2011-12-14 15:06   ` Chris Wilson
@ 2011-12-14 18:58   ` Kenneth Graunke
  1 sibling, 0 replies; 115+ messages in thread
From: Kenneth Graunke @ 2011-12-14 18:58 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On 12/14/2011 04:57 AM, Daniel Vetter wrote:
> ... like for forcewake, which protects everything _but_ display.
> Expect more things (like gtt abstractions, rings, inter-ring sync)
> to come.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c      |    6 +++---
>  drivers/gpu/drm/i915/i915_drv.h      |    8 ++++++--
>  drivers/gpu/drm/i915/intel_display.c |    8 ++++----
>  3 files changed, 13 insertions(+), 9 deletions(-)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 40/43] drm/i915: ppgtt register definitions
  2011-12-14 12:57 ` [PATCH 40/43] drm/i915: ppgtt register definitions Daniel Vetter
@ 2011-12-14 18:58   ` Eugeni Dodonov
  2011-12-14 19:01     ` Eugeni Dodonov
  0 siblings, 1 reply; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 18:58 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 288 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
>

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 725 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 42/43] drm/i915: per-ring fault reg
  2011-12-14 12:57 ` [PATCH 42/43] drm/i915: per-ring fault reg Daniel Vetter
@ 2011-12-14 19:00   ` Eugeni Dodonov
  2012-01-29 22:20     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 19:00 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 455 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> v2: Chris Wilson suggested to allocate the error_state with kzalloc
> for better paranioa. Also kill existing spurious clears of the
> error_state while at it.
>
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
>


Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>


-- 
Eugeni Dodonov
 <http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 924 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 15/43] drm/i915: add interface to simulate gpu hangs
  2011-12-14 12:57 ` [PATCH 15/43] drm/i915: add interface to simulate gpu hangs Daniel Vetter
@ 2011-12-14 19:00   ` Eugeni Dodonov
  0 siblings, 0 replies; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 19:00 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1548 bytes --]

On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> gpu reset is a very important piece of our infrastructure.
> Unfortunately we only really it test by actually hanging the gpu,
> which often has bad side-effects for the entire system. And the gpu
> hang handling code is one of the rather complicated pieces of code we
> have, consisting of
> - hang detection
> - error capture
> - actual gpu reset
> - reset of all the gem bookkeeping
> - reinitialition of the entire gpu
>
> This patch adds a debugfs to selectively stopping rings by ceasing to
> update the hw tail pointer, which will result in the gpu no longer
> updating it's head pointer and eventually to the hangcheck firing.
> This way we can exercise the gpu hang code under controlled conditions
> without a dying gpu taking down the entire systems.
>
> Patch motivated by me forgetting to properly reinitialize ppgtt after
> a gpu reset.
>
> Usage:
>
> echo $((1 << $ringnum)) > i915_ring_stop # stops one ring
>
> echo 0xffffffff > i915_ring_stop # stops all, future-proof version
>
> then run whatever testload is desired. i915_ring_stop automatically
> resets after a gpu hang is detected to avoid hanging the gpu to fast
> and declaring it wedged.
>
> v2: Incorporate feedback from Chris Wilson.
>
> v3: Add the missing cleanup.
>
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>


Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>


-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 2138 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 40/43] drm/i915: ppgtt register definitions
  2011-12-14 18:58   ` Eugeni Dodonov
@ 2011-12-14 19:01     ` Eugeni Dodonov
  0 siblings, 0 replies; 115+ messages in thread
From: Eugeni Dodonov @ 2011-12-14 19:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 557 bytes --]

On Wed, Dec 14, 2011 at 16:58, Eugeni Dodonov <eugeni@dodonov.net> wrote:

> On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch>wrote:
>
>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
>>
>
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
>

Just clarifying - the R-b is for the ppgtt series. I think I also sent a
Tested-by already, but in any case, also for the series:
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 1439 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 22/43] drm/i915: prevent division by zero when asking for chipset power
  2011-12-14 12:57 ` [PATCH 22/43] drm/i915: prevent division by zero when asking for chipset power Daniel Vetter
@ 2011-12-14 19:05   ` Kenneth Graunke
  0 siblings, 0 replies; 115+ messages in thread
From: Kenneth Graunke @ 2011-12-14 19:05 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Eugeni Dodonov

On 12/14/2011 04:57 AM, Daniel Vetter wrote:
> From: Eugeni Dodonov <eugeni.dodonov@intel.com>
> 
> This prevents an in-kernel division by zero which happens when we are
> asking for i915_chipset_val too quickly, or within a race condition
> between the power monitoring thread and userspace accesses via debugfs.
> 
> The issue can be reproduced easily via the following command:
> while ``; do cat /sys/kernel/debug/dri/0/i915_emon_status; done
> 
> This is particularly dangerous because it can be triggered by
> a non-privileged user by just reading the debugfs entry.
> 
> This issue was also found independently by Konstantin Belousov
> <kostikbel@gmail.com>, who proposed a similar patch.
> 
> Reported-by: Konstantin Belousov <kostikbel@gmail.com>
> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> Acked-by: Keith Packard <keithp@keithp.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_dma.c |   10 ++++++++++
>  drivers/gpu/drm/i915/i915_drv.h |    1 +
>  2 files changed, 11 insertions(+), 0 deletions(-)

This has been in drm-intel-fixes since December 8th.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 23/43] drm/i915: multithreaded forcewake is an ivb+ feature
  2011-12-14 12:57 ` [PATCH 23/43] drm/i915: multithreaded forcewake is an ivb+ feature Daniel Vetter
@ 2011-12-14 21:07   ` Eric Anholt
  0 siblings, 0 replies; 115+ messages in thread
From: Eric Anholt @ 2011-12-14 21:07 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 309 bytes --]

On Wed, 14 Dec 2011 13:57:20 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Name the function accordingly. Suggested by Chris Wilson.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

Reviewed-by: Eric Anholt <eric@anholt.net>

[-- Attachment #1.2: Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 08/43] drm/i915: drop register special-casing in forcewake
  2011-12-14 15:05   ` Chris Wilson
@ 2011-12-15 10:21     ` Daniel Vetter
  2011-12-15 10:44       ` Chris Wilson
  0 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-15 10:21 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Wed, Dec 14, 2011 at 16:05, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Wed, 14 Dec 2011 13:57:05 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>> We currently have 3 register for which we must not grab forcewake for:
>> FORCEWAKE, FROCEWAKE_MT and ECOBUS.
>> - FORCEWAKE is excluded in the NEEDS_FORCE_WAKE macro and accessed
>>   with _NOTRACE.
>> - FORCEWAKE_MT is just accessed with _NOTRACE.
>> - ECOBUS is only excluded in the macro.
>>
>> In fear of an ever-growing list of special cases and to cut down the
>> confusion, just access all of them with the _NOTRACE variants.
>
> Instead you build in future confusion by making us guess wtf is this using
> *_NOTRACE. The NOTRACE macro needs a bit of explanation as it now is
> more than simply skipping the tracepoints, and why certain registers
> must be accessed through the macro. Also add that warning to the
> register define.

When I last checked _NOTRACE was only used to avoid the forcewake
dance, hence why I didn't add any comment. Would renaming it to
_NO_FORCEWAKE make you happy, too? Otherwise I think I'll call it _RAW
and smash a bunch of comments all over the place, but imo that's
overkill (and especially in such architectural corner-cases comments
tend to get stale fast or at least not really reflect reality fully
correctly).
-Daniel
-- 
Daniel Vetter
daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 08/43] drm/i915: drop register special-casing in forcewake
  2011-12-15 10:21     ` Daniel Vetter
@ 2011-12-15 10:44       ` Chris Wilson
  2011-12-22  0:28         ` [PATCH] drm/i915: clear up I915_(READ|WRITE)_NOTRACE confusion Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Chris Wilson @ 2011-12-15 10:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 2149 bytes --]

On Thu, 15 Dec 2011 11:21:27 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> On Wed, Dec 14, 2011 at 16:05, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Wed, 14 Dec 2011 13:57:05 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >> We currently have 3 register for which we must not grab forcewake for:
> >> FORCEWAKE, FROCEWAKE_MT and ECOBUS.
> >> - FORCEWAKE is excluded in the NEEDS_FORCE_WAKE macro and accessed
> >>   with _NOTRACE.
> >> - FORCEWAKE_MT is just accessed with _NOTRACE.
> >> - ECOBUS is only excluded in the macro.
> >>
> >> In fear of an ever-growing list of special cases and to cut down the
> >> confusion, just access all of them with the _NOTRACE variants.
> >
> > Instead you build in future confusion by making us guess wtf is this using
> > *_NOTRACE. The NOTRACE macro needs a bit of explanation as it now is
> > more than simply skipping the tracepoints, and why certain registers
> > must be accessed through the macro. Also add that warning to the
> > register define.
> 
> When I last checked _NOTRACE was only used to avoid the forcewake
> dance, hence why I didn't add any comment. Would renaming it to
> _NO_FORCEWAKE make you happy, too?

Yeah, the macro did get abused past its original intentions and I'm just
catching up with my complaint. I'd suggest __I915_READ.

> Otherwise I think I'll call it _RAW
> and smash a bunch of comments all over the place, but imo that's
> overkill (and especially in such architectural corner-cases comments
> tend to get stale fast or at least not really reflect reality fully
> correctly).

Right. So the best place for the warning would be next to the register
define that it needs to avoid the forcewake dance, and a mention in the
forcewake dance that some registers are special. Despite the seemingly
futile nature, keeping the relevant information near the code is even
more important when it changes frequently. Knowing precisely which
assumptions and knowledge that was used when the code is written helps
when we need to adapt to new architectures and looking for
oversights/bugs.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish
  2011-12-14 12:57 ` [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish Daniel Vetter
@ 2011-12-16 20:07   ` Eric Anholt
  2012-03-01 20:40   ` Daniel Vetter
  1 sibling, 0 replies; 115+ messages in thread
From: Eric Anholt @ 2011-12-16 20:07 UTC (permalink / raw)
  To: Daniel Vetter, Keith Packard; +Cc: intel-gfx, stable


[-- Attachment #1.1: Type: text/plain, Size: 1396 bytes --]

On Wed, 14 Dec 2011 13:57:23 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> By clearing the GPU read domains before waiting upon the buffer, we run
> the risk of the wait being interrupted and the domains prematurely
> cleared. The next time we attempt to wait upon the buffer (after
> userspace handles the signal), we believe that the buffer is idle and so
> skip the wait.
> 
> There are a number of bugs across all generations which show signs of an
> overly haste reuse of active buffers.

I think this patch is better as:

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 683ff1e..cb5ff10 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3126,9 +3126,6 @@ i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj)
 {
        int ret;
 
-       if ((obj->base.read_domains & I915_GEM_GPU_DOMAINS) == 0)
-               return 0;
-
        if (obj->base.write_domain & I915_GEM_GPU_DOMAINS) {
                ret = i915_gem_flush_ring(obj->ring, 0, obj->base.write_domain);
                if (ret)

While I think it was valid other than in this function, the implication
between (read_domains & I915_GEM_GPU_DOMAINS) <-> obj->active is less
clear than just relying on i915_gem_object_wait_rendering()'s use of
obj->active.

[-- Attachment #1.2: Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions
  2011-12-14 15:06   ` Chris Wilson
@ 2011-12-21 20:38     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-21 20:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 03:06:34PM +0000, Chris Wilson wrote:
> On Wed, 14 Dec 2011 13:57:06 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > ... like for forcewake, which protects everything _but_ display.
> > Expect more things (like gtt abstractions, rings, inter-ring sync)
> > to come.
> 
> The only thing that looks odd is the layout in drm_i915_private is a
> little haphazard. Otherwise,

Well, I've tried to smash it together with other gt/render related stuff.
Attaching it to the seqno tracking is the best I could come up with among
a choice of equally random-looking options ...
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 27/43] drm/i915: flush overlay regfile writes
  2011-12-14 15:24   ` Chris Wilson
@ 2011-12-21 20:41     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-21 20:41 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 03:24:49PM +0000, Chris Wilson wrote:
> On Wed, 14 Dec 2011 13:57:24 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > Better be paranoid. The wmb should flush the wc writes, and
> > the chipset_flush hopefully flushes any mch buffers. There've been a
> > few overlay hangs I've never really diagnosed, unfortunately all the
> > reporters disappeared.
> 
> At which point we should just put the darn wmb into i830_chipset_flush()
> and make snark remarks about wbinvd.

Iirc I've also had reports of nutty people trying the overlay on gen3.
Maybe that got fixed with one of the domain tracking fixes, maybe not.
Given that most users of this disappeared, I've thought adding some random
lore here won't hurt and maybe it makes somebody happy ;-)
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 43/43] drm/i915: enable ppgtt
  2011-12-14 15:34   ` Chris Wilson
@ 2011-12-21 20:46     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-21 20:46 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 03:34:38PM +0000, Chris Wilson wrote:
> On Wed, 14 Dec 2011 13:57:40 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > v2: Don't try to enable ppgtt on pre-snb.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> > +void i915_gem_init_ppgtt(struct drm_device *dev)
> > +{
> > +	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	uint32_t pd_offset;
> > +	struct intel_ring_buffer *ring;
> > +	int i;
> > +
> > +	if (!HAS_ALIASING_PPGTT(dev))
> > +		return;
> I still think it is safer to use
>   if (dev_priv->mm.aliasing_ppgtt == NULL)
>     return;
> here
> 
> And pretty please can I module parameter to enable/disable ppgtt on
> boot.

I hope we can really avoid this. Working ppgtt is more or less required by
a bunch of things, and will at least result in strange suprises with
real per-process address spaces. Disabling the use of ppgtt with the
current code can be easily done by resetting the NON_SECURE flag in the
MI_START_BB cmd (or disabling all of it in the HAS_ALIASING_PPGTT macro).
This should be good enough to hammer out any outstanding issues with this.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH] drm/i915: clear up I915_(READ|WRITE)_NOTRACE confusion
  2011-12-15 10:44       ` Chris Wilson
@ 2011-12-22  0:28         ` Daniel Vetter
  2011-12-22 17:54           ` Keith Packard
  0 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2011-12-22  0:28 UTC (permalink / raw)
  To: chris; +Cc: Daniel Vetter, intel-gfx

Half of the users actually don't want just no tracing, but need to
avoid the forcewake dance for correctness. So add new variants
__I915_READ and __I915_WRITE for that.

Also improve the _NOTRACE variants to do the forcewake dance.
Currently not required because the only user is the i2c code, which is
in the display block and hence not suspect to forcewake issues, but
nice to avoid confusion while debuggin.

Finally add a comment explaining what's going on with ECOBUS. This one
is shamelessly inspired by a patch from Keith Packard.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.c      |   30 ++++++++++++++++--------------
 drivers/gpu/drm/i915/i915_drv.h      |   34 ++++++++++++++++++----------------
 drivers/gpu/drm/i915/intel_display.c |    7 ++++++-
 3 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 3a712cf..112f7f5 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -333,14 +333,14 @@ void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 	int count;
 
 	count = 0;
-	while (count++ < 50 && (I915_READ_NOTRACE(FORCEWAKE_ACK) & 1))
+	while (count++ < 50 && (__I915_READ(FORCEWAKE_ACK) & 1))
 		udelay(10);
 
-	I915_WRITE_NOTRACE(FORCEWAKE, 1);
+	__I915_WRITE(FORCEWAKE, 1);
 	POSTING_READ(FORCEWAKE);
 
 	count = 0;
-	while (count++ < 50 && (I915_READ_NOTRACE(FORCEWAKE_ACK) & 1) == 0)
+	while (count++ < 50 && (__I915_READ(FORCEWAKE_ACK) & 1) == 0)
 		udelay(10);
 }
 
@@ -349,14 +349,14 @@ void __gen6_gt_force_wake_mt_get(struct drm_i915_private *dev_priv)
 	int count;
 
 	count = 0;
-	while (count++ < 50 && (I915_READ_NOTRACE(FORCEWAKE_MT_ACK) & 1))
+	while (count++ < 50 && (__I915_READ(FORCEWAKE_MT_ACK) & 1))
 		udelay(10);
 
-	I915_WRITE_NOTRACE(FORCEWAKE_MT, (1<<16) | 1);
+	__I915_WRITE(FORCEWAKE_MT, (1<<16) | 1);
 	POSTING_READ(FORCEWAKE_MT);
 
 	count = 0;
-	while (count++ < 50 && (I915_READ_NOTRACE(FORCEWAKE_MT_ACK) & 1) == 0)
+	while (count++ < 50 && (__I915_READ(FORCEWAKE_MT_ACK) & 1) == 0)
 		udelay(10);
 }
 
@@ -378,13 +378,13 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 
 void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
-	I915_WRITE_NOTRACE(FORCEWAKE, 0);
+	__I915_WRITE(FORCEWAKE, 0);
 	POSTING_READ(FORCEWAKE);
 }
 
 void __gen6_gt_force_wake_mt_put(struct drm_i915_private *dev_priv)
 {
-	I915_WRITE_NOTRACE(FORCEWAKE_MT, (1<<16) | 0);
+	__I915_WRITE(FORCEWAKE_MT, (1<<16) | 0);
 	POSTING_READ(FORCEWAKE_MT);
 }
 
@@ -405,10 +405,10 @@ void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv)
 {
 	if (dev_priv->gt_fifo_count < GT_FIFO_NUM_RESERVED_ENTRIES) {
 		int loop = 500;
-		u32 fifo = I915_READ_NOTRACE(GT_FIFO_FREE_ENTRIES);
+		u32 fifo = __I915_READ(GT_FIFO_FREE_ENTRIES);
 		while (fifo <= GT_FIFO_NUM_RESERVED_ENTRIES && loop--) {
 			udelay(10);
-			fifo = I915_READ_NOTRACE(GT_FIFO_FREE_ENTRIES);
+			fifo = __I915_READ(GT_FIFO_FREE_ENTRIES);
 		}
 		WARN_ON(loop < 0 && fifo <= GT_FIFO_NUM_RESERVED_ENTRIES);
 		dev_priv->gt_fifo_count = fifo;
@@ -934,7 +934,7 @@ MODULE_LICENSE("GPL and additional rights");
 	 ((reg) < 0x40000))		 \
 
 #define __i915_read(x, y) \
-u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
+u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg, bool trace) { \
 	u##x val = 0; \
 	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
 		gen6_gt_force_wake_get(dev_priv); \
@@ -943,7 +943,8 @@ u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
 	} else { \
 		val = read##y(dev_priv->regs + reg); \
 	} \
-	trace_i915_reg_rw(false, reg, val, sizeof(val)); \
+	if (trace) \
+		trace_i915_reg_rw(false, reg, val, sizeof(val)); \
 	return val; \
 }
 
@@ -954,8 +955,9 @@ __i915_read(64, q)
 #undef __i915_read
 
 #define __i915_write(x, y) \
-void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val) { \
-	trace_i915_reg_rw(true, reg, val, sizeof(val)); \
+void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val, bool trace) { \
+	if (trace) \
+		trace_i915_reg_rw(true, reg, val, sizeof(val)); \
 	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
 		__gen6_gt_wait_for_fifo(dev_priv); \
 	} \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d8f442f..39debc9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1358,7 +1358,7 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
 
 #define __i915_read(x, y) \
-	u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg);
+	u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg, bool trace);
 
 __i915_read(8, b)
 __i915_read(16, w)
@@ -1367,7 +1367,7 @@ __i915_read(64, q)
 #undef __i915_read
 
 #define __i915_write(x, y) \
-	void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val);
+	void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val, bool trace);
 
 __i915_write(8, b)
 __i915_write(16, w)
@@ -1375,24 +1375,26 @@ __i915_write(32, l)
 __i915_write(64, q)
 #undef __i915_write
 
-#define I915_READ8(reg)		i915_read8(dev_priv, (reg))
-#define I915_WRITE8(reg, val)	i915_write8(dev_priv, (reg), (val))
+#define I915_READ8(reg)			i915_read8(dev_priv, (reg), true)
+#define I915_WRITE8(reg, val)		i915_write8(dev_priv, (reg), (val), true)
 
-#define I915_READ16(reg)	i915_read16(dev_priv, (reg))
-#define I915_WRITE16(reg, val)	i915_write16(dev_priv, (reg), (val))
-#define I915_READ16_NOTRACE(reg)	readw(dev_priv->regs + (reg))
-#define I915_WRITE16_NOTRACE(reg, val)	writew(val, dev_priv->regs + (reg))
+#define __I915_READ16(reg)		readw(dev_priv->regs + (reg))
+#define __I915_WRITE16(reg, val)	writew(val, dev_priv->regs + (reg))
+#define I915_READ16(reg)		i915_read16(dev_priv, (reg), true)
+#define I915_WRITE16(reg, val)		i915_write16(dev_priv, (reg), (val), true)
 
-#define I915_READ(reg)		i915_read32(dev_priv, (reg))
-#define I915_WRITE(reg, val)	i915_write32(dev_priv, (reg), (val))
-#define I915_READ_NOTRACE(reg)		readl(dev_priv->regs + (reg))
-#define I915_WRITE_NOTRACE(reg, val)	writel(val, dev_priv->regs + (reg))
+#define __I915_READ(reg)		readl(dev_priv->regs + (reg))
+#define __I915_WRITE(reg, val)		writel(val, dev_priv->regs + (reg))
+#define I915_READ(reg)			i915_read32(dev_priv, (reg), true)
+#define I915_WRITE(reg, val)		i915_write32(dev_priv, (reg), (val), true)
+#define I915_READ_NOTRACE(reg)		i915_read32(dev_priv, (reg), false)
+#define I915_WRITE_NOTRACE(reg, val)	i915_write32(dev_priv, (reg), (val), false)
 
-#define I915_WRITE64(reg, val)	i915_write64(dev_priv, (reg), (val))
-#define I915_READ64(reg)	i915_read64(dev_priv, (reg))
+#define I915_WRITE64(reg, val)		i915_write64(dev_priv, (reg), (val), true)
+#define I915_READ64(reg)		i915_read64(dev_priv, (reg), true)
 
-#define POSTING_READ(reg)	(void)I915_READ_NOTRACE(reg)
-#define POSTING_READ16(reg)	(void)I915_READ16_NOTRACE(reg)
+#define POSTING_READ(reg)		(void)__I915_READ(reg)
+#define POSTING_READ16(reg)		(void)__I915_READ16(reg)
 
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 48b479c..c4f3451 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8587,8 +8587,13 @@ static void intel_init_display(struct drm_device *dev)
 			u32	ecobus;
 
 			mutex_lock(&dev->struct_mutex);
+			/* The ECOBUS reg is actually in the gt power well, but
+			 * if the gt is powered down and mt forcewake not
+			 * enabled we'll just read a bogus 0 leading to the
+			 * correct conclusion that mt forcewake is indeed not
+			 * enabled. */
 			__gen6_gt_force_wake_mt_get(dev_priv);
-			ecobus = I915_READ_NOTRACE(ECOBUS);
+			ecobus = __I915_READ(ECOBUS);
 			__gen6_gt_force_wake_mt_put(dev_priv);
 			mutex_unlock(&dev->struct_mutex);
 
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH] drm/i915: clear up I915_(READ|WRITE)_NOTRACE confusion
  2011-12-22  0:28         ` [PATCH] drm/i915: clear up I915_(READ|WRITE)_NOTRACE confusion Daniel Vetter
@ 2011-12-22 17:54           ` Keith Packard
  2011-12-22 18:16             ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Keith Packard @ 2011-12-22 17:54 UTC (permalink / raw)
  To: chris; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 412 bytes --]

On Thu, 22 Dec 2011 01:28:36 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Half of the users actually don't want just no tracing, but need to
> avoid the forcewake dance for correctness. So add new variants
> __I915_READ and __I915_WRITE for that.

I'd sure like something more descriptive than '__' here. Perhaps
I915_READ_NOWAKE? Or even I915_READ_NO_FORCEWAKE?

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH] drm/i915: clear up I915_(READ|WRITE)_NOTRACE confusion
  2011-12-22 17:54           ` Keith Packard
@ 2011-12-22 18:16             ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2011-12-22 18:16 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Thu, Dec 22, 2011 at 09:54:27AM -0800, Keith Packard wrote:
> On Thu, 22 Dec 2011 01:28:36 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > Half of the users actually don't want just no tracing, but need to
> > avoid the forcewake dance for correctness. So add new variants
> > __I915_READ and __I915_WRITE for that.
> 
> I'd sure like something more descriptive than '__' here. Perhaps
> I915_READ_NOWAKE? Or even I915_READ_NO_FORCEWAKE?

It's actually I915_READ_NO_FORCEWAKE_NOTRACE. Which is ugly, so Chris
suggested the often-used (in the linux kernel) __ prefix for "raw
interface, you better know what you're doing". I admit that I like that
one quite a bit.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2011-12-14 12:57 ` [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock Daniel Vetter
@ 2012-01-03 18:51   ` Keith Packard
  2012-01-03 19:12     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Keith Packard @ 2012-01-03 18:51 UTC (permalink / raw)
  Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1161 bytes --]

On Wed, 14 Dec 2011 13:57:03 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> The problem this patch solves is that the forcewake accounting
> necessary for register reads is protected by dev->struct_mutex. But the
> hangcheck and error_capture code need to access registers without
> grabbing this mutex because we hold it while waiting for the gpu.
> So a new lock is required. Because currently the error_state capture
> is called from the error irq handler and the hangcheck code runs from
> a timer, it needs to be an irqsafe spinlock (note that the registers
> used by the irq handler (neglecting the error handling part) only uses
> registers that don't need the forcewake dance).

I think this description is wrong -- the only difference between using
atomic objects and using a spinlock is that with the spinlock the call
to ->force_wake_get is correctly serialized so that no register access
can occur without the chip being awoken. Without a spinlock, a second
thread can pass right through gen6_gt_force_wake_get and then go touch
registers while the first thread is busy waking the chip up.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-03 18:51   ` Keith Packard
@ 2012-01-03 19:12     ` Daniel Vetter
  2012-01-03 21:13       ` Keith Packard
  0 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2012-01-03 19:12 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx

On Tue, Jan 3, 2012 at 19:51, Keith Packard <keithp@keithp.com> wrote:
> On Wed, 14 Dec 2011 13:57:03 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
>> The problem this patch solves is that the forcewake accounting
>> necessary for register reads is protected by dev->struct_mutex. But the
>> hangcheck and error_capture code need to access registers without
>> grabbing this mutex because we hold it while waiting for the gpu.
>> So a new lock is required. Because currently the error_state capture
>> is called from the error irq handler and the hangcheck code runs from
>> a timer, it needs to be an irqsafe spinlock (note that the registers
>> used by the irq handler (neglecting the error handling part) only uses
>> registers that don't need the forcewake dance).
>
> I think this description is wrong -- the only difference between using
> atomic objects and using a spinlock is that with the spinlock the call
> to ->force_wake_get is correctly serialized so that no register access
> can occur without the chip being awoken. Without a spinlock, a second
> thread can pass right through gen6_gt_force_wake_get and then go touch
> registers while the first thread is busy waking the chip up.

I'm a bit confused by this. With the current code forcewake is
protected by dev->struct_mutex. Which doesn't work out because we need
to access registers that require the forcewake dance from non-process
context.

Afaik the atomic ops stuff is just ducttape for paranoia reasons.
-Daniel
-- 
Daniel Vetter
daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-03 19:12     ` Daniel Vetter
@ 2012-01-03 21:13       ` Keith Packard
  2012-01-03 21:49         ` Ben Widawsky
  2012-01-03 21:49         ` Daniel Vetter
  0 siblings, 2 replies; 115+ messages in thread
From: Keith Packard @ 2012-01-03 21:13 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1530 bytes --]

On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> I'm a bit confused by this. With the current code forcewake is
> protected by dev->struct_mutex. Which doesn't work out because we need
> to access registers that require the forcewake dance from non-process
> context.

Right, I like adding a spinlock around this to allow it to be called
without needing to be able to lock the struct_mutex. (I remember
suggesting that a spinlock would be necessary when the force wake code
first showed up...)

However, the commit message talks about the error capture and
hang check code, but doesn't appear to change them at all.

I think all this patch does is replace the locking for forcewake_count
From struct_mutex to a new irq-safe spinlock, the commit message makes
it sound like it's actually fixing stuff, which it isn't; it just
enables fixing stuff in future patches, right?

Reading through this a bit more, I think your patch opens up a hole in
i915_reset. i915_reset takes struct_mutex, then resets the chip and
restores the forcewake status. If we aren't relying on struct_mutex to
protect the forcewake bits, then there's nothing preventing a thread
From accessing the registers with the chip sleeping between the reset
and the force wake reset.

> Afaik the atomic ops stuff is just ducttape for paranoia reasons.

The atomic ops stuff would allow reading of the value without holding
struct_mutex, if that were actually useful.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-03 21:13       ` Keith Packard
@ 2012-01-03 21:49         ` Ben Widawsky
  2012-01-03 22:23           ` Chris Wilson
  2012-01-03 21:49         ` Daniel Vetter
  1 sibling, 1 reply; 115+ messages in thread
From: Ben Widawsky @ 2012-01-03 21:49 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On 01/03/2012 01:13 PM, Keith Packard wrote:
> On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
>> I'm a bit confused by this. With the current code forcewake is
>> protected by dev->struct_mutex. Which doesn't work out because we need
>> to access registers that require the forcewake dance from non-process
>> context.
> 
> Right, I like adding a spinlock around this to allow it to be called
> without needing to be able to lock the struct_mutex. (I remember
> suggesting that a spinlock would be necessary when the force wake code
> first showed up...)
> 
> However, the commit message talks about the error capture and
> hang check code, but doesn't appear to change them at all.
> 
> I think all this patch does is replace the locking for forcewake_count
> From struct_mutex to a new irq-safe spinlock, the commit message makes
> it sound like it's actually fixing stuff, which it isn't; it just
> enables fixing stuff in future patches, right?

As Daniel mentioned in the commit message, it fixes existing bugs simply
by using a spinlock. In the timer, we do not grab struct_mutex and there
is currently a race there (which we've known about since day 1).

>> Afaik the atomic ops stuff is just ducttape for paranoia reasons.
> 
> The atomic ops stuff would allow reading of the value without holding
> struct_mutex, if that were actually useful.

The atomic ops stuff was simply there to help reduce the races (even if
we don't have the lock, we can still safely increment the variable). It
should be safe to get rid of with the spinlock in place.

My only gripe here is Chris shot down my earlier version of this patch
many moons ago :(

Ben

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-03 21:13       ` Keith Packard
  2012-01-03 21:49         ` Ben Widawsky
@ 2012-01-03 21:49         ` Daniel Vetter
  2012-01-03 23:33           ` Keith Packard
  1 sibling, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2012-01-03 21:49 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx

On Tue, Jan 3, 2012 at 22:13, Keith Packard <keithp@keithp.com> wrote:
> On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
>> I'm a bit confused by this. With the current code forcewake is
>> protected by dev->struct_mutex. Which doesn't work out because we need
>> to access registers that require the forcewake dance from non-process
>> context.
>
> Right, I like adding a spinlock around this to allow it to be called
> without needing to be able to lock the struct_mutex. (I remember
> suggesting that a spinlock would be necessary when the force wake code
> first showed up...)
>
> However, the commit message talks about the error capture and
> hang check code, but doesn't appear to change them at all.
>
> I think all this patch does is replace the locking for forcewake_count
> From struct_mutex to a new irq-safe spinlock, the commit message makes
> it sound like it's actually fixing stuff, which it isn't; it just
> enables fixing stuff in future patches, right?

Nope, current hangcheck blows up, and we have an i-g-t testcase for it
(which the commit msg clearly states). There are also numerous bug
reports where a dying gpu results in tons of
WARN_ON(!mutex_locked(dev->struct_mutex)) noise in dmesg (which drowns
out the gpu hang warning). The locking change fixes this.

> Reading through this a bit more, I think your patch opens up a hole in
> i915_reset. i915_reset takes struct_mutex, then resets the chip and
> restores the forcewake status. If we aren't relying on struct_mutex to
> protect the forcewake bits, then there's nothing preventing a thread
> From accessing the registers with the chip sleeping between the reset
> and the force wake reset.

The patch adds the required locking to i915_reset.

>> Afaik the atomic ops stuff is just ducttape for paranoia reasons.
>
> The atomic ops stuff would allow reading of the value without holding
> struct_mutex, if that were actually useful.

... but is currently unused and inherently racy. Which is why the
patch drops it.
-- 
Daniel Vetter
daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-03 21:49         ` Ben Widawsky
@ 2012-01-03 22:23           ` Chris Wilson
  0 siblings, 0 replies; 115+ messages in thread
From: Chris Wilson @ 2012-01-03 22:23 UTC (permalink / raw)
  To: Ben Widawsky, Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Tue, 03 Jan 2012 13:49:36 -0800, Ben Widawsky <ben@bwidawsk.net> wrote:
> The atomic ops stuff was simply there to help reduce the races (even if
> we don't have the lock, we can still safely increment the variable). It
> should be safe to get rid of with the spinlock in place.
> 
> My only gripe here is Chris shot down my earlier version of this patch
> many moons ago :(

The other way of tackling it would be not to take the forcewake during
hangcheck at all, and engineer the hangcheck not to rely on the
ring reads. For example, use seqno as the primary activity monitor,
which only leaves the case of trying not to fire spuriously during a
long batchbuffer. To counter that, you could optimistically do a raw
read of ACTHD or simply rely on long timeouts. Any error recover should
be moved to the error handling workqueue, so that we never attempt to
write a register or modify the stuct without the struct_mutex.

Reducing the granularity of struct_mutex and solving the contention
with mode_config.lock over register access is the ultimate goal when
reviewing the locking mess.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-03 21:49         ` Daniel Vetter
@ 2012-01-03 23:33           ` Keith Packard
  2012-01-04 17:11             ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Keith Packard @ 2012-01-03 23:33 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 927 bytes --]

On Tue, 3 Jan 2012 22:49:52 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Nope, current hangcheck blows up, and we have an i-g-t testcase for it
> (which the commit msg clearly states). There are also numerous bug
> reports where a dying gpu results in tons of
> WARN_ON(!mutex_locked(dev->struct_mutex)) noise in dmesg (which drowns
> out the gpu hang warning). The locking change fixes this.

Ah, ok, that makes sense. Of course, hangcheck *could* have just taken
struct_mutex were it run in a suitable context.

> The patch adds the required locking to i915_reset.

No, the spinlock protects the forcewake_count access and not the actual
register access, which leaves all kinds of potential for races in
threads not also holding struct_mutex while accessing registers.

If you want a spinlock to protect the register access, it must surround
the whole operation.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-03 23:33           ` Keith Packard
@ 2012-01-04 17:11             ` Daniel Vetter
  2012-01-04 17:54               ` Keith Packard
  0 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2012-01-04 17:11 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx

On Wed, Jan 4, 2012 at 00:33, Keith Packard <keithp@keithp.com> wrote:
> On Tue, 3 Jan 2012 22:49:52 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
>> Nope, current hangcheck blows up, and we have an i-g-t testcase for it
>> (which the commit msg clearly states). There are also numerous bug
>> reports where a dying gpu results in tons of
>> WARN_ON(!mutex_locked(dev->struct_mutex)) noise in dmesg (which drowns
>> out the gpu hang warning). The locking change fixes this.
>
> Ah, ok, that makes sense. Of course, hangcheck *could* have just taken
> struct_mutex were it run in a suitable context.

Nope, we cannot move the hangcheck into process context by using a
delayed work item and then grabbing struct_mutex. If the gpu is dead,
we usually have a task stuck waiting for it and already holding
struct_mutex. It is *absolutely* imperial that the hangcheck and error
state capture code do not block on anything that the i915 gem code
might hold onto.

>> The patch adds the required locking to i915_reset.
>
> No, the spinlock protects the forcewake_count access and not the actual
> register access, which leaves all kinds of potential for races in
> threads not also holding struct_mutex while accessing registers.

Ah, I think I see you're concern: Between the time we reset the gpu
and the time we fix up the forcewake state somebody might sneak in and
see an inconstency between our tracking and the actual hw state, hence
reading garbage. Correct?

> If you want a spinlock to protect the register access, it must surround
> the whole operation.

Between the time the hangcheck declares the gpu dead and the time we
deem it officially resurrected at the end of i915_reset there's no
issue with returning garbage from register writes - after all, the gpu
just went down.

The only thing we have to take care of is that we don't leave behind
an inconsistent state after i915_reset, which the current locking in
my patch takes care of.

Hence I think that no further protection is required.
-Daniel
-- 
Daniel Vetter
daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-04 17:11             ` Daniel Vetter
@ 2012-01-04 17:54               ` Keith Packard
  2012-01-04 18:12                 ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Keith Packard @ 2012-01-04 17:54 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 643 bytes --]

On Wed, 4 Jan 2012 18:11:18 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Ah, I think I see you're concern: Between the time we reset the gpu
> and the time we fix up the forcewake state somebody might sneak in and
> see an inconstency between our tracking and the actual hw state, hence
> reading garbage. Correct?

Indeed. Plus, holding the spinlock across the whole operation also means
only taking it once, rather than twice. Spinlocks aren't free.

If we change the locking from struct_mutex to the spinlock, we should
actually make it work, independent of what access we have today.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-04 17:54               ` Keith Packard
@ 2012-01-04 18:12                 ` Daniel Vetter
  2012-01-05  2:22                   ` Keith Packard
  0 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2012-01-04 18:12 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, Jan 04, 2012 at 09:54:08AM -0800, Keith Packard wrote:
> On Wed, 4 Jan 2012 18:11:18 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > Ah, I think I see you're concern: Between the time we reset the gpu
> > and the time we fix up the forcewake state somebody might sneak in and
> > see an inconstency between our tracking and the actual hw state, hence
> > reading garbage. Correct?
> 
> Indeed. Plus, holding the spinlock across the whole operation also means
> only taking it once, rather than twice. Spinlocks aren't free.
> 
> If we change the locking from struct_mutex to the spinlock, we should
> actually make it work, independent of what access we have today.

The "Correct?" was just to check my understanding of your concern, I still
think its invalid. You've cut away the second part of my mail where I
explain why and I don't see you responding to that here. Also
micro-optimizing the gpu reset code sounds a bit strange.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-04 18:12                 ` Daniel Vetter
@ 2012-01-05  2:22                   ` Keith Packard
  2012-01-05 11:29                     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Keith Packard @ 2012-01-05  2:22 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2683 bytes --]

On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:

> The "Correct?" was just to check my understanding of your concern, I still
> think its invalid. You've cut away the second part of my mail where I
> explain why and I don't see you responding to that here. Also
> micro-optimizing the gpu reset code sounds a bit strange.

Sorry, I didn't explain things very well.

Right now, our register access looks like:

        get(struct_mutex);
        if (++forcewake_count == 1)
                force_wake_get()

        value = read32(reg)     or      write32(reg, val)

        if (--forcewake_count == 0)
                force_wake_put();

        /* more register accesses may follow ... */
        put(struct_mutex);

All very sensible, the whole register sequence is covered by
struct_mutex, which ensures that the forcewake is set across the
register access.

The patch does:

        get(spin_lock)
        if (++forcewake_count == 1)
                force_wake_get()
        put(spin_lock)
        value = read32(reg)     or     write32(reg, val)
        get(spin_lock)
        if (--forcewake_count == 0)
                force_wake_put()
        put(spin_lock)

I realize that all current users hold the struct_mutex across this whole
sequence, aside from the reset path, but we're removing that requirement
explicitly (the patch removes the WARN_ON calls).

Without a lock held across the whole sequence, it's easy to see how a
race could occur. We're also dropping and re-acquiring a spinlock with a
single instruction between, which seems wasteful. Instead, we should be
doing:

        get(spin_lock)
        force_wake_get();
        value = read32(reg)     or      write32(reg,val)
        force_wake_put();
        put(spin_lock);
        
No need here to deal with the wake lock at reset time; the whole
operation is atomic wrt interrupts. It's more efficient *and* correct,
without depending on the old struct_mutex locking.

If you want to continue to expose the user-mode wake lock stuff, you
could add:

        get(spin_lock);
        if (!forcewake_count)
                force_wake_get();
        value = read32(reg)     or      write32(reg,val)
        if (!forcewake_count)
                force_wake_put();
        put(spin_lock);

This would require that the reset code also deal with the
forcewake_count to restore the user-mode force wake.

A further optimization would hold the force_wake active for 'a while' to
avoid the extra bus cycles required, but we'd want to see a performance
problem from this before doing that.


-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-05  2:22                   ` Keith Packard
@ 2012-01-05 11:29                     ` Daniel Vetter
  2012-01-05 15:49                       ` Keith Packard
  0 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2012-01-05 11:29 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, Jan 04, 2012 at 06:22:41PM -0800, Keith Packard wrote:
> On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > The "Correct?" was just to check my understanding of your concern, I still
> > think its invalid. You've cut away the second part of my mail where I
> > explain why and I don't see you responding to that here. Also
> > micro-optimizing the gpu reset code sounds a bit strange.
> 
> Sorry, I didn't explain things very well.
> 
> Right now, our register access looks like:
> 
>         get(struct_mutex);
>         if (++forcewake_count == 1)
>                 force_wake_get()
> 
>         value = read32(reg)     or      write32(reg, val)
> 
>         if (--forcewake_count == 0)
>                 force_wake_put();
> 
>         /* more register accesses may follow ... */
>         put(struct_mutex);
> 
> All very sensible, the whole register sequence is covered by
> struct_mutex, which ensures that the forcewake is set across the
> register access.
> 
> The patch does:
> 
>         get(spin_lock)
>         if (++forcewake_count == 1)
>                 force_wake_get()
>         put(spin_lock)
>         value = read32(reg)     or     write32(reg, val)
>         get(spin_lock)
>         if (--forcewake_count == 0)
>                 force_wake_put()
>         put(spin_lock)
> 
> I realize that all current users hold the struct_mutex across this whole
> sequence, aside from the reset path, but we're removing that requirement
> explicitly (the patch removes the WARN_ON calls).
> 
> Without a lock held across the whole sequence, it's easy to see how a
> race could occur. We're also dropping and re-acquiring a spinlock with a
> single instruction between, which seems wasteful. Instead, we should be
> doing:
> 
>         get(spin_lock)
>         force_wake_get();
>         value = read32(reg)     or      write32(reg,val)
>         force_wake_put();
>         put(spin_lock);
>         
> No need here to deal with the wake lock at reset time; the whole
> operation is atomic wrt interrupts. It's more efficient *and* correct,
> without depending on the old struct_mutex locking.
> 
> If you want to continue to expose the user-mode wake lock stuff, you
> could add:
> 
>         get(spin_lock);
>         if (!forcewake_count)
>                 force_wake_get();
>         value = read32(reg)     or      write32(reg,val)
>         if (!forcewake_count)
>                 force_wake_put();
>         put(spin_lock);
> 
> This would require that the reset code also deal with the
> forcewake_count to restore the user-mode force wake.
> 
> A further optimization would hold the force_wake active for 'a while' to
> avoid the extra bus cycles required, but we'd want to see a performance
> problem from this before doing that.

I think you've lost me ... A few random comments first, it looks like I
neeed more coffee:

- The reset code (running from a workqueue) does hold sturct mutex. It's
  the hangcheck and error state capture code running from softirq/timer
  context causing issues.

- Forcewake works like a reference counted resources. As long as all
  _get/_put calls are properly balanced, I don't see how somebody could
  sneak in in between (when we don't hold the spinlock) and cause havoc.
  For paranaoia we might want to drop a WARN_ON in the _put call to check
  whether it ever drops below 0. But all current users are clearly
  balanced, so I didn't bother with it.

- My missed IRQ w/a actually relies on this by grabbing a forcewake ref in
  get_irq and dropping it again in put_irq. In between there's usually a
  schedule().

- I've pondered with Chris whether we should do your proposed optimization
  but we then noticed that the gem code doesn't actually read from any
  forcewake protected registers in normal execution (I don't consider
  running the hangcheck timer normal ;-). With my missed irq w/a that now
  changes, so we might need to reconsider this. But imo that's material
  for a separate patch.

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-05 11:29                     ` Daniel Vetter
@ 2012-01-05 15:49                       ` Keith Packard
  2012-01-05 16:59                         ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Keith Packard @ 2012-01-05 15:49 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2004 bytes --]

On Thu, 5 Jan 2012 12:29:08 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:

> - The reset code (running from a workqueue) does hold sturct mutex. It's
>   the hangcheck and error state capture code running from softirq/timer
>   context causing issues.

Right, I mis-wrote; I meant the hangcheck timer (which I always think of
as part of the reset code).

> - Forcewake works like a reference counted resources. As long as all
>   _get/_put calls are properly balanced, I don't see how somebody could
>   sneak in in between (when we don't hold the spinlock) and cause havoc.
>   For paranaoia we might want to drop a WARN_ON in the _put call to check
>   whether it ever drops below 0. But all current users are clearly
>   balanced, so I didn't bother with it.

Right, I was just confused somehow. Still seems weird to me to drop a
spinlock, execute a single instruction, and then immediately re-acquire
it, along with bumping forcewake_count twice.

> - My missed IRQ w/a actually relies on this by grabbing a forcewake ref in
>   get_irq and dropping it again in put_irq. In between there's usually a
>   schedule().

This is essentially the same as the user-level forcewake and would be
handled in the same way -- keep forcewake_count, but use it only for
long-term values.

> - I've pondered with Chris whether we should do your proposed optimization
>   but we then noticed that the gem code doesn't actually read from any
>   forcewake protected registers in normal execution (I don't consider
>   running the hangcheck timer normal ;-). With my missed irq w/a that now
>   changes, so we might need to reconsider this. But imo that's material
>   for a separate patch.

Yeah, all sounds reasonable. That separate patch can actually use
per-chip functions to read/write from the chip so we can also avoid
checking the forcewake stuff on register reads for older generation
hardware.

Make it work, then make it work faster.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-05 15:49                       ` Keith Packard
@ 2012-01-05 16:59                         ` Daniel Vetter
  2012-01-06  0:29                           ` Keith Packard
  0 siblings, 1 reply; 115+ messages in thread
From: Daniel Vetter @ 2012-01-05 16:59 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

Looks like we managed to clear up our mutual confusion here ;-)

On Thu, Jan 05, 2012 at 07:49:12AM -0800, Keith Packard wrote:
> On Thu, 5 Jan 2012 12:29:08 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > - The reset code (running from a workqueue) does hold sturct mutex. It's
> >   the hangcheck and error state capture code running from softirq/timer
> >   context causing issues.
> 
> Right, I mis-wrote; I meant the hangcheck timer (which I always think of
> as part of the reset code).
> 
> > - Forcewake works like a reference counted resources. As long as all
> >   _get/_put calls are properly balanced, I don't see how somebody could
> >   sneak in in between (when we don't hold the spinlock) and cause havoc.
> >   For paranaoia we might want to drop a WARN_ON in the _put call to check
> >   whether it ever drops below 0. But all current users are clearly
> >   balanced, so I didn't bother with it.
> 
> Right, I was just confused somehow. Still seems weird to me to drop a
> spinlock, execute a single instruction, and then immediately re-acquire
> it, along with bumping forcewake_count twice.

Absolutely agreed, it's really ugly. But especially for locking changes
I'd like a patch to do one thing, and one thing only. And I didn't see the
upside of a separate patch to fix things up, also because the current
I915_WRTE|READ macro maze is a bit hellish.

> > - My missed IRQ w/a actually relies on this by grabbing a forcewake ref in
> >   get_irq and dropping it again in put_irq. In between there's usually a
> >   schedule().
> 
> This is essentially the same as the user-level forcewake and would be
> handled in the same way -- keep forcewake_count, but use it only for
> long-term values.
> 
> > - I've pondered with Chris whether we should do your proposed optimization
> >   but we then noticed that the gem code doesn't actually read from any
> >   forcewake protected registers in normal execution (I don't consider
> >   running the hangcheck timer normal ;-). With my missed irq w/a that now
> >   changes, so we might need to reconsider this. But imo that's material
> >   for a separate patch.
> 
> Yeah, all sounds reasonable. That separate patch can actually use
> per-chip functions to read/write from the chip so we can also avoid
> checking the forcewake stuff on register reads for older generation
> hardware.
> 
> Make it work, then make it work faster.

Absolutely agreed, maybe with the adadendum to only try to make things
faster if it's actually a problem and shows up in a fast-path we care
about.

Cheers, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-05 16:59                         ` Daniel Vetter
@ 2012-01-06  0:29                           ` Keith Packard
  2012-01-06  5:41                             ` Keith Packard
  0 siblings, 1 reply; 115+ messages in thread
From: Keith Packard @ 2012-01-06  0:29 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2913 bytes --]

On Thu, 5 Jan 2012 17:59:47 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:

> Absolutely agreed, maybe with the adadendum to only try to make things
> faster if it's actually a problem and shows up in a fast-path we care
> about.

Here's a longer series that does a bunch of cleanup before trying to fix
things. Patches marked with '***' fix bugs. The patch marked with '...'
is the optimization to inline the spinlocks.

The following changes since commit d8e70a254d8f2da141006e496a51502b79115e80:

  drm/i915: only set the intel_crtc DPMS mode to on if the mode set succeeded (2012-01-03 14:55:52 -0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux forcewake-spinlock

Keith Packard (9):

      drm/i915: Split register access functions out from display functions

        The forcewake functions are invoked unconditionally on >= gen6
        hardware from the register read/write functions. Having these
        initialized as a side-effect of display initialization seems
        wrong to me. I've moved the functions out of the display
        structure and into a separate structure, and moved the
        initialization to driver load time.

      drm/i915: Access registers through function pointers

        This makes register access go through function pointers,
        following similar changes in many other parts of the driver.

      drm/i915: Split out reg read/write for pre/post gen6 hardware

        Taking advantage of the previous indirection, this actually
        creates separate register read/write functions for pre-gen6 and
        post-gen6 hardware.

      drm/i915: Move forcewake_count to reg_access structure

        Just moves the count into the new structure to keep things
        together.

      drm/i915: hide forcewake_count behind i915_forcewake_count

        Create a function to hide getting the forcewake_count value

      drm/i915: Switch forcewake from atomic to using a spinlock

        This changes the type from atomic to u32 and wraps all users in
        a new spinlock. The spinlock is held across calls to
        ->force_wake_put and ->force_wake_get.

***   drm/i915: Hold forcewake spinlock across reset process

        Changes the reset process to hold the spinlock -- this will
        ensure that all register operations will be correct wrt the
        spinlock, even if the hardware gets reset.

***   drm/i915: Hold forcewake spinlock during register write operations

        This protects the gt_fifo_count value under the spinlock and
        keeps modifications to that tied to the actual register write.

...   drm/i915: inline spin_lock usage in register read macros

        Here's the optimization I mentioned -- inlines the spinlocks
        inside the register read operations.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-06  0:29                           ` Keith Packard
@ 2012-01-06  5:41                             ` Keith Packard
  2012-01-06 20:43                               ` Keith Packard
  0 siblings, 1 reply; 115+ messages in thread
From: Keith Packard @ 2012-01-06  5:41 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 592 bytes --]

On Thu, 05 Jan 2012 16:29:43 -0800, Keith Packard <keithp@keithp.com> wrote:

> Here's a longer series that does a bunch of cleanup before trying to fix
> things. Patches marked with '***' fix bugs. The patch marked with '...'
> is the optimization to inline the spinlocks.

I talked with Eric about this and we decided that the whole splitting
out of the i/o functions just doesn't make any sense. That makes this
series very similar to Daniel's patches, so I'll rebase my bug fixes on
top of those changes and send out a (shorter) series tomorrow.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
  2012-01-06  5:41                             ` Keith Packard
@ 2012-01-06 20:43                               ` Keith Packard
  0 siblings, 0 replies; 115+ messages in thread
From: Keith Packard @ 2012-01-06 20:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2155 bytes --]

On Thu, 05 Jan 2012 21:41:34 -0800, Keith Packard <keithp@keithp.com> wrote:

> I talked with Eric about this and we decided that the whole splitting
> out of the i/o functions just doesn't make any sense. That makes this
> series very similar to Daniel's patches, so I'll rebase my bug fixes on
> top of those changes and send out a (shorter) series tomorrow.

And here's an updated version of my patches built on top of Daniel's:

The following changes since commit d06d2756a21a0c666f167ae9e4f13ef5f07f67d9:

  acpi/video: Don't restore backlight to 0 at boot time (2012-01-06 11:10:25 -0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux forcewake-spinlock

Daniel Vetter (1):
      drm/i915: protect force_wake_(get|put) with the gt_lock

                Daniel's patch (v3)


Keith Packard (3):
      drm/i915: Move reset forcewake processing to gen6_do_reset

                This moves the forcewake code inside gen6_do_reset, at
                the same time it changes from unconditionally calling
                __gen6_gt_force_wake_get to using
                dev_priv->display.force_wake_get. That could be broken
                out as a separate patch -- it's just a bug.

      drm/i915: Hold gt_lock during reset
      drm/i915: Hold gt_lock across forcewake register reads

                These two patches eliminate a race between chip reset
                and other read operations. By holding the gt_lock during
                all read operations, as well as across reset, we can
                ensure that forcewake is active for all register
                reads. Otherwise, right after chip reset, forcewake can
                be inactive, but the internal forcewake_count may be
                non-zero.

                As a nice side-effect, this eliminates taking the
                gt_lock twice during all register reads.

Please take a look and see if these are all reasonable additions to the
original patch and when it's ready, I'll push the whole sequence to
drm-intel-fixes.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 827 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 03/43] drm/i915: switch ring->id to be a real id
  2011-12-14 18:42   ` Eugeni Dodonov
@ 2012-01-29 16:40     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 16:40 UTC (permalink / raw)
  To: Eugeni Dodonov; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 04:42:23PM -0200, Eugeni Dodonov wrote:
> On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > ... and add a helpr function for the places where we want a flag.
> >
> > This way we can use ring->id to index into arrays.
> >
> > v2: Resurrect the missing beautification-space Chris Wilson noted.
> > I'm moving this space around because I'll reuse ring_str in the next
> > patch.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Queued for -next, thanks for the review.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 04/43] drm/i915: refactor ring error state capture to use arrays
  2011-12-14 18:43   ` Eugeni Dodonov
@ 2012-01-29 16:44     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 16:44 UTC (permalink / raw)
  To: Eugeni Dodonov; +Cc: Daniel Vetter, intel-gfx, Ben Widawsky

On Wed, Dec 14, 2011 at 04:43:08PM -0200, Eugeni Dodonov wrote:
> On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > The code already got unwieldy and we want to dump more per-ring
> > registers.
> >
> > Only functional change is that we now also capture the video
> > ring registers on ilk.
> >
> > v2: fixup a refactor fumble spotted by Chris Wilson.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >
> 
> 
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Queued for -next, thanks for the review.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 05/43] drm/i915: collect more per ring error state
  2011-12-14 18:43   ` Eugeni Dodonov
@ 2012-01-29 16:48     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 16:48 UTC (permalink / raw)
  To: Eugeni Dodonov; +Cc: Daniel Vetter, intel-gfx, Ben Widawsky

On Wed, Dec 14, 2011 at 04:43:40PM -0200, Eugeni Dodonov wrote:
> On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > Based on a patch by Ben Widawsky, but with different colors
> > for the bikeshed.
> >
> > In contrast to Ben's patch this one doesn't add the fault regs.
> > Afaics they're for the optional page fault support which
> > - we're not enabling
> > - and which seems to be unsupported by the hw team. Recent bspec
> >  lacks tons of information about this that the public docs released
> >  half a year back still contain.
> >
> > Also dump ring HEAD/TAIL registers - I've recently seen a few
> > error_state where just guessing these is not good enough.
> >
> > v2: Also dump INSTPM for every ring.
> >
> > v3: Fix a few really silly goof-ups spotted by Chris Wilson.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Queued for -next, thanks for the review.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 10/43] drm/i915/ringbuffer: kill snb blt workaround
  2011-12-14 12:57 ` [PATCH 10/43] drm/i915/ringbuffer: kill snb blt workaround Daniel Vetter
@ 2012-01-29 16:52   ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 16:52 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 01:57:07PM +0100, Daniel Vetter wrote:
> This was just to facilitate product enablement with pre-production hw.
> Allows us to kill quite a bit of cruft.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
> Reviewed-by: Eric Anholt <eric@anholt.net>
I've picked this one up for -next.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 11/43] drm/i915: Separate fence pin counting from normal bind pin counting
  2011-12-14 12:57 ` [PATCH 11/43] drm/i915: Separate fence pin counting from normal bind pin counting Daniel Vetter
@ 2012-01-29 16:56   ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 16:56 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx

On Wed, Dec 14, 2011 at 01:57:08PM +0100, Daniel Vetter wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> In order to correctly account for reserving space in the GTT and fences
> for a batch buffer, we need to independently track whether the fence is
> pinned due to a fenced GPU access in the batch or whether the buffer is
> pinned in the aperture. Currently we count the fenced as pinned if the
> buffer has already been seen in the execbuffer. This leads to a false
> accounting of available fence registers, causing frequent mass evictions.
> Worse, if coupled with the change to make i915_gem_object_get_fence()
> report EDADLK upon fence starvation, the batchbuffer can fail with only
> one fence required...
> 
> Fixes intel-gpu-tools/tests/gem_fenced_exec_thrash
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38735
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Tested-by: Paul Neumann <paul104x@yahoo.de>
Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 12/43] drm/i915: don't trash the gtt when running out of fences
  2011-12-14 15:09   ` Chris Wilson
@ 2012-01-29 16:57     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 16:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 03:09:24PM +0000, Chris Wilson wrote:
> On Wed, 14 Dec 2011 13:57:09 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > With the fence accounting fixed up in the previous commit not finding
> > enough fences is a fatal error and userspace bug. Trashing the entire
> > gtt is not gonna turn up that missing fence, so don't to this by
> > returning another error thatn ENOSPC.
> > 
> > This has the added benefit that it's easier to distinguish fence
> > accounting errors from gtt space accounting issues.
> > 
> > TTM serves as precendence for the EDEADLK error code - it returns it
> > when the reservation code needs resources already blocked by the
> > current reservation.
> > 
> > Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Queued for -next, thanks for the review.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 13/43] drm/i915: refactor debugfs open function
  2011-12-14 12:57 ` [PATCH 13/43] drm/i915: refactor debugfs open function Daniel Vetter
  2011-12-14 15:10   ` Chris Wilson
  2011-12-14 18:36   ` Eugeni Dodonov
@ 2012-01-29 17:28   ` Daniel Vetter
  2 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 17:28 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 01:57:10PM +0100, Daniel Vetter wrote:
> Only forcewake has an open with special semantics, the other r/w
> debugfs only assign the file private pointer.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

Queued for -next, thanks for the review.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 29/43] drm/i915: remove the i915_batchbuffer_info debugfs file
  2011-12-14 12:57 ` [PATCH 29/43] drm/i915: remove the i915_batchbuffer_info debugfs file Daniel Vetter
@ 2012-01-29 17:35   ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 17:35 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 01:57:26PM +0100, Daniel Vetter wrote:
> With the error_state facility in place, this has outlived it's
> usefulness. It also oopses with the lates llc-reloc patches because
> it directly access objects through the gtt without any checks.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
I've picked this up for -next.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 18/43] drm/i915: fix swizzle detection for gen3
  2011-12-14 12:57 ` [PATCH 18/43] drm/i915: fix swizzle detection for gen3 Daniel Vetter
@ 2012-01-29 17:36   ` Chris Wilson
  2012-01-30 20:20     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Chris Wilson @ 2012-01-29 17:36 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx, stable

[-- Attachment #1: Type: text/plain, Size: 1530 bytes --]

On Wed, 14 Dec 2011 13:57:15 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> It looks like the desktop variants of i915 and i945 also have the DCC
> register to control dram channel interleave and cpu side bit6
> swizzling.
> 
> Unfortunately internal Cspec/ConfigDB documentation for these ancient chips
> have already been dropped and there seem to be no archives. Also
> somebody thought the swizzling behaviour is surely a worthy secret to
> keep and redacted any mention of these fields from the published Intel
> datasheets.
> 
> I suspect the hw engineers were really proud of the page coloring
> they've achieved in their first dual channel dram controller with
> bit17 - after all Bspec explains in great length the optimal layout of
> page frame numbers modulo 4 for the color and depth buffers, too.
> Later on when they've started to work on VT-d they shamefully
> discoverd their stupidity and tried to cover the tracks ...
> 
> Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch> (i915g)
> Tested-by: Pavel Ondračka <pavel.ondracka@email.cz> (i945g)
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42625
> Cc: stable@kernel.org
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

It doesn't appear to have broken any of my machines, but I'm going to
qualify that to only the laptops I've look at the output of as well as
having run i-g-t!
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> (no changes on 855gm, pnv, gm45, ilk, snb)
-Chris


-- 
Chris Wilson, Intel Open Source Technology Centre

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 24/43] drm/i915: capture error_state also for stuck rings
  2011-12-14 12:57 ` [PATCH 24/43] drm/i915: capture error_state also for stuck rings Daniel Vetter
@ 2012-01-29 17:36   ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 17:36 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 01:57:21PM +0100, Daniel Vetter wrote:
> Since quite a while we also the basic output configuration in the
> error_state, so it should contain enough information to diagnose
> these MI_WAIT hangs.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
I've picked this up for -next.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 19/43] drm/i915: add debugfs file for swizzling information
  2011-12-14 12:57 ` [PATCH 19/43] drm/i915: add debugfs file for swizzling information Daniel Vetter
@ 2012-01-29 17:37   ` Chris Wilson
  2012-01-30 20:22     ` Daniel Vetter
  0 siblings, 1 reply; 115+ messages in thread
From: Chris Wilson @ 2012-01-29 17:37 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:16 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> This will also come handy for the gen6+ swizzling support, where the
> driver is supposed to control swizzling depending upon dram
> configuration.
> 
> v2: CxDRB3 are 16 bit regs! Noticed by Chris Wilson.

The output isn't quite the colour I prefer (where are the pokadots?),
nevertheless:
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 30/43] drm/i915: reject GTT domain in relocations
  2011-12-14 12:57 ` [PATCH 30/43] drm/i915: reject GTT domain in relocations Daniel Vetter
@ 2012-01-29 17:38   ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 17:38 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 01:57:27PM +0100, Daniel Vetter wrote:
> This confuses our domain tracking and can (for gtt write domains) lead
> to a subsequent oops.
> 
> Tested by tests/gem_exec_bad_domains from i-g-t.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Eric Anholt <eric@anholt.net>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
I've queued this up for -next.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 20/43] drm/i915: swizzling support for snb/ivb
  2011-12-14 12:57 ` [PATCH 20/43] drm/i915: swizzling support for snb/ivb Daniel Vetter
@ 2012-01-29 18:34   ` Chris Wilson
  2012-01-31  7:44   ` Ben Widawsky
  1 sibling, 0 replies; 115+ messages in thread
From: Chris Wilson @ 2012-01-29 18:34 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, 14 Dec 2011 13:57:17 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> We have to do this manually. Somebody had a Great Idea.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>

Speedups
========
 xlib             grads-heat-map  211.10 (226.71 5.67%) -> 162.57 (181.59 7.12%):  1.30x speedup
 xlib              midori-zoomed  684.28 (750.01 4.69%) -> 557.63 (747.08 14.49%):  1.23x speedup
 xlib       firefox-planet-gnome  3214.07 (3387.56 2.40%) -> 2693.18 (2869.28 4.26%):  1.19x speedup
 xlib             poppler-reseau  696.29 (726.12 5.28%) -> 597.91 (604.47 1.52%):  1.16x speedup
 xlib                       gvim  1456.78 (1529.64 4.20%) -> 1311.99 (1335.88 1.49%):  1.11x speedup
 xlib       gnome-system-monitor  1463.42 (1523.61 2.00%) -> 1334.18 (1370.66 1.69%):  1.10x speedup
 xlib          firefox-talos-gfx  2839.91 (3292.59 4.95%) -> 2633.07 (2667.22 0.53%):  1.08x speedup
Slowdowns
=========
 xlib       firefox-canvas-alpha  11689.83 (11937.09 1.28%) -> 12667.80 (14981.67 7.24%):  1.08x slowdown
 xlib             swfdec-youtube  1438.36 (1476.82 1.31%) -> 1600.76 (1640.50 1.39%):  1.11x slowdown

I'll take the 10-20% faster firefox whilst browsing, thanks.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 42/43] drm/i915: per-ring fault reg
  2011-12-14 19:00   ` Eugeni Dodonov
@ 2012-01-29 22:20     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-29 22:20 UTC (permalink / raw)
  To: Eugeni Dodonov; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 05:00:13PM -0200, Eugeni Dodonov wrote:
> On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > v2: Chris Wilson suggested to allocate the error_state with kzalloc
> > for better paranioa. Also kill existing spurious clears of the
> > error_state while at it.
> >
> > Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

I've picked this one up for -next and improved the commit message quite a
bit - it was too sparse imo.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 18/43] drm/i915: fix swizzle detection for gen3
  2012-01-29 17:36   ` Chris Wilson
@ 2012-01-30 20:20     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-30 20:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx, stable

On Sun, Jan 29, 2012 at 05:36:04PM +0000, Chris Wilson wrote:
> On Wed, 14 Dec 2011 13:57:15 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > It looks like the desktop variants of i915 and i945 also have the DCC
> > register to control dram channel interleave and cpu side bit6
> > swizzling.
> > 
> > Unfortunately internal Cspec/ConfigDB documentation for these ancient chips
> > have already been dropped and there seem to be no archives. Also
> > somebody thought the swizzling behaviour is surely a worthy secret to
> > keep and redacted any mention of these fields from the published Intel
> > datasheets.
> > 
> > I suspect the hw engineers were really proud of the page coloring
> > they've achieved in their first dual channel dram controller with
> > bit17 - after all Bspec explains in great length the optimal layout of
> > page frame numbers modulo 4 for the color and depth buffers, too.
> > Later on when they've started to work on VT-d they shamefully
> > discoverd their stupidity and tried to cover the tracks ...
> > 
> > Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch> (i915g)
> > Tested-by: Pavel Ondračka <pavel.ondracka@email.cz> (i945g)
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42625
> > Cc: stable@kernel.org
> > Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> It doesn't appear to have broken any of my machines, but I'm going to
> qualify that to only the laptops I've look at the output of as well as
> having run i-g-t!
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> (no changes on 855gm, pnv, gm45, ilk, snb)
I've picked this one here up for -next. If it blows up we just need to
revert it because afaik all the documentation about this disappeared into
secret vaults.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 19/43] drm/i915: add debugfs file for swizzling information
  2012-01-29 17:37   ` Chris Wilson
@ 2012-01-30 20:22     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-30 20:22 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On Sun, Jan 29, 2012 at 05:37:45PM +0000, Chris Wilson wrote:
> On Wed, 14 Dec 2011 13:57:16 +0100, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > This will also come handy for the gen6+ swizzling support, where the
> > driver is supposed to control swizzling depending upon dram
> > configuration.
> > 
> > v2: CxDRB3 are 16 bit regs! Noticed by Chris Wilson.
> 
> The output isn't quite the colour I prefer (where are the pokadots?),
> nevertheless:
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Queued for -next, thanks for the review.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 35/43] drm/i915: rewrite shmem_pread_slow to use copy_to_user
  2011-12-14 12:57 ` [PATCH 35/43] drm/i915: rewrite shmem_pread_slow to use copy_to_user Daniel Vetter
@ 2012-01-30 22:37   ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-30 22:37 UTC (permalink / raw)
  To: Keith Packard; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 01:57:32PM +0100, Daniel Vetter wrote:
> Like for shmem_pwrite_slow. The only difference is that because we
> read data, we can leave the fetched cachelines in the cpu: In the case
> that the object isn't in the cpu read domain anymore, the clflush for
> the next cpu read domain invalidation will simply drop these
> cachelines.
> 
> slow_shmem_bit17_copy is now ununsed, so kill it.
> 
> With this patch tests/gem_mmap_gtt now actually works.
> 
> v2: add __ to copy_to_user_swizzled as suggested by Chris Wilson.
> 
> v3: Fixup the swizzling logic, it swizzled the wrong pages.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
I've picked up these 3 patches to finally fix the pwrite/pread -EFAULT
bug. Assuming nothing pops up in testing this week and no one raises any
complains about the currently queued patches, I'll push out a new -next
after fosdem with the current set of patches in -next-queued.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 20/43] drm/i915: swizzling support for snb/ivb
  2011-12-14 12:57 ` [PATCH 20/43] drm/i915: swizzling support for snb/ivb Daniel Vetter
  2012-01-29 18:34   ` Chris Wilson
@ 2012-01-31  7:44   ` Ben Widawsky
  2012-01-31  8:42     ` Daniel Vetter
  1 sibling, 1 reply; 115+ messages in thread
From: Ben Widawsky @ 2012-01-31  7:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Wed, Dec 14, 2011 at 01:57:17PM +0100, Daniel Vetter wrote:
> We have to do this manually. Somebody had a Great Idea.
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Okay, my configdb access isn't working, so please forgive me if my
inline comments are not correct... I'll try to review it again when my
access works again.

> ---
>  drivers/gpu/drm/i915/i915_dma.c        |    2 +-
>  drivers/gpu/drm/i915/i915_drv.c        |    4 ++-
>  drivers/gpu/drm/i915/i915_drv.h        |    3 +-
>  drivers/gpu/drm/i915/i915_gem.c        |   23 ++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_gem_tiling.c |   16 +++++++++++++-
>  drivers/gpu/drm/i915/i915_reg.h        |   33 ++++++++++++++++++++++++++++++++
>  6 files changed, 74 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 448d5b1..4c21c67 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1202,7 +1202,7 @@ static int i915_load_gem_init(struct drm_device *dev)
>  	i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
>  
>  	mutex_lock(&dev->struct_mutex);
> -	ret = i915_gem_init_ringbuffer(dev);
> +	ret = i915_gem_init_hw(dev);
>  	mutex_unlock(&dev->struct_mutex);
>  	if (ret)
>  		return ret;
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 6dd219b..f12b43e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -494,7 +494,7 @@ static int i915_drm_thaw(struct drm_device *dev)
>  		mutex_lock(&dev->struct_mutex);
>  		dev_priv->mm.suspended = 0;
>  
> -		error = i915_gem_init_ringbuffer(dev);
> +		error = i915_gem_init_hw(dev);
>  		mutex_unlock(&dev->struct_mutex);
>  
>  		if (HAS_PCH_SPLIT(dev))
> @@ -690,6 +690,8 @@ int i915_reset(struct drm_device *dev, u8 flags)
>  			!dev_priv->mm.suspended) {
>  		dev_priv->mm.suspended = 0;
>  
> +		i915_gem_init_swizzling(dev);
> +
>  		dev_priv->ring[RCS].init(&dev_priv->ring[RCS]);
>  		if (HAS_BSD(dev))
>  		    dev_priv->ring[VCS].init(&dev_priv->ring[VCS]);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index b46fac5..311a4e1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1206,7 +1206,8 @@ int __must_check i915_gem_object_set_domain(struct drm_i915_gem_object *obj,
>  					    uint32_t read_domains,
>  					    uint32_t write_domain);
>  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
> -int __must_check i915_gem_init_ringbuffer(struct drm_device *dev);
> +int __must_check i915_gem_init_hw(struct drm_device *dev);
> +void i915_gem_init_swizzling(struct drm_device *dev);
>  void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>  void i915_gem_do_init(struct drm_device *dev,
>  		      unsigned long start,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index e995248..39459d2 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3744,12 +3744,31 @@ i915_gem_idle(struct drm_device *dev)
>  	return 0;
>  }
>  
> +void i915_gem_init_swizzling(struct drm_device *dev)
> +{
> +	drm_i915_private_t *dev_priv = dev->dev_private;
> +
> +	if (INTEL_INFO(dev)->gen < 6 ||
> +	    dev_priv->mm.bit_6_swizzle_x == I915_BIT_6_SWIZZLE_NONE)
> +		return;
> +
> +	I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_SWZCTL);
> +	if (IS_GEN6(dev))
> +		I915_WRITE(ARB_MODE, ARB_MODE_ENABLE(ARB_MODE_SWIZZLE_SNB));
> +	else
> +		I915_WRITE(ARB_MODE, ARB_MODE_ENABLE(ARB_MODE_SWIZZLE_IVB));
> +	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> +				 DISP_TILE_SURFACE_SWIZZLING);
> +
> +}
>  int
> -i915_gem_init_ringbuffer(struct drm_device *dev)
> +i915_gem_init_hw(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	int ret;
>  
> +	i915_gem_init_swizzling(dev);
> +
>  	ret = intel_init_render_ring_buffer(dev);
>  	if (ret)
>  		return ret;
> @@ -3805,7 +3824,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
>  	mutex_lock(&dev->struct_mutex);
>  	dev_priv->mm.suspended = 0;
>  
> -	ret = i915_gem_init_ringbuffer(dev);
> +	ret = i915_gem_init_hw(dev);
>  	if (ret != 0) {
>  		mutex_unlock(&dev->struct_mutex);
>  		return ret;
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 861223b..af0a2fc 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -93,8 +93,20 @@ i915_gem_detect_bit_6_swizzle(struct drm_device *dev)
>  	uint32_t swizzle_y = I915_BIT_6_SWIZZLE_UNKNOWN;
>  
>  	if (INTEL_INFO(dev)->gen >= 6) {
> -		swizzle_x = I915_BIT_6_SWIZZLE_NONE;
> -		swizzle_y = I915_BIT_6_SWIZZLE_NONE;
> +		uint32_t dimm_c0, dimm_c1;
> +		dimm_c0 = I915_READ(MAD_DIMM_C0);
> +		dimm_c1 = I915_READ(MAD_DIMM_C1);
> +		dimm_c0 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_A_SIZE_MASK;
> +		dimm_c1 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_A_SIZE_MASK;

That doesn't look right to me. Presumably you meant
MAD_DIMM_B_SIZE_MASK. Is it safe to ignore the others bits too, like
dual rank, and width? Also without configdb, I'm somewhat confused why
you don't read c2.

> +		/* Enable swizzling when the channels are populated with
> +		 * identically sized dimms. */
> +		if (dimm_c0 == dimm_c1) {
> +			swizzle_x = I915_BIT_6_SWIZZLE_9_10;
> +			swizzle_y = I915_BIT_6_SWIZZLE_9;
> +		} else {
> +			swizzle_x = I915_BIT_6_SWIZZLE_NONE;
> +			swizzle_y = I915_BIT_6_SWIZZLE_NONE;
> +		}
>  	} else if (IS_GEN5(dev)) {
>  		/* On Ironlake whatever DRAM config, GPU always do
>  		 * same swizzling setup.
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 8a9f113..e810723 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -295,6 +295,12 @@
>  #define FENCE_REG_SANDYBRIDGE_0		0x100000
>  #define   SANDYBRIDGE_FENCE_PITCH_SHIFT	32
>  
> +/* control register for cpu gtt access */
> +#define TILECTL				0x101000
> +#define   TILECTL_SWZCTL			(1 << 0)
> +#define   TILECTL_TLB_PREFETCH_DIS	(1 << 2)
> +#define   TILECTL_BACKSNOOP_DIS		(1 << 3)
> +
>  /*
>   * Instruction and interrupt control regs
>   */
> @@ -318,6 +324,11 @@
>  #define RING_MAX_IDLE(base)	((base)+0x54)
>  #define RING_HWS_PGA(base)	((base)+0x80)
>  #define RING_HWS_PGA_GEN6(base)	((base)+0x2080)
> +#define ARB_MODE		0x04030
> +#define   ARB_MODE_SWIZZLE_SNB	(1<<4)
> +#define   ARB_MODE_SWIZZLE_IVB	(1<<5)
> +#define   ARB_MODE_ENABLE(x)	GFX_MODE_ENABLE(x)
> +#define   ARB_MODE_DISABLE(x)	GFX_MODE_DISABLE(x)

Hurray for the designers reusing bits again!

>  #define RENDER_HWS_PGA_GEN7	(0x04080)
>  #define BSD_HWS_PGA_GEN7	(0x04180)
>  #define BLT_HWS_PGA_GEN7	(0x04280)
> @@ -1034,6 +1045,28 @@
>  #define C0DRB3			0x10206
>  #define C1DRB3			0x10606
>  
> +/** snb MCH registers for reading the DRAM channel configuration */
> +#define MAD_DIMM_C0			(MCHBAR_MIRROR_BASE_SNB + 0x5004)
> +#define   MAD_DIMM_C1			(MCHBAR_MIRROR_BASE_SNB + 0x5008)
> +#define   MAD_DIMM_C2			(MCHBAR_MIRROR_BASE_SNB + 0x500C)
> +#define   MAD_DIMM_ECC_MASK		(0x3 << 24)
> +#define   MAD_DIMM_ECC_OFF		(0x0 << 24)
> +#define   MAD_DIMM_ECC_IO_ON_LOGIC_OFF	(0x1 << 24)
> +#define   MAD_DIMM_ECC_IO_OFF_LOGIC_ON	(0x2 << 24)
> +#define   MAD_DIMM_ECC_ON		(0x3 << 24)
> +#define   MAD_DIMM_ENH_INTERLEAVE	(0x1 << 22)
> +#define   MAD_DIMM_RANK_INTERLEAVE	(0x1 << 21)
> +#define   MAD_DIMM_B_WIDTH_X16		(0x1 << 20) /* X8 chips if unset */
> +#define   MAD_DIMM_A_WIDTH_X16		(0x1 << 19) /* X8 chips if unset */
> +#define   MAD_DIMM_B_DUAL_RANK		(0x1 << 18)
> +#define   MAD_DIMM_A_DUAL_RANK		(0x1 << 17)
> +#define   MAD_DIMM_A_SELECT		(0x1 << 16)
> +#define   MAD_DIMM_B_SIZE_MASK		(0xff << 8) /* in multiples of 256mb */
> +#define   MAD_DIMM_B_SIZE_SHIFT		8
> +#define   MAD_DIMM_A_SIZE_MASK		(0xff << 0) /* in multiples of 256mb */
> +#define   MAD_DIMM_A_SIZE_SHIFT		8
> +
> +

This also looks less than correct. Seems like MAD_DIMM_A_SIZE_SHIFT
should be 0, and you should use those defines in the MASK definitions.


>  /* Clocking configuration register */
>  #define CLKCFG			0x10c00
>  #define CLKCFG_FSB_400					(5 << 0)	/* hrawclk 100 */

As I said, I'll get back to this when configdb works again.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 20/43] drm/i915: swizzling support for snb/ivb
  2012-01-31  7:44   ` Ben Widawsky
@ 2012-01-31  8:42     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-31  8:42 UTC (permalink / raw)
  To: Daniel Vetter, Keith Packard, intel-gfx

On Mon, Jan 30, 2012 at 11:44:35PM -0800, Ben Widawsky wrote:
> On Wed, Dec 14, 2011 at 01:57:17PM +0100, Daniel Vetter wrote:
> > We have to do this manually. Somebody had a Great Idea.
> > 
> > Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> Okay, my configdb access isn't working, so please forgive me if my
> inline comments are not correct... I'll try to review it again when my
> access works again.
> 
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c        |    2 +-
> >  drivers/gpu/drm/i915/i915_drv.c        |    4 ++-
> >  drivers/gpu/drm/i915/i915_drv.h        |    3 +-
> >  drivers/gpu/drm/i915/i915_gem.c        |   23 ++++++++++++++++++++-
> >  drivers/gpu/drm/i915/i915_gem_tiling.c |   16 +++++++++++++-
> >  drivers/gpu/drm/i915/i915_reg.h        |   33 ++++++++++++++++++++++++++++++++
> >  6 files changed, 74 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 448d5b1..4c21c67 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1202,7 +1202,7 @@ static int i915_load_gem_init(struct drm_device *dev)
> >  	i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
> >  
> >  	mutex_lock(&dev->struct_mutex);
> > -	ret = i915_gem_init_ringbuffer(dev);
> > +	ret = i915_gem_init_hw(dev);
> >  	mutex_unlock(&dev->struct_mutex);
> >  	if (ret)
> >  		return ret;
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 6dd219b..f12b43e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -494,7 +494,7 @@ static int i915_drm_thaw(struct drm_device *dev)
> >  		mutex_lock(&dev->struct_mutex);
> >  		dev_priv->mm.suspended = 0;
> >  
> > -		error = i915_gem_init_ringbuffer(dev);
> > +		error = i915_gem_init_hw(dev);
> >  		mutex_unlock(&dev->struct_mutex);
> >  
> >  		if (HAS_PCH_SPLIT(dev))
> > @@ -690,6 +690,8 @@ int i915_reset(struct drm_device *dev, u8 flags)
> >  			!dev_priv->mm.suspended) {
> >  		dev_priv->mm.suspended = 0;
> >  
> > +		i915_gem_init_swizzling(dev);
> > +
> >  		dev_priv->ring[RCS].init(&dev_priv->ring[RCS]);
> >  		if (HAS_BSD(dev))
> >  		    dev_priv->ring[VCS].init(&dev_priv->ring[VCS]);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index b46fac5..311a4e1 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1206,7 +1206,8 @@ int __must_check i915_gem_object_set_domain(struct drm_i915_gem_object *obj,
> >  					    uint32_t read_domains,
> >  					    uint32_t write_domain);
> >  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
> > -int __must_check i915_gem_init_ringbuffer(struct drm_device *dev);
> > +int __must_check i915_gem_init_hw(struct drm_device *dev);
> > +void i915_gem_init_swizzling(struct drm_device *dev);
> >  void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
> >  void i915_gem_do_init(struct drm_device *dev,
> >  		      unsigned long start,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index e995248..39459d2 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3744,12 +3744,31 @@ i915_gem_idle(struct drm_device *dev)
> >  	return 0;
> >  }
> >  
> > +void i915_gem_init_swizzling(struct drm_device *dev)
> > +{
> > +	drm_i915_private_t *dev_priv = dev->dev_private;
> > +
> > +	if (INTEL_INFO(dev)->gen < 6 ||
> > +	    dev_priv->mm.bit_6_swizzle_x == I915_BIT_6_SWIZZLE_NONE)
> > +		return;
> > +
> > +	I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_SWZCTL);
> > +	if (IS_GEN6(dev))
> > +		I915_WRITE(ARB_MODE, ARB_MODE_ENABLE(ARB_MODE_SWIZZLE_SNB));
> > +	else
> > +		I915_WRITE(ARB_MODE, ARB_MODE_ENABLE(ARB_MODE_SWIZZLE_IVB));
> > +	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> > +				 DISP_TILE_SURFACE_SWIZZLING);
> > +
> > +}
> >  int
> > -i915_gem_init_ringbuffer(struct drm_device *dev)
> > +i915_gem_init_hw(struct drm_device *dev)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	int ret;
> >  
> > +	i915_gem_init_swizzling(dev);
> > +
> >  	ret = intel_init_render_ring_buffer(dev);
> >  	if (ret)
> >  		return ret;
> > @@ -3805,7 +3824,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
> >  	mutex_lock(&dev->struct_mutex);
> >  	dev_priv->mm.suspended = 0;
> >  
> > -	ret = i915_gem_init_ringbuffer(dev);
> > +	ret = i915_gem_init_hw(dev);
> >  	if (ret != 0) {
> >  		mutex_unlock(&dev->struct_mutex);
> >  		return ret;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > index 861223b..af0a2fc 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > @@ -93,8 +93,20 @@ i915_gem_detect_bit_6_swizzle(struct drm_device *dev)
> >  	uint32_t swizzle_y = I915_BIT_6_SWIZZLE_UNKNOWN;
> >  
> >  	if (INTEL_INFO(dev)->gen >= 6) {
> > -		swizzle_x = I915_BIT_6_SWIZZLE_NONE;
> > -		swizzle_y = I915_BIT_6_SWIZZLE_NONE;
> > +		uint32_t dimm_c0, dimm_c1;
> > +		dimm_c0 = I915_READ(MAD_DIMM_C0);
> > +		dimm_c1 = I915_READ(MAD_DIMM_C1);
> > +		dimm_c0 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_A_SIZE_MASK;
> > +		dimm_c1 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_A_SIZE_MASK;
> 
> That doesn't look right to me. Presumably you meant
> MAD_DIMM_B_SIZE_MASK. Is it safe to ignore the others bits too, like
> dual rank, and width? Also without configdb, I'm somewhat confused why
> you don't read c2.

Yep to dimm_B instead of A. For the other bits I'm under the impression
that they're just used to select the best swizzling/interleaving within a
dimm, so shouldn't really matter for our purpose of selecting swizzling
between channels. For that I think we just want equally-sized channels.

I don't check for channel 2 because that only works on high-end 3-channel
cpus that don't ship with a gpu. All the swizzling patterns presume that
we have 2 channels only, anyway.

> 
> > +		/* Enable swizzling when the channels are populated with
> > +		 * identically sized dimms. */
> > +		if (dimm_c0 == dimm_c1) {
> > +			swizzle_x = I915_BIT_6_SWIZZLE_9_10;
> > +			swizzle_y = I915_BIT_6_SWIZZLE_9;
> > +		} else {
> > +			swizzle_x = I915_BIT_6_SWIZZLE_NONE;
> > +			swizzle_y = I915_BIT_6_SWIZZLE_NONE;
> > +		}
> >  	} else if (IS_GEN5(dev)) {
> >  		/* On Ironlake whatever DRAM config, GPU always do
> >  		 * same swizzling setup.
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 8a9f113..e810723 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -295,6 +295,12 @@
> >  #define FENCE_REG_SANDYBRIDGE_0		0x100000
> >  #define   SANDYBRIDGE_FENCE_PITCH_SHIFT	32
> >  
> > +/* control register for cpu gtt access */
> > +#define TILECTL				0x101000
> > +#define   TILECTL_SWZCTL			(1 << 0)
> > +#define   TILECTL_TLB_PREFETCH_DIS	(1 << 2)
> > +#define   TILECTL_BACKSNOOP_DIS		(1 << 3)
> > +
> >  /*
> >   * Instruction and interrupt control regs
> >   */
> > @@ -318,6 +324,11 @@
> >  #define RING_MAX_IDLE(base)	((base)+0x54)
> >  #define RING_HWS_PGA(base)	((base)+0x80)
> >  #define RING_HWS_PGA_GEN6(base)	((base)+0x2080)
> > +#define ARB_MODE		0x04030
> > +#define   ARB_MODE_SWIZZLE_SNB	(1<<4)
> > +#define   ARB_MODE_SWIZZLE_IVB	(1<<5)
> > +#define   ARB_MODE_ENABLE(x)	GFX_MODE_ENABLE(x)
> > +#define   ARB_MODE_DISABLE(x)	GFX_MODE_DISABLE(x)
> 
> Hurray for the designers reusing bits again!
> 
> >  #define RENDER_HWS_PGA_GEN7	(0x04080)
> >  #define BSD_HWS_PGA_GEN7	(0x04180)
> >  #define BLT_HWS_PGA_GEN7	(0x04280)
> > @@ -1034,6 +1045,28 @@
> >  #define C0DRB3			0x10206
> >  #define C1DRB3			0x10606
> >  
> > +/** snb MCH registers for reading the DRAM channel configuration */
> > +#define MAD_DIMM_C0			(MCHBAR_MIRROR_BASE_SNB + 0x5004)
> > +#define   MAD_DIMM_C1			(MCHBAR_MIRROR_BASE_SNB + 0x5008)
> > +#define   MAD_DIMM_C2			(MCHBAR_MIRROR_BASE_SNB + 0x500C)
> > +#define   MAD_DIMM_ECC_MASK		(0x3 << 24)
> > +#define   MAD_DIMM_ECC_OFF		(0x0 << 24)
> > +#define   MAD_DIMM_ECC_IO_ON_LOGIC_OFF	(0x1 << 24)
> > +#define   MAD_DIMM_ECC_IO_OFF_LOGIC_ON	(0x2 << 24)
> > +#define   MAD_DIMM_ECC_ON		(0x3 << 24)
> > +#define   MAD_DIMM_ENH_INTERLEAVE	(0x1 << 22)
> > +#define   MAD_DIMM_RANK_INTERLEAVE	(0x1 << 21)
> > +#define   MAD_DIMM_B_WIDTH_X16		(0x1 << 20) /* X8 chips if unset */
> > +#define   MAD_DIMM_A_WIDTH_X16		(0x1 << 19) /* X8 chips if unset */
> > +#define   MAD_DIMM_B_DUAL_RANK		(0x1 << 18)
> > +#define   MAD_DIMM_A_DUAL_RANK		(0x1 << 17)
> > +#define   MAD_DIMM_A_SELECT		(0x1 << 16)
> > +#define   MAD_DIMM_B_SIZE_MASK		(0xff << 8) /* in multiples of 256mb */
> > +#define   MAD_DIMM_B_SIZE_SHIFT		8
> > +#define   MAD_DIMM_A_SIZE_MASK		(0xff << 0) /* in multiples of 256mb */
> > +#define   MAD_DIMM_A_SIZE_SHIFT		8
> > +
> > +
> 
> This also looks less than correct. Seems like MAD_DIMM_A_SIZE_SHIFT
> should be 0, and you should use those defines in the MASK definitions.

Agreed, I'll fix this up.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 28/43] drm/i915: Handle unmappable buffers during error state capture
  2011-12-14 18:46   ` Eugeni Dodonov
@ 2012-01-31 19:32     ` Daniel Vetter
  0 siblings, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-01-31 19:32 UTC (permalink / raw)
  To: Eugeni Dodonov; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 14, 2011 at 04:46:48PM -0200, Eugeni Dodonov wrote:
> On Wed, Dec 14, 2011 at 10:57, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > As the buffer is not necessarily accessible through the GTT at the time
> > of a GPU hang, and capturing some of its contents is far more valuable
> > than skipping it, provide a clflushed fallback read path. We still
> > prefer to read through the GTT as that is more consistent with the GPU
> > access of the same buffer. So example it will demonstrate any errorneous
> > tiling or swizzling of the command buffer as seen by the GPU.
> >
> > This becomes necessary with use of CPU relocations and lazy GTT binding,
> > but could potentially happen anyway as a result of a pathological error.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

I've got fed up with my kernel's all dying while testing cs prefetch
behaviour, so I've included this patch in the current pile. Should also
make qa's job a bit easier because the test is already included in i-g-t,
but the fix won't be ready for the next -next cycle.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish
  2011-12-14 12:57 ` [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish Daniel Vetter
  2011-12-16 20:07   ` Eric Anholt
@ 2012-03-01 20:40   ` Daniel Vetter
  1 sibling, 0 replies; 115+ messages in thread
From: Daniel Vetter @ 2012-03-01 20:40 UTC (permalink / raw)
  To: Keith Packard; +Cc: intel-gfx, stable

On Wed, Dec 14, 2011 at 01:57:23PM +0100, Daniel Vetter wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> By clearing the GPU read domains before waiting upon the buffer, we run
> the risk of the wait being interrupted and the domains prematurely
> cleared. The next time we attempt to wait upon the buffer (after
> userspace handles the signal), we believe that the buffer is idle and so
> skip the wait.
> 
> There are a number of bugs across all generations which show signs of an
> overly haste reuse of active buffers.
> 
> Such as:
> 
>   https://bugs.freedesktop.org/show_bug.cgi?id=29046
>   https://bugs.freedesktop.org/show_bug.cgi?id=35863
>   https://bugs.freedesktop.org/show_bug.cgi?id=38952
>   https://bugs.freedesktop.org/show_bug.cgi?id=40282
>   https://bugs.freedesktop.org/show_bug.cgi?id=41098
>   https://bugs.freedesktop.org/show_bug.cgi?id=41102
>   https://bugs.freedesktop.org/show_bug.cgi?id=41284
>   https://bugs.freedesktop.org/show_bug.cgi?id=42141
> 
> A couple of those pre-date i915_gem_object_finish_gpu(), so may be
> unrelated (such as a wild write from a userspace command buffer), but
> this does look like a convincing cause for most of those bugs.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: stable@kernel.org
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>

I really hate it that we have neither a solid testcase nor a decen
explanation for how exactly this fixes issues. But there have been too
many reports from people that this patch here at least improves matters.

/me grumpily merges this for -next.

-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 115+ messages in thread

end of thread, other threads:[~2012-03-01 20:39 UTC | newest]

Thread overview: 115+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
2011-12-14 12:56 ` [PATCH 02/43] drm/i915: don't bail out of intel_wait_ring_buffer too early Daniel Vetter
2011-12-14 18:39   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 03/43] drm/i915: switch ring->id to be a real id Daniel Vetter
2011-12-14 18:42   ` Eugeni Dodonov
2012-01-29 16:40     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 04/43] drm/i915: refactor ring error state capture to use arrays Daniel Vetter
2011-12-14 18:43   ` Eugeni Dodonov
2012-01-29 16:44     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 05/43] drm/i915: collect more per ring error state Daniel Vetter
2011-12-14 18:43   ` Eugeni Dodonov
2012-01-29 16:48     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock Daniel Vetter
2012-01-03 18:51   ` Keith Packard
2012-01-03 19:12     ` Daniel Vetter
2012-01-03 21:13       ` Keith Packard
2012-01-03 21:49         ` Ben Widawsky
2012-01-03 22:23           ` Chris Wilson
2012-01-03 21:49         ` Daniel Vetter
2012-01-03 23:33           ` Keith Packard
2012-01-04 17:11             ` Daniel Vetter
2012-01-04 17:54               ` Keith Packard
2012-01-04 18:12                 ` Daniel Vetter
2012-01-05  2:22                   ` Keith Packard
2012-01-05 11:29                     ` Daniel Vetter
2012-01-05 15:49                       ` Keith Packard
2012-01-05 16:59                         ` Daniel Vetter
2012-01-06  0:29                           ` Keith Packard
2012-01-06  5:41                             ` Keith Packard
2012-01-06 20:43                               ` Keith Packard
2011-12-14 12:57 ` [PATCH 07/43] drm/i915: convert force_wake_get to func pointer in the gpu reset code Daniel Vetter
2011-12-14 12:57 ` [PATCH 08/43] drm/i915: drop register special-casing in forcewake Daniel Vetter
2011-12-14 15:05   ` Chris Wilson
2011-12-15 10:21     ` Daniel Vetter
2011-12-15 10:44       ` Chris Wilson
2011-12-22  0:28         ` [PATCH] drm/i915: clear up I915_(READ|WRITE)_NOTRACE confusion Daniel Vetter
2011-12-22 17:54           ` Keith Packard
2011-12-22 18:16             ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions Daniel Vetter
2011-12-14 15:06   ` Chris Wilson
2011-12-21 20:38     ` Daniel Vetter
2011-12-14 18:58   ` Kenneth Graunke
2011-12-14 12:57 ` [PATCH 10/43] drm/i915/ringbuffer: kill snb blt workaround Daniel Vetter
2012-01-29 16:52   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 11/43] drm/i915: Separate fence pin counting from normal bind pin counting Daniel Vetter
2012-01-29 16:56   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 12/43] drm/i915: don't trash the gtt when running out of fences Daniel Vetter
2011-12-14 15:09   ` Chris Wilson
2012-01-29 16:57     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 13/43] drm/i915: refactor debugfs open function Daniel Vetter
2011-12-14 15:10   ` Chris Wilson
2011-12-14 18:36   ` Eugeni Dodonov
2012-01-29 17:28   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 14/43] drm/i915: refactor debugfs create functions Daniel Vetter
2011-12-14 18:44   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 15/43] drm/i915: add interface to simulate gpu hangs Daniel Vetter
2011-12-14 19:00   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 16/43] drm/i915: rework dev->first_error locking Daniel Vetter
2011-12-14 15:13   ` Chris Wilson
2011-12-14 18:37   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 17/43] drm/i915: destroy existing error_state when simulating a gpu hang Daniel Vetter
2011-12-14 18:45   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 18/43] drm/i915: fix swizzle detection for gen3 Daniel Vetter
2012-01-29 17:36   ` Chris Wilson
2012-01-30 20:20     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 19/43] drm/i915: add debugfs file for swizzling information Daniel Vetter
2012-01-29 17:37   ` Chris Wilson
2012-01-30 20:22     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 20/43] drm/i915: swizzling support for snb/ivb Daniel Vetter
2012-01-29 18:34   ` Chris Wilson
2012-01-31  7:44   ` Ben Widawsky
2012-01-31  8:42     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 21/43] drm/i915: add gen6+ registers to i915_swizzle_info Daniel Vetter
2011-12-14 12:57 ` [PATCH 22/43] drm/i915: prevent division by zero when asking for chipset power Daniel Vetter
2011-12-14 19:05   ` Kenneth Graunke
2011-12-14 12:57 ` [PATCH 23/43] drm/i915: multithreaded forcewake is an ivb+ feature Daniel Vetter
2011-12-14 21:07   ` Eric Anholt
2011-12-14 12:57 ` [PATCH 24/43] drm/i915: capture error_state also for stuck rings Daniel Vetter
2012-01-29 17:36   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 25/43] drm/i915: properly flush the wc buffer in pwrites to phys objects Daniel Vetter
2011-12-14 15:23   ` Chris Wilson
2011-12-14 12:57 ` [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish Daniel Vetter
2011-12-16 20:07   ` Eric Anholt
2012-03-01 20:40   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 27/43] drm/i915: flush overlay regfile writes Daniel Vetter
2011-12-14 15:24   ` Chris Wilson
2011-12-21 20:41     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 28/43] drm/i915: Handle unmappable buffers during error state capture Daniel Vetter
2011-12-14 18:46   ` Eugeni Dodonov
2012-01-31 19:32     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 29/43] drm/i915: remove the i915_batchbuffer_info debugfs file Daniel Vetter
2012-01-29 17:35   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 30/43] drm/i915: reject GTT domain in relocations Daniel Vetter
2012-01-29 17:38   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 31/43] drm/i915: Use kcalloc instead of kzalloc to allocate array Daniel Vetter
2011-12-14 18:48   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 32/43] drm/i915: Avoid using mappable space for relocation processing through the CPU Daniel Vetter
2011-12-14 12:57 ` [PATCH 33/43] drm/i915: fall through pwrite_gtt_slow to the shmem slow path Daniel Vetter
2011-12-14 12:57 ` [PATCH 34/43] drm/i915: rewrite shmem_pwrite_slow to use copy_from_user Daniel Vetter
2011-12-14 12:57 ` [PATCH 35/43] drm/i915: rewrite shmem_pread_slow to use copy_to_user Daniel Vetter
2012-01-30 22:37   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 36/43] agp/intel-gtt: export the scratch page dma address Daniel Vetter
2011-12-14 12:57 ` [PATCH 37/43] agp/intel-gtt: export the gtt pagetable iomapping Daniel Vetter
2011-12-14 12:57 ` [PATCH 38/43] drm/i915: initialization/teardown for the aliasing ppgtt Daniel Vetter
2011-12-14 12:57 ` [PATCH 39/43] drm/i915: ppgtt binding/unbinding support Daniel Vetter
2011-12-14 12:57 ` [PATCH 40/43] drm/i915: ppgtt register definitions Daniel Vetter
2011-12-14 18:58   ` Eugeni Dodonov
2011-12-14 19:01     ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 41/43] drm/i915: ppgtt debugfs info Daniel Vetter
2011-12-14 12:57 ` [PATCH 42/43] drm/i915: per-ring fault reg Daniel Vetter
2011-12-14 19:00   ` Eugeni Dodonov
2012-01-29 22:20     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 43/43] drm/i915: enable ppgtt Daniel Vetter
2011-12-14 15:34   ` Chris Wilson
2011-12-21 20:46     ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.