All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/50] Execlists v2
@ 2014-05-09 12:08 oscar.mateo
  2014-05-09 12:08 ` [PATCH 01/50] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
                   ` (51 more replies)
  0 siblings, 52 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

For a description of this patchset, please check the previous cover letter [1].

Together with this patchset, I'm also submitting an IGT test: gem_execlist [2].

v2:
- Use the same context struct for all the different engines (suggested by Brad Volkin).
- Rename write_tail to submit (suggested by Brad).
- Simplify hardware submission id creation by using LRCA[31:11] as hwCtxId[18:0].
- Non-render contexts are only two pages long (suggested by Damien Lespiau).
- Disable HWSTAM, as no one listens to it anyway (suggested by Damien).
- Do not write PDPs in the context every time, doing it at context creation time is enough.
- Various kmalloc changes in gen8_switch_context_queue (suggested by Damien).
- Module parameter to disable Execlists (as per Damien's patches).
- Update the HW read pointer in CONTEXT_STATUS_PTR (suggested by Damien).
- Fixed gpu reset and basic error reporting (verified by the new gem_error_capture test).
- Fix for unexpected full preemption in some scenarios (instead of lite restore).
- Ack the context switch interrupts as soon as possible (fix by Bob Beckett).
- Move default context backing object creation to intel_init_ring.
- Take into account the second BSD ring.
- Help out the ctx switch interrupt handler by sharing the burden of squashing requests
  together.

What I haven't done in this release:

- Get the context sizes from the CXT_SIZE registers, as suggested by Damien: the BSpec is full 
  of holes with regards to the various CXT_SIZE registers, but the hardcoded values seem pretty
  clear.
- Allocate the ringbuffer together with the context, as suggested by Damien: now that every
  context has NUM_RINGS ringbuffers on it, the advantage of this is not clear anymore.
- Damien pointed out that we are missing the RS context restore, but I don't see any RS values
  that are needed on the first execution (the first save should take care of these).
- I have added a comment to clarify how the context population takes place (MI_LOAD_REGISTER_IMM
  plus <reg,value> pairs) but I haven't provided names for each position (as Jeff Mcgee suggested)
  or created an OUT_BATCH_REG_WRITE(reg, value) (as Daniel Vetter suggested).

[1]
http://lists.freedesktop.org/archives/intel-gfx/2014-March/042563.html
[2]
http://lists.freedesktop.org/archives/intel-gfx/2014-May/044846.html

Ben Widawsky (13):
  drm/i915: s/for_each_ring/for_each_active_ring
  drm/i915: for_each_ring
  drm/i915: Extract trivial parts of ring init (early init)
  drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring
    Contexts)
  drm/i915/bdw: Rework init code for Logical Ring Contexts
  drm/i915/bdw: A bit more advanced context init/fini
  drm/i915/bdw: Populate LR contexts (somewhat)
  drm/i915/bdw: Status page for LR contexts
  drm/i915/bdw: Enable execlists in the hardware
  drm/i915/bdw: LR context ring init
  drm/i915/bdw: GEN8 new ring flush
  drm/i915/bdw: Implement context switching (somewhat)
  drm/i915/bdw: Print context state in debugfs

Michel Thierry (1):
  drm/i915/bdw: Get prepared for a two-stage execlist submit process

Oscar Mateo (33):
  drm/i915: Simplify a couple of functions thanks to for_each_ring
  drm/i915: Extract ringbuffer destroy, make destroy & alloc outside
    accesible
  drm/i915: s/intel_ring_buffer/intel_engine
  drm/i915: Split the ringbuffers and the rings
  drm/i915: Rename functions that mention ringbuffers (meaning rings)
  drm/i915: Plumb the context everywhere in the execbuffer path
  drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit
  drm/i915: Write a new set of context-aware ringbuffer management
    functions
  drm/i915: Final touches to ringbuffer and context plumbing and
    refactoring
  drm/i915: s/write_tail/submit
  drm/i915: Introduce one context backing object per engine
  drm/i915: Make i915_gem_create_context outside accessible
  drm/i915: Option to skip backing object allocation during context
    creation
  drm/i915: Extract context backing object allocation
  drm/i915/bdw: New file for Logical Ring Contexts and Execlists
  drm/i915/bdw: Allocate ringbuffer backing objects for default global
    LRC
  drm/i915/bdw: Allocate ringbuffer for user-created LRCs
  drm/i915/bdw: Deferred creation of user-created LRCs
  drm/i915/bdw: Allow non-default, non-render, user-created LRCs
  drm/i915/bdw: Execlists ring tail writing
  drm/i915/bdw: Set the request context information correctly in the LRC
    case
  drm/i915/bdw: Always write seqno to default context
  drm/i915/bdw: Write the tail pointer, LRC style
  drm/i915/bdw: Don't write PDP in the legacy way when using LRCs
  drm/i915/bdw: Start queueing contexts to be submitted
  drm/i915/bdw: Display execlists info in debugfs
  drm/i915/bdw: Display context backing obj & ringbuffer info in debugfs
  drm/i915/bdw: Document execlists and logical ring contexts
  drm/i915/bdw: Avoid non-lite-restore preemptions
  drm/i915/bdw: Make sure gpu reset still works with Execlists
  drm/i915/bdw: Make sure error capture keeps working with Execlists
  drm/i915/bdw: Help out the ctx switch interrupt handler
  drm/i915/bdw: Enable logical ring contexts

Thomas Daniel (3):
  drm/i915/bdw: Add forcewake lock around ELSP writes
  drm/i915/bdw: LR context switch interrupts
  drm/i915/bdw: Handle context switch events

 drivers/gpu/drm/i915/Makefile              |   1 +
 drivers/gpu/drm/i915/i915_cmd_parser.c     |  16 +-
 drivers/gpu/drm/i915/i915_debugfs.c        | 180 ++++++-
 drivers/gpu/drm/i915/i915_dma.c            |  48 +-
 drivers/gpu/drm/i915/i915_drv.h            |  97 +++-
 drivers/gpu/drm/i915/i915_gem.c            | 172 ++++---
 drivers/gpu/drm/i915/i915_gem_context.c    | 220 +++++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h        |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      |  19 +-
 drivers/gpu/drm/i915/i915_irq.c            | 102 ++--
 drivers/gpu/drm/i915/i915_params.c         |   6 +
 drivers/gpu/drm/i915/i915_reg.h            |  11 +
 drivers/gpu/drm/i915/i915_trace.h          |  26 +-
 drivers/gpu/drm/i915/intel_display.c       |  26 +-
 drivers/gpu/drm/i915/intel_drv.h           |   4 +-
 drivers/gpu/drm/i915/intel_lrc.c           | 729 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_overlay.c       |  12 +-
 drivers/gpu/drm/i915/intel_pm.c            |  18 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 792 +++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 196 ++++---
 22 files changed, 2107 insertions(+), 696 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_lrc.c

-- 
1.9.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 01/50] drm/i915: s/for_each_ring/for_each_active_ring
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 02/50] drm/i915: for_each_ring oscar.mateo
                   ` (50 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

The name "active" was recommended by Chris.

With the ordering change of how we initialize things, it is desirable to
be able to address each ring, whether initialized or not.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Several rebases.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     | 12 ++++++------
 drivers/gpu/drm/i915/i915_drv.h         |  2 +-
 drivers/gpu/drm/i915/i915_gem.c         | 16 ++++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c     | 10 +++++-----
 drivers/gpu/drm/i915/i915_irq.c         | 10 +++++-----
 drivers/gpu/drm/i915/intel_pm.c         |  8 ++++----
 drivers/gpu/drm/i915/intel_ringbuffer.c |  2 +-
 8 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 18b3565..103e62c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -571,7 +571,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 		return ret;
 
 	count = 0;
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		if (list_empty(&ring->request_list))
 			continue;
 
@@ -615,7 +615,7 @@ static int i915_gem_seqno_info(struct seq_file *m, void *data)
 		return ret;
 	intel_runtime_pm_get(dev_priv);
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		i915_ring_seqno_info(m, ring);
 
 	intel_runtime_pm_put(dev_priv);
@@ -752,7 +752,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 		seq_printf(m, "Graphics Interrupt mask:		%08x\n",
 			   I915_READ(GTIMR));
 	}
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		if (INTEL_INFO(dev)->gen >= 6) {
 			seq_printf(m,
 				   "Graphics Interrupt mask (%s):	%08x\n",
@@ -1703,7 +1703,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 
 		seq_puts(m, "HW context ");
 		describe_ctx(m, ctx);
-		for_each_ring(ring, dev_priv, i)
+		for_each_active_ring(ring, dev_priv, i)
 			if (ring->default_context == ctx)
 				seq_printf(m, "(default context %s) ", ring->name);
 
@@ -1835,7 +1835,7 @@ static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 
 	seq_printf(m, "Page directories: %d\n", ppgtt->num_pd_pages);
 	seq_printf(m, "Page tables: %d\n", ppgtt->num_pd_entries);
-	for_each_ring(ring, dev_priv, unused) {
+	for_each_active_ring(ring, dev_priv, unused) {
 		seq_printf(m, "%s\n", ring->name);
 		for (i = 0; i < 4; i++) {
 			u32 offset = 0x270 + i * 8;
@@ -1857,7 +1857,7 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen == 6)
 		seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(GFX_MODE));
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		seq_printf(m, "%s\n", ring->name);
 		if (INTEL_INFO(dev)->gen == 7)
 			seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(RING_MODE_GEN7(ring)));
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 26253c0..a53a028 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1545,7 +1545,7 @@ static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
 }
 
 /* Iterate over initialised rings */
-#define for_each_ring(ring__, dev_priv__, i__) \
+#define for_each_active_ring(ring__, dev_priv__, i__) \
 	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
 		if (((ring__) = &(dev_priv__)->ring[(i__)]), intel_ring_initialized((ring__)))
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8fd1824..ce941cf 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2108,7 +2108,7 @@ i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
 	int ret, i, j;
 
 	/* Carefully retire all requests without writing to the rings */
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		ret = intel_ring_idle(ring);
 		if (ret)
 			return ret;
@@ -2116,7 +2116,7 @@ i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
 	i915_gem_retire_requests(dev);
 
 	/* Finally reset hw state */
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		intel_ring_init_seqno(ring, seqno);
 
 		for (j = 0; j < ARRAY_SIZE(ring->semaphore.sync_seqno); j++)
@@ -2434,10 +2434,10 @@ void i915_gem_reset(struct drm_device *dev)
 	 * them for finding the guilty party. As the requests only borrow
 	 * their reference to the objects, the inspection must be done first.
 	 */
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		i915_gem_reset_ring_status(dev_priv, ring);
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		i915_gem_reset_ring_cleanup(dev_priv, ring);
 
 	i915_gem_context_reset(dev);
@@ -2516,7 +2516,7 @@ i915_gem_retire_requests(struct drm_device *dev)
 	bool idle = true;
 	int i;
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		i915_gem_retire_requests_ring(ring);
 		idle &= list_empty(&ring->request_list);
 	}
@@ -2804,7 +2804,7 @@ int i915_gpu_idle(struct drm_device *dev)
 	int ret, i;
 
 	/* Flush everything onto the inactive list. */
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		ret = i915_switch_context(ring, ring->default_context);
 		if (ret)
 			return ret;
@@ -4261,7 +4261,7 @@ i915_gem_stop_ringbuffers(struct drm_device *dev)
 	struct intel_ring_buffer *ring;
 	int i;
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		intel_stop_ring_buffer(ring);
 }
 
@@ -4533,7 +4533,7 @@ i915_gem_cleanup_ringbuffer(struct drm_device *dev)
 	struct intel_ring_buffer *ring;
 	int i;
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		intel_cleanup_ring_buffer(ring);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f77b4c1..948df20 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -483,7 +483,7 @@ int i915_gem_context_enable(struct drm_i915_private *dev_priv)
 
 	BUG_ON(!dev_priv->ring[RCS].default_context);
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		ret = i915_switch_context(ring, ring->default_context);
 		if (ret)
 			return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 72b1bf8..1dff805 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -835,7 +835,7 @@ static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 	struct intel_ring_buffer *ring;
 	int j, ret;
 
-	for_each_ring(ring, dev_priv, j) {
+	for_each_active_ring(ring, dev_priv, j) {
 		I915_WRITE(RING_MODE_GEN7(ring),
 			   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
 
@@ -852,7 +852,7 @@ static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 	return 0;
 
 err_out:
-	for_each_ring(ring, dev_priv, j)
+	for_each_active_ring(ring, dev_priv, j)
 		I915_WRITE(RING_MODE_GEN7(ring),
 			   _MASKED_BIT_DISABLE(GFX_PPGTT_ENABLE));
 	return ret;
@@ -878,7 +878,7 @@ static int gen7_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 	}
 	I915_WRITE(GAM_ECOCHK, ecochk);
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		int ret;
 		/* GFX_MODE is per-ring on gen7+ */
 		I915_WRITE(RING_MODE_GEN7(ring),
@@ -917,7 +917,7 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 
 	I915_WRITE(GFX_MODE, _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		int ret = ppgtt->switch_mm(ppgtt, ring, true);
 		if (ret)
 			return ret;
@@ -1275,7 +1275,7 @@ void i915_check_and_clear_faults(struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen < 6)
 		return;
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		u32 fault_reg;
 		fault_reg = I915_READ(RING_FAULT_REG(ring));
 		if (fault_reg & RING_FAULT_VALID) {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2d76183..4a8e8cb 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2122,7 +2122,7 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 	 */
 
 	/* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		wake_up_all(&ring->irq_queue);
 
 	/* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
@@ -2591,7 +2591,7 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
 	} else {
 		u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
 
-		for_each_ring(signaller, dev_priv, i) {
+		for_each_active_ring(signaller, dev_priv, i) {
 			if(ring == signaller)
 				continue;
 
@@ -2674,7 +2674,7 @@ static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
 	struct intel_ring_buffer *ring;
 	int i;
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		ring->hangcheck.deadlock = false;
 }
 
@@ -2746,7 +2746,7 @@ static void i915_hangcheck_elapsed(unsigned long data)
 	if (!i915.enable_hangcheck)
 		return;
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		u64 acthd;
 		u32 seqno;
 		bool busy = true;
@@ -2825,7 +2825,7 @@ static void i915_hangcheck_elapsed(unsigned long data)
 		busy_count += busy;
 	}
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		if (ring->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG) {
 			DRM_INFO("%s on %s\n",
 				 stuck[i] ? "stuck" : "no progress",
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 834c49c..acfded3 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3400,7 +3400,7 @@ static void gen8_enable_rps(struct drm_device *dev)
 	I915_WRITE(GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16);
 	I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
 	I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */
-	for_each_ring(ring, dev_priv, unused)
+	for_each_active_ring(ring, dev_priv, unused)
 		I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
 	I915_WRITE(GEN6_RC_SLEEP, 0);
 	I915_WRITE(GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */
@@ -3494,7 +3494,7 @@ static void gen6_enable_rps(struct drm_device *dev)
 	I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000);
 	I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25);
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
 
 	I915_WRITE(GEN6_RC_SLEEP, 0);
@@ -3819,7 +3819,7 @@ static void valleyview_enable_rps(struct drm_device *dev)
 	I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000);
 	I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25);
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
 
 	I915_WRITE(GEN6_RC6_THRESHOLD, 0x557);
@@ -4435,7 +4435,7 @@ bool i915_gpu_busy(void)
 		goto out_unlock;
 	dev_priv = i915_mch_dev;
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		ret |= !list_empty(&ring->request_list);
 
 out_unlock:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 40a7aa4..a112971 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -685,7 +685,7 @@ static int gen6_signal(struct intel_ring_buffer *signaller,
 		return ret;
 #undef MBOX_UPDATE_DWORDS
 
-	for_each_ring(useless, dev_priv, i) {
+	for_each_active_ring(useless, dev_priv, i) {
 		u32 mbox_reg = signaller->semaphore.mbox.signal[i];
 		if (mbox_reg != GEN6_NOSYNC) {
 			intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 02/50] drm/i915: for_each_ring
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
  2014-05-09 12:08 ` [PATCH 01/50] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-13 13:25   ` Daniel Vetter
  2014-05-19 16:33   ` Volkin, Bradley D
  2014-05-09 12:08 ` [PATCH 03/50] drm/i915: Simplify a couple of functions thanks to for_each_ring oscar.mateo
                   ` (49 subsequent siblings)
  51 siblings, 2 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

for_each_ring() iterates over all rings supported by the hardware, not
just those which have been initialized as in for_each_active_ring()

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a53a028..b1725c6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1544,6 +1544,17 @@ static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
 	return dev->dev_private;
 }
 
+/* NB: Typically you want to use for_each_ring in init code before ringbuffers
+ * are setup, or in debug code. for_each_active_ring is more suited for code
+ * which is dynamically handling active rings, ie. normal code. In most
+ * (currently all cases except on pre-production hardware) for_each_ring will
+ * work even if it's a bad idea to use it - so be careful.
+ */
+#define for_each_ring(ring__, dev_priv__, i__) \
+	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
+		if (((ring__) = &(dev_priv__)->ring[(i__)]), \
+		    INTEL_INFO((dev_priv__)->dev)->ring_mask & (1<<(i__)))
+
 /* Iterate over initialised rings */
 #define for_each_active_ring(ring__, dev_priv__, i__) \
 	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 03/50] drm/i915: Simplify a couple of functions thanks to for_each_ring
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
  2014-05-09 12:08 ` [PATCH 01/50] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
  2014-05-09 12:08 ` [PATCH 02/50] drm/i915: for_each_ring oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
                   ` (48 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

This patch should have no functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 948df20..014fb8f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -359,12 +359,12 @@ err_destroy:
 void i915_gem_context_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *ring;
 	int i;
 
 	/* Prevent the hardware from restoring the last context (which hung) on
 	 * the next switch */
-	for (i = 0; i < I915_NUM_RINGS; i++) {
-		struct intel_ring_buffer *ring = &dev_priv->ring[i];
+	for_each_ring(ring, dev_priv, i) {
 		struct i915_hw_context *dctx = ring->default_context;
 
 		/* Do a fake switch to the default context */
@@ -392,7 +392,8 @@ int i915_gem_context_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx;
-	int i;
+	struct intel_ring_buffer *ring;
+	int unused;
 
 	/* Init should only be called once per module load. Eventually the
 	 * restriction on the context_disabled check can be loosened. */
@@ -416,8 +417,8 @@ int i915_gem_context_init(struct drm_device *dev)
 	}
 
 	/* NB: RCS will hold a ref for all rings */
-	for (i = 0; i < I915_NUM_RINGS; i++)
-		dev_priv->ring[i].default_context = ctx;
+	for_each_ring(ring, dev_priv, unused)
+		ring->default_context = ctx;
 
 	DRM_DEBUG_DRIVER("%s context support initialized\n", dev_priv->hw_context_size ? "HW" : "fake");
 	return 0;
@@ -427,7 +428,8 @@ void i915_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *dctx = dev_priv->ring[RCS].default_context;
-	int i;
+	struct intel_ring_buffer *ring;
+	int unused;
 
 	if (dctx->obj) {
 		/* The only known way to stop the gpu from accessing the hw context is
@@ -451,9 +453,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 		}
 	}
 
-	for (i = 0; i < I915_NUM_RINGS; i++) {
-		struct intel_ring_buffer *ring = &dev_priv->ring[i];
-
+	for_each_ring(ring, dev_priv, unused) {
 		if (ring->last_context)
 			i915_gem_context_unreference(ring->last_context);
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init)
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (2 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 03/50] drm/i915: Simplify a couple of functions thanks to for_each_ring oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-13 13:26   ` Daniel Vetter
  2014-05-09 12:08 ` [PATCH 05/50] drm/i915: Extract ringbuffer destroy, make destroy & alloc outside accesible oscar.mateo
                   ` (47 subsequent siblings)
  51 siblings, 1 reply; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

It's beneficial to be able to get a name, base, and id before we've
actually initialized the rings. This ability was effectively destroyed
in the ringbuffer fire which Daniel started.

With the simple early init function, that ability is restored.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: The Full PPGTT series have moved things around a little bit.
Also, don't forget the VEBOX.

v3: Checking ring->dev is not a good way to test if a ring is
initialized...

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |  2 ++
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 60 ++++++++++++++++++---------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 4 files changed, 37 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ce941cf..6ef53bd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4502,6 +4502,8 @@ int i915_gem_init(struct drm_device *dev)
 
 	i915_gem_init_global_gtt(dev);
 
+	intel_init_rings_early(dev);
+
 	ret = i915_gem_context_init(dev);
 	if (ret) {
 		mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2d81985..8f37238 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -886,7 +886,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 	for (i = 0; i < I915_NUM_RINGS; i++) {
 		struct intel_ring_buffer *ring = &dev_priv->ring[i];
 
-		if (ring->dev == NULL)
+		if (!intel_ring_initialized(ring))
 			continue;
 
 		error->ring[i].valid = true;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index a112971..fc737c8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1417,7 +1417,6 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 {
 	int ret;
 
-	ring->dev = dev;
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	ring->size = 32 * PAGE_SIZE;
@@ -1908,10 +1907,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
 
-	ring->name = "render ring";
-	ring->id = RCS;
-	ring->mmio_base = RENDER_RING_BASE;
-
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
@@ -2019,10 +2014,6 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
 	int ret;
 
-	ring->name = "render ring";
-	ring->id = RCS;
-	ring->mmio_base = RENDER_RING_BASE;
-
 	if (INTEL_INFO(dev)->gen >= 6) {
 		/* non-kms not supported on gen6+ */
 		return -ENODEV;
@@ -2056,7 +2047,6 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 	ring->init = init_render_ring;
 	ring->cleanup = render_ring_cleanup;
 
-	ring->dev = dev;
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 
@@ -2086,12 +2076,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[VCS];
 
-	ring->name = "bsd ring";
-	ring->id = VCS;
-
 	ring->write_tail = ring_write_tail;
 	if (INTEL_INFO(dev)->gen >= 6) {
-		ring->mmio_base = GEN6_BSD_RING_BASE;
 		/* gen6 bsd needs a special wa for tail updates */
 		if (IS_GEN6(dev))
 			ring->write_tail = gen6_bsd_ring_write_tail;
@@ -2132,7 +2118,6 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 		ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
 		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	} else {
-		ring->mmio_base = BSD_RING_BASE;
 		ring->flush = bsd_ring_flush;
 		ring->add_request = i9xx_add_request;
 		ring->get_seqno = ring_get_seqno;
@@ -2167,11 +2152,7 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 		return -EINVAL;
 	}
 
-	ring->name = "bds2_ring";
-	ring->id = VCS2;
-
 	ring->write_tail = ring_write_tail;
-	ring->mmio_base = GEN8_BSD2_RING_BASE;
 	ring->flush = gen6_bsd_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
@@ -2210,10 +2191,6 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[BCS];
 
-	ring->name = "blitter ring";
-	ring->id = BCS;
-
-	ring->mmio_base = BLT_RING_BASE;
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
@@ -2259,10 +2236,6 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[VECS];
 
-	ring->name = "video enhancement ring";
-	ring->id = VECS;
-
-	ring->mmio_base = VEBOX_RING_BASE;
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
@@ -2351,3 +2324,36 @@ intel_stop_ring_buffer(struct intel_ring_buffer *ring)
 
 	stop_ring(ring);
 }
+
+void intel_init_rings_early(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	dev_priv->ring[RCS].name = "render ring";
+	dev_priv->ring[RCS].id = RCS;
+	dev_priv->ring[RCS].mmio_base = RENDER_RING_BASE;
+	dev_priv->ring[RCS].dev = dev;
+
+	dev_priv->ring[BCS].name = "blitter ring";
+	dev_priv->ring[BCS].id = BCS;
+	dev_priv->ring[BCS].mmio_base = BLT_RING_BASE;
+	dev_priv->ring[BCS].dev = dev;
+
+	dev_priv->ring[VCS].name = "bsd ring";
+	dev_priv->ring[VCS].id = VCS;
+	if (INTEL_INFO(dev)->gen >= 6)
+		dev_priv->ring[VCS].mmio_base = GEN6_BSD_RING_BASE;
+	else
+		dev_priv->ring[VCS].mmio_base = BSD_RING_BASE;
+	dev_priv->ring[VCS].dev = dev;
+
+	dev_priv->ring[VCS2].name = "bds2_ring";
+	dev_priv->ring[VCS2].id = VCS2;
+	dev_priv->ring[VCS2].mmio_base = GEN8_BSD2_RING_BASE;
+	dev_priv->ring[VCS2].dev = dev;
+
+	dev_priv->ring[VECS].name = "video enhancement ring";
+	dev_priv->ring[VECS].id = VECS;
+	dev_priv->ring[VECS].mmio_base = VEBOX_RING_BASE;
+	dev_priv->ring[VECS].dev = dev;
+}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 72c3c15..b1bf767 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -297,6 +297,7 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno);
 int intel_ring_flush_all_caches(struct intel_ring_buffer *ring);
 int intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring);
 
+void intel_init_rings_early(struct drm_device *dev);
 int intel_init_render_ring_buffer(struct drm_device *dev);
 int intel_init_bsd_ring_buffer(struct drm_device *dev);
 int intel_init_bsd2_ring_buffer(struct drm_device *dev);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 05/50] drm/i915: Extract ringbuffer destroy, make destroy & alloc outside accesible
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (3 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
                   ` (46 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 21 ++++++++++++++-------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  3 +++
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index fc737c8..5d61923 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1368,7 +1368,18 @@ static int init_phys_status_page(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-static int allocate_ring_buffer(struct intel_ring_buffer *ring)
+void intel_destroy_ring_buffer(struct intel_ring_buffer *ring)
+{
+	if (!ring->obj)
+		return;
+
+	iounmap(ring->virtual_start);
+	i915_gem_object_ggtt_unpin(ring->obj);
+	drm_gem_object_unreference(&ring->obj->base);
+	ring->obj = NULL;
+}
+
+int intel_allocate_ring_buffer(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -1435,7 +1446,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 			return ret;
 	}
 
-	ret = allocate_ring_buffer(ring);
+	ret = intel_allocate_ring_buffer(ring);
 	if (ret) {
 		DRM_ERROR("Failed to allocate ringbuffer %s: %d\n", ring->name, ret);
 		return ret;
@@ -1464,11 +1475,7 @@ void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
 	intel_stop_ring_buffer(ring);
 	WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0);
 
-	iounmap(ring->virtual_start);
-
-	i915_gem_object_ggtt_unpin(ring->obj);
-	drm_gem_object_unreference(&ring->obj->base);
-	ring->obj = NULL;
+	intel_destroy_ring_buffer(ring);
 	ring->preallocated_lazy_request = NULL;
 	ring->outstanding_lazy_seqno = 0;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index b1bf767..680e451 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -307,6 +307,9 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev);
 u64 intel_ring_get_active_head(struct intel_ring_buffer *ring);
 void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
 
+void intel_destroy_ring_buffer(struct intel_ring_buffer *ring);
+int intel_allocate_ring_buffer(struct intel_ring_buffer *ring);
+
 static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
 {
 	return ring->tail;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (4 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 05/50] drm/i915: Extract ringbuffer destroy, make destroy & alloc outside accesible oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-13 13:28   ` Daniel Vetter
  2014-05-09 12:08 ` [PATCH 07/50] drm/i915: Split the ringbuffers and the rings oscar.mateo
                   ` (45 subsequent siblings)
  51 siblings, 1 reply; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

In the upcoming patches, we plan to break the correlation between
engines (a.k.a. rings) and ringbuffers, so it makes sense to
refactor the code and make the change obvious.

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_cmd_parser.c     |  16 +--
 drivers/gpu/drm/i915/i915_debugfs.c        |  16 +--
 drivers/gpu/drm/i915/i915_dma.c            |  10 +-
 drivers/gpu/drm/i915/i915_drv.h            |  32 +++---
 drivers/gpu/drm/i915/i915_gem.c            |  58 +++++------
 drivers/gpu/drm/i915/i915_gem_context.c    |  14 +--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  18 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  18 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.h        |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      |   6 +-
 drivers/gpu/drm/i915/i915_irq.c            |  28 ++---
 drivers/gpu/drm/i915/i915_trace.h          |  26 ++---
 drivers/gpu/drm/i915/intel_display.c       |  14 +--
 drivers/gpu/drm/i915/intel_drv.h           |   4 +-
 drivers/gpu/drm/i915/intel_overlay.c       |  12 +--
 drivers/gpu/drm/i915/intel_pm.c            |  10 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 158 ++++++++++++++---------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  76 +++++++-------
 18 files changed, 259 insertions(+), 259 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 69d34e4..3234d36 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -498,7 +498,7 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
 	return 0;
 }
 
-static bool validate_cmds_sorted(struct intel_ring_buffer *ring)
+static bool validate_cmds_sorted(struct intel_engine *ring)
 {
 	int i;
 	bool ret = true;
@@ -550,7 +550,7 @@ static bool check_sorted(int ring_id, const u32 *reg_table, int reg_count)
 	return ret;
 }
 
-static bool validate_regs_sorted(struct intel_ring_buffer *ring)
+static bool validate_regs_sorted(struct intel_engine *ring)
 {
 	return check_sorted(ring->id, ring->reg_table, ring->reg_count) &&
 		check_sorted(ring->id, ring->master_reg_table,
@@ -562,10 +562,10 @@ static bool validate_regs_sorted(struct intel_ring_buffer *ring)
  * @ring: the ringbuffer to initialize
  *
  * Optionally initializes fields related to batch buffer command parsing in the
- * struct intel_ring_buffer based on whether the platform requires software
+ * struct intel_engine based on whether the platform requires software
  * command parsing.
  */
-void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring)
+void i915_cmd_parser_init_ring(struct intel_engine *ring)
 {
 	if (!IS_GEN7(ring->dev))
 		return;
@@ -664,7 +664,7 @@ find_cmd_in_table(const struct drm_i915_cmd_table *table,
  * ring's default length encoding and returns default_desc.
  */
 static const struct drm_i915_cmd_descriptor*
-find_cmd(struct intel_ring_buffer *ring,
+find_cmd(struct intel_engine *ring,
 	 u32 cmd_header,
 	 struct drm_i915_cmd_descriptor *default_desc)
 {
@@ -744,7 +744,7 @@ finish:
  *
  * Return: true if the ring requires software command parsing
  */
-bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
+bool i915_needs_cmd_parser(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
@@ -763,7 +763,7 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
 	return (i915.enable_cmd_parser == 1);
 }
 
-static bool check_cmd(const struct intel_ring_buffer *ring,
+static bool check_cmd(const struct intel_engine *ring,
 		      const struct drm_i915_cmd_descriptor *desc,
 		      const u32 *cmd,
 		      const bool is_master,
@@ -865,7 +865,7 @@ static bool check_cmd(const struct intel_ring_buffer *ring,
  *
  * Return: non-zero if the parser finds violations or otherwise fails
  */
-int i915_parse_cmds(struct intel_ring_buffer *ring,
+int i915_parse_cmds(struct intel_engine *ring,
 		    struct drm_i915_gem_object *batch_obj,
 		    u32 batch_start_offset,
 		    bool is_master)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 103e62c..0052460 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -562,7 +562,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct drm_i915_gem_request *gem_request;
 	int ret, count, i;
 
@@ -594,7 +594,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 }
 
 static void i915_ring_seqno_info(struct seq_file *m,
-				 struct intel_ring_buffer *ring)
+				 struct intel_engine *ring)
 {
 	if (ring->get_seqno) {
 		seq_printf(m, "Current sequence (%s): %u\n",
@@ -607,7 +607,7 @@ static int i915_gem_seqno_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -630,7 +630,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i, pipe;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -800,7 +800,7 @@ static int i915_hws_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	const u32 *hws;
 	int i;
 
@@ -1677,7 +1677,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct i915_hw_context *ctx;
 	int ret, i;
 
@@ -1826,7 +1826,7 @@ static int per_file_ctx(int id, void *ptr, void *data)
 static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
 	int unused, i;
 
@@ -1850,7 +1850,7 @@ static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct drm_file *file;
 	int i;
 
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index d02c8de..5263d63 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -119,7 +119,7 @@ static void i915_write_hws_pga(struct drm_device *dev)
 static void i915_free_hws(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	struct intel_engine *ring = LP_RING(dev_priv);
 
 	if (dev_priv->status_page_dmah) {
 		drm_pci_free(dev, dev_priv->status_page_dmah);
@@ -139,7 +139,7 @@ void i915_kernel_lost_context(struct drm_device * dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_master_private *master_priv;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	struct intel_engine *ring = LP_RING(dev_priv);
 
 	/*
 	 * We should never lose context on the ring with modesetting
@@ -234,7 +234,7 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
 static int i915_dma_resume(struct drm_device * dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	struct intel_engine *ring = LP_RING(dev_priv);
 
 	DRM_DEBUG_DRIVER("%s\n", __func__);
 
@@ -782,7 +782,7 @@ static int i915_wait_irq(struct drm_device * dev, int irq_nr)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_master_private *master_priv = dev->primary->master->driver_priv;
 	int ret = 0;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	struct intel_engine *ring = LP_RING(dev_priv);
 
 	DRM_DEBUG_DRIVER("irq_nr=%d breadcrumb=%d\n", irq_nr,
 		  READ_BREADCRUMB(dev_priv));
@@ -1073,7 +1073,7 @@ static int i915_set_status_page(struct drm_device *dev, void *data,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	drm_i915_hws_addr_t *hws = data;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		return -ENODEV;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b1725c6..3b7a36f9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -594,7 +594,7 @@ struct i915_hw_context {
 	bool is_initialized;
 	uint8_t remap_slice;
 	struct drm_i915_file_private *file_priv;
-	struct intel_ring_buffer *last_ring;
+	struct intel_engine *last_ring;
 	struct drm_i915_gem_object *obj;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_address_space *vm;
@@ -1354,7 +1354,7 @@ struct drm_i915_private {
 	wait_queue_head_t gmbus_wait_queue;
 
 	struct pci_dev *bridge_dev;
-	struct intel_ring_buffer ring[I915_NUM_RINGS];
+	struct intel_engine ring[I915_NUM_RINGS];
 	uint32_t last_seqno, next_seqno;
 
 	drm_dma_handle_t *status_page_dmah;
@@ -1675,7 +1675,7 @@ struct drm_i915_gem_object {
 	void *dma_buf_vmapping;
 	int vmapping_count;
 
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 
 	/** Breadcrumb of last rendering to the buffer. */
 	uint32_t last_read_seqno;
@@ -1714,7 +1714,7 @@ struct drm_i915_gem_object {
  */
 struct drm_i915_gem_request {
 	/** On Which ring this request was generated */
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 
 	/** GEM sequence number associated with this request. */
 	uint32_t seqno;
@@ -1755,7 +1755,7 @@ struct drm_i915_file_private {
 
 	struct i915_hw_context *private_default_ctx;
 	atomic_t rps_wait_boost;
-	struct  intel_ring_buffer *bsd_ring;
+	struct  intel_engine *bsd_ring;
 };
 
 /*
@@ -2182,9 +2182,9 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
-			 struct intel_ring_buffer *to);
+			 struct intel_engine *to);
 void i915_vma_move_to_active(struct i915_vma *vma,
-			     struct intel_ring_buffer *ring);
+			     struct intel_engine *ring);
 int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
 			 struct drm_mode_create_dumb *args);
@@ -2226,7 +2226,7 @@ i915_gem_object_unpin_fence(struct drm_i915_gem_object *obj)
 }
 
 struct drm_i915_gem_request *
-i915_gem_find_active_request(struct intel_ring_buffer *ring);
+i915_gem_find_active_request(struct intel_engine *ring);
 
 bool i915_gem_retire_requests(struct drm_device *dev);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
@@ -2264,18 +2264,18 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_init(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
-int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice);
+int i915_gem_l3_remap(struct intel_engine *ring, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
-int __i915_add_request(struct intel_ring_buffer *ring,
+int __i915_add_request(struct intel_engine *ring,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *batch_obj,
 		       u32 *seqno);
 #define i915_add_request(ring, seqno) \
 	__i915_add_request(ring, NULL, NULL, seqno)
-int __must_check i915_wait_seqno(struct intel_ring_buffer *ring,
+int __must_check i915_wait_seqno(struct intel_engine *ring,
 				 uint32_t seqno);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
 int __must_check
@@ -2286,7 +2286,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
 int __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
-				     struct intel_ring_buffer *pipelined);
+				     struct intel_engine *pipelined);
 void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj);
 int i915_gem_attach_phys_object(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
@@ -2388,7 +2388,7 @@ void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 int i915_gem_context_enable(struct drm_i915_private *dev_priv);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
-int i915_switch_context(struct intel_ring_buffer *ring,
+int i915_switch_context(struct intel_engine *ring,
 			struct i915_hw_context *to);
 struct i915_hw_context *
 i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
@@ -2497,9 +2497,9 @@ const char *i915_cache_level_str(int type);
 
 /* i915_cmd_parser.c */
 int i915_cmd_parser_get_version(void);
-void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring);
-bool i915_needs_cmd_parser(struct intel_ring_buffer *ring);
-int i915_parse_cmds(struct intel_ring_buffer *ring,
+void i915_cmd_parser_init_ring(struct intel_engine *ring);
+bool i915_needs_cmd_parser(struct intel_engine *ring);
+int i915_parse_cmds(struct intel_engine *ring,
 		    struct drm_i915_gem_object *batch_obj,
 		    u32 batch_start_offset,
 		    bool is_master);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6ef53bd..a3b697b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -64,7 +64,7 @@ static unsigned long i915_gem_inactive_scan(struct shrinker *shrinker,
 static unsigned long i915_gem_purge(struct drm_i915_private *dev_priv, long target);
 static unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv);
 static void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
-static void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
+static void i915_gem_retire_requests_ring(struct intel_engine *ring);
 
 static bool cpu_cache_is_coherent(struct drm_device *dev,
 				  enum i915_cache_level level)
@@ -977,7 +977,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * equal.
  */
 static int
-i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
+i915_gem_check_olr(struct intel_engine *ring, u32 seqno)
 {
 	int ret;
 
@@ -996,7 +996,7 @@ static void fake_irq(unsigned long data)
 }
 
 static bool missed_irq(struct drm_i915_private *dev_priv,
-		       struct intel_ring_buffer *ring)
+		       struct intel_engine *ring)
 {
 	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
 }
@@ -1027,7 +1027,7 @@ static bool can_wait_boost(struct drm_i915_file_private *file_priv)
  * Returns 0 if the seqno was found within the alloted time. Else returns the
  * errno with remaining time filled in timeout argument.
  */
-static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
+static int __wait_seqno(struct intel_engine *ring, u32 seqno,
 			unsigned reset_counter,
 			bool interruptible,
 			struct timespec *timeout,
@@ -1134,7 +1134,7 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
  * request and object lists appropriately for that event.
  */
 int
-i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno)
+i915_wait_seqno(struct intel_engine *ring, uint32_t seqno)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1159,7 +1159,7 @@ i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno)
 
 static int
 i915_gem_object_wait_rendering__tail(struct drm_i915_gem_object *obj,
-				     struct intel_ring_buffer *ring)
+				     struct intel_engine *ring)
 {
 	if (!obj->active)
 		return 0;
@@ -1184,7 +1184,7 @@ static __must_check int
 i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 			       bool readonly)
 {
-	struct intel_ring_buffer *ring = obj->ring;
+	struct intel_engine *ring = obj->ring;
 	u32 seqno;
 	int ret;
 
@@ -1209,7 +1209,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = obj->ring;
+	struct intel_engine *ring = obj->ring;
 	unsigned reset_counter;
 	u32 seqno;
 	int ret;
@@ -2011,7 +2011,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 
 static void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
-			       struct intel_ring_buffer *ring)
+			       struct intel_engine *ring)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -2049,7 +2049,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 }
 
 void i915_vma_move_to_active(struct i915_vma *vma,
-			     struct intel_ring_buffer *ring)
+			     struct intel_engine *ring)
 {
 	list_move_tail(&vma->mm_list, &vma->vm->active_list);
 	return i915_gem_object_move_to_active(vma->obj, ring);
@@ -2090,7 +2090,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 static void
 i915_gem_object_retire(struct drm_i915_gem_object *obj)
 {
-	struct intel_ring_buffer *ring = obj->ring;
+	struct intel_engine *ring = obj->ring;
 
 	if (ring == NULL)
 		return;
@@ -2104,7 +2104,7 @@ static int
 i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i, j;
 
 	/* Carefully retire all requests without writing to the rings */
@@ -2170,7 +2170,7 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
 	return 0;
 }
 
-int __i915_add_request(struct intel_ring_buffer *ring,
+int __i915_add_request(struct intel_engine *ring,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *obj,
 		       u32 *out_seqno)
@@ -2330,7 +2330,7 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request)
 }
 
 struct drm_i915_gem_request *
-i915_gem_find_active_request(struct intel_ring_buffer *ring)
+i915_gem_find_active_request(struct intel_engine *ring)
 {
 	struct drm_i915_gem_request *request;
 	u32 completed_seqno;
@@ -2348,7 +2348,7 @@ i915_gem_find_active_request(struct intel_ring_buffer *ring)
 }
 
 static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv,
-				       struct intel_ring_buffer *ring)
+				       struct intel_engine *ring)
 {
 	struct drm_i915_gem_request *request;
 	bool ring_hung;
@@ -2367,7 +2367,7 @@ static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv,
 }
 
 static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
-					struct intel_ring_buffer *ring)
+					struct intel_engine *ring)
 {
 	while (!list_empty(&ring->active_list)) {
 		struct drm_i915_gem_object *obj;
@@ -2426,7 +2426,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	/*
@@ -2449,7 +2449,7 @@ void i915_gem_reset(struct drm_device *dev)
  * This function clears the request list as sequence numbers are passed.
  */
 static void
-i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
+i915_gem_retire_requests_ring(struct intel_engine *ring)
 {
 	uint32_t seqno;
 
@@ -2512,7 +2512,7 @@ bool
 i915_gem_retire_requests(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	bool idle = true;
 	int i;
 
@@ -2606,7 +2606,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_wait *args = data;
 	struct drm_i915_gem_object *obj;
-	struct intel_ring_buffer *ring = NULL;
+	struct intel_engine *ring = NULL;
 	struct timespec timeout_stack, *timeout = NULL;
 	unsigned reset_counter;
 	u32 seqno = 0;
@@ -2677,9 +2677,9 @@ out:
  */
 int
 i915_gem_object_sync(struct drm_i915_gem_object *obj,
-		     struct intel_ring_buffer *to)
+		     struct intel_engine *to)
 {
-	struct intel_ring_buffer *from = obj->ring;
+	struct intel_engine *from = obj->ring;
 	u32 seqno;
 	int ret, idx;
 
@@ -2800,7 +2800,7 @@ int i915_vma_unbind(struct i915_vma *vma)
 int i915_gpu_idle(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i;
 
 	/* Flush everything onto the inactive list. */
@@ -3659,7 +3659,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
 int
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
-				     struct intel_ring_buffer *pipelined)
+				     struct intel_engine *pipelined)
 {
 	u32 old_read_domains, old_write_domain;
 	int ret;
@@ -3812,7 +3812,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	unsigned long recent_enough = jiffies - msecs_to_jiffies(20);
 	struct drm_i915_gem_request *request;
-	struct intel_ring_buffer *ring = NULL;
+	struct intel_engine *ring = NULL;
 	unsigned reset_counter;
 	u32 seqno = 0;
 	int ret;
@@ -4258,7 +4258,7 @@ static void
 i915_gem_stop_ringbuffers(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	for_each_active_ring(ring, dev_priv, i)
@@ -4307,7 +4307,7 @@ err:
 	return ret;
 }
 
-int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
+int i915_gem_l3_remap(struct intel_engine *ring, int slice)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -4532,7 +4532,7 @@ void
 i915_gem_cleanup_ringbuffer(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	for_each_active_ring(ring, dev_priv, i)
@@ -4608,7 +4608,7 @@ i915_gem_lastclose(struct drm_device *dev)
 }
 
 static void
-init_ring_lists(struct intel_ring_buffer *ring)
+init_ring_lists(struct intel_engine *ring)
 {
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 014fb8f..4d37e20 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -359,7 +359,7 @@ err_destroy:
 void i915_gem_context_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	/* Prevent the hardware from restoring the last context (which hung) on
@@ -392,7 +392,7 @@ int i915_gem_context_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int unused;
 
 	/* Init should only be called once per module load. Eventually the
@@ -428,7 +428,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *dctx = dev_priv->ring[RCS].default_context;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int unused;
 
 	if (dctx->obj) {
@@ -467,7 +467,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 
 int i915_gem_context_enable(struct drm_i915_private *dev_priv)
 {
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i;
 
 	/* This is the only place the aliasing PPGTT gets enabled, which means
@@ -546,7 +546,7 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 }
 
 static inline int
-mi_set_context(struct intel_ring_buffer *ring,
+mi_set_context(struct intel_engine *ring,
 	       struct i915_hw_context *new_context,
 	       u32 hw_flags)
 {
@@ -596,7 +596,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 	return ret;
 }
 
-static int do_switch(struct intel_ring_buffer *ring,
+static int do_switch(struct intel_engine *ring,
 		     struct i915_hw_context *to)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -726,7 +726,7 @@ unpin_out:
  * it will have a refoucnt > 1. This allows us to destroy the context abstract
  * object while letting the normal object tracking destroy the backing BO.
  */
-int i915_switch_context(struct intel_ring_buffer *ring,
+int i915_switch_context(struct intel_engine *ring,
 			struct i915_hw_context *to)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 47fe8ec..95e797e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -541,7 +541,7 @@ need_reloc_mappable(struct i915_vma *vma)
 
 static int
 i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
-				struct intel_ring_buffer *ring,
+				struct intel_engine *ring,
 				bool *need_reloc)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
@@ -596,7 +596,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 }
 
 static int
-i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
+i915_gem_execbuffer_reserve(struct intel_engine *ring,
 			    struct list_head *vmas,
 			    bool *need_relocs)
 {
@@ -711,7 +711,7 @@ static int
 i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 				  struct drm_i915_gem_execbuffer2 *args,
 				  struct drm_file *file,
-				  struct intel_ring_buffer *ring,
+				  struct intel_engine *ring,
 				  struct eb_vmas *eb,
 				  struct drm_i915_gem_exec_object2 *exec)
 {
@@ -827,7 +827,7 @@ err:
 }
 
 static int
-i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring,
+i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
 				struct list_head *vmas)
 {
 	struct i915_vma *vma;
@@ -912,7 +912,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
 
 static struct i915_hw_context *
 i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
-			  struct intel_ring_buffer *ring, const u32 ctx_id)
+			  struct intel_engine *ring, const u32 ctx_id)
 {
 	struct i915_hw_context *ctx = NULL;
 	struct i915_ctx_hang_stats *hs;
@@ -935,7 +935,7 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 
 static void
 i915_gem_execbuffer_move_to_active(struct list_head *vmas,
-				   struct intel_ring_buffer *ring)
+				   struct intel_engine *ring)
 {
 	struct i915_vma *vma;
 
@@ -970,7 +970,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 static void
 i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 				    struct drm_file *file,
-				    struct intel_ring_buffer *ring,
+				    struct intel_engine *ring,
 				    struct drm_i915_gem_object *obj)
 {
 	/* Unconditionally force add_request to emit a full flush. */
@@ -982,7 +982,7 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 
 static int
 i915_reset_gen7_sol_offsets(struct drm_device *dev,
-			    struct intel_ring_buffer *ring)
+			    struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret, i;
@@ -1048,7 +1048,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct eb_vmas *eb;
 	struct drm_i915_gem_object *batch_obj;
 	struct drm_clip_rect *cliprects = NULL;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct i915_hw_context *ctx;
 	struct i915_address_space *vm;
 	const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1dff805..31b58ee 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -207,7 +207,7 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr,
 }
 
 /* Broadwell Page Directory Pointer Descriptors */
-static int gen8_write_pdp(struct intel_ring_buffer *ring, unsigned entry,
+static int gen8_write_pdp(struct intel_engine *ring, unsigned entry,
 			   uint64_t val, bool synchronous)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -237,7 +237,7 @@ static int gen8_write_pdp(struct intel_ring_buffer *ring, unsigned entry,
 }
 
 static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_ring_buffer *ring,
+			  struct intel_engine *ring,
 			  bool synchronous)
 {
 	int i, ret;
@@ -716,7 +716,7 @@ static uint32_t get_pd_offset(struct i915_hw_ppgtt *ppgtt)
 }
 
 static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			 struct intel_ring_buffer *ring,
+			 struct intel_engine *ring,
 			 bool synchronous)
 {
 	struct drm_device *dev = ppgtt->base.dev;
@@ -760,7 +760,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_ring_buffer *ring,
+			  struct intel_engine *ring,
 			  bool synchronous)
 {
 	struct drm_device *dev = ppgtt->base.dev;
@@ -811,7 +811,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static int gen6_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_ring_buffer *ring,
+			  struct intel_engine *ring,
 			  bool synchronous)
 {
 	struct drm_device *dev = ppgtt->base.dev;
@@ -832,7 +832,7 @@ static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int j, ret;
 
 	for_each_active_ring(ring, dev_priv, j) {
@@ -862,7 +862,7 @@ static int gen7_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	uint32_t ecochk, ecobits;
 	int i;
 
@@ -901,7 +901,7 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	uint32_t ecochk, gab_ctl, ecobits;
 	int i;
 
@@ -1269,7 +1269,7 @@ static void undo_idling(struct drm_i915_private *dev_priv, bool interruptible)
 void i915_check_and_clear_faults(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	if (INTEL_INFO(dev)->gen < 6)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index cfca023..0775662 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -261,7 +261,7 @@ struct i915_hw_ppgtt {
 
 	int (*enable)(struct i915_hw_ppgtt *ppgtt);
 	int (*switch_mm)(struct i915_hw_ppgtt *ppgtt,
-			 struct intel_ring_buffer *ring,
+			 struct intel_engine *ring,
 			 bool synchronous);
 	void (*debug_dump)(struct i915_hw_ppgtt *ppgtt, struct seq_file *m);
 };
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 8f37238..0853db3 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -745,7 +745,7 @@ static void i915_gem_record_fences(struct drm_device *dev,
 }
 
 static void i915_record_ring_state(struct drm_device *dev,
-				   struct intel_ring_buffer *ring,
+				   struct intel_engine *ring,
 				   struct drm_i915_error_ring *ering)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -857,7 +857,7 @@ static void i915_record_ring_state(struct drm_device *dev,
 }
 
 
-static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
+static void i915_gem_record_active_context(struct intel_engine *ring,
 					   struct drm_i915_error_state *error,
 					   struct drm_i915_error_ring *ering)
 {
@@ -884,7 +884,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 	int i, count;
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
-		struct intel_ring_buffer *ring = &dev_priv->ring[i];
+		struct intel_engine *ring = &dev_priv->ring[i];
 
 		if (!intel_ring_initialized(ring))
 			continue;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4a8e8cb..58c8812 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1077,7 +1077,7 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 }
 
 static void notify_ring(struct drm_device *dev,
-			struct intel_ring_buffer *ring)
+			struct intel_engine *ring)
 {
 	if (ring->obj == NULL)
 		return;
@@ -2111,7 +2111,7 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
 static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 			       bool reset_completed)
 {
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	/*
@@ -2544,14 +2544,14 @@ static void gen8_disable_vblank(struct drm_device *dev, int pipe)
 }
 
 static u32
-ring_last_seqno(struct intel_ring_buffer *ring)
+ring_last_seqno(struct intel_engine *ring)
 {
 	return list_entry(ring->request_list.prev,
 			  struct drm_i915_gem_request, list)->seqno;
 }
 
 static bool
-ring_idle(struct intel_ring_buffer *ring, u32 seqno)
+ring_idle(struct intel_engine *ring, u32 seqno)
 {
 	return (list_empty(&ring->request_list) ||
 		i915_seqno_passed(seqno, ring_last_seqno(ring)));
@@ -2574,11 +2574,11 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
 	}
 }
 
-static struct intel_ring_buffer *
-semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
+static struct intel_engine *
+semaphore_wait_to_signaller_ring(struct intel_engine *ring, u32 ipehr)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ring_buffer *signaller;
+	struct intel_engine *signaller;
 	int i;
 
 	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
@@ -2606,8 +2606,8 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
 	return NULL;
 }
 
-static struct intel_ring_buffer *
-semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
+static struct intel_engine *
+semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 cmd, ipehr, head;
@@ -2649,10 +2649,10 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
 	return semaphore_wait_to_signaller_ring(ring, ipehr);
 }
 
-static int semaphore_passed(struct intel_ring_buffer *ring)
+static int semaphore_passed(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ring_buffer *signaller;
+	struct intel_engine *signaller;
 	u32 seqno, ctl;
 
 	ring->hangcheck.deadlock = true;
@@ -2671,7 +2671,7 @@ static int semaphore_passed(struct intel_ring_buffer *ring)
 
 static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
 {
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	for_each_active_ring(ring, dev_priv, i)
@@ -2679,7 +2679,7 @@ static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
 }
 
 static enum intel_ring_hangcheck_action
-ring_stuck(struct intel_ring_buffer *ring, u64 acthd)
+ring_stuck(struct intel_engine *ring, u64 acthd)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -2735,7 +2735,7 @@ static void i915_hangcheck_elapsed(unsigned long data)
 {
 	struct drm_device *dev = (struct drm_device *)data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 	int busy_count = 0, rings_hung = 0;
 	bool stuck[I915_NUM_RINGS] = { 0 };
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index b29d7b1..a4f9e62 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -326,8 +326,8 @@ TRACE_EVENT(i915_gem_evict_vm,
 );
 
 TRACE_EVENT(i915_gem_ring_sync_to,
-	    TP_PROTO(struct intel_ring_buffer *from,
-		     struct intel_ring_buffer *to,
+	    TP_PROTO(struct intel_engine *from,
+		     struct intel_engine *to,
 		     u32 seqno),
 	    TP_ARGS(from, to, seqno),
 
@@ -352,7 +352,7 @@ TRACE_EVENT(i915_gem_ring_sync_to,
 );
 
 TRACE_EVENT(i915_gem_ring_dispatch,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno, u32 flags),
 	    TP_ARGS(ring, seqno, flags),
 
 	    TP_STRUCT__entry(
@@ -375,7 +375,7 @@ TRACE_EVENT(i915_gem_ring_dispatch,
 );
 
 TRACE_EVENT(i915_gem_ring_flush,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 invalidate, u32 flush),
+	    TP_PROTO(struct intel_engine *ring, u32 invalidate, u32 flush),
 	    TP_ARGS(ring, invalidate, flush),
 
 	    TP_STRUCT__entry(
@@ -398,7 +398,7 @@ TRACE_EVENT(i915_gem_ring_flush,
 );
 
 DECLARE_EVENT_CLASS(i915_gem_request,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno),
 
 	    TP_STRUCT__entry(
@@ -418,12 +418,12 @@ DECLARE_EVENT_CLASS(i915_gem_request,
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno)
 );
 
 TRACE_EVENT(i915_gem_request_complete,
-	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_PROTO(struct intel_engine *ring),
 	    TP_ARGS(ring),
 
 	    TP_STRUCT__entry(
@@ -443,12 +443,12 @@ TRACE_EVENT(i915_gem_request_complete,
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_retire,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno)
 );
 
 TRACE_EVENT(i915_gem_request_wait_begin,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno),
 
 	    TP_STRUCT__entry(
@@ -477,12 +477,12 @@ TRACE_EVENT(i915_gem_request_wait_begin,
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_wait_end,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno)
 );
 
 DECLARE_EVENT_CLASS(i915_ring,
-	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_PROTO(struct intel_engine *ring),
 	    TP_ARGS(ring),
 
 	    TP_STRUCT__entry(
@@ -499,12 +499,12 @@ DECLARE_EVENT_CLASS(i915_ring,
 );
 
 DEFINE_EVENT(i915_ring, i915_ring_wait_begin,
-	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_PROTO(struct intel_engine *ring),
 	    TP_ARGS(ring)
 );
 
 DEFINE_EVENT(i915_ring, i915_ring_wait_end,
-	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_PROTO(struct intel_engine *ring),
 	    TP_ARGS(ring)
 );
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index c65e7f7..f821147 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1944,7 +1944,7 @@ static int intel_align_height(struct drm_device *dev, int height, bool tiled)
 int
 intel_pin_and_fence_fb_obj(struct drm_device *dev,
 			   struct drm_i915_gem_object *obj,
-			   struct intel_ring_buffer *pipelined)
+			   struct intel_engine *pipelined)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 alignment;
@@ -8424,7 +8424,7 @@ out:
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
-			struct intel_ring_buffer *ring)
+			struct intel_engine *ring)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_crtc *crtc;
@@ -8582,7 +8582,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	u32 flip_mask;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
@@ -8627,7 +8627,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	u32 flip_mask;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
@@ -8669,7 +8669,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	uint32_t pf, pipesrc;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
@@ -8717,7 +8717,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	uint32_t pf, pipesrc;
 	int ret;
 
@@ -8762,7 +8762,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	uint32_t plane_bit = 0;
 	int len, ret;
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index d8b540b..23b5abf 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -694,7 +694,7 @@ int intel_pch_rawclk(struct drm_device *dev);
 int valleyview_cur_cdclk(struct drm_i915_private *dev_priv);
 void intel_mark_busy(struct drm_device *dev);
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
-			struct intel_ring_buffer *ring);
+			struct intel_engine *ring);
 void intel_mark_idle(struct drm_device *dev);
 void intel_crtc_restore_mode(struct drm_crtc *crtc);
 void intel_crtc_update_dpms(struct drm_crtc *crtc);
@@ -726,7 +726,7 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
 				    struct intel_load_detect_pipe *old);
 int intel_pin_and_fence_fb_obj(struct drm_device *dev,
 			       struct drm_i915_gem_object *obj,
-			       struct intel_ring_buffer *pipelined);
+			       struct intel_engine *pipelined);
 void intel_unpin_fb_obj(struct drm_i915_gem_object *obj);
 struct drm_framebuffer *
 __intel_framebuffer_create(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index d8adc91..965eec1 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -213,7 +213,7 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
 {
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	BUG_ON(overlay->last_flip_req);
@@ -236,7 +236,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 {
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	BUG_ON(overlay->active);
@@ -263,7 +263,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 {
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	u32 flip_addr = overlay->flip_addr;
 	u32 tmp;
 	int ret;
@@ -320,7 +320,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
 {
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	u32 flip_addr = overlay->flip_addr;
 	int ret;
 
@@ -363,7 +363,7 @@ static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay)
 {
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	if (overlay->last_flip_req == 0)
@@ -389,7 +389,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
 {
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	/* Only wait if there is actually an old frame to release to
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index acfded3..17f636e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3379,7 +3379,7 @@ static void parse_rp_state_cap(struct drm_i915_private *dev_priv, u32 rp_state_c
 static void gen8_enable_rps(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	uint32_t rc6_mask = 0, rp_state_cap;
 	int unused;
 
@@ -3454,7 +3454,7 @@ static void gen8_enable_rps(struct drm_device *dev)
 static void gen6_enable_rps(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	u32 rp_state_cap;
 	u32 gt_perf_status;
 	u32 rc6vids, pcu_mbox = 0, rc6_mask = 0;
@@ -3783,7 +3783,7 @@ static void valleyview_cleanup_gt_powersave(struct drm_device *dev)
 static void valleyview_enable_rps(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	u32 gtfifodbg, val, rc6_mode = 0;
 	int i;
 
@@ -3914,7 +3914,7 @@ static int ironlake_setup_rc6(struct drm_device *dev)
 static void ironlake_enable_rc6(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	bool was_interruptible;
 	int ret;
 
@@ -4426,7 +4426,7 @@ EXPORT_SYMBOL_GPL(i915_gpu_lower);
 bool i915_gpu_busy(void)
 {
 	struct drm_i915_private *dev_priv;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	bool ret = false;
 	int i;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5d61923..4c3cc44 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -40,7 +40,7 @@
  */
 #define CACHELINE_BYTES 64
 
-static inline int ring_space(struct intel_ring_buffer *ring)
+static inline int ring_space(struct intel_engine *ring)
 {
 	int space = (ring->head & HEAD_ADDR) - (ring->tail + I915_RING_FREE_SPACE);
 	if (space < 0)
@@ -48,13 +48,13 @@ static inline int ring_space(struct intel_ring_buffer *ring)
 	return space;
 }
 
-static bool intel_ring_stopped(struct intel_ring_buffer *ring)
+static bool intel_ring_stopped(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	return dev_priv->gpu_error.stop_rings & intel_ring_flag(ring);
 }
 
-void __intel_ring_advance(struct intel_ring_buffer *ring)
+void __intel_ring_advance(struct intel_engine *ring)
 {
 	ring->tail &= ring->size - 1;
 	if (intel_ring_stopped(ring))
@@ -63,7 +63,7 @@ void __intel_ring_advance(struct intel_ring_buffer *ring)
 }
 
 static int
-gen2_render_ring_flush(struct intel_ring_buffer *ring,
+gen2_render_ring_flush(struct intel_engine *ring,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -89,7 +89,7 @@ gen2_render_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-gen4_render_ring_flush(struct intel_ring_buffer *ring,
+gen4_render_ring_flush(struct intel_engine *ring,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -184,7 +184,7 @@ gen4_render_ring_flush(struct intel_ring_buffer *ring,
  * really our business.  That leaves only stall at scoreboard.
  */
 static int
-intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring)
+intel_emit_post_sync_nonzero_flush(struct intel_engine *ring)
 {
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
@@ -219,7 +219,7 @@ intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring)
 }
 
 static int
-gen6_render_ring_flush(struct intel_ring_buffer *ring,
+gen6_render_ring_flush(struct intel_engine *ring,
                          u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -271,7 +271,7 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
+gen7_render_ring_cs_stall_wa(struct intel_engine *ring)
 {
 	int ret;
 
@@ -289,7 +289,7 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
+static int gen7_ring_fbc_flush(struct intel_engine *ring, u32 value)
 {
 	int ret;
 
@@ -313,7 +313,7 @@ static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
 }
 
 static int
-gen7_render_ring_flush(struct intel_ring_buffer *ring,
+gen7_render_ring_flush(struct intel_engine *ring,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -374,7 +374,7 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-gen8_render_ring_flush(struct intel_ring_buffer *ring,
+gen8_render_ring_flush(struct intel_engine *ring,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -414,14 +414,14 @@ gen8_render_ring_flush(struct intel_ring_buffer *ring,
 
 }
 
-static void ring_write_tail(struct intel_ring_buffer *ring,
+static void ring_write_tail(struct intel_engine *ring,
 			    u32 value)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	I915_WRITE_TAIL(ring, value);
 }
 
-u64 intel_ring_get_active_head(struct intel_ring_buffer *ring)
+u64 intel_ring_get_active_head(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u64 acthd;
@@ -437,7 +437,7 @@ u64 intel_ring_get_active_head(struct intel_ring_buffer *ring)
 	return acthd;
 }
 
-static void ring_setup_phys_status_page(struct intel_ring_buffer *ring)
+static void ring_setup_phys_status_page(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 addr;
@@ -448,7 +448,7 @@ static void ring_setup_phys_status_page(struct intel_ring_buffer *ring)
 	I915_WRITE(HWS_PGA, addr);
 }
 
-static bool stop_ring(struct intel_ring_buffer *ring)
+static bool stop_ring(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
 
@@ -472,7 +472,7 @@ static bool stop_ring(struct intel_ring_buffer *ring)
 	return (I915_READ_HEAD(ring) & HEAD_ADDR) == 0;
 }
 
-static int init_ring_common(struct intel_ring_buffer *ring)
+static int init_ring_common(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -550,7 +550,7 @@ out:
 }
 
 static int
-init_pipe_control(struct intel_ring_buffer *ring)
+init_pipe_control(struct intel_engine *ring)
 {
 	int ret;
 
@@ -591,7 +591,7 @@ err:
 	return ret;
 }
 
-static int init_render_ring(struct intel_ring_buffer *ring)
+static int init_render_ring(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -647,7 +647,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	return ret;
 }
 
-static void render_ring_cleanup(struct intel_ring_buffer *ring)
+static void render_ring_cleanup(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 
@@ -663,12 +663,12 @@ static void render_ring_cleanup(struct intel_ring_buffer *ring)
 	ring->scratch.obj = NULL;
 }
 
-static int gen6_signal(struct intel_ring_buffer *signaller,
+static int gen6_signal(struct intel_engine *signaller,
 		       unsigned int num_dwords)
 {
 	struct drm_device *dev = signaller->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *useless;
+	struct intel_engine *useless;
 	int i, ret;
 
 	/* NB: In order to be able to do semaphore MBOX updates for varying
@@ -713,7 +713,7 @@ static int gen6_signal(struct intel_ring_buffer *signaller,
  * This acts like a signal in the canonical semaphore.
  */
 static int
-gen6_add_request(struct intel_ring_buffer *ring)
+gen6_add_request(struct intel_engine *ring)
 {
 	int ret;
 
@@ -745,8 +745,8 @@ static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
  * @seqno - seqno which the waiter will block on
  */
 static int
-gen6_ring_sync(struct intel_ring_buffer *waiter,
-	       struct intel_ring_buffer *signaller,
+gen6_ring_sync(struct intel_engine *waiter,
+	       struct intel_engine *signaller,
 	       u32 seqno)
 {
 	u32 dw1 = MI_SEMAPHORE_MBOX |
@@ -794,7 +794,7 @@ do {									\
 } while (0)
 
 static int
-pc_render_add_request(struct intel_ring_buffer *ring)
+pc_render_add_request(struct intel_engine *ring)
 {
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
@@ -842,7 +842,7 @@ pc_render_add_request(struct intel_ring_buffer *ring)
 }
 
 static u32
-gen6_ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
+gen6_ring_get_seqno(struct intel_engine *ring, bool lazy_coherency)
 {
 	/* Workaround to force correct ordering between irq and seqno writes on
 	 * ivb (and maybe also on snb) by reading from a CS register (like
@@ -856,31 +856,31 @@ gen6_ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
 }
 
 static u32
-ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
+ring_get_seqno(struct intel_engine *ring, bool lazy_coherency)
 {
 	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
 }
 
 static void
-ring_set_seqno(struct intel_ring_buffer *ring, u32 seqno)
+ring_set_seqno(struct intel_engine *ring, u32 seqno)
 {
 	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
 }
 
 static u32
-pc_render_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
+pc_render_get_seqno(struct intel_engine *ring, bool lazy_coherency)
 {
 	return ring->scratch.cpu_page[0];
 }
 
 static void
-pc_render_set_seqno(struct intel_ring_buffer *ring, u32 seqno)
+pc_render_set_seqno(struct intel_engine *ring, u32 seqno)
 {
 	ring->scratch.cpu_page[0] = seqno;
 }
 
 static bool
-gen5_ring_get_irq(struct intel_ring_buffer *ring)
+gen5_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -898,7 +898,7 @@ gen5_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-gen5_ring_put_irq(struct intel_ring_buffer *ring)
+gen5_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -911,7 +911,7 @@ gen5_ring_put_irq(struct intel_ring_buffer *ring)
 }
 
 static bool
-i9xx_ring_get_irq(struct intel_ring_buffer *ring)
+i9xx_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -932,7 +932,7 @@ i9xx_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-i9xx_ring_put_irq(struct intel_ring_buffer *ring)
+i9xx_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -948,7 +948,7 @@ i9xx_ring_put_irq(struct intel_ring_buffer *ring)
 }
 
 static bool
-i8xx_ring_get_irq(struct intel_ring_buffer *ring)
+i8xx_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -969,7 +969,7 @@ i8xx_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-i8xx_ring_put_irq(struct intel_ring_buffer *ring)
+i8xx_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -984,7 +984,7 @@ i8xx_ring_put_irq(struct intel_ring_buffer *ring)
 	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
 }
 
-void intel_ring_setup_status_page(struct intel_ring_buffer *ring)
+void intel_ring_setup_status_page(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -1047,7 +1047,7 @@ void intel_ring_setup_status_page(struct intel_ring_buffer *ring)
 }
 
 static int
-bsd_ring_flush(struct intel_ring_buffer *ring,
+bsd_ring_flush(struct intel_engine *ring,
 	       u32     invalidate_domains,
 	       u32     flush_domains)
 {
@@ -1064,7 +1064,7 @@ bsd_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-i9xx_add_request(struct intel_ring_buffer *ring)
+i9xx_add_request(struct intel_engine *ring)
 {
 	int ret;
 
@@ -1082,7 +1082,7 @@ i9xx_add_request(struct intel_ring_buffer *ring)
 }
 
 static bool
-gen6_ring_get_irq(struct intel_ring_buffer *ring)
+gen6_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1107,7 +1107,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-gen6_ring_put_irq(struct intel_ring_buffer *ring)
+gen6_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1125,7 +1125,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
 }
 
 static bool
-hsw_vebox_get_irq(struct intel_ring_buffer *ring)
+hsw_vebox_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1145,7 +1145,7 @@ hsw_vebox_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-hsw_vebox_put_irq(struct intel_ring_buffer *ring)
+hsw_vebox_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1163,7 +1163,7 @@ hsw_vebox_put_irq(struct intel_ring_buffer *ring)
 }
 
 static bool
-gen8_ring_get_irq(struct intel_ring_buffer *ring)
+gen8_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1189,7 +1189,7 @@ gen8_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-gen8_ring_put_irq(struct intel_ring_buffer *ring)
+gen8_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1209,7 +1209,7 @@ gen8_ring_put_irq(struct intel_ring_buffer *ring)
 }
 
 static int
-i965_dispatch_execbuffer(struct intel_ring_buffer *ring,
+i965_dispatch_execbuffer(struct intel_engine *ring,
 			 u64 offset, u32 length,
 			 unsigned flags)
 {
@@ -1232,7 +1232,7 @@ i965_dispatch_execbuffer(struct intel_ring_buffer *ring,
 /* Just userspace ABI convention to limit the wa batch bo to a resonable size */
 #define I830_BATCH_LIMIT (256*1024)
 static int
-i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
+i830_dispatch_execbuffer(struct intel_engine *ring,
 				u64 offset, u32 len,
 				unsigned flags)
 {
@@ -1283,7 +1283,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
 }
 
 static int
-i915_dispatch_execbuffer(struct intel_ring_buffer *ring,
+i915_dispatch_execbuffer(struct intel_engine *ring,
 			 u64 offset, u32 len,
 			 unsigned flags)
 {
@@ -1300,7 +1300,7 @@ i915_dispatch_execbuffer(struct intel_ring_buffer *ring,
 	return 0;
 }
 
-static void cleanup_status_page(struct intel_ring_buffer *ring)
+static void cleanup_status_page(struct intel_engine *ring)
 {
 	struct drm_i915_gem_object *obj;
 
@@ -1314,7 +1314,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
 	ring->status_page.obj = NULL;
 }
 
-static int init_status_page(struct intel_ring_buffer *ring)
+static int init_status_page(struct intel_engine *ring)
 {
 	struct drm_i915_gem_object *obj;
 
@@ -1351,7 +1351,7 @@ err_unref:
 	return 0;
 }
 
-static int init_phys_status_page(struct intel_ring_buffer *ring)
+static int init_phys_status_page(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
@@ -1368,7 +1368,7 @@ static int init_phys_status_page(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-void intel_destroy_ring_buffer(struct intel_ring_buffer *ring)
+void intel_destroy_ring_buffer(struct intel_engine *ring)
 {
 	if (!ring->obj)
 		return;
@@ -1379,7 +1379,7 @@ void intel_destroy_ring_buffer(struct intel_ring_buffer *ring)
 	ring->obj = NULL;
 }
 
-int intel_allocate_ring_buffer(struct intel_ring_buffer *ring)
+int intel_allocate_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -1424,7 +1424,7 @@ err_unref:
 }
 
 static int intel_init_ring_buffer(struct drm_device *dev,
-				  struct intel_ring_buffer *ring)
+				  struct intel_engine *ring)
 {
 	int ret;
 
@@ -1465,7 +1465,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	return ring->init(ring);
 }
 
-void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
+void intel_cleanup_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
 
@@ -1485,7 +1485,7 @@ void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
 	cleanup_status_page(ring);
 }
 
-static int intel_ring_wait_request(struct intel_ring_buffer *ring, int n)
+static int intel_ring_wait_request(struct intel_engine *ring, int n)
 {
 	struct drm_i915_gem_request *request;
 	u32 seqno = 0, tail;
@@ -1538,7 +1538,7 @@ static int intel_ring_wait_request(struct intel_ring_buffer *ring, int n)
 	return 0;
 }
 
-static int ring_wait_for_space(struct intel_ring_buffer *ring, int n)
+static int ring_wait_for_space(struct intel_engine *ring, int n)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1586,7 +1586,7 @@ static int ring_wait_for_space(struct intel_ring_buffer *ring, int n)
 	return -EBUSY;
 }
 
-static int intel_wrap_ring_buffer(struct intel_ring_buffer *ring)
+static int intel_wrap_ring_buffer(struct intel_engine *ring)
 {
 	uint32_t __iomem *virt;
 	int rem = ring->size - ring->tail;
@@ -1608,7 +1608,7 @@ static int intel_wrap_ring_buffer(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-int intel_ring_idle(struct intel_ring_buffer *ring)
+int intel_ring_idle(struct intel_engine *ring)
 {
 	u32 seqno;
 	int ret;
@@ -1632,7 +1632,7 @@ int intel_ring_idle(struct intel_ring_buffer *ring)
 }
 
 static int
-intel_ring_alloc_seqno(struct intel_ring_buffer *ring)
+intel_ring_alloc_seqno(struct intel_engine *ring)
 {
 	if (ring->outstanding_lazy_seqno)
 		return 0;
@@ -1650,7 +1650,7 @@ intel_ring_alloc_seqno(struct intel_ring_buffer *ring)
 	return i915_gem_get_seqno(ring->dev, &ring->outstanding_lazy_seqno);
 }
 
-static int __intel_ring_prepare(struct intel_ring_buffer *ring,
+static int __intel_ring_prepare(struct intel_engine *ring,
 				int bytes)
 {
 	int ret;
@@ -1670,7 +1670,7 @@ static int __intel_ring_prepare(struct intel_ring_buffer *ring,
 	return 0;
 }
 
-int intel_ring_begin(struct intel_ring_buffer *ring,
+int intel_ring_begin(struct intel_engine *ring,
 		     int num_dwords)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -1695,7 +1695,7 @@ int intel_ring_begin(struct intel_ring_buffer *ring,
 }
 
 /* Align the ring tail to a cacheline boundary */
-int intel_ring_cacheline_align(struct intel_ring_buffer *ring)
+int intel_ring_cacheline_align(struct intel_engine *ring)
 {
 	int num_dwords = (ring->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
 	int ret;
@@ -1716,7 +1716,7 @@ int intel_ring_cacheline_align(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno)
+void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
@@ -1733,7 +1733,7 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno)
 	ring->hangcheck.seqno = seqno;
 }
 
-static void gen6_bsd_ring_write_tail(struct intel_ring_buffer *ring,
+static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
 				     u32 value)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -1766,7 +1766,7 @@ static void gen6_bsd_ring_write_tail(struct intel_ring_buffer *ring,
 		   _MASKED_BIT_DISABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
 }
 
-static int gen6_bsd_ring_flush(struct intel_ring_buffer *ring,
+static int gen6_bsd_ring_flush(struct intel_engine *ring,
 			       u32 invalidate, u32 flush)
 {
 	uint32_t cmd;
@@ -1802,7 +1802,7 @@ static int gen6_bsd_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-gen8_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
+gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1826,7 +1826,7 @@ gen8_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 }
 
 static int
-hsw_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
+hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1847,7 +1847,7 @@ hsw_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 }
 
 static int
-gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
+gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1869,7 +1869,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 
 /* Blitter support (SandyBridge+) */
 
-static int gen6_ring_flush(struct intel_ring_buffer *ring,
+static int gen6_ring_flush(struct intel_engine *ring,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
@@ -1912,7 +1912,7 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
 int intel_init_render_ring_buffer(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
@@ -2018,7 +2018,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	if (INTEL_INFO(dev)->gen >= 6) {
@@ -2081,7 +2081,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 int intel_init_bsd_ring_buffer(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[VCS];
+	struct intel_engine *ring = &dev_priv->ring[VCS];
 
 	ring->write_tail = ring_write_tail;
 	if (INTEL_INFO(dev)->gen >= 6) {
@@ -2152,7 +2152,7 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[VCS2];
+	struct intel_engine *ring = &dev_priv->ring[VCS2];
 
 	if ((INTEL_INFO(dev)->gen != 8)) {
 		DRM_ERROR("No dual-BSD ring on non-BDW machine\n");
@@ -2196,7 +2196,7 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 int intel_init_blt_ring_buffer(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[BCS];
+	struct intel_engine *ring = &dev_priv->ring[BCS];
 
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
@@ -2241,7 +2241,7 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 int intel_init_vebox_ring_buffer(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[VECS];
+	struct intel_engine *ring = &dev_priv->ring[VECS];
 
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
@@ -2279,7 +2279,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 }
 
 int
-intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
+intel_ring_flush_all_caches(struct intel_engine *ring)
 {
 	int ret;
 
@@ -2297,7 +2297,7 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
 }
 
 int
-intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
+intel_ring_invalidate_all_caches(struct intel_engine *ring)
 {
 	uint32_t flush_domains;
 	int ret;
@@ -2317,7 +2317,7 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
 }
 
 void
-intel_stop_ring_buffer(struct intel_ring_buffer *ring)
+intel_stop_ring_buffer(struct intel_engine *ring)
 {
 	int ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 680e451..50cc525 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -54,7 +54,7 @@ struct intel_ring_hangcheck {
 	bool deadlock;
 };
 
-struct  intel_ring_buffer {
+struct intel_engine {
 	const char	*name;
 	enum intel_ring_id {
 		RCS = 0x0,
@@ -90,33 +90,33 @@ struct  intel_ring_buffer {
 	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
 	u32		trace_irq_seqno;
-	bool __must_check (*irq_get)(struct intel_ring_buffer *ring);
-	void		(*irq_put)(struct intel_ring_buffer *ring);
+	bool __must_check (*irq_get)(struct intel_engine *ring);
+	void		(*irq_put)(struct intel_engine *ring);
 
-	int		(*init)(struct intel_ring_buffer *ring);
+	int		(*init)(struct intel_engine *ring);
 
-	void		(*write_tail)(struct intel_ring_buffer *ring,
+	void		(*write_tail)(struct intel_engine *ring,
 				      u32 value);
-	int __must_check (*flush)(struct intel_ring_buffer *ring,
+	int __must_check (*flush)(struct intel_engine *ring,
 				  u32	invalidate_domains,
 				  u32	flush_domains);
-	int		(*add_request)(struct intel_ring_buffer *ring);
+	int		(*add_request)(struct intel_engine *ring);
 	/* Some chipsets are not quite as coherent as advertised and need
 	 * an expensive kick to force a true read of the up-to-date seqno.
 	 * However, the up-to-date seqno is not always required and the last
 	 * seen value is good enough. Note that the seqno will always be
 	 * monotonic, even if not coherent.
 	 */
-	u32		(*get_seqno)(struct intel_ring_buffer *ring,
+	u32		(*get_seqno)(struct intel_engine *ring,
 				     bool lazy_coherency);
-	void		(*set_seqno)(struct intel_ring_buffer *ring,
+	void		(*set_seqno)(struct intel_engine *ring,
 				     u32 seqno);
-	int		(*dispatch_execbuffer)(struct intel_ring_buffer *ring,
+	int		(*dispatch_execbuffer)(struct intel_engine *ring,
 					       u64 offset, u32 length,
 					       unsigned flags);
 #define I915_DISPATCH_SECURE 0x1
 #define I915_DISPATCH_PINNED 0x2
-	void		(*cleanup)(struct intel_ring_buffer *ring);
+	void		(*cleanup)(struct intel_engine *ring);
 
 	struct {
 		u32	sync_seqno[I915_NUM_RINGS-1];
@@ -129,10 +129,10 @@ struct  intel_ring_buffer {
 		} mbox;
 
 		/* AKA wait() */
-		int	(*sync_to)(struct intel_ring_buffer *ring,
-				   struct intel_ring_buffer *to,
+		int	(*sync_to)(struct intel_engine *ring,
+				   struct intel_engine *to,
 				   u32 seqno);
-		int	(*signal)(struct intel_ring_buffer *signaller,
+		int	(*signal)(struct intel_engine *signaller,
 				  /* num_dwords needed by caller */
 				  unsigned int num_dwords);
 	} semaphore;
@@ -210,20 +210,20 @@ struct  intel_ring_buffer {
 };
 
 static inline bool
-intel_ring_initialized(struct intel_ring_buffer *ring)
+intel_ring_initialized(struct intel_engine *ring)
 {
 	return ring->obj != NULL;
 }
 
 static inline unsigned
-intel_ring_flag(struct intel_ring_buffer *ring)
+intel_ring_flag(struct intel_engine *ring)
 {
 	return 1 << ring->id;
 }
 
 static inline u32
-intel_ring_sync_index(struct intel_ring_buffer *ring,
-		      struct intel_ring_buffer *other)
+intel_ring_sync_index(struct intel_engine *ring,
+		      struct intel_engine *other)
 {
 	int idx;
 
@@ -241,7 +241,7 @@ intel_ring_sync_index(struct intel_ring_buffer *ring,
 }
 
 static inline u32
-intel_read_status_page(struct intel_ring_buffer *ring,
+intel_read_status_page(struct intel_engine *ring,
 		       int reg)
 {
 	/* Ensure that the compiler doesn't optimize away the load. */
@@ -250,7 +250,7 @@ intel_read_status_page(struct intel_ring_buffer *ring,
 }
 
 static inline void
-intel_write_status_page(struct intel_ring_buffer *ring,
+intel_write_status_page(struct intel_engine *ring,
 			int reg, u32 value)
 {
 	ring->status_page.page_addr[reg] = value;
@@ -275,27 +275,27 @@ intel_write_status_page(struct intel_ring_buffer *ring,
 #define I915_GEM_HWS_SCRATCH_INDEX	0x30
 #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
 
-void intel_stop_ring_buffer(struct intel_ring_buffer *ring);
-void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring);
+void intel_stop_ring_buffer(struct intel_engine *ring);
+void intel_cleanup_ring_buffer(struct intel_engine *ring);
 
-int __must_check intel_ring_begin(struct intel_ring_buffer *ring, int n);
-int __must_check intel_ring_cacheline_align(struct intel_ring_buffer *ring);
-static inline void intel_ring_emit(struct intel_ring_buffer *ring,
+int __must_check intel_ring_begin(struct intel_engine *ring, int n);
+int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
+static inline void intel_ring_emit(struct intel_engine *ring,
 				   u32 data)
 {
 	iowrite32(data, ring->virtual_start + ring->tail);
 	ring->tail += 4;
 }
-static inline void intel_ring_advance(struct intel_ring_buffer *ring)
+static inline void intel_ring_advance(struct intel_engine *ring)
 {
 	ring->tail &= ring->size - 1;
 }
-void __intel_ring_advance(struct intel_ring_buffer *ring);
+void __intel_ring_advance(struct intel_engine *ring);
 
-int __must_check intel_ring_idle(struct intel_ring_buffer *ring);
-void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno);
-int intel_ring_flush_all_caches(struct intel_ring_buffer *ring);
-int intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring);
+int __must_check intel_ring_idle(struct intel_engine *ring);
+void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
+int intel_ring_flush_all_caches(struct intel_engine *ring);
+int intel_ring_invalidate_all_caches(struct intel_engine *ring);
 
 void intel_init_rings_early(struct drm_device *dev);
 int intel_init_render_ring_buffer(struct drm_device *dev);
@@ -304,24 +304,24 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev);
 int intel_init_blt_ring_buffer(struct drm_device *dev);
 int intel_init_vebox_ring_buffer(struct drm_device *dev);
 
-u64 intel_ring_get_active_head(struct intel_ring_buffer *ring);
-void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
+u64 intel_ring_get_active_head(struct intel_engine *ring);
+void intel_ring_setup_status_page(struct intel_engine *ring);
 
-void intel_destroy_ring_buffer(struct intel_ring_buffer *ring);
-int intel_allocate_ring_buffer(struct intel_ring_buffer *ring);
+void intel_destroy_ring_buffer(struct intel_engine *ring);
+int intel_allocate_ring_buffer(struct intel_engine *ring);
 
-static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
+static inline u32 intel_ring_get_tail(struct intel_engine *ring)
 {
 	return ring->tail;
 }
 
-static inline u32 intel_ring_get_seqno(struct intel_ring_buffer *ring)
+static inline u32 intel_ring_get_seqno(struct intel_engine *ring)
 {
 	BUG_ON(ring->outstanding_lazy_seqno == 0);
 	return ring->outstanding_lazy_seqno;
 }
 
-static inline void i915_trace_irq_get(struct intel_ring_buffer *ring, u32 seqno)
+static inline void i915_trace_irq_get(struct intel_engine *ring, u32 seqno)
 {
 	if (ring->trace_irq_seqno == 0 && ring->irq_get(ring))
 		ring->trace_irq_seqno = seqno;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 07/50] drm/i915: Split the ringbuffers and the rings
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (5 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 08/50] drm/i915: Rename functions that mention ringbuffers (meaning rings) oscar.mateo
                   ` (44 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Following the logic behind the previous patch, the ringbuffers and the rings
belong in different structs. For the time being, we will keep the relationship
between the two via the default_ringbuf living inside each ring.

This commit should not introduce functional changes (unless I made an error,
this is).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  25 ++++---
 drivers/gpu/drm/i915/i915_gem.c         |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c   |   6 +-
 drivers/gpu/drm/i915/i915_irq.c         |   9 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 123 ++++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  61 ++++++++++------
 6 files changed, 131 insertions(+), 95 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 5263d63..8ec8963 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -47,6 +47,8 @@
 
 #define LP_RING(d) (&((struct drm_i915_private *)(d))->ring[RCS])
 
+#define LP_RINGBUF(d) (&((struct drm_i915_private *)(d))->ring[RCS].default_ringbuf)
+
 #define BEGIN_LP_RING(n) \
 	intel_ring_begin(LP_RING(dev_priv), (n))
 
@@ -63,7 +65,7 @@
  * has access to the ring.
  */
 #define RING_LOCK_TEST_WITH_RETURN(dev, file) do {			\
-	if (LP_RING(dev->dev_private)->obj == NULL)			\
+	if (LP_RINGBUF(dev->dev_private)->obj == NULL)			\
 		LOCK_TEST_WITH_RETURN(dev, file);			\
 } while (0)
 
@@ -140,6 +142,7 @@ void i915_kernel_lost_context(struct drm_device * dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_master_private *master_priv;
 	struct intel_engine *ring = LP_RING(dev_priv);
+	struct intel_ringbuffer *ringbuf = LP_RINGBUF(dev_priv);
 
 	/*
 	 * We should never lose context on the ring with modesetting
@@ -148,17 +151,17 @@ void i915_kernel_lost_context(struct drm_device * dev)
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		return;
 
-	ring->head = I915_READ_HEAD(ring) & HEAD_ADDR;
-	ring->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
-	ring->space = ring->head - (ring->tail + I915_RING_FREE_SPACE);
-	if (ring->space < 0)
-		ring->space += ring->size;
+	ringbuf->head = I915_READ_HEAD(ring) & HEAD_ADDR;
+	ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
+	ringbuf->space = ringbuf->head - (ringbuf->tail + I915_RING_FREE_SPACE);
+	if (ringbuf->space < 0)
+		ringbuf->space += ringbuf->size;
 
 	if (!dev->primary->master)
 		return;
 
 	master_priv = dev->primary->master->driver_priv;
-	if (ring->head == ring->tail && master_priv->sarea_priv)
+	if (ringbuf->head == ringbuf->tail && master_priv->sarea_priv)
 		master_priv->sarea_priv->perf_boxes |= I915_BOX_RING_EMPTY;
 }
 
@@ -201,7 +204,7 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
 	}
 
 	if (init->ring_size != 0) {
-		if (LP_RING(dev_priv)->obj != NULL) {
+		if (LP_RINGBUF(dev_priv)->obj != NULL) {
 			i915_dma_cleanup(dev);
 			DRM_ERROR("Client tried to initialize ringbuffer in "
 				  "GEM mode\n");
@@ -238,7 +241,7 @@ static int i915_dma_resume(struct drm_device * dev)
 
 	DRM_DEBUG_DRIVER("%s\n", __func__);
 
-	if (ring->virtual_start == NULL) {
+	if (__get_ringbuf(ring)->virtual_start == NULL) {
 		DRM_ERROR("can not ioremap virtual address for"
 			  " ring buffer\n");
 		return -ENOMEM;
@@ -360,7 +363,7 @@ static int i915_emit_cmds(struct drm_device * dev, int *buffer, int dwords)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int i, ret;
 
-	if ((dwords+1) * sizeof(int) >= LP_RING(dev_priv)->size - 8)
+	if ((dwords+1) * sizeof(int) >= LP_RINGBUF(dev_priv)->size - 8)
 		return -EINVAL;
 
 	for (i = 0; i < dwords;) {
@@ -823,7 +826,7 @@ static int i915_irq_emit(struct drm_device *dev, void *data,
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		return -ENODEV;
 
-	if (!dev_priv || !LP_RING(dev_priv)->virtual_start) {
+	if (!dev_priv || !LP_RINGBUF(dev_priv)->virtual_start) {
 		DRM_ERROR("called with no initialization\n");
 		return -EINVAL;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a3b697b..d9253c4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2494,7 +2494,7 @@ i915_gem_retire_requests_ring(struct intel_engine *ring)
 		 * of tail of the request to update the last known position
 		 * of the GPU head.
 		 */
-		ring->last_retired_head = request->tail;
+		__get_ringbuf(ring)->last_retired_head = request->tail;
 
 		i915_gem_free_request(request);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0853db3..a7b165f 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -823,8 +823,8 @@ static void i915_record_ring_state(struct drm_device *dev,
 		ering->hws = I915_READ(mmio);
 	}
 
-	ering->cpu_ring_head = ring->head;
-	ering->cpu_ring_tail = ring->tail;
+	ering->cpu_ring_head = __get_ringbuf(ring)->head;
+	ering->cpu_ring_tail = __get_ringbuf(ring)->tail;
 
 	ering->hangcheck_score = ring->hangcheck.score;
 	ering->hangcheck_action = ring->hangcheck.action;
@@ -928,7 +928,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 		}
 
 		error->ring[i].ringbuffer =
-			i915_error_ggtt_object_create(dev_priv, ring->obj);
+			i915_error_ggtt_object_create(dev_priv, __get_ringbuf(ring)->obj);
 
 		if (ring->status_page.obj)
 			error->ring[i].hws_page =
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 58c8812..e0c3a01 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1079,7 +1079,7 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 static void notify_ring(struct drm_device *dev,
 			struct intel_engine *ring)
 {
-	if (ring->obj == NULL)
+	if (!intel_ring_initialized(ring))
 		return;
 
 	trace_i915_gem_request_complete(ring);
@@ -2610,6 +2610,7 @@ static struct intel_engine *
 semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	u32 cmd, ipehr, head;
 	int i;
 
@@ -2632,10 +2633,10 @@ semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 		 * our ring is smaller than what the hardware (and hence
 		 * HEAD_ADDR) allows. Also handles wrap-around.
 		 */
-		head &= ring->size - 1;
+		head &= ringbuf->size - 1;
 
 		/* This here seems to blow up */
-		cmd = ioread32(ring->virtual_start + head);
+		cmd = ioread32(ringbuf->virtual_start + head);
 		if (cmd == ipehr)
 			break;
 
@@ -2645,7 +2646,7 @@ semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 	if (!i)
 		return NULL;
 
-	*seqno = ioread32(ring->virtual_start + head + 4) + 1;
+	*seqno = ioread32(ringbuf->virtual_start + head + 4) + 1;
 	return semaphore_wait_to_signaller_ring(ring, ipehr);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4c3cc44..f02c21e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -42,9 +42,11 @@
 
 static inline int ring_space(struct intel_engine *ring)
 {
-	int space = (ring->head & HEAD_ADDR) - (ring->tail + I915_RING_FREE_SPACE);
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	int space = (ringbuf->head & HEAD_ADDR) - (ringbuf->tail + I915_RING_FREE_SPACE);
 	if (space < 0)
-		space += ring->size;
+		space += ringbuf->size;
 	return space;
 }
 
@@ -56,10 +58,12 @@ static bool intel_ring_stopped(struct intel_engine *ring)
 
 void __intel_ring_advance(struct intel_engine *ring)
 {
-	ring->tail &= ring->size - 1;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	ringbuf->tail &= ringbuf->size - 1;
 	if (intel_ring_stopped(ring))
 		return;
-	ring->write_tail(ring, ring->tail);
+	ring->write_tail(ring, ringbuf->tail);
 }
 
 static int
@@ -476,7 +480,8 @@ static int init_ring_common(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj = ring->obj;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct drm_i915_gem_object *obj = ringbuf->obj;
 	int ret = 0;
 
 	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
@@ -515,7 +520,7 @@ static int init_ring_common(struct intel_engine *ring)
 	 * register values. */
 	I915_WRITE_START(ring, i915_gem_obj_ggtt_offset(obj));
 	I915_WRITE_CTL(ring,
-			((ring->size - PAGE_SIZE) & RING_NR_PAGES)
+			((ringbuf->size - PAGE_SIZE) & RING_NR_PAGES)
 			| RING_VALID);
 
 	/* If the head is still not zero, the ring is dead */
@@ -535,10 +540,10 @@ static int init_ring_common(struct intel_engine *ring)
 	if (!drm_core_check_feature(ring->dev, DRIVER_MODESET))
 		i915_kernel_lost_context(ring->dev);
 	else {
-		ring->head = I915_READ_HEAD(ring);
-		ring->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
-		ring->space = ring_space(ring);
-		ring->last_retired_head = -1;
+		ringbuf->head = I915_READ_HEAD(ring);
+		ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
+		ringbuf->space = ring_space(ring);
+		ringbuf->last_retired_head = -1;
 	}
 
 	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
@@ -1370,13 +1375,15 @@ static int init_phys_status_page(struct intel_engine *ring)
 
 void intel_destroy_ring_buffer(struct intel_engine *ring)
 {
-	if (!ring->obj)
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	if (!ringbuf->obj)
 		return;
 
-	iounmap(ring->virtual_start);
-	i915_gem_object_ggtt_unpin(ring->obj);
-	drm_gem_object_unreference(&ring->obj->base);
-	ring->obj = NULL;
+	iounmap(ringbuf->virtual_start);
+	i915_gem_object_ggtt_unpin(ringbuf->obj);
+	drm_gem_object_unreference(&ringbuf->obj->base);
+	ringbuf->obj = NULL;
 }
 
 int intel_allocate_ring_buffer(struct intel_engine *ring)
@@ -1384,16 +1391,17 @@ int intel_allocate_ring_buffer(struct intel_engine *ring)
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_object *obj;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
-	if (ring->obj)
+	if (ringbuf->obj)
 		return 0;
 
 	obj = NULL;
 	if (!HAS_LLC(dev))
-		obj = i915_gem_object_create_stolen(dev, ring->size);
+		obj = i915_gem_object_create_stolen(dev, ringbuf->size);
 	if (obj == NULL)
-		obj = i915_gem_alloc_object(dev, ring->size);
+		obj = i915_gem_alloc_object(dev, ringbuf->size);
 	if (obj == NULL)
 		return -ENOMEM;
 
@@ -1405,15 +1413,15 @@ int intel_allocate_ring_buffer(struct intel_engine *ring)
 	if (ret)
 		goto err_unpin;
 
-	ring->virtual_start =
+	ringbuf->virtual_start =
 		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj),
-			   ring->size);
-	if (ring->virtual_start == NULL) {
+			   ringbuf->size);
+	if (ringbuf->virtual_start == NULL) {
 		ret = -EINVAL;
 		goto err_unpin;
 	}
 
-	ring->obj = obj;
+	ringbuf->obj = obj;
 	return 0;
 
 err_unpin:
@@ -1426,11 +1434,12 @@ err_unref:
 static int intel_init_ring_buffer(struct drm_device *dev,
 				  struct intel_engine *ring)
 {
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
-	ring->size = 32 * PAGE_SIZE;
+	ringbuf->size = 32 * PAGE_SIZE;
 	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
 
 	init_waitqueue_head(&ring->irq_queue);
@@ -1456,9 +1465,9 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	 * the TAIL pointer points to within the last 2 cachelines
 	 * of the buffer.
 	 */
-	ring->effective_size = ring->size;
+	ringbuf->effective_size = ringbuf->size;
 	if (IS_I830(dev) || IS_845G(dev))
-		ring->effective_size -= 2 * CACHELINE_BYTES;
+		ringbuf->effective_size -= 2 * CACHELINE_BYTES;
 
 	i915_cmd_parser_init_ring(ring);
 
@@ -1468,8 +1477,9 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 void intel_cleanup_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 
-	if (ring->obj == NULL)
+	if (ringbuf->obj == NULL)
 		return;
 
 	intel_stop_ring_buffer(ring);
@@ -1488,15 +1498,16 @@ void intel_cleanup_ring_buffer(struct intel_engine *ring)
 static int intel_ring_wait_request(struct intel_engine *ring, int n)
 {
 	struct drm_i915_gem_request *request;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	u32 seqno = 0, tail;
 	int ret;
 
-	if (ring->last_retired_head != -1) {
-		ring->head = ring->last_retired_head;
-		ring->last_retired_head = -1;
+	if (ringbuf->last_retired_head != -1) {
+		ringbuf->head = ringbuf->last_retired_head;
+		ringbuf->last_retired_head = -1;
 
-		ring->space = ring_space(ring);
-		if (ring->space >= n)
+		ringbuf->space = ring_space(ring);
+		if (ringbuf->space >= n)
 			return 0;
 	}
 
@@ -1506,9 +1517,9 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 		if (request->tail == -1)
 			continue;
 
-		space = request->tail - (ring->tail + I915_RING_FREE_SPACE);
+		space = request->tail - (ringbuf->tail + I915_RING_FREE_SPACE);
 		if (space < 0)
-			space += ring->size;
+			space += ringbuf->size;
 		if (space >= n) {
 			seqno = request->seqno;
 			tail = request->tail;
@@ -1530,9 +1541,9 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 	if (ret)
 		return ret;
 
-	ring->head = tail;
-	ring->space = ring_space(ring);
-	if (WARN_ON(ring->space < n))
+	ringbuf->head = tail;
+	ringbuf->space = ring_space(ring);
+	if (WARN_ON(ringbuf->space < n))
 		return -ENOSPC;
 
 	return 0;
@@ -1542,6 +1553,7 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	unsigned long end;
 	int ret;
 
@@ -1561,9 +1573,9 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 	end = jiffies + 60 * HZ;
 
 	do {
-		ring->head = I915_READ_HEAD(ring);
-		ring->space = ring_space(ring);
-		if (ring->space >= n) {
+		ringbuf->head = I915_READ_HEAD(ring);
+		ringbuf->space = ring_space(ring);
+		if (ringbuf->space >= n) {
 			trace_i915_ring_wait_end(ring);
 			return 0;
 		}
@@ -1589,21 +1601,22 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 static int intel_wrap_ring_buffer(struct intel_engine *ring)
 {
 	uint32_t __iomem *virt;
-	int rem = ring->size - ring->tail;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	int rem = ringbuf->size - ringbuf->tail;
 
-	if (ring->space < rem) {
+	if (ringbuf->space < rem) {
 		int ret = ring_wait_for_space(ring, rem);
 		if (ret)
 			return ret;
 	}
 
-	virt = ring->virtual_start + ring->tail;
+	virt = ringbuf->virtual_start + ringbuf->tail;
 	rem /= 4;
 	while (rem--)
 		iowrite32(MI_NOOP, virt++);
 
-	ring->tail = 0;
-	ring->space = ring_space(ring);
+	ringbuf->tail = 0;
+	ringbuf->space = ring_space(ring);
 
 	return 0;
 }
@@ -1653,15 +1666,16 @@ intel_ring_alloc_seqno(struct intel_engine *ring)
 static int __intel_ring_prepare(struct intel_engine *ring,
 				int bytes)
 {
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
-	if (unlikely(ring->tail + bytes > ring->effective_size)) {
+	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
 		ret = intel_wrap_ring_buffer(ring);
 		if (unlikely(ret))
 			return ret;
 	}
 
-	if (unlikely(ring->space < bytes)) {
+	if (unlikely(ringbuf->space < bytes)) {
 		ret = ring_wait_for_space(ring, bytes);
 		if (unlikely(ret))
 			return ret;
@@ -1674,6 +1688,7 @@ int intel_ring_begin(struct intel_engine *ring,
 		     int num_dwords)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
@@ -1690,14 +1705,15 @@ int intel_ring_begin(struct intel_engine *ring,
 	if (ret)
 		return ret;
 
-	ring->space -= num_dwords * sizeof(uint32_t);
+	ringbuf->space -= num_dwords * sizeof(uint32_t);
 	return 0;
 }
 
 /* Align the ring tail to a cacheline boundary */
 int intel_ring_cacheline_align(struct intel_engine *ring)
 {
-	int num_dwords = (ring->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	int num_dwords = (ringbuf->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
 	int ret;
 
 	if (num_dwords == 0)
@@ -2019,6 +2035,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[RCS];
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	if (INTEL_INFO(dev)->gen >= 6) {
@@ -2057,13 +2074,13 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 
-	ring->size = size;
-	ring->effective_size = ring->size;
+	ringbuf->size = size;
+	ringbuf->effective_size = ringbuf->size;
 	if (IS_I830(ring->dev) || IS_845G(ring->dev))
-		ring->effective_size -= 2 * CACHELINE_BYTES;
+		ringbuf->effective_size -= 2 * CACHELINE_BYTES;
 
-	ring->virtual_start = ioremap_wc(start, size);
-	if (ring->virtual_start == NULL) {
+	ringbuf->virtual_start = ioremap_wc(start, size);
+	if (ringbuf->virtual_start == NULL) {
 		DRM_ERROR("can not ioremap virtual address for"
 			  " ring buffer\n");
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 50cc525..7299bff 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -54,6 +54,27 @@ struct intel_ring_hangcheck {
 	bool deadlock;
 };
 
+struct intel_ringbuffer {
+	struct drm_i915_gem_object *obj;
+	void __iomem *virtual_start;
+
+	u32 head;
+	u32 tail;
+	int space;
+	int size;
+	int effective_size;
+
+	/** We track the position of the requests in the ring buffer, and
+	 * when each is retired we increment last_retired_head as the GPU
+	 * must have finished processing the request and so we know we
+	 * can advance the ringbuffer up to that position.
+	 *
+	 * last_retired_head is set to -1 after the value is consumed so
+	 * we can detect new retirements.
+	 */
+	u32 last_retired_head;
+};
+
 struct intel_engine {
 	const char	*name;
 	enum intel_ring_id {
@@ -66,27 +87,11 @@ struct intel_engine {
 #define I915_NUM_RINGS 5
 #define LAST_USER_RING (VECS + 1)
 	u32		mmio_base;
-	void		__iomem *virtual_start;
 	struct		drm_device *dev;
-	struct		drm_i915_gem_object *obj;
+	struct intel_ringbuffer default_ringbuf;
 
-	u32		head;
-	u32		tail;
-	int		space;
-	int		size;
-	int		effective_size;
 	struct intel_hw_status_page status_page;
 
-	/** We track the position of the requests in the ring buffer, and
-	 * when each is retired we increment last_retired_head as the GPU
-	 * must have finished processing the request and so we know we
-	 * can advance the ringbuffer up to that position.
-	 *
-	 * last_retired_head is set to -1 after the value is consumed so
-	 * we can detect new retirements.
-	 */
-	u32		last_retired_head;
-
 	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
 	u32		trace_irq_seqno;
@@ -139,7 +144,7 @@ struct intel_engine {
 
 	/**
 	 * List of objects currently involved in rendering from the
-	 * ringbuffer.
+	 * engine.
 	 *
 	 * Includes buffers having the contents of their GPU caches
 	 * flushed, not necessarily primitives.  last_rendering_seqno
@@ -209,10 +214,16 @@ struct intel_engine {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
+/* This is a temporary define to help us transition to per-context ringbuffers */
+static inline struct intel_ringbuffer *__get_ringbuf(struct intel_engine *ring)
+{
+	return &ring->default_ringbuf;
+}
+
 static inline bool
 intel_ring_initialized(struct intel_engine *ring)
 {
-	return ring->obj != NULL;
+	return __get_ringbuf(ring)->obj != NULL;
 }
 
 static inline unsigned
@@ -283,12 +294,16 @@ int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
 static inline void intel_ring_emit(struct intel_engine *ring,
 				   u32 data)
 {
-	iowrite32(data, ring->virtual_start + ring->tail);
-	ring->tail += 4;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	iowrite32(data, ringbuf->virtual_start + ringbuf->tail);
+	ringbuf->tail += 4;
 }
 static inline void intel_ring_advance(struct intel_engine *ring)
 {
-	ring->tail &= ring->size - 1;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	ringbuf->tail &= ringbuf->size - 1;
 }
 void __intel_ring_advance(struct intel_engine *ring);
 
@@ -312,7 +327,7 @@ int intel_allocate_ring_buffer(struct intel_engine *ring);
 
 static inline u32 intel_ring_get_tail(struct intel_engine *ring)
 {
-	return ring->tail;
+	return __get_ringbuf(ring)->tail;
 }
 
 static inline u32 intel_ring_get_seqno(struct intel_engine *ring)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 08/50] drm/i915: Rename functions that mention ringbuffers (meaning rings)
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (6 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 07/50] drm/i915: Split the ringbuffers and the rings oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path oscar.mateo
                   ` (43 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Continue with the refactoring: do not init or clean a "ringbuffer" when
you actually mean a "ring", because they are not the same thing anymore.

Again, no functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  6 +++---
 drivers/gpu/drm/i915/i915_drv.h         |  2 +-
 drivers/gpu/drm/i915/i915_gem.c         | 30 +++++++++++++++---------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 30 +++++++++++++++---------------
 drivers/gpu/drm/i915/intel_ringbuffer.h | 14 +++++++-------
 5 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 8ec8963..eb3ce6d 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -179,7 +179,7 @@ static int i915_dma_cleanup(struct drm_device * dev)
 
 	mutex_lock(&dev->struct_mutex);
 	for (i = 0; i < I915_NUM_RINGS; i++)
-		intel_cleanup_ring_buffer(&dev_priv->ring[i]);
+		intel_cleanup_ring(&dev_priv->ring[i]);
 	mutex_unlock(&dev->struct_mutex);
 
 	/* Clear the HWS virtual address at teardown */
@@ -1383,7 +1383,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 
 cleanup_gem:
 	mutex_lock(&dev->struct_mutex);
-	i915_gem_cleanup_ringbuffer(dev);
+	i915_gem_cleanup_ring(dev);
 	i915_gem_context_fini(dev);
 	mutex_unlock(&dev->struct_mutex);
 	WARN_ON(dev_priv->mm.aliasing_ppgtt);
@@ -1839,7 +1839,7 @@ int i915_driver_unload(struct drm_device *dev)
 
 		mutex_lock(&dev->struct_mutex);
 		i915_gem_free_all_phys_object(dev);
-		i915_gem_cleanup_ringbuffer(dev);
+		i915_gem_cleanup_ring(dev);
 		i915_gem_context_fini(dev);
 		WARN_ON(dev_priv->mm.aliasing_ppgtt);
 		mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3b7a36f9..ee27ce8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2266,7 +2266,7 @@ int __must_check i915_gem_init(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
 int i915_gem_l3_remap(struct intel_engine *ring, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
-void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
+void i915_gem_cleanup_ring(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 int __i915_add_request(struct intel_engine *ring,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9253c4..e7565d9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4262,7 +4262,7 @@ i915_gem_stop_ringbuffers(struct drm_device *dev)
 	int i;
 
 	for_each_active_ring(ring, dev_priv, i)
-		intel_stop_ring_buffer(ring);
+		intel_stop_ring(ring);
 }
 
 int
@@ -4384,30 +4384,30 @@ static int i915_gem_init_rings(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
-	ret = intel_init_render_ring_buffer(dev);
+	ret = intel_init_render_ring(dev);
 	if (ret)
 		return ret;
 
 	if (HAS_BSD(dev)) {
-		ret = intel_init_bsd_ring_buffer(dev);
+		ret = intel_init_bsd_ring(dev);
 		if (ret)
 			goto cleanup_render_ring;
 	}
 
 	if (intel_enable_blt(dev)) {
-		ret = intel_init_blt_ring_buffer(dev);
+		ret = intel_init_blt_ring(dev);
 		if (ret)
 			goto cleanup_bsd_ring;
 	}
 
 	if (HAS_VEBOX(dev)) {
-		ret = intel_init_vebox_ring_buffer(dev);
+		ret = intel_init_vebox_ring(dev);
 		if (ret)
 			goto cleanup_blt_ring;
 	}
 
 	if (HAS_BSD2(dev)) {
-		ret = intel_init_bsd2_ring_buffer(dev);
+		ret = intel_init_bsd2_ring(dev);
 		if (ret)
 			goto cleanup_vebox_ring;
 	}
@@ -4419,15 +4419,15 @@ static int i915_gem_init_rings(struct drm_device *dev)
 	return 0;
 
 cleanup_bsd2_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[VCS2]);
+	intel_cleanup_ring(&dev_priv->ring[VCS2]);
 cleanup_vebox_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[VECS]);
+	intel_cleanup_ring(&dev_priv->ring[VECS]);
 cleanup_blt_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[BCS]);
+	intel_cleanup_ring(&dev_priv->ring[BCS]);
 cleanup_bsd_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[VCS]);
+	intel_cleanup_ring(&dev_priv->ring[VCS]);
 cleanup_render_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[RCS]);
+	intel_cleanup_ring(&dev_priv->ring[RCS]);
 
 	return ret;
 }
@@ -4479,7 +4479,7 @@ i915_gem_init_hw(struct drm_device *dev)
 	ret = i915_gem_context_enable(dev_priv);
 	if (ret && ret != -EIO) {
 		DRM_ERROR("Context enable failed %d\n", ret);
-		i915_gem_cleanup_ringbuffer(dev);
+		i915_gem_cleanup_ring(dev);
 	}
 
 	return ret;
@@ -4529,14 +4529,14 @@ int i915_gem_init(struct drm_device *dev)
 }
 
 void
-i915_gem_cleanup_ringbuffer(struct drm_device *dev)
+i915_gem_cleanup_ring(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring;
 	int i;
 
 	for_each_active_ring(ring, dev_priv, i)
-		intel_cleanup_ring_buffer(ring);
+		intel_cleanup_ring(ring);
 }
 
 int
@@ -4573,7 +4573,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
 	return 0;
 
 cleanup_ringbuffer:
-	i915_gem_cleanup_ringbuffer(dev);
+	i915_gem_cleanup_ring(dev);
 	dev_priv->ums.mm_suspended = 1;
 	mutex_unlock(&dev->struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index f02c21e..6d14dcb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1431,8 +1431,8 @@ err_unref:
 	return ret;
 }
 
-static int intel_init_ring_buffer(struct drm_device *dev,
-				  struct intel_engine *ring)
+static int intel_init_ring(struct drm_device *dev,
+			   struct intel_engine *ring)
 {
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
@@ -1474,7 +1474,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	return ring->init(ring);
 }
 
-void intel_cleanup_ring_buffer(struct intel_engine *ring)
+void intel_cleanup_ring(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
@@ -1482,7 +1482,7 @@ void intel_cleanup_ring_buffer(struct intel_engine *ring)
 	if (ringbuf->obj == NULL)
 		return;
 
-	intel_stop_ring_buffer(ring);
+	intel_stop_ring(ring);
 	WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0);
 
 	intel_destroy_ring_buffer(ring);
@@ -1925,7 +1925,7 @@ static int gen6_ring_flush(struct intel_engine *ring,
 	return 0;
 }
 
-int intel_init_render_ring_buffer(struct drm_device *dev)
+int intel_init_render_ring(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[RCS];
@@ -2028,7 +2028,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->scratch.gtt_offset = i915_gem_obj_ggtt_offset(obj);
 	}
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
 int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
@@ -2095,7 +2095,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 	return 0;
 }
 
-int intel_init_bsd_ring_buffer(struct drm_device *dev)
+int intel_init_bsd_ring(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[VCS];
@@ -2159,14 +2159,14 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 	}
 	ring->init = init_ring_common;
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
 /**
  * Initialize the second BSD ring for Broadwell GT3.
  * It is noted that this only exists on Broadwell GT3.
  */
-int intel_init_bsd2_ring_buffer(struct drm_device *dev)
+int intel_init_bsd2_ring(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[VCS2];
@@ -2207,10 +2207,10 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 
 	ring->init = init_ring_common;
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
-int intel_init_blt_ring_buffer(struct drm_device *dev)
+int intel_init_blt_ring(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[BCS];
@@ -2252,10 +2252,10 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	ring->init = init_ring_common;
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
-int intel_init_vebox_ring_buffer(struct drm_device *dev)
+int intel_init_vebox_ring(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[VECS];
@@ -2292,7 +2292,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	ring->init = init_ring_common;
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
 int
@@ -2334,7 +2334,7 @@ intel_ring_invalidate_all_caches(struct intel_engine *ring)
 }
 
 void
-intel_stop_ring_buffer(struct intel_engine *ring)
+intel_stop_ring(struct intel_engine *ring)
 {
 	int ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 7299bff..c9328fd 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -286,8 +286,8 @@ intel_write_status_page(struct intel_engine *ring,
 #define I915_GEM_HWS_SCRATCH_INDEX	0x30
 #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
 
-void intel_stop_ring_buffer(struct intel_engine *ring);
-void intel_cleanup_ring_buffer(struct intel_engine *ring);
+void intel_stop_ring(struct intel_engine *ring);
+void intel_cleanup_ring(struct intel_engine *ring);
 
 int __must_check intel_ring_begin(struct intel_engine *ring, int n);
 int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
@@ -313,11 +313,11 @@ int intel_ring_flush_all_caches(struct intel_engine *ring);
 int intel_ring_invalidate_all_caches(struct intel_engine *ring);
 
 void intel_init_rings_early(struct drm_device *dev);
-int intel_init_render_ring_buffer(struct drm_device *dev);
-int intel_init_bsd_ring_buffer(struct drm_device *dev);
-int intel_init_bsd2_ring_buffer(struct drm_device *dev);
-int intel_init_blt_ring_buffer(struct drm_device *dev);
-int intel_init_vebox_ring_buffer(struct drm_device *dev);
+int intel_init_render_ring(struct drm_device *dev);
+int intel_init_bsd_ring(struct drm_device *dev);
+int intel_init_bsd2_ring(struct drm_device *dev);
+int intel_init_blt_ring(struct drm_device *dev);
+int intel_init_vebox_ring(struct drm_device *dev);
 
 u64 intel_ring_get_active_head(struct intel_engine *ring);
 void intel_ring_setup_status_page(struct intel_engine *ring);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (7 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 08/50] drm/i915: Rename functions that mention ringbuffers (meaning rings) oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-16 11:04   ` Chris Wilson
  2014-05-09 12:08 ` [PATCH 10/50] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit oscar.mateo
                   ` (42 subsequent siblings)
  51 siblings, 1 reply; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The context are going to become very important pretty soon, and
we need to be able to access them in a number of places inside
the command submission path. The idea is that, when we need to
place commands inside a ringbuffer or update the tail register,
we know which context we are working with.

We left intel_ring_begin() as a function macro to quickly adapt
legacy code, an introduce intel_ringbuffer_begin() as the first
of a set of new functions for ringbuffer manipulation (the rest
will come in subsequent patches).

No functional changes.

v2: Do not set the context to NULL. In legacy code, set it to
the default ring context (even if it doesn't get used later on).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c            |  2 +-
 drivers/gpu/drm/i915/i915_drv.h            |  3 +-
 drivers/gpu/drm/i915/i915_gem.c            |  5 +-
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 23 +++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  9 ++--
 drivers/gpu/drm/i915/intel_display.c       | 10 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 80 +++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 23 ++++++---
 9 files changed, 100 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index eb3ce6d..a582a64 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -56,7 +56,7 @@
 	intel_ring_emit(LP_RING(dev_priv), x)
 
 #define ADVANCE_LP_RING() \
-	__intel_ring_advance(LP_RING(dev_priv))
+	__intel_ring_advance(LP_RING(dev_priv), LP_RING(dev_priv)->default_context)
 
 /**
  * Lock test for when it's just for synchronization of ring access.
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ee27ce8..35b2ae4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2270,11 +2270,12 @@ void i915_gem_cleanup_ring(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 int __i915_add_request(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *batch_obj,
 		       u32 *seqno);
 #define i915_add_request(ring, seqno) \
-	__i915_add_request(ring, NULL, NULL, seqno)
+	__i915_add_request(ring, ring->default_context, NULL, NULL, seqno)
 int __must_check i915_wait_seqno(struct intel_engine *ring,
 				 uint32_t seqno);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e7565d9..774151c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2171,6 +2171,7 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
 }
 
 int __i915_add_request(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *obj,
 		       u32 *out_seqno)
@@ -2188,7 +2189,7 @@ int __i915_add_request(struct intel_engine *ring,
 	 * is that the flush _must_ happen before the next request, no matter
 	 * what.
 	 */
-	ret = intel_ring_flush_all_caches(ring);
+	ret = intel_ring_flush_all_caches(ring, ctx);
 	if (ret)
 		return ret;
 
@@ -2203,7 +2204,7 @@ int __i915_add_request(struct intel_engine *ring,
 	 */
 	request_ring_position = intel_ring_get_tail(ring);
 
-	ret = ring->add_request(ring);
+	ret = ring->add_request(ring, ctx);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4d37e20..50337ae 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -558,7 +558,7 @@ mi_set_context(struct intel_engine *ring,
 	 * itlb_before_ctx_switch.
 	 */
 	if (IS_GEN6(ring->dev)) {
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, 0);
+		ret = ring->flush(ring, new_context, I915_GEM_GPU_DOMAINS, 0);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 95e797e..c93941d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -828,6 +828,7 @@ err:
 
 static int
 i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
+				struct i915_hw_context *ctx,
 				struct list_head *vmas)
 {
 	struct i915_vma *vma;
@@ -856,7 +857,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
 	/* Unconditionally invalidate gpu caches and ensure that we do flush
 	 * any residual writes from the previous batch.
 	 */
-	return intel_ring_invalidate_all_caches(ring);
+	return intel_ring_invalidate_all_caches(ring, ctx);
 }
 
 static bool
@@ -971,18 +972,20 @@ static void
 i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 				    struct drm_file *file,
 				    struct intel_engine *ring,
+				    struct i915_hw_context *ctx,
 				    struct drm_i915_gem_object *obj)
 {
 	/* Unconditionally force add_request to emit a full flush. */
 	ring->gpu_caches_dirty = true;
 
 	/* Add a breadcrumb for the completion of the batch buffer */
-	(void)__i915_add_request(ring, file, obj, NULL);
+	(void)__i915_add_request(ring, ctx, file, obj, NULL);
 }
 
 static int
 i915_reset_gen7_sol_offsets(struct drm_device *dev,
-			    struct intel_engine *ring)
+			    struct intel_engine *ring,
+			    struct i915_hw_context *ctx)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret, i;
@@ -992,7 +995,7 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 		return -EINVAL;
 	}
 
-	ret = intel_ring_begin(ring, 4 * 3);
+	ret = intel_ringbuffer_begin(ring, ctx, 4 * 3);
 	if (ret)
 		return ret;
 
@@ -1277,7 +1280,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	else
 		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
-	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
+	ret = i915_gem_execbuffer_move_to_gpu(ring, ctx, &eb->vmas);
 	if (ret)
 		goto err;
 
@@ -1287,7 +1290,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    mode != dev_priv->relative_constants_mode) {
-		ret = intel_ring_begin(ring, 4);
+		ret = intel_ringbuffer_begin(ring, ctx, 4);
 		if (ret)
 				goto err;
 
@@ -1301,7 +1304,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	}
 
 	if (args->flags & I915_EXEC_GEN7_SOL_RESET) {
-		ret = i915_reset_gen7_sol_offsets(dev, ring);
+		ret = i915_reset_gen7_sol_offsets(dev, ring, ctx);
 		if (ret)
 			goto err;
 	}
@@ -1315,14 +1318,14 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			if (ret)
 				goto err;
 
-			ret = ring->dispatch_execbuffer(ring,
+			ret = ring->dispatch_execbuffer(ring, ctx,
 							exec_start, exec_len,
 							flags);
 			if (ret)
 				goto err;
 		}
 	} else {
-		ret = ring->dispatch_execbuffer(ring,
+		ret = ring->dispatch_execbuffer(ring, ctx,
 						exec_start, exec_len,
 						flags);
 		if (ret)
@@ -1332,7 +1335,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
 	i915_gem_execbuffer_move_to_active(&eb->vmas, ring);
-	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
+	i915_gem_execbuffer_retire_commands(dev, file, ring, ctx, batch_obj);
 
 err:
 	/* the request owns the ref now */
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 31b58ee..a0993c0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -740,7 +740,8 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	}
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, ring->default_context,
+			I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -784,7 +785,8 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	}
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, ring->default_context,
+			I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -802,7 +804,8 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 
 	/* XXX: RCS is the only one to auto invalidate the TLBs? */
 	if (ring->id != RCS) {
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+		ret = ring->flush(ring, ring->default_context,
+				I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f821147..d7c6ce5 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8609,7 +8609,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8651,7 +8651,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8700,7 +8700,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8745,7 +8745,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8855,7 +8855,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6d14dcb..3b43070 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -56,18 +56,21 @@ static bool intel_ring_stopped(struct intel_engine *ring)
 	return dev_priv->gpu_error.stop_rings & intel_ring_flag(ring);
 }
 
-void __intel_ring_advance(struct intel_engine *ring)
+void __intel_ring_advance(struct intel_engine *ring,
+			  struct i915_hw_context *ctx)
 {
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 
 	ringbuf->tail &= ringbuf->size - 1;
 	if (intel_ring_stopped(ring))
 		return;
-	ring->write_tail(ring, ringbuf->tail);
+
+	ring->write_tail(ring, ctx, ringbuf->tail);
 }
 
 static int
 gen2_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -94,6 +97,7 @@ gen2_render_ring_flush(struct intel_engine *ring,
 
 static int
 gen4_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -224,7 +228,8 @@ intel_emit_post_sync_nonzero_flush(struct intel_engine *ring)
 
 static int
 gen6_render_ring_flush(struct intel_engine *ring,
-                         u32 invalidate_domains, u32 flush_domains)
+		       struct i915_hw_context *ctx,
+		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
@@ -318,6 +323,7 @@ static int gen7_ring_fbc_flush(struct intel_engine *ring, u32 value)
 
 static int
 gen7_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -379,6 +385,7 @@ gen7_render_ring_flush(struct intel_engine *ring,
 
 static int
 gen8_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -402,7 +409,7 @@ gen8_render_ring_flush(struct intel_engine *ring,
 		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
 	}
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ringbuffer_begin(ring, ctx, 6);
 	if (ret)
 		return ret;
 
@@ -419,7 +426,7 @@ gen8_render_ring_flush(struct intel_engine *ring,
 }
 
 static void ring_write_tail(struct intel_engine *ring,
-			    u32 value)
+			    struct i915_hw_context *ctx, u32 value)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	I915_WRITE_TAIL(ring, value);
@@ -466,7 +473,7 @@ static bool stop_ring(struct intel_engine *ring)
 
 	I915_WRITE_CTL(ring, 0);
 	I915_WRITE_HEAD(ring, 0);
-	ring->write_tail(ring, 0);
+	ring->write_tail(ring, ring->default_context, 0);
 
 	if (!IS_GEN2(ring->dev)) {
 		(void)I915_READ_CTL(ring);
@@ -718,7 +725,8 @@ static int gen6_signal(struct intel_engine *signaller,
  * This acts like a signal in the canonical semaphore.
  */
 static int
-gen6_add_request(struct intel_engine *ring)
+gen6_add_request(struct intel_engine *ring,
+		 struct i915_hw_context *ctx)
 {
 	int ret;
 
@@ -730,7 +738,7 @@ gen6_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -799,7 +807,8 @@ do {									\
 } while (0)
 
 static int
-pc_render_add_request(struct intel_engine *ring)
+pc_render_add_request(struct intel_engine *ring,
+		      struct i915_hw_context *ctx)
 {
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
@@ -841,7 +850,7 @@ pc_render_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, 0);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -1053,6 +1062,7 @@ void intel_ring_setup_status_page(struct intel_engine *ring)
 
 static int
 bsd_ring_flush(struct intel_engine *ring,
+	       struct i915_hw_context *ctx,
 	       u32     invalidate_domains,
 	       u32     flush_domains)
 {
@@ -1069,7 +1079,8 @@ bsd_ring_flush(struct intel_engine *ring,
 }
 
 static int
-i9xx_add_request(struct intel_engine *ring)
+i9xx_add_request(struct intel_engine *ring,
+		 struct i915_hw_context *ctx)
 {
 	int ret;
 
@@ -1081,7 +1092,7 @@ i9xx_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -1215,6 +1226,7 @@ gen8_ring_put_irq(struct intel_engine *ring)
 
 static int
 i965_dispatch_execbuffer(struct intel_engine *ring,
+			 struct i915_hw_context *ctx,
 			 u64 offset, u32 length,
 			 unsigned flags)
 {
@@ -1238,6 +1250,7 @@ i965_dispatch_execbuffer(struct intel_engine *ring,
 #define I830_BATCH_LIMIT (256*1024)
 static int
 i830_dispatch_execbuffer(struct intel_engine *ring,
+				struct i915_hw_context *ctx,
 				u64 offset, u32 len,
 				unsigned flags)
 {
@@ -1289,6 +1302,7 @@ i830_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 i915_dispatch_execbuffer(struct intel_engine *ring,
+			 struct i915_hw_context *ctx,
 			 u64 offset, u32 len,
 			 unsigned flags)
 {
@@ -1549,7 +1563,8 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 	return 0;
 }
 
-static int ring_wait_for_space(struct intel_engine *ring, int n)
+static int ring_wait_for_space(struct intel_engine *ring,
+			       struct i915_hw_context *ctx, int n)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1562,7 +1577,7 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 		return ret;
 
 	/* force the tail write in case we have been skipping them */
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	trace_i915_ring_wait_begin(ring);
 	/* With GEM the hangcheck timer should kick us out of the loop,
@@ -1598,14 +1613,15 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 	return -EBUSY;
 }
 
-static int intel_wrap_ring_buffer(struct intel_engine *ring)
+static int intel_wrap_ring_buffer(struct intel_engine *ring,
+				  struct i915_hw_context *ctx)
 {
 	uint32_t __iomem *virt;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int rem = ringbuf->size - ringbuf->tail;
 
 	if (ringbuf->space < rem) {
-		int ret = ring_wait_for_space(ring, rem);
+		int ret = ring_wait_for_space(ring, ctx, rem);
 		if (ret)
 			return ret;
 	}
@@ -1664,19 +1680,19 @@ intel_ring_alloc_seqno(struct intel_engine *ring)
 }
 
 static int __intel_ring_prepare(struct intel_engine *ring,
-				int bytes)
+				struct i915_hw_context *ctx, int bytes)
 {
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
-		ret = intel_wrap_ring_buffer(ring);
+		ret = intel_wrap_ring_buffer(ring, ctx);
 		if (unlikely(ret))
 			return ret;
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
-		ret = ring_wait_for_space(ring, bytes);
+		ret = ring_wait_for_space(ring, ctx, bytes);
 		if (unlikely(ret))
 			return ret;
 	}
@@ -1684,8 +1700,9 @@ static int __intel_ring_prepare(struct intel_engine *ring,
 	return 0;
 }
 
-int intel_ring_begin(struct intel_engine *ring,
-		     int num_dwords)
+int intel_ringbuffer_begin(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
+			      int num_dwords)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
@@ -1696,7 +1713,7 @@ int intel_ring_begin(struct intel_engine *ring,
 	if (ret)
 		return ret;
 
-	ret = __intel_ring_prepare(ring, num_dwords * sizeof(uint32_t));
+	ret = __intel_ring_prepare(ring, ctx, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
 
@@ -1750,7 +1767,7 @@ void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno)
 }
 
 static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
-				     u32 value)
+				     struct i915_hw_context *ctx, u32 value)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
@@ -1783,6 +1800,7 @@ static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
 }
 
 static int gen6_bsd_ring_flush(struct intel_engine *ring,
+			       struct i915_hw_context *ctx,
 			       u32 invalidate, u32 flush)
 {
 	uint32_t cmd;
@@ -1819,6 +1837,7 @@ static int gen6_bsd_ring_flush(struct intel_engine *ring,
 
 static int
 gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1827,7 +1846,7 @@ gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 		!(flags & I915_DISPATCH_SECURE);
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ringbuffer_begin(ring, ctx, 4);
 	if (ret)
 		return ret;
 
@@ -1843,6 +1862,7 @@ gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1864,6 +1884,7 @@ hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1886,6 +1907,7 @@ gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
 /* Blitter support (SandyBridge+) */
 
 static int gen6_ring_flush(struct intel_engine *ring,
+			   struct i915_hw_context *ctx,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
@@ -2296,14 +2318,15 @@ int intel_init_vebox_ring(struct drm_device *dev)
 }
 
 int
-intel_ring_flush_all_caches(struct intel_engine *ring)
+intel_ring_flush_all_caches(struct intel_engine *ring,
+			    struct i915_hw_context *ctx)
 {
 	int ret;
 
 	if (!ring->gpu_caches_dirty)
 		return 0;
 
-	ret = ring->flush(ring, 0, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, ctx, 0, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -2314,7 +2337,8 @@ intel_ring_flush_all_caches(struct intel_engine *ring)
 }
 
 int
-intel_ring_invalidate_all_caches(struct intel_engine *ring)
+intel_ring_invalidate_all_caches(struct intel_engine *ring,
+				 struct i915_hw_context *ctx)
 {
 	uint32_t flush_domains;
 	int ret;
@@ -2323,7 +2347,7 @@ intel_ring_invalidate_all_caches(struct intel_engine *ring)
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;
 
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, flush_domains);
+	ret = ring->flush(ring, ctx, I915_GEM_GPU_DOMAINS, flush_domains);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c9328fd..4ed68b4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -75,6 +75,8 @@ struct intel_ringbuffer {
 	u32 last_retired_head;
 };
 
+struct i915_hw_context;
+
 struct intel_engine {
 	const char	*name;
 	enum intel_ring_id {
@@ -101,11 +103,13 @@ struct intel_engine {
 	int		(*init)(struct intel_engine *ring);
 
 	void		(*write_tail)(struct intel_engine *ring,
-				      u32 value);
+				      struct i915_hw_context *ctx, u32 value);
 	int __must_check (*flush)(struct intel_engine *ring,
+				  struct i915_hw_context *ctx,
 				  u32	invalidate_domains,
 				  u32	flush_domains);
-	int		(*add_request)(struct intel_engine *ring);
+	int		(*add_request)(struct intel_engine *ring,
+				       struct i915_hw_context *ctx);
 	/* Some chipsets are not quite as coherent as advertised and need
 	 * an expensive kick to force a true read of the up-to-date seqno.
 	 * However, the up-to-date seqno is not always required and the last
@@ -117,6 +121,7 @@ struct intel_engine {
 	void		(*set_seqno)(struct intel_engine *ring,
 				     u32 seqno);
 	int		(*dispatch_execbuffer)(struct intel_engine *ring,
+					       struct i915_hw_context *ctx,
 					       u64 offset, u32 length,
 					       unsigned flags);
 #define I915_DISPATCH_SECURE 0x1
@@ -289,7 +294,10 @@ intel_write_status_page(struct intel_engine *ring,
 void intel_stop_ring(struct intel_engine *ring);
 void intel_cleanup_ring(struct intel_engine *ring);
 
-int __must_check intel_ring_begin(struct intel_engine *ring, int n);
+int __must_check intel_ringbuffer_begin(struct intel_engine *ring,
+					   struct i915_hw_context *ctx, int n);
+#define intel_ring_begin(ring, n) \
+		intel_ringbuffer_begin(ring, ring->default_context, n)
 int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
 static inline void intel_ring_emit(struct intel_engine *ring,
 				   u32 data)
@@ -305,12 +313,15 @@ static inline void intel_ring_advance(struct intel_engine *ring)
 
 	ringbuf->tail &= ringbuf->size - 1;
 }
-void __intel_ring_advance(struct intel_engine *ring);
+void __intel_ring_advance(struct intel_engine *ring,
+			  struct i915_hw_context *ctx);
 
 int __must_check intel_ring_idle(struct intel_engine *ring);
 void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
-int intel_ring_flush_all_caches(struct intel_engine *ring);
-int intel_ring_invalidate_all_caches(struct intel_engine *ring);
+int intel_ring_flush_all_caches(struct intel_engine *ring,
+				struct i915_hw_context *ctx);
+int intel_ring_invalidate_all_caches(struct intel_engine *ring,
+				     struct i915_hw_context *ctx);
 
 void intel_init_rings_early(struct drm_device *dev);
 int intel_init_render_ring(struct drm_device *dev);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 10/50] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (8 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 11/50] drm/i915: Write a new set of context-aware ringbuffer management functions oscar.mateo
                   ` (41 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The "__" name is too confusing, specially given the refactoring patch
that comes soon with more new ringbuffer management functions.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  3 ++-
 drivers/gpu/drm/i915/intel_display.c    | 10 +++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.c | 10 +++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 ++--
 4 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a582a64..166fbdf 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -56,7 +56,8 @@
 	intel_ring_emit(LP_RING(dev_priv), x)
 
 #define ADVANCE_LP_RING() \
-	__intel_ring_advance(LP_RING(dev_priv), LP_RING(dev_priv)->default_context)
+	intel_ringbuffer_advance_and_submit(LP_RING(dev_priv), \
+			LP_RING(dev_priv)->default_context)
 
 /**
  * Lock test for when it's just for synchronization of ring access.
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index d7c6ce5..e0550c6 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8609,7 +8609,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8651,7 +8651,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8700,7 +8700,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8745,7 +8745,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8855,7 +8855,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3b43070..0f4d3b6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -56,7 +56,7 @@ static bool intel_ring_stopped(struct intel_engine *ring)
 	return dev_priv->gpu_error.stop_rings & intel_ring_flag(ring);
 }
 
-void __intel_ring_advance(struct intel_engine *ring,
+void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
 			  struct i915_hw_context *ctx)
 {
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
@@ -738,7 +738,7 @@ gen6_add_request(struct intel_engine *ring,
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	return 0;
 }
@@ -850,7 +850,7 @@ pc_render_add_request(struct intel_engine *ring,
 	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, 0);
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	return 0;
 }
@@ -1092,7 +1092,7 @@ i9xx_add_request(struct intel_engine *ring,
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	return 0;
 }
@@ -1577,7 +1577,7 @@ static int ring_wait_for_space(struct intel_engine *ring,
 		return ret;
 
 	/* force the tail write in case we have been skipping them */
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	trace_i915_ring_wait_begin(ring);
 	/* With GEM the hangcheck timer should kick us out of the loop,
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4ed68b4..b2dcad4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -313,8 +313,8 @@ static inline void intel_ring_advance(struct intel_engine *ring)
 
 	ringbuf->tail &= ringbuf->size - 1;
 }
-void __intel_ring_advance(struct intel_engine *ring,
-			  struct i915_hw_context *ctx);
+void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
+					 struct i915_hw_context *ctx);
 
 int __must_check intel_ring_idle(struct intel_engine *ring);
 void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 11/50] drm/i915: Write a new set of context-aware ringbuffer management functions
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (9 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 10/50] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 12/50] drm/i915: Final touches to ringbuffer and context plumbing and refactoring oscar.mateo
                   ` (40 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

We need functions that are aware of the fact that the ringbuffer might be living
somewhere else than in the engine. After this commit and some of the previous, these
new ringbuffer functions finally are:

intel_ringbuffer_get
intel_ringbuffer_begin
intel_ringbuffer_cacheline_align
intel_ringbuffer_emit
intel_ringbuffer_advance
intel_ringbuffer_advance_and_submit
intel_ringbuffer_get_tail

Some of the old ones remain after the refactoring as deprecated functions, simply
calling the previous set of functions to manipulate the engine's default ringbuffer:

intel_ring_begin
intel_ring_emit
intel_ring_advance

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c            |  4 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 19 ++++++---
 drivers/gpu/drm/i915/intel_display.c       |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 37 ++++++++--------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 68 ++++++++++++++++++++++--------
 5 files changed, 86 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 774151c..26bd68f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2181,7 +2181,7 @@ int __i915_add_request(struct intel_engine *ring,
 	u32 request_ring_position, request_start;
 	int ret;
 
-	request_start = intel_ring_get_tail(ring);
+	request_start = intel_ringbuffer_get_tail(ring, ctx);
 	/*
 	 * Emit any outstanding flushes - execbuf can fail to emit the flush
 	 * after having emitted the batchbuffer command. Hence we need to fix
@@ -2202,7 +2202,7 @@ int __i915_add_request(struct intel_engine *ring,
 	 * GPU processing the request, we never over-estimate the
 	 * position of the head.
 	 */
-	request_ring_position = intel_ring_get_tail(ring);
+	request_ring_position = intel_ringbuffer_get_tail(ring, ctx);
 
 	ret = ring->add_request(ring, ctx);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c93941d..e78ed94 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -988,16 +988,17 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 			    struct i915_hw_context *ctx)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret, i;
+	struct intel_ringbuffer *ringbuf;
+	int i;
 
 	if (!IS_GEN7(dev) || ring != &dev_priv->ring[RCS]) {
 		DRM_DEBUG("sol reset is gen7/rcs only\n");
 		return -EINVAL;
 	}
 
-	ret = intel_ringbuffer_begin(ring, ctx, 4 * 3);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 4 * 3);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return PTR_ERR(ringbuf);
 
 	for (i = 0; i < 4; i++) {
 		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
@@ -1290,9 +1291,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    mode != dev_priv->relative_constants_mode) {
-		ret = intel_ringbuffer_begin(ring, ctx, 4);
-		if (ret)
-				goto err;
+		struct intel_ringbuffer *ringbuf;
+
+		ringbuf = intel_ringbuffer_begin(ring, ctx, 4);
+		if (IS_ERR_OR_NULL(ringbuf)) {
+			ret = PTR_ERR(ringbuf);
+			goto err;
+		}
 
 		intel_ring_emit(ring, MI_NOOP);
 		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index e0550c6..24e2e3f 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8812,7 +8812,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	 * then do the cacheline alignment, and finally emit the
 	 * MI_DISPLAY_FLIP.
 	 */
-	ret = intel_ring_cacheline_align(ring);
+	ret = intel_ringbuffer_cacheline_align(ring, ring->default_context);
 	if (ret)
 		goto err_unpin;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0f4d3b6..6292e75 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -388,9 +388,9 @@ gen8_render_ring_flush(struct intel_engine *ring,
 		       struct i915_hw_context *ctx,
 		       u32 invalidate_domains, u32 flush_domains)
 {
+	struct intel_ringbuffer *ringbuf;
 	u32 flags = 0;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
-	int ret;
 
 	flags |= PIPE_CONTROL_CS_STALL;
 
@@ -409,9 +409,9 @@ gen8_render_ring_flush(struct intel_engine *ring,
 		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
 	}
 
-	ret = intel_ringbuffer_begin(ring, ctx, 6);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 6);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return PTR_ERR(ringbuf);
 
 	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(6));
 	intel_ring_emit(ring, flags);
@@ -1700,34 +1700,37 @@ static int __intel_ring_prepare(struct intel_engine *ring,
 	return 0;
 }
 
-int intel_ringbuffer_begin(struct intel_engine *ring,
-			      struct i915_hw_context *ctx,
-			      int num_dwords)
+struct intel_ringbuffer *
+intel_ringbuffer_begin(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
+		       int num_dwords)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	int ret;
 
 	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
 				   dev_priv->mm.interruptible);
 	if (ret)
-		return ret;
+		return ERR_PTR(ret);
 
 	ret = __intel_ring_prepare(ring, ctx, num_dwords * sizeof(uint32_t));
 	if (ret)
-		return ret;
+		return ERR_PTR(ret);
 
 	/* Preallocate the olr before touching the ring */
 	ret = intel_ring_alloc_seqno(ring);
 	if (ret)
-		return ret;
+		return ERR_PTR(ret);
 
 	ringbuf->space -= num_dwords * sizeof(uint32_t);
-	return 0;
+
+	return ringbuf;
 }
 
 /* Align the ring tail to a cacheline boundary */
-int intel_ring_cacheline_align(struct intel_engine *ring)
+int intel_ringbuffer_cacheline_align(struct intel_engine *ring,
+				     struct i915_hw_context *ctx)
 {
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int num_dwords = (ringbuf->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
@@ -1844,11 +1847,11 @@ gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	bool ppgtt = dev_priv->mm.aliasing_ppgtt != NULL &&
 		!(flags & I915_DISPATCH_SECURE);
-	int ret;
+	struct intel_ringbuffer *ringbuf;
 
-	ret = intel_ringbuffer_begin(ring, ctx, 4);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 4);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return PTR_ERR(ringbuf);
 
 	/* FIXME(BDW): Address space and security selectors. */
 	intel_ring_emit(ring, MI_BATCH_BUFFER_START_GEN8 | (ppgtt<<8));
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index b2dcad4..59280b2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -294,27 +294,66 @@ intel_write_status_page(struct intel_engine *ring,
 void intel_stop_ring(struct intel_engine *ring);
 void intel_cleanup_ring(struct intel_engine *ring);
 
-int __must_check intel_ringbuffer_begin(struct intel_engine *ring,
-					   struct i915_hw_context *ctx, int n);
-#define intel_ring_begin(ring, n) \
-		intel_ringbuffer_begin(ring, ring->default_context, n)
-int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
-static inline void intel_ring_emit(struct intel_engine *ring,
-				   u32 data)
+struct intel_ringbuffer *
+intel_ringbuffer_get(struct intel_engine *ring,
+		struct i915_hw_context *ctx);
+
+struct intel_ringbuffer *
+intel_ringbuffer_begin(struct intel_engine *ring,
+		struct i915_hw_context *ctx, int n);
+
+static inline int __must_check
+intel_ring_begin(struct intel_engine *ring, u32 data)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf;
+
+	ringbuf = intel_ringbuffer_begin(ring, ring->default_context, data);
+	if (IS_ERR(ringbuf))
+		return PTR_ERR(ringbuf);
+
+	return 0;
+}
+
+int __must_check
+intel_ringbuffer_cacheline_align(struct intel_engine *ring,
+				struct i915_hw_context *ctx);
 
+static inline void
+intel_ringbuffer_emit(struct intel_ringbuffer *ringbuf, u32 data)
+{
 	iowrite32(data, ringbuf->virtual_start + ringbuf->tail);
 	ringbuf->tail += 4;
 }
-static inline void intel_ring_advance(struct intel_engine *ring)
+
+static inline void
+intel_ring_emit(struct intel_engine *ring, u32 data)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	intel_ringbuffer_emit(&ring->default_ringbuf, data);
+}
 
+static inline void
+intel_ringbuffer_advance(struct intel_ringbuffer *ringbuf)
+{
 	ringbuf->tail &= ringbuf->size - 1;
 }
-void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
-					 struct i915_hw_context *ctx);
+
+static inline void
+intel_ring_advance(struct intel_engine *ring)
+{
+	intel_ringbuffer_advance(&ring->default_ringbuf);
+}
+
+void
+intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
+				struct i915_hw_context *ctx);
+
+static inline u32
+intel_ringbuffer_get_tail(struct intel_engine *ring,
+			struct i915_hw_context *ctx)
+{
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
+	return ringbuf->tail;
+}
 
 int __must_check intel_ring_idle(struct intel_engine *ring);
 void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
@@ -336,11 +375,6 @@ void intel_ring_setup_status_page(struct intel_engine *ring);
 void intel_destroy_ring_buffer(struct intel_engine *ring);
 int intel_allocate_ring_buffer(struct intel_engine *ring);
 
-static inline u32 intel_ring_get_tail(struct intel_engine *ring)
-{
-	return __get_ringbuf(ring)->tail;
-}
-
 static inline u32 intel_ring_get_seqno(struct intel_engine *ring)
 {
 	BUG_ON(ring->outstanding_lazy_seqno == 0);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 12/50] drm/i915: Final touches to ringbuffer and context plumbing and refactoring
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (10 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 11/50] drm/i915: Write a new set of context-aware ringbuffer management functions oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 13/50] drm/i915: s/write_tail/submit oscar.mateo
                   ` (39 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Thanks to the previous functions and intel_ringbuffer_get(), every function
that needs to be context-aware can get the ringbuffer from the appropriate place.
Others (either pre-GEN8 or that clearly manipulate the rings's default ringbuffer)
get it directly from the engine.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c            |  2 +-
 drivers/gpu/drm/i915/i915_gem.c            |  6 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 18 +++---
 drivers/gpu/drm/i915/i915_gpu_error.c      |  6 +-
 drivers/gpu/drm/i915/i915_irq.c            |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 92 +++++++++++++++---------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 13 ++---
 7 files changed, 70 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 166fbdf..7bdb9be 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -242,7 +242,7 @@ static int i915_dma_resume(struct drm_device * dev)
 
 	DRM_DEBUG_DRIVER("%s\n", __func__);
 
-	if (__get_ringbuf(ring)->virtual_start == NULL) {
+	if (ring->default_ringbuf.virtual_start == NULL) {
 		DRM_ERROR("can not ioremap virtual address for"
 			  " ring buffer\n");
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 26bd68f..4a22560 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2481,6 +2481,7 @@ i915_gem_retire_requests_ring(struct intel_engine *ring)
 
 	while (!list_empty(&ring->request_list)) {
 		struct drm_i915_gem_request *request;
+		struct intel_ringbuffer *ringbuf;
 
 		request = list_first_entry(&ring->request_list,
 					   struct drm_i915_gem_request,
@@ -2490,12 +2491,15 @@ i915_gem_retire_requests_ring(struct intel_engine *ring)
 			break;
 
 		trace_i915_gem_request_retire(ring, request->seqno);
+
+		ringbuf = intel_ringbuffer_get(ring, request->ctx);
+
 		/* We know the GPU must have read the request to have
 		 * sent us the seqno + interrupt, so use the position
 		 * of tail of the request to update the last known position
 		 * of the GPU head.
 		 */
-		__get_ringbuf(ring)->last_retired_head = request->tail;
+		ringbuf->last_retired_head = request->tail;
 
 		i915_gem_free_request(request);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e78ed94..823ad3d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1001,12 +1001,12 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 		return PTR_ERR(ringbuf);
 
 	for (i = 0; i < 4; i++) {
-		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
-		intel_ring_emit(ring, GEN7_SO_WRITE_OFFSET(i));
-		intel_ring_emit(ring, 0);
+		intel_ringbuffer_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+		intel_ringbuffer_emit(ringbuf, GEN7_SO_WRITE_OFFSET(i));
+		intel_ringbuffer_emit(ringbuf, 0);
 	}
 
-	intel_ring_advance(ring);
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 }
@@ -1299,11 +1299,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 		}
 
-		intel_ring_emit(ring, MI_NOOP);
-		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
-		intel_ring_emit(ring, INSTPM);
-		intel_ring_emit(ring, mask << 16 | mode);
-		intel_ring_advance(ring);
+		intel_ringbuffer_emit(ringbuf, MI_NOOP);
+		intel_ringbuffer_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+		intel_ringbuffer_emit(ringbuf, INSTPM);
+		intel_ringbuffer_emit(ringbuf, mask << 16 | mode);
+		intel_ringbuffer_advance(ringbuf);
 
 		dev_priv->relative_constants_mode = mode;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index a7b165f..6724e32 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -823,8 +823,8 @@ static void i915_record_ring_state(struct drm_device *dev,
 		ering->hws = I915_READ(mmio);
 	}
 
-	ering->cpu_ring_head = __get_ringbuf(ring)->head;
-	ering->cpu_ring_tail = __get_ringbuf(ring)->tail;
+	ering->cpu_ring_head = ring->default_ringbuf.head;
+	ering->cpu_ring_tail = ring->default_ringbuf.tail;
 
 	ering->hangcheck_score = ring->hangcheck.score;
 	ering->hangcheck_action = ring->hangcheck.action;
@@ -928,7 +928,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 		}
 
 		error->ring[i].ringbuffer =
-			i915_error_ggtt_object_create(dev_priv, __get_ringbuf(ring)->obj);
+			i915_error_ggtt_object_create(dev_priv, ring->default_ringbuf.obj);
 
 		if (ring->status_page.obj)
 			error->ring[i].hws_page =
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index e0c3a01..873ae50 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2610,7 +2610,7 @@ static struct intel_engine *
 semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	u32 cmd, ipehr, head;
 	int i;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6292e75..f18bfb2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -40,10 +40,8 @@
  */
 #define CACHELINE_BYTES 64
 
-static inline int ring_space(struct intel_engine *ring)
+static inline int ring_space(struct intel_ringbuffer *ringbuf)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
-
 	int space = (ringbuf->head & HEAD_ADDR) - (ringbuf->tail + I915_RING_FREE_SPACE);
 	if (space < 0)
 		space += ringbuf->size;
@@ -59,7 +57,7 @@ static bool intel_ring_stopped(struct intel_engine *ring)
 void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
 			  struct i915_hw_context *ctx)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 
 	ringbuf->tail &= ringbuf->size - 1;
 	if (intel_ring_stopped(ring))
@@ -413,13 +411,13 @@ gen8_render_ring_flush(struct intel_engine *ring,
 	if (IS_ERR_OR_NULL(ringbuf))
 		return PTR_ERR(ringbuf);
 
-	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(6));
-	intel_ring_emit(ring, flags);
-	intel_ring_emit(ring, scratch_addr);
-	intel_ring_emit(ring, 0);
-	intel_ring_emit(ring, 0);
-	intel_ring_emit(ring, 0);
-	intel_ring_advance(ring);
+	intel_ringbuffer_emit(ringbuf, GFX_OP_PIPE_CONTROL(6));
+	intel_ringbuffer_emit(ringbuf, flags);
+	intel_ringbuffer_emit(ringbuf, scratch_addr);
+	intel_ringbuffer_emit(ringbuf, 0);
+	intel_ringbuffer_emit(ringbuf, 0);
+	intel_ringbuffer_emit(ringbuf, 0);
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 
@@ -487,7 +485,7 @@ static int init_ring_common(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	struct drm_i915_gem_object *obj = ringbuf->obj;
 	int ret = 0;
 
@@ -549,7 +547,7 @@ static int init_ring_common(struct intel_engine *ring)
 	else {
 		ringbuf->head = I915_READ_HEAD(ring);
 		ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
-		ringbuf->space = ring_space(ring);
+		ringbuf->space = ring_space(ringbuf);
 		ringbuf->last_retired_head = -1;
 	}
 
@@ -1387,10 +1385,8 @@ static int init_phys_status_page(struct intel_engine *ring)
 	return 0;
 }
 
-void intel_destroy_ring_buffer(struct intel_engine *ring)
+void intel_destroy_ring_buffer(struct intel_ringbuffer *ringbuf)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
-
 	if (!ringbuf->obj)
 		return;
 
@@ -1400,12 +1396,11 @@ void intel_destroy_ring_buffer(struct intel_engine *ring)
 	ringbuf->obj = NULL;
 }
 
-int intel_allocate_ring_buffer(struct intel_engine *ring)
+int intel_allocate_ring_buffer(struct drm_device *dev,
+		struct intel_ringbuffer *ringbuf)
 {
-	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_object *obj;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	if (ringbuf->obj)
@@ -1448,7 +1443,7 @@ err_unref:
 static int intel_init_ring(struct drm_device *dev,
 			   struct intel_engine *ring)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	int ret;
 
 	INIT_LIST_HEAD(&ring->active_list);
@@ -1469,7 +1464,7 @@ static int intel_init_ring(struct drm_device *dev,
 			return ret;
 	}
 
-	ret = intel_allocate_ring_buffer(ring);
+	ret = intel_allocate_ring_buffer(dev, &ring->default_ringbuf);
 	if (ret) {
 		DRM_ERROR("Failed to allocate ringbuffer %s: %d\n", ring->name, ret);
 		return ret;
@@ -1491,7 +1486,7 @@ static int intel_init_ring(struct drm_device *dev,
 void intel_cleanup_ring(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 
 	if (ringbuf->obj == NULL)
 		return;
@@ -1499,7 +1494,7 @@ void intel_cleanup_ring(struct intel_engine *ring)
 	intel_stop_ring(ring);
 	WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0);
 
-	intel_destroy_ring_buffer(ring);
+	intel_destroy_ring_buffer(&ring->default_ringbuf);
 	ring->preallocated_lazy_request = NULL;
 	ring->outstanding_lazy_seqno = 0;
 
@@ -1509,10 +1504,11 @@ void intel_cleanup_ring(struct intel_engine *ring)
 	cleanup_status_page(ring);
 }
 
-static int intel_ring_wait_request(struct intel_engine *ring, int n)
+static int intel_ring_wait_request(struct intel_engine *ring,
+				   struct i915_hw_context *ctx, int n)
 {
 	struct drm_i915_gem_request *request;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	u32 seqno = 0, tail;
 	int ret;
 
@@ -1520,7 +1516,7 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 		ringbuf->head = ringbuf->last_retired_head;
 		ringbuf->last_retired_head = -1;
 
-		ringbuf->space = ring_space(ring);
+		ringbuf->space = ring_space(ringbuf);
 		if (ringbuf->space >= n)
 			return 0;
 	}
@@ -1556,7 +1552,7 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 		return ret;
 
 	ringbuf->head = tail;
-	ringbuf->space = ring_space(ring);
+	ringbuf->space = ring_space(ringbuf);
 	if (WARN_ON(ringbuf->space < n))
 		return -ENOSPC;
 
@@ -1568,11 +1564,11 @@ static int ring_wait_for_space(struct intel_engine *ring,
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	unsigned long end;
 	int ret;
 
-	ret = intel_ring_wait_request(ring, n);
+	ret = intel_ring_wait_request(ring, ctx, n);
 	if (ret != -ENOSPC)
 		return ret;
 
@@ -1589,7 +1585,7 @@ static int ring_wait_for_space(struct intel_engine *ring,
 
 	do {
 		ringbuf->head = I915_READ_HEAD(ring);
-		ringbuf->space = ring_space(ring);
+		ringbuf->space = ring_space(ringbuf);
 		if (ringbuf->space >= n) {
 			trace_i915_ring_wait_end(ring);
 			return 0;
@@ -1617,7 +1613,7 @@ static int intel_wrap_ring_buffer(struct intel_engine *ring,
 				  struct i915_hw_context *ctx)
 {
 	uint32_t __iomem *virt;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	int rem = ringbuf->size - ringbuf->tail;
 
 	if (ringbuf->space < rem) {
@@ -1632,7 +1628,7 @@ static int intel_wrap_ring_buffer(struct intel_engine *ring,
 		iowrite32(MI_NOOP, virt++);
 
 	ringbuf->tail = 0;
-	ringbuf->space = ring_space(ring);
+	ringbuf->space = ring_space(ringbuf);
 
 	return 0;
 }
@@ -1682,7 +1678,7 @@ intel_ring_alloc_seqno(struct intel_engine *ring)
 static int __intel_ring_prepare(struct intel_engine *ring,
 				struct i915_hw_context *ctx, int bytes)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	int ret;
 
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
@@ -1701,6 +1697,13 @@ static int __intel_ring_prepare(struct intel_engine *ring,
 }
 
 struct intel_ringbuffer *
+intel_ringbuffer_get(struct intel_engine *ring, struct i915_hw_context *ctx)
+{
+	/* For the time being, the only ringbuffer is in the engine */
+	return &ring->default_ringbuf;
+}
+
+struct intel_ringbuffer *
 intel_ringbuffer_begin(struct intel_engine *ring,
 		       struct i915_hw_context *ctx,
 		       int num_dwords)
@@ -1732,22 +1735,21 @@ intel_ringbuffer_begin(struct intel_engine *ring,
 int intel_ringbuffer_cacheline_align(struct intel_engine *ring,
 				     struct i915_hw_context *ctx)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	int num_dwords = (ringbuf->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
-	int ret;
 
 	if (num_dwords == 0)
 		return 0;
 
 	num_dwords = CACHELINE_BYTES / sizeof(uint32_t) - num_dwords;
-	ret = intel_ring_begin(ring, num_dwords);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, num_dwords);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return PTR_ERR(ringbuf);
 
 	while (num_dwords--)
 		intel_ring_emit(ring, MI_NOOP);
 
-	intel_ring_advance(ring);
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 }
@@ -1854,11 +1856,11 @@ gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 		return PTR_ERR(ringbuf);
 
 	/* FIXME(BDW): Address space and security selectors. */
-	intel_ring_emit(ring, MI_BATCH_BUFFER_START_GEN8 | (ppgtt<<8));
-	intel_ring_emit(ring, lower_32_bits(offset));
-	intel_ring_emit(ring, upper_32_bits(offset));
-	intel_ring_emit(ring, MI_NOOP);
-	intel_ring_advance(ring);
+	intel_ringbuffer_emit(ringbuf, MI_BATCH_BUFFER_START_GEN8 | (ppgtt<<8));
+	intel_ringbuffer_emit(ringbuf, lower_32_bits(offset));
+	intel_ringbuffer_emit(ringbuf, upper_32_bits(offset));
+	intel_ringbuffer_emit(ringbuf, MI_NOOP);
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 }
@@ -2060,7 +2062,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[RCS];
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	int ret;
 
 	if (INTEL_INFO(dev)->gen >= 6) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 59280b2..dd85a2b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -219,16 +219,10 @@ struct intel_engine {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
-/* This is a temporary define to help us transition to per-context ringbuffers */
-static inline struct intel_ringbuffer *__get_ringbuf(struct intel_engine *ring)
-{
-	return &ring->default_ringbuf;
-}
-
 static inline bool
 intel_ring_initialized(struct intel_engine *ring)
 {
-	return __get_ringbuf(ring)->obj != NULL;
+	return ring->default_ringbuf.obj != NULL;
 }
 
 static inline unsigned
@@ -372,8 +366,9 @@ int intel_init_vebox_ring(struct drm_device *dev);
 u64 intel_ring_get_active_head(struct intel_engine *ring);
 void intel_ring_setup_status_page(struct intel_engine *ring);
 
-void intel_destroy_ring_buffer(struct intel_engine *ring);
-int intel_allocate_ring_buffer(struct intel_engine *ring);
+void intel_destroy_ring_buffer(struct intel_ringbuffer *ringbuf);
+int intel_allocate_ring_buffer(struct drm_device *dev,
+		struct intel_ringbuffer *ringbuf);
 
 static inline u32 intel_ring_get_seqno(struct intel_engine *ring)
 {
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 13/50] drm/i915: s/write_tail/submit
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (11 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 12/50] drm/i915: Final touches to ringbuffer and context plumbing and refactoring oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 14/50] drm/i915: Introduce one context backing object per engine oscar.mateo
                   ` (38 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

In Execlists we don't really write to the tail register to submit
a workload to the GPU, so the name is going to stop being accurate
soon.

Change and name suggested by Brad.

Cc: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 18 +++++++++---------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 ++--
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index f18bfb2..6225123 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -63,7 +63,7 @@ void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
 	if (intel_ring_stopped(ring))
 		return;
 
-	ring->write_tail(ring, ctx, ringbuf->tail);
+	ring->submit(ring, ctx, ringbuf->tail);
 }
 
 static int
@@ -471,7 +471,7 @@ static bool stop_ring(struct intel_engine *ring)
 
 	I915_WRITE_CTL(ring, 0);
 	I915_WRITE_HEAD(ring, 0);
-	ring->write_tail(ring, ring->default_context, 0);
+	ring->submit(ring, ring->default_context, 0);
 
 	if (!IS_GEN2(ring->dev)) {
 		(void)I915_READ_CTL(ring);
@@ -2017,7 +2017,7 @@ int intel_init_render_ring(struct drm_device *dev)
 		}
 		ring->irq_enable_mask = I915_USER_INTERRUPT;
 	}
-	ring->write_tail = ring_write_tail;
+	ring->submit = ring_write_tail;
 	if (IS_HASWELL(dev))
 		ring->dispatch_execbuffer = hsw_ring_dispatch_execbuffer;
 	else if (IS_GEN8(dev))
@@ -2088,7 +2088,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 		ring->irq_put = i9xx_ring_put_irq;
 	}
 	ring->irq_enable_mask = I915_USER_INTERRUPT;
-	ring->write_tail = ring_write_tail;
+	ring->submit = ring_write_tail;
 	if (INTEL_INFO(dev)->gen >= 4)
 		ring->dispatch_execbuffer = i965_dispatch_execbuffer;
 	else if (IS_I830(dev) || IS_845G(dev))
@@ -2127,11 +2127,11 @@ int intel_init_bsd_ring(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[VCS];
 
-	ring->write_tail = ring_write_tail;
+	ring->submit = ring_write_tail;
 	if (INTEL_INFO(dev)->gen >= 6) {
 		/* gen6 bsd needs a special wa for tail updates */
 		if (IS_GEN6(dev))
-			ring->write_tail = gen6_bsd_ring_write_tail;
+			ring->submit = gen6_bsd_ring_write_tail;
 		ring->flush = gen6_bsd_ring_flush;
 		ring->add_request = gen6_add_request;
 		ring->get_seqno = gen6_ring_get_seqno;
@@ -2203,7 +2203,7 @@ int intel_init_bsd2_ring(struct drm_device *dev)
 		return -EINVAL;
 	}
 
-	ring->write_tail = ring_write_tail;
+	ring->submit = ring_write_tail;
 	ring->flush = gen6_bsd_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
@@ -2242,7 +2242,7 @@ int intel_init_blt_ring(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[BCS];
 
-	ring->write_tail = ring_write_tail;
+	ring->submit = ring_write_tail;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
@@ -2287,7 +2287,7 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[VECS];
 
-	ring->write_tail = ring_write_tail;
+	ring->submit = ring_write_tail;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index dd85a2b..c224fdc 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -102,8 +102,8 @@ struct intel_engine {
 
 	int		(*init)(struct intel_engine *ring);
 
-	void		(*write_tail)(struct intel_engine *ring,
-				      struct i915_hw_context *ctx, u32 value);
+	void		(*submit)(struct intel_engine *ring,
+				    struct i915_hw_context *ctx, u32 value);
 	int __must_check (*flush)(struct intel_engine *ring,
 				  struct i915_hw_context *ctx,
 				  u32	invalidate_domains,
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 14/50] drm/i915: Introduce one context backing object per engine
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (12 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 13/50] drm/i915: s/write_tail/submit oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 15/50] drm/i915: Make i915_gem_create_context outside accessible oscar.mateo
                   ` (37 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

When we start using Execlists, a context backing object only makes
sense for a given engine (because it will hold state data specific to
it) so multiplex the context struct to contain <no-of-engines> objects.

In legacy ringbuffer sumission mode, the only MI_SET_CONTEXT we really
perform is for the render engine, so the RCS backing object is the one
to use troughout the HW context code.

Originally, I colored this code by instantiating one new context for
every engine I wanted to use, but this change suggested by Brad makes
it more elegant.

No functional changes.

Cc: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
 drivers/gpu/drm/i915/i915_drv.h         |  4 +-
 drivers/gpu/drm/i915/i915_gem_context.c | 75 ++++++++++++++++++---------------
 3 files changed, 46 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0052460..65a740e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1707,7 +1707,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 			if (ring->default_context == ctx)
 				seq_printf(m, "(default context %s) ", ring->name);
 
-		describe_obj(m, ctx->obj);
+		describe_obj(m, ctx->engine[RCS].obj);
 		seq_putc(m, '\n');
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 35b2ae4..5be09a0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -595,7 +595,9 @@ struct i915_hw_context {
 	uint8_t remap_slice;
 	struct drm_i915_file_private *file_priv;
 	struct intel_engine *last_ring;
-	struct drm_i915_gem_object *obj;
+	struct {
+		struct drm_i915_gem_object *obj;
+	} engine[I915_NUM_RINGS];
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_address_space *vm;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 50337ae..f92cba9 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -181,15 +181,16 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	struct i915_hw_context *ctx = container_of(ctx_ref,
 						   typeof(*ctx), ref);
 	struct i915_hw_ppgtt *ppgtt = NULL;
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[RCS].obj;
 
-	if (ctx->obj) {
+	if (ctx_obj) {
 		/* We refcount even the aliasing PPGTT to keep the code symmetric */
-		if (USES_PPGTT(ctx->obj->base.dev))
+		if (USES_PPGTT(ctx_obj->base.dev))
 			ppgtt = ctx_to_ppgtt(ctx);
 
 		/* XXX: Free up the object before tearing down the address space, in
 		 * case we're bound in the PPGTT */
-		drm_gem_object_unreference(&ctx->obj->base);
+		drm_gem_object_unreference(&ctx_obj->base);
 	}
 
 	if (ppgtt)
@@ -224,6 +225,7 @@ __create_hw_context(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx;
+	struct drm_i915_gem_object *ctx_obj;
 	int ret;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -234,8 +236,9 @@ __create_hw_context(struct drm_device *dev,
 	list_add_tail(&ctx->link, &dev_priv->context_list);
 
 	if (dev_priv->hw_context_size) {
-		ctx->obj = i915_gem_alloc_object(dev, dev_priv->hw_context_size);
-		if (ctx->obj == NULL) {
+		ctx->engine[RCS].obj = ctx_obj = i915_gem_alloc_object(dev,
+							dev_priv->hw_context_size);
+		if (ctx_obj == NULL) {
 			ret = -ENOMEM;
 			goto err_out;
 		}
@@ -249,7 +252,7 @@ __create_hw_context(struct drm_device *dev,
 		 * negative performance impact.
 		 */
 		if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev)) {
-			ret = i915_gem_object_set_cache_level(ctx->obj,
+			ret = i915_gem_object_set_cache_level(ctx_obj,
 							      I915_CACHE_L3_LLC);
 			/* Failure shouldn't ever happen this early */
 			if (WARN_ON(ret))
@@ -293,6 +296,7 @@ i915_gem_create_context(struct drm_device *dev,
 	const bool is_global_default_ctx = file_priv == NULL;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx;
+	struct drm_i915_gem_object *ctx_obj;
 	int ret = 0;
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -301,7 +305,8 @@ i915_gem_create_context(struct drm_device *dev,
 	if (IS_ERR(ctx))
 		return ctx;
 
-	if (is_global_default_ctx && ctx->obj) {
+	ctx_obj = ctx->engine[RCS].obj;
+	if (is_global_default_ctx && ctx_obj) {
 		/* We may need to do things with the shrinker which
 		 * require us to immediately switch back to the default
 		 * context. This can cause a problem as pinning the
@@ -309,7 +314,7 @@ i915_gem_create_context(struct drm_device *dev,
 		 * be available. To avoid this we always pin the default
 		 * context.
 		 */
-		ret = i915_gem_obj_ggtt_pin(ctx->obj,
+		ret = i915_gem_obj_ggtt_pin(ctx_obj,
 					    get_context_alignment(dev), 0);
 		if (ret) {
 			DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
@@ -349,8 +354,8 @@ i915_gem_create_context(struct drm_device *dev,
 	return ctx;
 
 err_unpin:
-	if (is_global_default_ctx && ctx->obj)
-		i915_gem_object_ggtt_unpin(ctx->obj);
+	if (is_global_default_ctx && ctx_obj)
+		i915_gem_object_ggtt_unpin(ctx_obj);
 err_destroy:
 	i915_gem_context_unreference(ctx);
 	return ERR_PTR(ret);
@@ -366,6 +371,7 @@ void i915_gem_context_reset(struct drm_device *dev)
 	 * the next switch */
 	for_each_ring(ring, dev_priv, i) {
 		struct i915_hw_context *dctx = ring->default_context;
+		struct drm_i915_gem_object *dctx_obj = dctx->engine[RCS].obj;
 
 		/* Do a fake switch to the default context */
 		if (ring->last_context == dctx)
@@ -374,12 +380,12 @@ void i915_gem_context_reset(struct drm_device *dev)
 		if (!ring->last_context)
 			continue;
 
-		if (dctx->obj && i == RCS) {
-			WARN_ON(i915_gem_obj_ggtt_pin(dctx->obj,
+		if (dctx_obj && i == RCS) {
+			WARN_ON(i915_gem_obj_ggtt_pin(dctx_obj,
 						      get_context_alignment(dev), 0));
 			/* Fake a finish/inactive */
-			dctx->obj->base.write_domain = 0;
-			dctx->obj->active = 0;
+			dctx_obj->base.write_domain = 0;
+			dctx_obj->active = 0;
 		}
 
 		i915_gem_context_unreference(ring->last_context);
@@ -428,10 +434,11 @@ void i915_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *dctx = dev_priv->ring[RCS].default_context;
+	struct drm_i915_gem_object *dctx_obj = dctx->engine[RCS].obj;
 	struct intel_engine *ring;
 	int unused;
 
-	if (dctx->obj) {
+	if (dctx_obj) {
 		/* The only known way to stop the gpu from accessing the hw context is
 		 * to reset it. Do this as the very last operation to avoid confusing
 		 * other code, leading to spurious errors. */
@@ -446,8 +453,8 @@ void i915_gem_context_fini(struct drm_device *dev)
 		WARN_ON(!dev_priv->ring[RCS].last_context);
 		if (dev_priv->ring[RCS].last_context == dctx) {
 			/* Fake switch to NULL context */
-			WARN_ON(dctx->obj->active);
-			i915_gem_object_ggtt_unpin(dctx->obj);
+			WARN_ON(dctx->engine[RCS].obj->active);
+			i915_gem_object_ggtt_unpin(dctx_obj);
 			i915_gem_context_unreference(dctx);
 			dev_priv->ring[RCS].last_context = NULL;
 		}
@@ -461,7 +468,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 		ring->last_context = NULL;
 	}
 
-	i915_gem_object_ggtt_unpin(dctx->obj);
+	i915_gem_object_ggtt_unpin(dctx_obj);
 	i915_gem_context_unreference(dctx);
 }
 
@@ -575,7 +582,7 @@ mi_set_context(struct intel_engine *ring,
 
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_emit(ring, MI_SET_CONTEXT);
-	intel_ring_emit(ring, i915_gem_obj_ggtt_offset(new_context->obj) |
+	intel_ring_emit(ring, i915_gem_obj_ggtt_offset(new_context->engine[RCS].obj) |
 			MI_MM_SPACE_GTT |
 			MI_SAVE_EXT_STATE_EN |
 			MI_RESTORE_EXT_STATE_EN |
@@ -602,12 +609,14 @@ static int do_switch(struct intel_engine *ring,
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	struct i915_hw_ppgtt *ppgtt = ctx_to_ppgtt(to);
+	struct drm_i915_gem_object *to_obj = to->engine[RCS].obj;
+	struct drm_i915_gem_object *from_obj = from ? from->engine[RCS].obj : NULL;
 	u32 hw_flags = 0;
 	int ret, i;
 
 	if (from != NULL && ring == &dev_priv->ring[RCS]) {
-		BUG_ON(from->obj == NULL);
-		BUG_ON(!i915_gem_obj_is_pinned(from->obj));
+		BUG_ON(from_obj == NULL);
+		BUG_ON(!i915_gem_obj_is_pinned(from_obj));
 	}
 
 	if (from == to && from->last_ring == ring && !to->remap_slice)
@@ -615,7 +624,7 @@ static int do_switch(struct intel_engine *ring,
 
 	/* Trying to pin first makes error handling easier. */
 	if (ring == &dev_priv->ring[RCS]) {
-		ret = i915_gem_obj_ggtt_pin(to->obj,
+		ret = i915_gem_obj_ggtt_pin(to_obj,
 					    get_context_alignment(ring->dev), 0);
 		if (ret)
 			return ret;
@@ -648,14 +657,14 @@ static int do_switch(struct intel_engine *ring,
 	 *
 	 * XXX: We need a real interface to do this instead of trickery.
 	 */
-	ret = i915_gem_object_set_to_gtt_domain(to->obj, false);
+	ret = i915_gem_object_set_to_gtt_domain(to_obj, false);
 	if (ret)
 		goto unpin_out;
 
-	if (!to->obj->has_global_gtt_mapping) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
+	if (!to_obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(to_obj,
 							   &dev_priv->gtt.base);
-		vma->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+		vma->bind_vma(vma, to_obj->cache_level, GLOBAL_BIND);
 	}
 
 	if (!to->is_initialized || i915_gem_context_is_default(to))
@@ -684,8 +693,8 @@ static int do_switch(struct intel_engine *ring,
 	 * MI_SET_CONTEXT instead of when the next seqno has completed.
 	 */
 	if (from != NULL) {
-		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-		i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->obj), ring);
+		from_obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
+		i915_vma_move_to_active(i915_gem_obj_to_ggtt(from_obj), ring);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
 		 * object dirty. The only exception is that the context must be
@@ -693,11 +702,11 @@ static int do_switch(struct intel_engine *ring,
 		 * able to defer doing this until we know the object would be
 		 * swapped, but there is no way to do that yet.
 		 */
-		from->obj->dirty = 1;
-		BUG_ON(from->obj->ring != ring);
+		from_obj->dirty = 1;
+		BUG_ON(from_obj->ring != ring);
 
 		/* obj is kept alive until the next request by its active ref */
-		i915_gem_object_ggtt_unpin(from->obj);
+		i915_gem_object_ggtt_unpin(from_obj);
 		i915_gem_context_unreference(from);
 	}
 
@@ -712,7 +721,7 @@ done:
 
 unpin_out:
 	if (ring->id == RCS)
-		i915_gem_object_ggtt_unpin(to->obj);
+		i915_gem_object_ggtt_unpin(to_obj);
 	return ret;
 }
 
@@ -733,7 +742,7 @@ int i915_switch_context(struct intel_engine *ring,
 
 	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
-	if (to->obj == NULL) { /* We have the fake context */
+	if (to->engine[RCS].obj == NULL) { /* We have the fake context */
 		if (to != ring->last_context) {
 			i915_gem_context_reference(to);
 			if (ring->last_context)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 15/50] drm/i915: Make i915_gem_create_context outside accessible
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (13 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 14/50] drm/i915: Introduce one context backing object per engine oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 16/50] drm/i915: Option to skip backing object allocation during context creation oscar.mateo
                   ` (36 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

We are going to reuse it during logical ring context creation.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 2 ++
 drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5be09a0..2d5a65d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2387,6 +2387,8 @@ void i915_gem_object_ggtt_unpin(struct drm_i915_gem_object *obj);
 #define ctx_to_ppgtt(ctx) container_of((ctx)->vm, struct i915_hw_ppgtt, base)
 int __must_check i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
+struct i915_hw_context *i915_gem_create_context(struct drm_device *dev,
+		struct drm_i915_file_private *file_priv, bool create_vm);
 void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 int i915_gem_context_enable(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f92cba9..b2abe9a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -288,7 +288,7 @@ err_out:
  * context state of the GPU for applications that don't utilize HW contexts, as
  * well as an idle case.
  */
-static struct i915_hw_context *
+struct i915_hw_context *
 i915_gem_create_context(struct drm_device *dev,
 			struct drm_i915_file_private *file_priv,
 			bool create_vm)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 16/50] drm/i915: Option to skip backing object allocation during context creation
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (14 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 15/50] drm/i915: Make i915_gem_create_context outside accessible oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 17/50] drm/i915: Extract context backing object allocation oscar.mateo
                   ` (35 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Allocating only the RCS backing object won't be enough for LRCs, so
give callers the opportunity to skip it.

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  3 ++-
 drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++-------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2d5a65d..6f45bf0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2388,7 +2388,8 @@ void i915_gem_object_ggtt_unpin(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
 struct i915_hw_context *i915_gem_create_context(struct drm_device *dev,
-		struct drm_i915_file_private *file_priv, bool create_vm);
+		struct drm_i915_file_private *file_priv,
+		bool create_vm, bool create_obj);
 void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 int i915_gem_context_enable(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index b2abe9a..2d47532 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -221,7 +221,8 @@ create_vm_for_ctx(struct drm_device *dev, struct i915_hw_context *ctx)
 
 static struct i915_hw_context *
 __create_hw_context(struct drm_device *dev,
-		  struct drm_i915_file_private *file_priv)
+		struct drm_i915_file_private *file_priv,
+		bool create_obj)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx;
@@ -235,7 +236,7 @@ __create_hw_context(struct drm_device *dev,
 	kref_init(&ctx->ref);
 	list_add_tail(&ctx->link, &dev_priv->context_list);
 
-	if (dev_priv->hw_context_size) {
+	if (dev_priv->hw_context_size && create_obj) {
 		ctx->engine[RCS].obj = ctx_obj = i915_gem_alloc_object(dev,
 							dev_priv->hw_context_size);
 		if (ctx_obj == NULL) {
@@ -291,7 +292,7 @@ err_out:
 struct i915_hw_context *
 i915_gem_create_context(struct drm_device *dev,
 			struct drm_i915_file_private *file_priv,
-			bool create_vm)
+			bool create_vm, bool create_obj)
 {
 	const bool is_global_default_ctx = file_priv == NULL;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -301,7 +302,7 @@ i915_gem_create_context(struct drm_device *dev,
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 
-	ctx = __create_hw_context(dev, file_priv);
+	ctx = __create_hw_context(dev, file_priv, create_obj);
 	if (IS_ERR(ctx))
 		return ctx;
 
@@ -415,7 +416,7 @@ int i915_gem_context_init(struct drm_device *dev)
 		}
 	}
 
-	ctx = i915_gem_create_context(dev, NULL, USES_PPGTT(dev));
+	ctx = i915_gem_create_context(dev, NULL, USES_PPGTT(dev), true);
 	if (IS_ERR(ctx)) {
 		DRM_ERROR("Failed to create default global context (error %ld)\n",
 			  PTR_ERR(ctx));
@@ -519,7 +520,7 @@ int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
 
 	mutex_lock(&dev->struct_mutex);
 	file_priv->private_default_ctx =
-		i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev));
+		i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev), true);
 	mutex_unlock(&dev->struct_mutex);
 
 	if (IS_ERR(file_priv->private_default_ctx)) {
@@ -775,7 +776,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
-	ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev));
+	ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev), true);
 	mutex_unlock(&dev->struct_mutex);
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 17/50] drm/i915: Extract context backing object allocation
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (15 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 16/50] drm/i915: Option to skip backing object allocation during context creation oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 18/50] drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring Contexts) oscar.mateo
                   ` (34 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

We are going to use it later to allocate our own objects.

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
 drivers/gpu/drm/i915/i915_drv.h         |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.c | 54 +++++++++++++++++++++------------
 3 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 65a740e..204b432 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1698,7 +1698,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	}
 
 	list_for_each_entry(ctx, &dev_priv->context_list, link) {
-		if (ctx->obj == NULL)
+		if (ctx->engine[RCS].obj == NULL)
 			continue;
 
 		seq_puts(m, "HW context ");
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6f45bf0..86730d5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2387,6 +2387,8 @@ void i915_gem_object_ggtt_unpin(struct drm_i915_gem_object *obj);
 #define ctx_to_ppgtt(ctx) container_of((ctx)->vm, struct i915_hw_ppgtt, base)
 int __must_check i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
+struct drm_i915_gem_object *intel_allocate_ctx_backing_obj(struct drm_device *dev,
+		size_t size);
 struct i915_hw_context *i915_gem_create_context(struct drm_device *dev,
 		struct drm_i915_file_private *file_priv,
 		bool create_vm, bool create_obj);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2d47532..708111d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -219,6 +219,36 @@ create_vm_for_ctx(struct drm_device *dev, struct i915_hw_context *ctx)
 	return ppgtt;
 }
 
+struct drm_i915_gem_object *
+intel_allocate_ctx_backing_obj(struct drm_device *dev, size_t size)
+{
+	struct drm_i915_gem_object *obj;
+	int ret;
+
+	obj = i915_gem_alloc_object(dev, size);
+	if (obj == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * Try to make the context utilize L3 as well as LLC.
+	 *
+	 * On VLV we don't have L3 controls in the PTEs so we
+	 * shouldn't touch the cache level, especially as that
+	 * would make the object snooped which might have a
+	 * negative performance impact.
+	 */
+	if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev)) {
+		ret = i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC);
+		/* Failure shouldn't ever happen this early */
+		if (WARN_ON(ret)) {
+			drm_gem_object_unreference(&obj->base);
+			return ERR_PTR(ret);
+		}
+	}
+
+	return obj;
+}
+
 static struct i915_hw_context *
 __create_hw_context(struct drm_device *dev,
 		struct drm_i915_file_private *file_priv,
@@ -237,28 +267,14 @@ __create_hw_context(struct drm_device *dev,
 	list_add_tail(&ctx->link, &dev_priv->context_list);
 
 	if (dev_priv->hw_context_size && create_obj) {
-		ctx->engine[RCS].obj = ctx_obj = i915_gem_alloc_object(dev,
-							dev_priv->hw_context_size);
-		if (ctx_obj == NULL) {
-			ret = -ENOMEM;
+		ctx_obj = intel_allocate_ctx_backing_obj(dev,
+				dev_priv->hw_context_size);
+		if (IS_ERR_OR_NULL(ctx_obj)) {
+			ret = PTR_ERR(ctx_obj);
 			goto err_out;
 		}
 
-		/*
-		 * Try to make the context utilize L3 as well as LLC.
-		 *
-		 * On VLV we don't have L3 controls in the PTEs so we
-		 * shouldn't touch the cache level, especially as that
-		 * would make the object snooped which might have a
-		 * negative performance impact.
-		 */
-		if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev)) {
-			ret = i915_gem_object_set_cache_level(ctx_obj,
-							      I915_CACHE_L3_LLC);
-			/* Failure shouldn't ever happen this early */
-			if (WARN_ON(ret))
-				goto err_out;
-		}
+		ctx->engine[RCS].obj = ctx_obj;
 	}
 
 	/* Default context will never have a file_priv */
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 18/50] drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring Contexts)
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (16 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 17/50] drm/i915: Extract context backing object allocation oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 19/50] drm/i915/bdw: New file for Logical Ring Contexts and Execlists oscar.mateo
                   ` (33 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

GEN8 brings an expansion of the HW contexts: "Logical Ring Contexts".
These expanded contexts enable a number of new abilities, especially
"Execlists".

In dev_priv, lrc_enabled will reflect the state of whether or not we've
actually properly initialized these new contexts. This helps the
transition in the code but is a candidate for removal at some point.

The macro is defined to off until we have things in place to hope to
work.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Rename "advanced contexts" to the more correct "logical ring
contexts"

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>

v3: Add a module parameter to enable execlists. Execlist are relatively
new, and so it'd be wise to be able to switch back to ring submission
to debug subtle problems that will inevitably arise.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h    | 4 ++++
 drivers/gpu/drm/i915/i915_params.c | 6 ++++++
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 86730d5..bd93dc2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1506,6 +1506,8 @@ struct drm_i915_private {
 
 	uint32_t hw_context_size;
 	struct list_head context_list;
+	/* Logical Ring Contexts */
+	bool lrc_enabled;
 
 	u32 fdi_rx_config;
 
@@ -1926,6 +1928,7 @@ struct drm_i915_cmd_table {
 #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
 
 #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
+#define HAS_LOGICAL_RING_CONTEXTS(dev)	0
 #define HAS_ALIASING_PPGTT(dev)	(INTEL_INFO(dev)->gen >= 6 && \
 				 (!IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev)))
 #define HAS_PPGTT(dev)		(INTEL_INFO(dev)->gen >= 7 \
@@ -2013,6 +2016,7 @@ struct i915_params {
 	int enable_rc6;
 	int enable_fbc;
 	int enable_ppgtt;
+	int enable_execlists;
 	int enable_psr;
 	unsigned int preliminary_hw_support;
 	int disable_power_well;
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index d05a2af..96eb3ef 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -37,6 +37,7 @@ struct i915_params i915 __read_mostly = {
 	.enable_fbc = -1,
 	.enable_hangcheck = true,
 	.enable_ppgtt = -1,
+	.enable_execlists = -1,
 	.enable_psr = 0,
 	.preliminary_hw_support = IS_ENABLED(CONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT),
 	.disable_power_well = 1,
@@ -116,6 +117,11 @@ MODULE_PARM_DESC(enable_ppgtt,
 	"Override PPGTT usage. "
 	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full)");
 
+module_param_named(enable_execlists, i915.enable_execlists, int, 0400);
+MODULE_PARM_DESC(enable_execlists,
+	"Override execlists usage. "
+	"(-1=auto, 0=disabled [default], 1=enabled)");
+
 module_param_named(enable_psr, i915.enable_psr, int, 0600);
 MODULE_PARM_DESC(enable_psr, "Enable PSR (default: false)");
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 19/50] drm/i915/bdw: New file for Logical Ring Contexts and Execlists
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (17 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 18/50] drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring Contexts) oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 20/50] drm/i915/bdw: Rework init code for Logical Ring Contexts oscar.mateo
                   ` (32 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Some legacy HW context and ringbuffer code assumptions don't make sense
for this new submission method, so we will place this stuff in a separate
file.

Note for reviewers: I've carefully considered the best name for this file
and this was my best option (other possibilities were intel_lr_context.c
or intel_execlist.c). I am open to a certain bikeshedding on this matter,
anyway. Regarding splitting execlists and logical ring contexts, it is
probably not worth it for the moment.

v2: Change to intel_lrc.c

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/Makefile    |  1 +
 drivers/gpu/drm/i915/intel_lrc.c | 42 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/intel_lrc.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b1445b7..428cff9 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -29,6 +29,7 @@ i915-y += i915_cmd_parser.o \
 	  i915_gpu_error.o \
 	  i915_irq.o \
 	  i915_trace_points.o \
+	  intel_lrc.o \
 	  intel_ringbuffer.o \
 	  intel_uncore.o
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
new file mode 100644
index 0000000..49bb6fc
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -0,0 +1,42 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Ben Widawsky <ben@bwidawsk.net>
+ *    Michel Thierry <michel.thierry@intel.com>
+ *    Thomas Daniel <thomas.daniel@intel.com>
+ *    Oscar Mateo <oscar.mateo@intel.com>
+ *
+ */
+
+/*
+ * GEN8 brings an expansion of the HW contexts: "Logical Ring Contexts".
+ * These expanded contexts enable a number of new abilities, especially
+ * "Execlists" (also implemented in this file).
+ *
+ * Execlists are the new method by which, on gen8+ hardware, workloads are
+ * submitted for execution (as opposed to the legacy, ringbuffer-based, method).
+ */
+
+#include <drm/drmP.h>
+#include <drm/i915_drm.h>
+#include "i915_drv.h"
-- 
1.9.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 20/50] drm/i915/bdw: Rework init code for Logical Ring Contexts
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (18 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 19/50] drm/i915/bdw: New file for Logical Ring Contexts and Execlists oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 21/50] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
                   ` (31 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

This modifies the init code to try to start logical ring contexts when
possible, and fall back to legacy ringbuffers when not.

Most importantly, things make things easy if we do the context creation
before ringbuffer initialization. Upcoming patches will make it clearer
why that is. For the initial enabling of execlists, the ringbuffer code
will be reused a decent amount. As such this code will have to change,
but it helps with enabling to be able to run through a bunch of the
context init, and still have a system boot.

Finally, for the bikeshedders out there: I've tried merging the legacy
hw context init functionality, ie. one init function, and this logic was
in the context creation. The resulting code was much uglier and for no
real gain.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Rebased on top of the Full PPGTT series.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>

v3: Factor out a enable_execlists() function so it's clear what the
condition is. Use module parameter to enable.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |  3 +++
 drivers/gpu/drm/i915/i915_gem.c  | 24 +++++++++++++++++++-----
 drivers/gpu/drm/i915/intel_lrc.c |  5 +++++
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bd93dc2..41d3f95 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2425,6 +2425,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file);
 
+/* intel_lrc.c */
+int gen8_gem_context_init(struct drm_device *dev);
+
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
 					  struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4a22560..f8acf3d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4490,10 +4490,18 @@ i915_gem_init_hw(struct drm_device *dev)
 	return ret;
 }
 
+static bool enable_execlists(struct drm_device *dev)
+{
+	if (!i915.enable_execlists)
+		return false;
+
+	return HAS_LOGICAL_RING_CONTEXTS(dev) && USES_PPGTT(dev);
+}
+
 int i915_gem_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret;
+	int ret = -1;
 
 	mutex_lock(&dev->struct_mutex);
 
@@ -4509,11 +4517,17 @@ int i915_gem_init(struct drm_device *dev)
 
 	intel_init_rings_early(dev);
 
-	ret = i915_gem_context_init(dev);
+	if (enable_execlists(dev))
+		ret = gen8_gem_context_init(dev);
+
 	if (ret) {
-		mutex_unlock(&dev->struct_mutex);
-		return ret;
-	}
+		ret = i915_gem_context_init(dev);
+		if (ret) {
+			mutex_unlock(&dev->struct_mutex);
+			return ret;
+		}
+	} else
+		dev_priv->lrc_enabled = true;
 
 	ret = i915_gem_init_hw(dev);
 	if (ret == -EIO) {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 49bb6fc..3a93e99 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -40,3 +40,8 @@
 #include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
+
+int gen8_gem_context_init(struct drm_device *dev)
+{
+	return -ENOSYS;
+}
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 21/50] drm/i915/bdw: A bit more advanced context init/fini
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (19 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 20/50] drm/i915/bdw: Rework init code for Logical Ring Contexts oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 22/50] drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC oscar.mateo
                   ` (30 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

There are a few big differences between context init and fini with the
previous implementation of hardware contexts. One of them is
demonstrated in this patch: we must allocate a ctx backing object for
each engine.

Regarding the context size, reading the register to calculate the sizes
can work, I think, however the docs are very clear about the actual
context sizes on GEN8, so just hardcode that and use it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Rebased on top of the Full PPGTT series. It is important to notice
that at this point we have one global default context per engine, all
of them using the aliasing PPGTT (as opposed to the single global
default context we have with legacy HW contexts).

v3:
- Go back to one single global default context, this time with multiple
  backing objects inside.
- Use different context sizes for non-render engines, as suggested by
  Damien (still hardcoded, since the information about the context size
  registers in the BSpec is, well, *lacking*).
- Render ctx size is 20 (or 19) pages, but not 21 (caught by Damien).
- Move default context backing object creation to intel_init_ring (so
  that we don't waste memory in rings that might not get initialized).

Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  4 +-
 drivers/gpu/drm/i915/i915_drv.h         |  5 +-
 drivers/gpu/drm/i915/i915_gem_context.c | 37 +++++++++-----
 drivers/gpu/drm/i915/intel_lrc.c        | 85 ++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 10 ++++
 5 files changed, 124 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 7bdb9be..e0f56f5 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1385,7 +1385,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 cleanup_gem:
 	mutex_lock(&dev->struct_mutex);
 	i915_gem_cleanup_ring(dev);
-	i915_gem_context_fini(dev);
+	i915_gem_context_fini(dev, dev_priv->lrc_enabled);
 	mutex_unlock(&dev->struct_mutex);
 	WARN_ON(dev_priv->mm.aliasing_ppgtt);
 	drm_mm_takedown(&dev_priv->gtt.base.mm);
@@ -1841,7 +1841,7 @@ int i915_driver_unload(struct drm_device *dev)
 		mutex_lock(&dev->struct_mutex);
 		i915_gem_free_all_phys_object(dev);
 		i915_gem_cleanup_ring(dev);
-		i915_gem_context_fini(dev);
+		i915_gem_context_fini(dev, dev_priv->lrc_enabled);
 		WARN_ON(dev_priv->mm.aliasing_ppgtt);
 		mutex_unlock(&dev->struct_mutex);
 		i915_gem_cleanup_stolen(dev);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 41d3f95..1cc1042 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2390,7 +2390,7 @@ void i915_gem_object_ggtt_unpin(struct drm_i915_gem_object *obj);
 /* i915_gem_context.c */
 #define ctx_to_ppgtt(ctx) container_of((ctx)->vm, struct i915_hw_ppgtt, base)
 int __must_check i915_gem_context_init(struct drm_device *dev);
-void i915_gem_context_fini(struct drm_device *dev);
+void i915_gem_context_fini(struct drm_device *dev, bool is_lrc);
 struct drm_i915_gem_object *intel_allocate_ctx_backing_obj(struct drm_device *dev,
 		size_t size);
 struct i915_hw_context *i915_gem_create_context(struct drm_device *dev,
@@ -2427,6 +2427,9 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* intel_lrc.c */
 int gen8_gem_context_init(struct drm_device *dev);
+int gen8_create_lr_context(struct i915_hw_context *ctx,
+			   struct intel_engine *ring,
+			   struct drm_i915_file_private *file_priv);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 708111d..16fc780 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -181,16 +181,22 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	struct i915_hw_context *ctx = container_of(ctx_ref,
 						   typeof(*ctx), ref);
 	struct i915_hw_ppgtt *ppgtt = NULL;
-	struct drm_i915_gem_object *ctx_obj = ctx->engine[RCS].obj;
+	int i;
 
-	if (ctx_obj) {
-		/* We refcount even the aliasing PPGTT to keep the code symmetric */
-		if (USES_PPGTT(ctx_obj->base.dev))
-			ppgtt = ctx_to_ppgtt(ctx);
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		struct drm_i915_gem_object *ctx_obj = ctx->engine[i].obj;
+		if (ctx_obj) {
+			if (i == RCS) {
+				/* We refcount even the aliasing PPGTT to keep the
+				 * code symmetric */
+				if (USES_PPGTT(ctx_obj->base.dev))
+					ppgtt = ctx_to_ppgtt(ctx);
+			}
 
-		/* XXX: Free up the object before tearing down the address space, in
-		 * case we're bound in the PPGTT */
-		drm_gem_object_unreference(&ctx_obj->base);
+			/* XXX: Free up the object before tearing down the address
+			 * space, in case we're bound in the PPGTT */
+			drm_gem_object_unreference(&ctx_obj->base);
+		}
 	}
 
 	if (ppgtt)
@@ -447,15 +453,15 @@ int i915_gem_context_init(struct drm_device *dev)
 	return 0;
 }
 
-void i915_gem_context_fini(struct drm_device *dev)
+void i915_gem_context_fini(struct drm_device *dev, bool is_lrc)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *dctx = dev_priv->ring[RCS].default_context;
 	struct drm_i915_gem_object *dctx_obj = dctx->engine[RCS].obj;
 	struct intel_engine *ring;
-	int unused;
+	int ring_id;
 
-	if (dctx_obj) {
+	if (dctx_obj && !is_lrc) {
 		/* The only known way to stop the gpu from accessing the hw context is
 		 * to reset it. Do this as the very last operation to avoid confusing
 		 * other code, leading to spurious errors. */
@@ -477,15 +483,20 @@ void i915_gem_context_fini(struct drm_device *dev)
 		}
 	}
 
-	for_each_ring(ring, dev_priv, unused) {
+	for_each_ring(ring, dev_priv, ring_id) {
 		if (ring->last_context)
 			i915_gem_context_unreference(ring->last_context);
 
 		ring->default_context = NULL;
 		ring->last_context = NULL;
+
+		dctx_obj = dctx->engine[ring_id].obj;
+		if (dctx_obj) {
+			BUG_ON(ring_id != RCS && !is_lrc);
+			i915_gem_object_ggtt_unpin(dctx_obj);
+		}
 	}
 
-	i915_gem_object_ggtt_unpin(dctx_obj);
 	i915_gem_context_unreference(dctx);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3a93e99..875d7b9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -41,7 +41,90 @@
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
+#define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
+#define GEN8_LR_CONTEXT_OTHER_SIZE (2 * PAGE_SIZE)
+
+#define GEN8_LR_CONTEXT_ALIGN 4096
+
+static uint32_t get_lr_context_size(struct intel_engine *ring)
+{
+	int ret = 0;
+
+	BUG_ON(INTEL_INFO(ring->dev)->gen != 8);
+
+	switch (ring->id) {
+	case RCS:
+		ret = GEN8_LR_CONTEXT_RENDER_SIZE;
+		break;
+	case VCS:
+	case BCS:
+	case VECS:
+	case VCS2:
+		ret = GEN8_LR_CONTEXT_OTHER_SIZE;
+		break;
+	}
+
+	return ret;
+}
+
+int gen8_create_lr_context(struct i915_hw_context *ctx,
+			   struct intel_engine *ring,
+			   struct drm_i915_file_private *file_priv)
+{
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_gem_object *ctx_obj;
+	uint32_t context_size;
+	int ret;
+
+	if (ctx->engine[ring->id].obj)
+		return 0;
+
+	context_size = round_up(get_lr_context_size(ring), 4096);
+
+	ctx_obj = intel_allocate_ctx_backing_obj(dev, context_size);
+	if (IS_ERR_OR_NULL(ctx_obj)) {
+		ret = PTR_ERR(ctx_obj);
+		DRM_DEBUG_DRIVER("Alloc LRC backing obj failed: %d\n", ret);
+		return ret;
+	}
+
+	ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Pin LRC backing obj failed: %d\n", ret);
+		drm_gem_object_unreference(&ctx_obj->base);
+		return ret;
+	}
+
+	ctx->engine[ring->id].obj = ctx_obj;
+
+	return 0;
+}
+
 int gen8_gem_context_init(struct drm_device *dev)
 {
-	return -ENOSYS;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct i915_hw_context *ctx;
+	struct intel_engine *ring;
+	int ring_id;
+	int ret;
+
+	/* Any value != 0 works here */
+	dev_priv->hw_context_size = GEN8_LR_CONTEXT_RENDER_SIZE;
+
+	ctx = i915_gem_create_context(dev, NULL, true, false);
+	if (IS_ERR_OR_NULL(ctx)) {
+		ret = PTR_ERR(ctx);
+		DRM_ERROR("Init LR context failed: %d\n", ret);
+		goto err_out;
+	}
+
+	for_each_ring(ring, dev_priv, ring_id)
+		ring->default_context = ctx;
+
+	DRM_DEBUG_DRIVER("LR context support initialized\n");
+	return 0;
+
+err_out:
+	i915_gem_context_fini(dev, true);
+	return ret;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6225123..e7f61d9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1443,6 +1443,7 @@ err_unref:
 static int intel_init_ring(struct drm_device *dev,
 			   struct intel_engine *ring)
 {
+	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	int ret;
 
@@ -1453,6 +1454,15 @@ static int intel_init_ring(struct drm_device *dev,
 
 	init_waitqueue_head(&ring->irq_queue);
 
+	if (dev_priv->lrc_enabled) {
+		ret = gen8_create_lr_context(ring->default_context, ring, NULL);
+		if (ret) {
+			DRM_ERROR("Create LR context for %s failed: %d\n",
+					ring->name, ret);
+			return ret;
+		}
+	}
+
 	if (I915_NEED_GFX_HWS(dev)) {
 		ret = init_status_page(ring);
 		if (ret)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 22/50] drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (20 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 21/50] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 23/50] drm/i915/bdw: Allocate ringbuffer for user-created LRCs oscar.mateo
                   ` (29 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The default context holds pointers to every engine's default ringbuffer,
and makes its own allocation of the corresponding backing objects. During
ringbuffer creation we might try to allocate them again, but this will
fail silently (same thing with deallocation).

This patch replaces a similar patch by Ben Widawsky in the early series
called: "drm/i915/bdw: Allocate ringbuffer for LR contexts"

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem_context.c |  5 +++++
 drivers/gpu/drm/i915/intel_lrc.c        | 19 +++++++++++++++++++
 3 files changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1cc1042..4d58167 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -597,6 +597,7 @@ struct i915_hw_context {
 	struct intel_engine *last_ring;
 	struct {
 		struct drm_i915_gem_object *obj;
+		struct intel_ringbuffer *ringbuf;
 	} engine[I915_NUM_RINGS];
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_address_space *vm;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 16fc780..76314e3 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -185,6 +185,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
 		struct drm_i915_gem_object *ctx_obj = ctx->engine[i].obj;
+		struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
+
 		if (ctx_obj) {
 			if (i == RCS) {
 				/* We refcount even the aliasing PPGTT to keep the
@@ -197,6 +199,9 @@ void i915_gem_context_free(struct kref *ctx_ref)
 			 * space, in case we're bound in the PPGTT */
 			drm_gem_object_unreference(&ctx_obj->base);
 		}
+
+		if (ringbuf)
+			intel_destroy_ring_buffer(ringbuf);
 	}
 
 	if (ppgtt)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 875d7b9..64d40e4 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -95,6 +95,25 @@ int gen8_create_lr_context(struct i915_hw_context *ctx,
 		return ret;
 	}
 
+	if (!file_priv) {
+		ring->default_ringbuf.size = 32 * PAGE_SIZE;
+
+		/* TODO: For now we put this in the mappable region so that we can reuse
+		 * the existing ringbuffer code which ioremaps it. When we start
+		 * creating many contexts, this will no longer work and we must switch
+		 * to a kmapish interface.
+		 */
+		ret = intel_allocate_ring_buffer(dev, &ring->default_ringbuf);
+		if (ret) {
+			DRM_DEBUG_DRIVER("Failed to allocate ringbuffer %s: %d\n",
+					ring->name, ret);
+			i915_gem_object_ggtt_unpin(ctx_obj);
+			drm_gem_object_unreference(&ctx_obj->base);
+			return ret;
+		}
+		ctx->engine[ring->id].ringbuf = &ring->default_ringbuf;
+	}
+
 	ctx->engine[ring->id].obj = ctx_obj;
 
 	return 0;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 23/50] drm/i915/bdw: Allocate ringbuffer for user-created LRCs
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (21 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 22/50] drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
                   ` (28 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

As we said earlier, logical ring contexts created by the user have their
own ringbuffer: not only the backing pages, but the whole management struct.

Since this is now ready, we use the context ringbuffer or the engine's
default ringbuffer depending on whether LRCs are enabled or not.

This patch replaces a similar patch in the early series called:
"drm/i915/bdw: Prepare for user-created LR contexts"

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c |  5 +++-
 drivers/gpu/drm/i915/intel_lrc.c        | 45 +++++++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.c |  8 ++++--
 3 files changed, 42 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 76314e3..e4e616d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -200,8 +200,11 @@ void i915_gem_context_free(struct kref *ctx_ref)
 			drm_gem_object_unreference(&ctx_obj->base);
 		}
 
-		if (ringbuf)
+		if (ringbuf) {
 			intel_destroy_ring_buffer(ringbuf);
+			if (ctx->file_priv)
+				kfree(ringbuf);
+		}
 	}
 
 	if (ppgtt)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 64d40e4..0f2c5cb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -74,6 +74,7 @@ int gen8_create_lr_context(struct i915_hw_context *ctx,
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_gem_object *ctx_obj;
 	uint32_t context_size;
+	struct intel_ringbuffer *ringbuf;
 	int ret;
 
 	if (ctx->engine[ring->id].obj)
@@ -95,25 +96,43 @@ int gen8_create_lr_context(struct i915_hw_context *ctx,
 		return ret;
 	}
 
-	if (!file_priv) {
-		ring->default_ringbuf.size = 32 * PAGE_SIZE;
-
-		/* TODO: For now we put this in the mappable region so that we can reuse
-		 * the existing ringbuffer code which ioremaps it. When we start
-		 * creating many contexts, this will no longer work and we must switch
-		 * to a kmapish interface.
-		 */
-		ret = intel_allocate_ring_buffer(dev, &ring->default_ringbuf);
-		if (ret) {
-			DRM_DEBUG_DRIVER("Failed to allocate ringbuffer %s: %d\n",
-					ring->name, ret);
+	if (file_priv) {
+		ringbuf = kzalloc(sizeof(*ringbuf), GFP_KERNEL);
+		if (!ringbuf) {
+			ret = -ENOMEM;
+			DRM_DEBUG_DRIVER("Failed to allocate ringbuffer %s\n",
+					ring->name);
 			i915_gem_object_ggtt_unpin(ctx_obj);
 			drm_gem_object_unreference(&ctx_obj->base);
 			return ret;
 		}
-		ctx->engine[ring->id].ringbuf = &ring->default_ringbuf;
+	} else
+		ringbuf = &ring->default_ringbuf;
+
+	ringbuf->size = 32 * PAGE_SIZE;
+	ringbuf->effective_size = ringbuf->size;
+	ringbuf->head = 0;
+	ringbuf->tail = 0;
+	ringbuf->space = ringbuf->size;
+	ringbuf->last_retired_head = -1;
+
+	/* TODO: For now we put this in the mappable region so that we can reuse
+	 * the existing ringbuffer code which ioremaps it. When we start
+	 * creating many contexts, this will no longer work and we must switch
+	 * to a kmapish interface.
+	 */
+	ret = intel_allocate_ring_buffer(dev, ringbuf);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Failed to allocate ringbuffer obj %s: %d\n",
+				ring->name, ret);
+		if (file_priv)
+			kfree(ringbuf);
+		i915_gem_object_ggtt_unpin(ctx_obj);
+		drm_gem_object_unreference(&ctx_obj->base);
+		return ret;
 	}
 
+	ctx->engine[ring->id].ringbuf = ringbuf;
 	ctx->engine[ring->id].obj = ctx_obj;
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e7f61d9..8b0260d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1709,8 +1709,12 @@ static int __intel_ring_prepare(struct intel_engine *ring,
 struct intel_ringbuffer *
 intel_ringbuffer_get(struct intel_engine *ring, struct i915_hw_context *ctx)
 {
-	/* For the time being, the only ringbuffer is in the engine */
-	return &ring->default_ringbuf;
+	struct drm_i915_private *dev_priv = to_i915(ring->dev);
+
+	if (dev_priv->lrc_enabled)
+		return ctx->engine[ring->id].ringbuf;
+	else
+		return &ring->default_ringbuf;
 }
 
 struct intel_ringbuffer *
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (22 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 23/50] drm/i915/bdw: Allocate ringbuffer for user-created LRCs oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 13:36   ` Damien Lespiau
  2014-05-12 17:00   ` [PATCH v2 " oscar.mateo
  2014-05-09 12:08 ` [PATCH 25/50] drm/i915/bdw: Deferred creation of user-created LRCs oscar.mateo
                   ` (27 subsequent siblings)
  51 siblings, 2 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

For the most part, logical ring context objects are similar to hardware
contexts in that the backing object is meant to be opaque. There are
some exceptions where we need to poke certain offsets of the object for
initialization, updating the tail pointer or updating the PDPs.

For our basic execlist implementation we'll only need our PPGTT PDs,
and ringbuffer addresses in order to set up the context. With previous
patches, we have both, so start prepping the context to be load.

Before running a context for the first time you must populate some
fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
first page (in 0 based counting) of the context  image. These same
fields will be read and written to as contexts are saved and restored
once the system is up and running.

Many of these fields are completely reused from previous global
registers: ringbuffer head/tail/control, context control matches some
previous MI_SET_CONTEXT flags, and page directories. There are other
fields which we don't touch which we may want in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
for other engines.

Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>

v3: Several rebases and general changes to the code.

v4: Squash with "Extract LR context object populating"
Also, Damien's review comments:
- Set the Force Posted bit on the LRI header, as the BSpec suggest we do.
- Prevent warning when compiling a 32-bits kernel without HIGHMEM64.
- Add a clarifying comment to the context population code.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h  |   1 +
 drivers/gpu/drm/i915/intel_lrc.c | 159 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 160 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 03ffc57..33d007d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -269,6 +269,7 @@
  *   address/value pairs. Don't overdue it, though, x <= 2^4 must hold!
  */
 #define MI_LOAD_REGISTER_IMM(x)	MI_INSTR(0x22, 2*(x)-1)
+#define   MI_LRI_FORCE_POSTED		(1<<12)
 #define MI_STORE_REGISTER_MEM(x) MI_INSTR(0x24, 2*(x)-1)
 #define MI_STORE_REGISTER_MEM_GEN8(x) MI_INSTR(0x24, 3*(x)-1)
 #define   MI_SRM_LRM_GLOBAL_GTT		(1<<22)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0f2c5cb..5a85496 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -46,6 +46,152 @@
 
 #define GEN8_LR_CONTEXT_ALIGN 4096
 
+#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
+#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
+
+#define CTX_LRI_HEADER_0		0x01
+#define CTX_CONTEXT_CONTROL		0x02
+#define CTX_RING_HEAD			0x04
+#define CTX_RING_TAIL			0x06
+#define CTX_RING_BUFFER_START		0x08
+#define CTX_RING_BUFFER_CONTROL	0x0a
+#define CTX_BB_HEAD_U			0x0c
+#define CTX_BB_HEAD_L			0x0e
+#define CTX_BB_STATE			0x10
+#define CTX_SECOND_BB_HEAD_U		0x12
+#define CTX_SECOND_BB_HEAD_L		0x14
+#define CTX_SECOND_BB_STATE		0x16
+#define CTX_BB_PER_CTX_PTR		0x18
+#define CTX_RCS_INDIRECT_CTX		0x1a
+#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
+#define CTX_LRI_HEADER_1		0x21
+#define CTX_CTX_TIMESTAMP		0x22
+#define CTX_PDP3_UDW			0x24
+#define CTX_PDP3_LDW			0x26
+#define CTX_PDP2_UDW			0x28
+#define CTX_PDP2_LDW			0x2a
+#define CTX_PDP1_UDW			0x2c
+#define CTX_PDP1_LDW			0x2e
+#define CTX_PDP0_UDW			0x30
+#define CTX_PDP0_LDW			0x32
+#define CTX_LRI_HEADER_2		0x41
+#define CTX_R_PWR_CLK_STATE		0x42
+#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
+
+static int
+intel_populate_lr_context(struct i915_hw_context *ctx,
+			  struct intel_engine *ring)
+{
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].obj;
+	struct drm_i915_gem_object *ring_obj = ctx->engine[ring->id].ringbuf->obj;
+	struct i915_hw_ppgtt *ppgtt;
+	struct page *page;
+	uint32_t *reg_state;
+	int ret;
+
+	ppgtt = ctx_to_ppgtt(ctx);
+
+	ret = i915_gem_object_set_to_cpu_domain(ctx_obj, true);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Could not set to CPU domain\n");
+		return ret;
+	}
+
+	ret = i915_gem_object_get_pages(ctx_obj);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Could not get object pages\n");
+		return ret;
+	}
+
+	i915_gem_object_pin_pages(ctx_obj);
+
+	/* The second page of the context object contains some fields which must
+	 * be set up prior to the first execution. */
+	page = i915_gem_object_get_page(ctx_obj, 1);
+	reg_state = kmap_atomic(page);
+
+	/* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
+	 * commands followed by (reg, value) pairs. The values we are setting here are
+	 * only for the first context restore: on a subsequent save, the GPU will
+	 * recreate this batchbuffer with new values (including all the missing
+	 * MI_LOAD_REGISTER_IMM commands that we are not initializing here). */
+	if (ring->id == RCS)
+		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
+	else
+		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
+	reg_state[CTX_LRI_HEADER_0] |= MI_LRI_FORCE_POSTED;
+	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
+	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
+	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
+	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
+	reg_state[CTX_RING_HEAD+1] = 0;
+	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
+	reg_state[CTX_RING_TAIL+1] = 0;
+	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
+	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
+	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
+	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
+	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
+	reg_state[CTX_BB_HEAD_U+1] = 0;
+	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
+	reg_state[CTX_BB_HEAD_L+1] = 0;
+	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
+	reg_state[CTX_BB_STATE+1] = (1<<5);
+	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
+	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
+	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
+	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
+	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
+	reg_state[CTX_SECOND_BB_STATE+1] = 0;
+	if (ring->id == RCS) {
+		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
+		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
+		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
+		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
+		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
+		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
+	}
+	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
+	reg_state[CTX_LRI_HEADER_1] |= MI_LRI_FORCE_POSTED;
+	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
+	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
+	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
+	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
+	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
+	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
+	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
+	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
+	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
+	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
+	reg_state[CTX_PDP3_UDW+1] = (u64)ppgtt->pd_dma_addr[3] >> 32;
+	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
+	reg_state[CTX_PDP2_UDW+1] = (u64)ppgtt->pd_dma_addr[2] >> 32;
+	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
+	reg_state[CTX_PDP1_UDW+1] = (u64)ppgtt->pd_dma_addr[1] >> 32;
+	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
+	reg_state[CTX_PDP0_UDW+1] = (u64)ppgtt->pd_dma_addr[0] >> 32;
+	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
+	if (ring->id == RCS) {
+		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
+		reg_state[CTX_LRI_HEADER_2] |= MI_LRI_FORCE_POSTED;
+		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
+		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
+#if 0
+		/* Offsets not yet defined for these */
+		reg_state[CTX_GPGPU_CSR_BASE_ADDRESS] = 0;
+		reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
+#endif
+	}
+
+	kunmap_atomic(reg_state);
+
+	ctx_obj->dirty = 1;
+	set_page_dirty(page);
+	i915_gem_object_unpin_pages(ctx_obj);
+
+	return 0;
+}
+
 static uint32_t get_lr_context_size(struct intel_engine *ring)
 {
 	int ret = 0;
@@ -135,6 +281,19 @@ int gen8_create_lr_context(struct i915_hw_context *ctx,
 	ctx->engine[ring->id].ringbuf = ringbuf;
 	ctx->engine[ring->id].obj = ctx_obj;
 
+	ret = intel_populate_lr_context(ctx, ring);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Failed to populate LRC: %d\n", ret);
+		ctx->engine[ring->id].ringbuf = NULL;
+		ctx->engine[ring->id].obj = NULL;
+		intel_destroy_ring_buffer(ringbuf);
+		if (file_priv)
+			kfree(ringbuf);
+		i915_gem_object_ggtt_unpin(ctx_obj);
+		drm_gem_object_unreference(&ctx_obj->base);
+		return ret;
+	}
+
 	return 0;
 }
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 25/50] drm/i915/bdw: Deferred creation of user-created LRCs
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (23 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, " oscar.mateo
                   ` (26 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The backing objects for user-created contexts (either via open fd or
create context ioctl) are actually empty until the user starts sending
execbuffers to them. We do this because, at create time, we really
don't know which engine is going to be used with the context later on.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  3 +++
 drivers/gpu/drm/i915/i915_gem_context.c    | 31 +++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  5 ++++-
 drivers/gpu/drm/i915/intel_lrc.c           | 34 ++++++++++++++++++++++++++++++
 4 files changed, 61 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4d58167..7d06a66 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2431,6 +2431,9 @@ int gen8_gem_context_init(struct drm_device *dev);
 int gen8_create_lr_context(struct i915_hw_context *ctx,
 			   struct intel_engine *ring,
 			   struct drm_i915_file_private *file_priv);
+struct i915_hw_context *
+gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
+			  struct intel_engine *ring, const u32 ctx_id);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e4e616d..d4c6863 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -181,6 +181,7 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	struct i915_hw_context *ctx = container_of(ctx_ref,
 						   typeof(*ctx), ref);
 	struct i915_hw_ppgtt *ppgtt = NULL;
+	bool ppgtt_unref = false;
 	int i;
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
@@ -188,12 +189,12 @@ void i915_gem_context_free(struct kref *ctx_ref)
 		struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
 
 		if (ctx_obj) {
-			if (i == RCS) {
-				/* We refcount even the aliasing PPGTT to keep the
-				 * code symmetric */
-				if (USES_PPGTT(ctx_obj->base.dev))
-					ppgtt = ctx_to_ppgtt(ctx);
-			}
+			struct drm_device *dev = ctx_obj->base.dev;
+			struct drm_i915_private *dev_priv = to_i915(dev);
+			ppgtt_unref = USES_PPGTT(dev);
+
+			if (dev_priv->lrc_enabled && ctx->file_priv)
+				i915_gem_object_ggtt_unpin(ctx_obj);
 
 			/* XXX: Free up the object before tearing down the address
 			 * space, in case we're bound in the PPGTT */
@@ -207,8 +208,13 @@ void i915_gem_context_free(struct kref *ctx_ref)
 		}
 	}
 
-	if (ppgtt)
-		kref_put(&ppgtt->ref, ppgtt_release);
+	if (ppgtt_unref) {
+		/* We refcount even the aliasing PPGTT to keep the
+		 * code symmetric */
+		ppgtt = ctx_to_ppgtt(ctx);
+		if (ppgtt)
+			kref_put(&ppgtt->ref, ppgtt_release);
+	}
 	list_del(&ctx->link);
 	kfree(ctx);
 }
@@ -550,12 +556,13 @@ static int context_idr_cleanup(int id, void *p, void *data)
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_private *dev_priv = to_i915(dev);
 
 	idr_init(&file_priv->context_idr);
 
 	mutex_lock(&dev->struct_mutex);
-	file_priv->private_default_ctx =
-		i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev), true);
+	file_priv->private_default_ctx = i915_gem_create_context(dev, file_priv,
+			USES_FULL_PPGTT(dev), !dev_priv->lrc_enabled);
 	mutex_unlock(&dev->struct_mutex);
 
 	if (IS_ERR(file_priv->private_default_ctx)) {
@@ -801,6 +808,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_context_create *args = data;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct i915_hw_context *ctx;
 	int ret;
 
@@ -811,7 +819,8 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
-	ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev), true);
+	ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev),
+			!dev_priv->lrc_enabled);
 	mutex_unlock(&dev->struct_mutex);
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 823ad3d..f7dad8c 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1190,7 +1190,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		goto pre_mutex_err;
 	}
 
-	ctx = i915_gem_validate_context(dev, file, ring, ctx_id);
+	if (dev_priv->lrc_enabled)
+		ctx = gen8_gem_validate_context(dev, file, ring, ctx_id);
+	else
+		ctx = i915_gem_validate_context(dev, file, ring, ctx_id);
 	if (IS_ERR(ctx)) {
 		mutex_unlock(&dev->struct_mutex);
 		ret = PTR_ERR(ctx);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5a85496..a656b48 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -78,6 +78,40 @@
 #define CTX_R_PWR_CLK_STATE		0x42
 #define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
 
+struct i915_hw_context *
+gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
+			  struct intel_engine *ring, const u32 ctx_id)
+{
+	struct i915_hw_context *ctx = NULL;
+	struct i915_ctx_hang_stats *hs;
+
+	/* There is no reason why we cannot accept non-default, non-render contexts,
+	 * other than it changes the ABI (these kind of custom contexts have not been
+	 * allowed before) */
+	if (ring->id != RCS && ctx_id != DEFAULT_CONTEXT_ID)
+		return ERR_PTR(-EINVAL);
+
+	ctx = i915_gem_context_get(file->driver_priv, ctx_id);
+	if (IS_ERR(ctx))
+		return ctx;
+
+	hs = &ctx->hang_stats;
+	if (hs->banned) {
+		DRM_DEBUG("Context %u tried to submit while banned\n", ctx_id);
+		return ERR_PTR(-EIO);
+	}
+
+	if (!ctx->engine[ring->id].obj) {
+		int ret = gen8_create_lr_context(ctx, ring, file->driver_priv);
+		if (ret) {
+			DRM_DEBUG("Could not create LRC %u\n", ctx_id);
+			return ERR_PTR(ret);
+		}
+	}
+
+	return ctx;
+}
+
 static int
 intel_populate_lr_context(struct i915_hw_context *ctx,
 			  struct intel_engine *ring)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, user-created LRCs
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (24 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 25/50] drm/i915/bdw: Deferred creation of user-created LRCs oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-13 13:35   ` Daniel Vetter
  2014-05-09 12:08 ` [PATCH 27/50] drm/i915/bdw: Status page for LR contexts oscar.mateo
                   ` (25 subsequent siblings)
  51 siblings, 1 reply; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

This commit changes the ABI, so it is provided separately so that it can be
dropped by the maintainer is so he wishes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a656b48..0a944c2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -85,12 +85,6 @@ gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 	struct i915_hw_context *ctx = NULL;
 	struct i915_ctx_hang_stats *hs;
 
-	/* There is no reason why we cannot accept non-default, non-render contexts,
-	 * other than it changes the ABI (these kind of custom contexts have not been
-	 * allowed before) */
-	if (ring->id != RCS && ctx_id != DEFAULT_CONTEXT_ID)
-		return ERR_PTR(-EINVAL);
-
 	ctx = i915_gem_context_get(file->driver_priv, ctx_id);
 	if (IS_ERR(ctx))
 		return ctx;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 27/50] drm/i915/bdw: Status page for LR contexts
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (25 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, " oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 28/50] drm/i915/bdw: Enable execlists in the hardware oscar.mateo
                   ` (24 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

The status page with logical ring contexts is included already in the
context object. Update the init and cleanup functions to reflect that. The
status page is offset 0 from the context object when using logical ring
contexts.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Several rebases.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8b0260d..eef7094 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1319,16 +1319,21 @@ i915_dispatch_execbuffer(struct intel_engine *ring,
 
 static void cleanup_status_page(struct intel_engine *ring)
 {
+	struct drm_i915_private *dev_priv = to_i915(ring->dev);
 	struct drm_i915_gem_object *obj;
 
 	obj = ring->status_page.obj;
 	if (obj == NULL)
 		return;
+	ring->status_page.obj = NULL;
 
 	kunmap(sg_page(obj->pages->sgl));
+
+	if (dev_priv->lrc_enabled)
+		return;
+
 	i915_gem_object_ggtt_unpin(obj);
 	drm_gem_object_unreference(&obj->base);
-	ring->status_page.obj = NULL;
 }
 
 static int init_status_page(struct intel_engine *ring)
@@ -1455,15 +1460,22 @@ static int intel_init_ring(struct drm_device *dev,
 	init_waitqueue_head(&ring->irq_queue);
 
 	if (dev_priv->lrc_enabled) {
+		struct drm_i915_gem_object *obj;
+
 		ret = gen8_create_lr_context(ring->default_context, ring, NULL);
 		if (ret) {
 			DRM_ERROR("Create LR context for %s failed: %d\n",
 					ring->name, ret);
 			return ret;
 		}
-	}
 
-	if (I915_NEED_GFX_HWS(dev)) {
+		obj = ring->default_context->engine[ring->id].obj;
+		ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(obj);
+		ring->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
+		if (ring->status_page.page_addr == NULL)
+			return -ENOMEM;
+		ring->status_page.obj = obj;
+	} else if (I915_NEED_GFX_HWS(dev)) {
 		ret = init_status_page(ring);
 		if (ret)
 			return ret;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 28/50] drm/i915/bdw: Enable execlists in the hardware
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (26 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 27/50] drm/i915/bdw: Status page for LR contexts oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:08 ` [PATCH 29/50] drm/i915/bdw: Execlists ring tail writing oscar.mateo
                   ` (23 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Set Replay Mode to 0 per BSpec

Michel Thierry <michel.thierry@intel.com>

v3: Several rebases.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0a944c2..a85f91c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -343,9 +343,17 @@ int gen8_gem_context_init(struct drm_device *dev)
 		goto err_out;
 	}
 
-	for_each_ring(ring, dev_priv, ring_id)
+	for_each_ring(ring, dev_priv, ring_id) {
 		ring->default_context = ctx;
 
+		I915_WRITE(RING_MODE_GEN7(ring),
+			_MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
+			_MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
+		POSTING_READ(RING_MODE_GEN7(ring));
+		DRM_DEBUG_DRIVER("Execlists enabled for %s\n",
+				ring->name);
+	}
+
 	DRM_DEBUG_DRIVER("LR context support initialized\n");
 	return 0;
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 29/50] drm/i915/bdw: Execlists ring tail writing
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (27 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 28/50] drm/i915/bdw: Enable execlists in the hardware oscar.mateo
@ 2014-05-09 12:08 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 30/50] drm/i915/bdw: LR context ring init oscar.mateo
                   ` (22 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The write tail function is a very special place for execlists: since
all access to the ring is mediated through requests (thanks to
Chris Wilson's "Write RING_TAIL once per-request" for that) and all
requests end up with a write tail, this is the place we are going to
use to submit contexts for execution.

For the moment, just mark the place (we still need to do a lot of
preparation before execlists are ready to start submitting things).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index eef7094..03719b0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -430,6 +430,12 @@ static void ring_write_tail(struct intel_engine *ring,
 	I915_WRITE_TAIL(ring, value);
 }
 
+static void gen8_submit_ctx(struct intel_engine *ring,
+			    struct i915_hw_context *ctx, u32 value)
+{
+	DRM_ERROR("Execlists still not ready!\n");
+}
+
 u64 intel_ring_get_active_head(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -1983,12 +1989,15 @@ int intel_init_render_ring(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[RCS];
 
+	ring->submit = ring_write_tail;
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
 			ring->flush = gen6_render_ring_flush;
 		if (INTEL_INFO(dev)->gen >= 8) {
+			if (dev_priv->lrc_enabled)
+				ring->submit = gen8_submit_ctx;
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
 			ring->irq_put = gen8_ring_put_irq;
@@ -2043,7 +2052,7 @@ int intel_init_render_ring(struct drm_device *dev)
 		}
 		ring->irq_enable_mask = I915_USER_INTERRUPT;
 	}
-	ring->submit = ring_write_tail;
+
 	if (IS_HASWELL(dev))
 		ring->dispatch_execbuffer = hsw_ring_dispatch_execbuffer;
 	else if (IS_GEN8(dev))
@@ -2163,6 +2172,8 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (INTEL_INFO(dev)->gen >= 8) {
+			if (dev_priv->lrc_enabled)
+				ring->submit = gen8_submit_ctx;
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			ring->irq_get = gen8_ring_get_irq;
@@ -2229,7 +2240,10 @@ int intel_init_bsd2_ring(struct drm_device *dev)
 		return -EINVAL;
 	}
 
-	ring->submit = ring_write_tail;
+	if (dev_priv->lrc_enabled)
+		ring->submit = gen8_submit_ctx;
+	else
+		ring->submit = ring_write_tail;
 	ring->flush = gen6_bsd_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
@@ -2274,6 +2288,8 @@ int intel_init_blt_ring(struct drm_device *dev)
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
+		if (dev_priv->lrc_enabled)
+			ring->submit = gen8_submit_ctx;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
@@ -2320,6 +2336,8 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	ring->set_seqno = ring_set_seqno;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
+		if (dev_priv->lrc_enabled)
+			ring->submit = gen8_submit_ctx;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 30/50] drm/i915/bdw: LR context ring init
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (28 preceding siblings ...)
  2014-05-09 12:08 ` [PATCH 29/50] drm/i915/bdw: Execlists ring tail writing oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 31/50] drm/i915/bdw: Set the request context information correctly in the LRC case oscar.mateo
                   ` (21 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

Logical ring contexts do not need most of the ring init: we just need
the pipe control object for the render ring and a few workarounds (some
more stuff to be added later).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 67 +++++++++++++++++++++++++++------
 1 file changed, 55 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 03719b0..35e89c9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -565,6 +565,11 @@ out:
 	return ret;
 }
 
+static int init_ring_common_lrc(struct intel_engine *ring)
+{
+	return 0;
+}
+
 static int
 init_pipe_control(struct intel_engine *ring)
 {
@@ -663,6 +668,35 @@ static int init_render_ring(struct intel_engine *ring)
 	return ret;
 }
 
+static int init_render_ring_lrc(struct intel_engine *ring)
+{
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	int ret;
+
+	ret = init_ring_common_lrc(ring);
+	if (ret)
+		return ret;
+
+	/* We need to disable the AsyncFlip performance optimisations in order
+	 * to use MI_WAIT_FOR_EVENT within the CS. It should already be
+	 * programmed to '1' on all products.
+	 *
+	 * WaDisableAsyncFlipPerfMode:snb,ivb,hsw,vlv,bdw
+	 */
+	if (INTEL_INFO(dev)->gen >= 6)
+		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(ASYNC_FLIP_PERF_DISABLE));
+
+	ret = init_pipe_control(ring);
+	if (ret)
+		return ret;
+
+	if (INTEL_INFO(dev)->gen >= 6)
+		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
+
+	return 0;
+}
+
 static void render_ring_cleanup(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
@@ -1990,14 +2024,17 @@ int intel_init_render_ring(struct drm_device *dev)
 	struct intel_engine *ring = &dev_priv->ring[RCS];
 
 	ring->submit = ring_write_tail;
+	ring->init = init_render_ring;
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
 			ring->flush = gen6_render_ring_flush;
 		if (INTEL_INFO(dev)->gen >= 8) {
-			if (dev_priv->lrc_enabled)
+			if (dev_priv->lrc_enabled) {
 				ring->submit = gen8_submit_ctx;
+				ring->init = init_render_ring_lrc;
+			}
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
 			ring->irq_put = gen8_ring_put_irq;
@@ -2065,7 +2102,6 @@ int intel_init_render_ring(struct drm_device *dev)
 		ring->dispatch_execbuffer = i830_dispatch_execbuffer;
 	else
 		ring->dispatch_execbuffer = i915_dispatch_execbuffer;
-	ring->init = init_render_ring;
 	ring->cleanup = render_ring_cleanup;
 
 	/* Workaround batchbuffer to combat CS tlb bug. */
@@ -2163,6 +2199,7 @@ int intel_init_bsd_ring(struct drm_device *dev)
 	struct intel_engine *ring = &dev_priv->ring[VCS];
 
 	ring->submit = ring_write_tail;
+	ring->init = init_ring_common;
 	if (INTEL_INFO(dev)->gen >= 6) {
 		/* gen6 bsd needs a special wa for tail updates */
 		if (IS_GEN6(dev))
@@ -2172,8 +2209,10 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (INTEL_INFO(dev)->gen >= 8) {
-			if (dev_priv->lrc_enabled)
+			if (dev_priv->lrc_enabled) {
 				ring->submit = gen8_submit_ctx;
+				ring->init = init_ring_common_lrc;
+			}
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			ring->irq_get = gen8_ring_get_irq;
@@ -2221,7 +2260,6 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		}
 		ring->dispatch_execbuffer = i965_dispatch_execbuffer;
 	}
-	ring->init = init_ring_common;
 
 	return intel_init_ring(dev, ring);
 }
@@ -2240,10 +2278,13 @@ int intel_init_bsd2_ring(struct drm_device *dev)
 		return -EINVAL;
 	}
 
-	if (dev_priv->lrc_enabled)
+	if (dev_priv->lrc_enabled) {
 		ring->submit = gen8_submit_ctx;
-	else
+		ring->init = init_ring_common_lrc;
+	} else {
 		ring->submit = ring_write_tail;
+		ring->init = init_ring_common;
+	}
 	ring->flush = gen6_bsd_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
@@ -2272,8 +2313,6 @@ int intel_init_bsd2_ring(struct drm_device *dev)
 	ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
 	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 
-	ring->init = init_ring_common;
-
 	return intel_init_ring(dev, ring);
 }
 
@@ -2283,13 +2322,16 @@ int intel_init_blt_ring(struct drm_device *dev)
 	struct intel_engine *ring = &dev_priv->ring[BCS];
 
 	ring->submit = ring_write_tail;
+	ring->init = init_ring_common;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
-		if (dev_priv->lrc_enabled)
+		if (dev_priv->lrc_enabled) {
 			ring->submit = gen8_submit_ctx;
+			ring->init = init_ring_common_lrc;
+		}
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
@@ -2319,7 +2361,6 @@ int intel_init_blt_ring(struct drm_device *dev)
 	ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
 	ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
 	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
-	ring->init = init_ring_common;
 
 	return intel_init_ring(dev, ring);
 }
@@ -2330,14 +2371,17 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	struct intel_engine *ring = &dev_priv->ring[VECS];
 
 	ring->submit = ring_write_tail;
+	ring->init = init_ring_common;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
-		if (dev_priv->lrc_enabled)
+		if (dev_priv->lrc_enabled) {
 			ring->submit = gen8_submit_ctx;
+			ring->init = init_ring_common_lrc;
+		}
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
@@ -2361,7 +2405,6 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
 	ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
 	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
-	ring->init = init_ring_common;
 
 	return intel_init_ring(dev, ring);
 }
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 31/50] drm/i915/bdw: Set the request context information correctly in the LRC case
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (29 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 30/50] drm/i915/bdw: LR context ring init oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 32/50] drm/i915/bdw: GEN8 new ring flush oscar.mateo
                   ` (20 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

We need it (at least) to properly update the last retired head.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f8acf3d..f9ed89e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2224,7 +2224,10 @@ int __i915_add_request(struct intel_engine *ring,
 	/* Hold a reference to the current context so that we can inspect
 	 * it later in case a hangcheck error event fires.
 	 */
-	request->ctx = ring->last_context;
+	if (dev_priv->lrc_enabled)
+		request->ctx = ctx;
+	else
+		request->ctx = ring->last_context;
 	if (request->ctx)
 		i915_gem_context_reference(request->ctx);
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 32/50] drm/i915/bdw: GEN8 new ring flush
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (30 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 31/50] drm/i915/bdw: Set the request context information correctly in the LRC case oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 33/50] drm/i915/bdw: Always write seqno to default context oscar.mateo
                   ` (19 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

The BSD invalidate bit is no longer present, and we can consolidate the
blt and bsd ring flushes into one. This helps prep the code to more
easily handle logical ring contexts.

This partially reverts:
commit 65ea32ce040a0a9a907362e9a362a842fd18cb21
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Dec 13 14:57:32 2012 -0800

    drm/i915/bdw: Update MI_FLUSH_DW

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Several rebases. Do not forget the VEBOX.

v3: Due to a reorder of the patch series, make it ctx aware now.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 60 +++++++++++++++++++++------------
 1 file changed, 38 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 35e89c9..5e4e3f7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1870,6 +1870,31 @@ static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
 		   _MASKED_BIT_DISABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
 }
 
+static int gen8_ring_flush(struct intel_engine *ring,
+			   struct i915_hw_context *ctx,
+			   u32 invalidate, u32 flush)
+{
+	uint32_t cmd;
+	struct intel_ringbuffer *ringbuf;
+
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 4);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
+
+	cmd = MI_FLUSH_DW + 1;
+
+	if (invalidate & I915_GEM_GPU_DOMAINS)
+		cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
+			MI_FLUSH_DW_OP_STOREDW;
+	intel_ringbuffer_emit(ringbuf, cmd);
+	intel_ringbuffer_emit(ringbuf, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
+	intel_ringbuffer_emit(ringbuf, 0); /* upper addr */
+	intel_ringbuffer_emit(ringbuf, 0); /* value */
+	intel_ringbuffer_advance(ringbuf);
+
+	return 0;
+}
+
 static int gen6_bsd_ring_flush(struct intel_engine *ring,
 			       struct i915_hw_context *ctx,
 			       u32 invalidate, u32 flush)
@@ -1882,8 +1907,7 @@ static int gen6_bsd_ring_flush(struct intel_engine *ring,
 		return ret;
 
 	cmd = MI_FLUSH_DW;
-	if (INTEL_INFO(ring->dev)->gen >= 8)
-		cmd += 1;
+
 	/*
 	 * Bspec vol 1c.5 - video engine command streamer:
 	 * "If ENABLED, all TLBs will be invalidated once the flush
@@ -1895,13 +1919,9 @@ static int gen6_bsd_ring_flush(struct intel_engine *ring,
 			MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
 	intel_ring_emit(ring, cmd);
 	intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
-	if (INTEL_INFO(ring->dev)->gen >= 8) {
-		intel_ring_emit(ring, 0); /* upper addr */
-		intel_ring_emit(ring, 0); /* value */
-	} else  {
-		intel_ring_emit(ring, 0);
-		intel_ring_emit(ring, MI_NOOP);
-	}
+	intel_ring_emit(ring, 0);
+	intel_ring_emit(ring, MI_NOOP);
+
 	intel_ring_advance(ring);
 	return 0;
 }
@@ -1990,8 +2010,7 @@ static int gen6_ring_flush(struct intel_engine *ring,
 		return ret;
 
 	cmd = MI_FLUSH_DW;
-	if (INTEL_INFO(ring->dev)->gen >= 8)
-		cmd += 1;
+
 	/*
 	 * Bspec vol 1c.3 - blitter engine command streamer:
 	 * "If ENABLED, all TLBs will be invalidated once the flush
@@ -2003,13 +2022,7 @@ static int gen6_ring_flush(struct intel_engine *ring,
 			MI_FLUSH_DW_OP_STOREDW;
 	intel_ring_emit(ring, cmd);
 	intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
-	if (INTEL_INFO(ring->dev)->gen >= 8) {
-		intel_ring_emit(ring, 0); /* upper addr */
-		intel_ring_emit(ring, 0); /* value */
-	} else  {
-		intel_ring_emit(ring, 0);
-		intel_ring_emit(ring, MI_NOOP);
-	}
+	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
 
 	if (IS_GEN7(dev) && !invalidate && flush)
@@ -2204,7 +2217,6 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		/* gen6 bsd needs a special wa for tail updates */
 		if (IS_GEN6(dev))
 			ring->submit = gen6_bsd_ring_write_tail;
-		ring->flush = gen6_bsd_ring_flush;
 		ring->add_request = gen6_add_request;
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
@@ -2213,6 +2225,7 @@ int intel_init_bsd_ring(struct drm_device *dev)
 				ring->submit = gen8_submit_ctx;
 				ring->init = init_ring_common_lrc;
 			}
+			ring->flush = gen8_ring_flush;
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			ring->irq_get = gen8_ring_get_irq;
@@ -2220,6 +2233,7 @@ int intel_init_bsd_ring(struct drm_device *dev)
 			ring->dispatch_execbuffer =
 				gen8_ring_dispatch_execbuffer;
 		} else {
+			ring->flush = gen6_bsd_ring_flush;
 			ring->irq_enable_mask = GT_BSD_USER_INTERRUPT;
 			ring->irq_get = gen6_ring_get_irq;
 			ring->irq_put = gen6_ring_put_irq;
@@ -2285,7 +2299,7 @@ int intel_init_bsd2_ring(struct drm_device *dev)
 		ring->submit = ring_write_tail;
 		ring->init = init_ring_common;
 	}
-	ring->flush = gen6_bsd_ring_flush;
+	ring->flush = gen8_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
@@ -2323,7 +2337,6 @@ int intel_init_blt_ring(struct drm_device *dev)
 
 	ring->submit = ring_write_tail;
 	ring->init = init_ring_common;
-	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
@@ -2332,12 +2345,14 @@ int intel_init_blt_ring(struct drm_device *dev)
 			ring->submit = gen8_submit_ctx;
 			ring->init = init_ring_common_lrc;
 		}
+		ring->flush = gen8_ring_flush;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 	} else {
+		ring->flush = gen6_ring_flush;
 		ring->irq_enable_mask = GT_BLT_USER_INTERRUPT;
 		ring->irq_get = gen6_ring_get_irq;
 		ring->irq_put = gen6_ring_put_irq;
@@ -2372,7 +2387,6 @@ int intel_init_vebox_ring(struct drm_device *dev)
 
 	ring->submit = ring_write_tail;
 	ring->init = init_ring_common;
-	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
@@ -2382,12 +2396,14 @@ int intel_init_vebox_ring(struct drm_device *dev)
 			ring->submit = gen8_submit_ctx;
 			ring->init = init_ring_common_lrc;
 		}
+		ring->flush = gen8_ring_flush;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 	} else {
+		ring->flush = gen6_ring_flush;
 		ring->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
 		ring->irq_get = hsw_vebox_get_irq;
 		ring->irq_put = hsw_vebox_put_irq;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 33/50] drm/i915/bdw: Always write seqno to default context
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (31 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 32/50] drm/i915/bdw: GEN8 new ring flush oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 34/50] drm/i915/bdw: Implement context switching (somewhat) oscar.mateo
                   ` (18 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Oscar Mateo <oscar.mateo@intel.com>

Even though we have one Hardware Status Page per context, we are still
managing the seqnos per engine. Therefore, the sequence number must be
written to a consistent place for all contexts: one of the global
default contexts.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v2: Since get_seqno and set_seqno now look for the seqno in the engine's
status page, they don't need to be changed.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c | 67 ++++++++++++++++++++++++++++++++-
 2 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 33d007d..2e76ec0 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -259,6 +259,7 @@
 #define   MI_FORCE_RESTORE		(1<<1)
 #define   MI_RESTORE_INHIBIT		(1<<0)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
+#define MI_STORE_DWORD_IMM_GEN8	MI_INSTR(0x20, 2)
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 965+ only */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
 #define   MI_STORE_DWORD_INDEX_SHIFT 2
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5e4e3f7..d38d824 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -781,6 +781,66 @@ gen6_add_request(struct intel_engine *ring,
 	return 0;
 }
 
+static int
+gen8_nonrender_add_request_lrc(struct intel_engine *ring,
+			       struct i915_hw_context *ctx)
+{
+	struct intel_ringbuffer *ringbuf;
+	struct i915_hw_context *dctx = ring->default_context;
+	struct drm_i915_gem_object *obj = dctx->engine[ring->id].obj;
+	u32 cmd;
+
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 6);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
+
+	cmd = MI_FLUSH_DW + 1;
+	cmd |= MI_INVALIDATE_TLB;
+	cmd |= MI_FLUSH_DW_OP_STOREDW;
+
+	intel_ringbuffer_emit(ringbuf, cmd);
+	intel_ringbuffer_emit(ringbuf,
+			((i915_gem_obj_ggtt_offset(obj)) +
+			(I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)) |
+			MI_FLUSH_DW_USE_GTT);
+	intel_ringbuffer_emit(ringbuf, 0); /* upper addr */
+	intel_ringbuffer_emit(ringbuf, ring->outstanding_lazy_seqno);
+	intel_ringbuffer_emit(ringbuf, MI_USER_INTERRUPT);
+	intel_ringbuffer_emit(ringbuf, MI_NOOP);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
+
+	return 0;
+}
+
+static int
+gen8_add_request_lrc(struct intel_engine *ring,
+		     struct i915_hw_context *ctx)
+{
+	struct intel_ringbuffer *ringbuf;
+	struct i915_hw_context *dctx = ring->default_context;
+	struct drm_i915_gem_object *obj = dctx->engine[ring->id].obj;
+	u32 cmd;
+
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 6);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
+
+	cmd = MI_STORE_DWORD_IMM_GEN8;
+	cmd |= (1 << 22); /* use global GTT */
+
+	intel_ringbuffer_emit(ringbuf, cmd);
+	intel_ringbuffer_emit(ringbuf,
+			((i915_gem_obj_ggtt_offset(obj)) +
+			(I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)));
+	intel_ringbuffer_emit(ringbuf, 0); /* upper addr */
+	intel_ringbuffer_emit(ringbuf, ring->outstanding_lazy_seqno);
+	intel_ringbuffer_emit(ringbuf, MI_USER_INTERRUPT);
+	intel_ringbuffer_emit(ringbuf, MI_NOOP);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
+
+	return 0;
+}
+
 static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
 					      u32 seqno)
 {
@@ -2047,6 +2107,7 @@ int intel_init_render_ring(struct drm_device *dev)
 			if (dev_priv->lrc_enabled) {
 				ring->submit = gen8_submit_ctx;
 				ring->init = init_render_ring_lrc;
+				ring->add_request = gen8_add_request_lrc;
 			}
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
@@ -2224,6 +2285,7 @@ int intel_init_bsd_ring(struct drm_device *dev)
 			if (dev_priv->lrc_enabled) {
 				ring->submit = gen8_submit_ctx;
 				ring->init = init_ring_common_lrc;
+				ring->add_request = gen8_nonrender_add_request_lrc;
 			}
 			ring->flush = gen8_ring_flush;
 			ring->irq_enable_mask =
@@ -2294,13 +2356,14 @@ int intel_init_bsd2_ring(struct drm_device *dev)
 
 	if (dev_priv->lrc_enabled) {
 		ring->submit = gen8_submit_ctx;
+		ring->add_request = gen8_nonrender_add_request_lrc;
 		ring->init = init_ring_common_lrc;
 	} else {
 		ring->submit = ring_write_tail;
+		ring->add_request = gen6_add_request;
 		ring->init = init_ring_common;
 	}
 	ring->flush = gen8_ring_flush;
-	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	ring->irq_enable_mask =
@@ -2344,6 +2407,7 @@ int intel_init_blt_ring(struct drm_device *dev)
 		if (dev_priv->lrc_enabled) {
 			ring->submit = gen8_submit_ctx;
 			ring->init = init_ring_common_lrc;
+			ring->add_request = gen8_nonrender_add_request_lrc;
 		}
 		ring->flush = gen8_ring_flush;
 		ring->irq_enable_mask =
@@ -2395,6 +2459,7 @@ int intel_init_vebox_ring(struct drm_device *dev)
 		if (dev_priv->lrc_enabled) {
 			ring->submit = gen8_submit_ctx;
 			ring->init = init_ring_common_lrc;
+			ring->add_request = gen8_nonrender_add_request_lrc;
 		}
 		ring->flush = gen8_ring_flush;
 		ring->irq_enable_mask =
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 34/50] drm/i915/bdw: Implement context switching (somewhat)
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (32 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 33/50] drm/i915/bdw: Always write seqno to default context oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 35/50] drm/i915/bdw: Add forcewake lock around ELSP writes oscar.mateo
                   ` (17 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

A context switch occurs by submitting a context descriptor to the
ExecList Submission Port. Given that we can now initialize a context,
it's possible to begin implementing the context switch by creating the
descriptor and submitting it to ELSP (actually two, since the ELSP
has two ports).

The context object must be mapped in the GGTT, which means it must exist
in the 0-4GB graphics VA range.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: This code has changed quite a lot in various rebases. Of particular
importance is that now we use the globally unique Submission ID to send
to the hardware. Also, context pages are now pinned unconditionally to
GGTT, so there is no need to bind them.

v3: Use LRCA[31:11] as hwCtxId[18:0]. This guarantees that the HW context
ID we submit to the ELSP is globally unique and != 0 (Bspec requirements
of the software use-only bits of the Context ID in the Context Descriptor
Format) without the hassle of the previous submission Id construction.
Also, re-add the ELSP porting read (it was dropped somewhere during the
rebases).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |  9 ++++
 drivers/gpu/drm/i915/intel_lrc.c | 95 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7d06a66..208a4bd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2434,6 +2434,15 @@ int gen8_create_lr_context(struct i915_hw_context *ctx,
 struct i915_hw_context *
 gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 			  struct intel_engine *ring, const u32 ctx_id);
+static inline u32 intel_get_lr_contextid(struct drm_i915_gem_object *ctx_obj)
+{
+	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj);
+
+	/* LRCA is required to be 4K aligned and LRCA context image is always at
+	 * least 2 pages, so the more significant 19 bits are globally unique
+	 * (which leaves one HwCtxId bit free) */
+	return lrca >> 13;
+}
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a85f91c..2eb1c28 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -47,6 +47,7 @@
 #define GEN8_LR_CONTEXT_ALIGN 4096
 
 #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
+#define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
 #define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
 
 #define CTX_LRI_HEADER_0		0x01
@@ -78,6 +79,100 @@
 #define CTX_R_PWR_CLK_STATE		0x42
 #define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
 
+#define GEN8_CTX_VALID (1<<0)
+#define GEN8_CTX_FORCE_PD_RESTORE (1<<1)
+#define GEN8_CTX_FORCE_RESTORE (1<<2)
+#define GEN8_CTX_L3LLC_COHERENT (1<<5)
+#define GEN8_CTX_PRIVILEGE (1<<8)
+enum {
+	ADVANCED_CONTEXT=0,
+	LEGACY_CONTEXT,
+	ADVANCED_AD_CONTEXT,
+	LEGACY_64B_CONTEXT
+};
+#define GEN8_CTX_MODE_SHIFT 3
+enum {
+	FAULT_AND_HANG=0,
+	FAULT_AND_HALT, /* Debug only */
+	FAULT_AND_STREAM,
+	FAULT_AND_CONTINUE /* Unsupported */
+};
+#define GEN8_CTX_FAULT_SHIFT 6
+#define GEN8_CTX_LRCA_SHIFT 12
+#define GEN8_CTX_UNUSED_SHIFT 32
+
+static inline uint64_t get_descriptor(struct drm_i915_gem_object *ctx_obj)
+{
+	uint64_t desc;
+
+	BUG_ON(i915_gem_obj_ggtt_offset(ctx_obj) & 0xFFFFFFFF00000000ULL);
+
+	desc = GEN8_CTX_VALID;
+	desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
+	desc |= i915_gem_obj_ggtt_offset(ctx_obj);
+	desc |= GEN8_CTX_L3LLC_COHERENT;
+	desc |= (u64)intel_get_lr_contextid(ctx_obj) << GEN8_CTX_UNUSED_SHIFT;
+	desc |= GEN8_CTX_PRIVILEGE;
+
+	/* TODO: WaDisableLiteRestore when we start using semaphore
+	 * signalling between Command Streamers */
+	/* desc |= GEN8_CTX_FORCE_RESTORE; */
+
+	return desc;
+}
+
+static void submit_execlist(struct intel_engine *ring,
+		struct drm_i915_gem_object *ctx_obj0,
+		struct drm_i915_gem_object *ctx_obj1)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	uint64_t temp = 0;
+	uint32_t desc[4];
+
+	/* XXX: You must always write both descriptors in the order below. */
+	if (ctx_obj1)
+		temp = get_descriptor(ctx_obj1);
+	else
+		temp = 0;
+	desc[1] = (u32)(temp >> 32);
+	desc[0] = (u32)temp;
+
+	temp = get_descriptor(ctx_obj0);
+	desc[3] = (u32)(temp >> 32);
+	desc[2] = (u32)temp;
+
+	I915_WRITE(RING_ELSP(ring), desc[1]);
+	I915_WRITE(RING_ELSP(ring), desc[0]);
+	I915_WRITE(RING_ELSP(ring), desc[3]);
+	/* The context is automatically loaded after the following */
+	I915_WRITE(RING_ELSP(ring), desc[2]);
+
+	/* ELSP is a write only register, so this serves as a posting read */
+	POSTING_READ(RING_EXECLIST_STATUS(ring));
+}
+
+static int gen8_switch_context(struct intel_engine *ring,
+		struct i915_hw_context *to0, u32 tail0,
+		struct i915_hw_context *to1, u32 tail1)
+{
+	struct drm_i915_gem_object *ctx_obj0;
+	struct drm_i915_gem_object *ctx_obj1 = NULL;
+
+	ctx_obj0 = to0->engine[ring->id].obj;
+	BUG_ON(!ctx_obj0);
+	BUG_ON(!i915_gem_obj_is_pinned(ctx_obj0));
+
+	if (to1) {
+		ctx_obj1 = to1->engine[ring->id].obj;
+		BUG_ON(!ctx_obj1);
+		BUG_ON(!i915_gem_obj_is_pinned(ctx_obj1));
+	}
+
+	submit_execlist(ring, ctx_obj0, ctx_obj1);
+
+	return 0;
+}
+
 struct i915_hw_context *
 gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 			  struct intel_engine *ring, const u32 ctx_id)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 35/50] drm/i915/bdw: Add forcewake lock around ELSP writes
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (33 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 34/50] drm/i915/bdw: Implement context switching (somewhat) oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 36/50] drm/i915/bdw: Write the tail pointer, LRC style oscar.mateo
                   ` (16 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Thomas Daniel <thomas.daniel@intel.com>

BSPEC says: SW must set Force Wakeup bit to prevent GT from
entering C6 while ELSP writes are in progress.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Acked-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |  1 +
 drivers/gpu/drm/i915/intel_lrc.c | 15 +++++++++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 208a4bd..6b39fed 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2694,6 +2694,7 @@ int vlv_freq_opcode(struct drm_i915_private *dev_priv, int val);
 
 #define I915_READ(reg)		dev_priv->uncore.funcs.mmio_readl(dev_priv, (reg), true)
 #define I915_WRITE(reg, val)	dev_priv->uncore.funcs.mmio_writel(dev_priv, (reg), (val), true)
+#define I915_RAW_WRITE(reg, val)	writel(val, dev_priv->regs + reg)
 #define I915_READ_NOTRACE(reg)		dev_priv->uncore.funcs.mmio_readl(dev_priv, (reg), false)
 #define I915_WRITE_NOTRACE(reg, val)	dev_priv->uncore.funcs.mmio_writel(dev_priv, (reg), (val), false)
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2eb1c28..54cbb4b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -141,14 +141,21 @@ static void submit_execlist(struct intel_engine *ring,
 	desc[3] = (u32)(temp >> 32);
 	desc[2] = (u32)temp;
 
-	I915_WRITE(RING_ELSP(ring), desc[1]);
-	I915_WRITE(RING_ELSP(ring), desc[0]);
-	I915_WRITE(RING_ELSP(ring), desc[3]);
+	/* Set Force Wakeup bit to prevent GT from entering C6 while
+	 * ELSP writes are in progress */
+	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
+
+	I915_RAW_WRITE(RING_ELSP(ring), desc[1]);
+	I915_RAW_WRITE(RING_ELSP(ring), desc[0]);
+	I915_RAW_WRITE(RING_ELSP(ring), desc[3]);
 	/* The context is automatically loaded after the following */
-	I915_WRITE(RING_ELSP(ring), desc[2]);
+	I915_RAW_WRITE(RING_ELSP(ring), desc[2]);
 
 	/* ELSP is a write only register, so this serves as a posting read */
 	POSTING_READ(RING_EXECLIST_STATUS(ring));
+
+	/* Release Force Wakeup */
+	gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
 }
 
 static int gen8_switch_context(struct intel_engine *ring,
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 36/50] drm/i915/bdw: Write the tail pointer, LRC style
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (34 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 35/50] drm/i915/bdw: Add forcewake lock around ELSP writes oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 37/50] drm/i915/bdw: Don't write PDP in the legacy way when using LRCs oscar.mateo
                   ` (15 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Each logical ring context has the tail pointer in the context object,
so update it before submission.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 54cbb4b..b06098e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -158,6 +158,21 @@ static void submit_execlist(struct intel_engine *ring,
 	gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
 }
 
+static int lr_context_write_tail(struct drm_i915_gem_object *ctx_obj, u32 tail)
+{
+	struct page *page;
+	uint32_t *reg_state;
+
+	page = i915_gem_object_get_page(ctx_obj, 1);
+	reg_state = kmap_atomic(page);
+
+	reg_state[CTX_RING_TAIL+1] = tail;
+
+	kunmap_atomic(reg_state);
+
+	return 0;
+}
+
 static int gen8_switch_context(struct intel_engine *ring,
 		struct i915_hw_context *to0, u32 tail0,
 		struct i915_hw_context *to1, u32 tail1)
@@ -169,10 +184,14 @@ static int gen8_switch_context(struct intel_engine *ring,
 	BUG_ON(!ctx_obj0);
 	BUG_ON(!i915_gem_obj_is_pinned(ctx_obj0));
 
+	lr_context_write_tail(ctx_obj0, tail0);
+
 	if (to1) {
 		ctx_obj1 = to1->engine[ring->id].obj;
 		BUG_ON(!ctx_obj1);
 		BUG_ON(!i915_gem_obj_is_pinned(ctx_obj1));
+
+		lr_context_write_tail(ctx_obj1, tail1);
 	}
 
 	submit_execlist(ring, ctx_obj0, ctx_obj1);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 37/50] drm/i915/bdw: Don't write PDP in the legacy way when using LRCs
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (35 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 36/50] drm/i915/bdw: Write the tail pointer, LRC style oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 38/50] drm/i915/bdw: LR context switch interrupts oscar.mateo
                   ` (14 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

This is mostly for correctness so that we know we are running the LR
context correctly (this is, the PDPs are contained inside the context
object).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a0993c0..de4a982 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -240,11 +240,15 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 			  struct intel_engine *ring,
 			  bool synchronous)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	int i, ret;
 
 	/* bit of a hack to find the actual last used pd */
 	int used_pd = ppgtt->num_pd_entries / GEN8_PDES_PER_PAGE;
 
+	if (dev_priv->lrc_enabled)
+		return 0;
+
 	for (i = used_pd - 1; i >= 0; i--) {
 		dma_addr_t addr = ppgtt->pd_dma_addr[i];
 		ret = gen8_write_pdp(ring, i, addr, synchronous);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 38/50] drm/i915/bdw: LR context switch interrupts
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (36 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 37/50] drm/i915/bdw: Don't write PDP in the legacy way when using LRCs oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 39/50] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
                   ` (13 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Thomas Daniel <thomas.daniel@intel.com>

We need to attend context switch interrupts from all rings. Also, fixed writing
IMR/IER and added HWSTAM at ring init time.

Notice that, if added to irq_enable_mask, the context switch interrupts would
be incorrectly masked out when the user interrupts are due to no users waiting
on a sequence number. Therefore, this commit adds a bitmask of interrupts to
be kept unmasked at all times.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v2: Disable HWSTAM, as suggested by Damien (nobody listens to these interrupts,
anyway).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c         | 27 ++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_reg.h         |  2 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 36 ++++++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 4 files changed, 45 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 873ae50..a28cf6c 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1300,7 +1300,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 				       struct drm_i915_private *dev_priv,
 				       u32 master_ctl)
 {
-	u32 rcs, bcs, vcs;
+	u32 rcs, bcs, vcs, vecs;
 	uint32_t tmp = 0;
 	irqreturn_t ret = IRQ_NONE;
 
@@ -1314,6 +1314,8 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 				notify_ring(dev, &dev_priv->ring[RCS]);
 			if (bcs & GT_RENDER_USER_INTERRUPT)
 				notify_ring(dev, &dev_priv->ring[BCS]);
+			if ((rcs | bcs) & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
 			I915_WRITE(GEN8_GT_IIR(0), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT0)!\n");
@@ -1326,9 +1328,13 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 			vcs = tmp >> GEN8_VCS1_IRQ_SHIFT;
 			if (vcs & GT_RENDER_USER_INTERRUPT)
 				notify_ring(dev, &dev_priv->ring[VCS]);
+			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
 			vcs = tmp >> GEN8_VCS2_IRQ_SHIFT;
 			if (vcs & GT_RENDER_USER_INTERRUPT)
 				notify_ring(dev, &dev_priv->ring[VCS2]);
+			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
 			I915_WRITE(GEN8_GT_IIR(1), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT1)!\n");
@@ -1338,9 +1344,11 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 		tmp = I915_READ(GEN8_GT_IIR(3));
 		if (tmp) {
 			ret = IRQ_HANDLED;
-			vcs = tmp >> GEN8_VECS_IRQ_SHIFT;
-			if (vcs & GT_RENDER_USER_INTERRUPT)
+			vecs = tmp >> GEN8_VECS_IRQ_SHIFT;
+			if (vecs & GT_RENDER_USER_INTERRUPT)
 				notify_ring(dev, &dev_priv->ring[VECS]);
+			if (vecs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
 			I915_WRITE(GEN8_GT_IIR(3), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT3)!\n");
@@ -3243,12 +3251,17 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 	/* These are interrupts we'll toggle with the ring mask register */
 	uint32_t gt_interrupts[] = {
 		GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
-			GT_RENDER_L3_PARITY_ERROR_INTERRUPT |
-			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT,
+		GT_RENDER_L3_PARITY_ERROR_INTERRUPT |
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
+		GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT |
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT,
 		GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
-			GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT,
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
+		GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT |
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT,
 		0,
-		GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT
+		GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT |
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT,
 		};
 
 	for (i = 0; i < ARRAY_SIZE(gt_interrupts); i++)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2e76ec0..97a51f8 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -823,6 +823,7 @@ enum punit_power_well {
 #define RING_ACTHD_UDW(base)	((base)+0x5c)
 #define RING_NOPID(base)	((base)+0x94)
 #define RING_IMR(base)		((base)+0xa8)
+#define RING_HWSTAM(base)	((base)+0x98)
 #define RING_TIMESTAMP(base)	((base)+0x358)
 #define   TAIL_ADDR		0x001FFFF8
 #define   HEAD_WRAP_COUNT	0xFFE00000
@@ -4269,6 +4270,7 @@ enum punit_power_well {
 #define GEN8_GT_IMR(which) (0x44304 + (0x10 * (which)))
 #define GEN8_GT_IIR(which) (0x44308 + (0x10 * (which)))
 #define GEN8_GT_IER(which) (0x4430c + (0x10 * (which)))
+#define   GEN8_GT_CONTEXT_SWITCH_INTERRUPT	(1<<8)
 
 #define GEN8_BCS_IRQ_SHIFT 16
 #define GEN8_RCS_IRQ_SHIFT 0
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d38d824..847fec5 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -567,6 +567,12 @@ out:
 
 static int init_ring_common_lrc(struct intel_engine *ring)
 {
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
+	I915_WRITE(RING_HWSTAM(ring->mmio_base), 0xffffffff);
+
 	return 0;
 }
 
@@ -1288,13 +1294,7 @@ gen8_ring_get_irq(struct intel_engine *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (ring->irq_refcount++ == 0) {
-		if (HAS_L3_DPF(dev) && ring->id == RCS) {
-			I915_WRITE_IMR(ring,
-				       ~(ring->irq_enable_mask |
-					 GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
-		} else {
-			I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
-		}
+		I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
 		POSTING_READ(RING_IMR(ring->mmio_base));
 	}
 	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
@@ -1311,12 +1311,7 @@ gen8_ring_put_irq(struct intel_engine *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (--ring->irq_refcount == 0) {
-		if (HAS_L3_DPF(dev) && ring->id == RCS) {
-			I915_WRITE_IMR(ring,
-				       ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
-		} else {
-			I915_WRITE_IMR(ring, ~0);
-		}
+		I915_WRITE_IMR(ring, ~ring->irq_keep_mask);
 		POSTING_READ(RING_IMR(ring->mmio_base));
 	}
 	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
@@ -2108,15 +2103,20 @@ int intel_init_render_ring(struct drm_device *dev)
 				ring->submit = gen8_submit_ctx;
 				ring->init = init_render_ring_lrc;
 				ring->add_request = gen8_add_request_lrc;
+				ring->irq_keep_mask =
+					GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT;
 			}
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
 			ring->irq_put = gen8_ring_put_irq;
+			if (HAS_L3_DPF(dev))
+				ring->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
 		} else {
 			ring->irq_get = gen6_ring_get_irq;
 			ring->irq_put = gen6_ring_put_irq;
 		}
-		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
+		ring->irq_enable_mask =
+			GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT;
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		ring->semaphore.sync_to = gen6_ring_sync;
@@ -2286,6 +2286,8 @@ int intel_init_bsd_ring(struct drm_device *dev)
 				ring->submit = gen8_submit_ctx;
 				ring->init = init_ring_common_lrc;
 				ring->add_request = gen8_nonrender_add_request_lrc;
+				ring->irq_keep_mask =
+					GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			}
 			ring->flush = gen8_ring_flush;
 			ring->irq_enable_mask =
@@ -2358,6 +2360,8 @@ int intel_init_bsd2_ring(struct drm_device *dev)
 		ring->submit = gen8_submit_ctx;
 		ring->add_request = gen8_nonrender_add_request_lrc;
 		ring->init = init_ring_common_lrc;
+		ring->irq_keep_mask =
+			GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
 	} else {
 		ring->submit = ring_write_tail;
 		ring->add_request = gen6_add_request;
@@ -2408,6 +2412,8 @@ int intel_init_blt_ring(struct drm_device *dev)
 			ring->submit = gen8_submit_ctx;
 			ring->init = init_ring_common_lrc;
 			ring->add_request = gen8_nonrender_add_request_lrc;
+			ring->irq_keep_mask =
+				GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		}
 		ring->flush = gen8_ring_flush;
 		ring->irq_enable_mask =
@@ -2460,6 +2466,8 @@ int intel_init_vebox_ring(struct drm_device *dev)
 			ring->submit = gen8_submit_ctx;
 			ring->init = init_ring_common_lrc;
 			ring->add_request = gen8_nonrender_add_request_lrc;
+			ring->irq_keep_mask =
+				GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		}
 		ring->flush = gen8_ring_flush;
 		ring->irq_enable_mask =
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c224fdc..709b1f1 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -96,6 +96,7 @@ struct intel_engine {
 
 	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
+	u32		irq_keep_mask;		/* bitmask for interrupts that should not be masked */
 	u32		trace_irq_seqno;
 	bool __must_check (*irq_get)(struct intel_engine *ring);
 	void		(*irq_put)(struct intel_engine *ring);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 39/50] drm/i915/bdw: Get prepared for a two-stage execlist submit process
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (37 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 38/50] drm/i915/bdw: LR context switch interrupts oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 40/50] drm/i915/bdw: Handle context switch events oscar.mateo
                   ` (12 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Michel Thierry <michel.thierry@intel.com>

Context switch (and execlist submission) should happen only when
other contexts are not active, otherwise pre-emption occurs.

To assure this, we place context switch requests in a queue and those
request are later consumed when the right context switch interrupt is
received.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>

v2: Use a spinlock, do not remove the requests on unqueue (wait for
context switch completion).

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v3: Several rebases and code changes. Use unique ID.

v4:
- Move the queue/lock init to the late ring initialization.
- Damien's kmalloc review comments: check return, use sizeof(*req),
do not cast.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  6 ++++
 drivers/gpu/drm/i915/intel_lrc.c        | 57 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  3 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
 4 files changed, 69 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6b39fed..f2aae6a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1745,6 +1745,9 @@ struct drm_i915_gem_request {
 	struct drm_i915_file_private *file_priv;
 	/** file_priv list entry for this request */
 	struct list_head client_list;
+
+	/** execlist queue entry for this request */
+	struct list_head execlist_link;
 };
 
 struct drm_i915_file_private {
@@ -2443,6 +2446,9 @@ static inline u32 intel_get_lr_contextid(struct drm_i915_gem_object *ctx_obj)
 	 * (which leaves one HwCtxId bit free) */
 	return lrca >> 13;
 }
+int gen8_switch_context_queue(struct intel_engine *ring,
+			      struct i915_hw_context *to,
+			      u32 tail);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b06098e..6da7db9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -199,6 +199,63 @@ static int gen8_switch_context(struct intel_engine *ring,
 	return 0;
 }
 
+static void gen8_switch_context_unqueue(struct intel_engine *ring)
+{
+	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
+	struct drm_i915_gem_request *cursor = NULL, *tmp = NULL;
+
+	if (list_empty(&ring->execlist_queue))
+		return;
+
+	/* Try to read in pairs */
+	list_for_each_entry_safe(cursor, tmp, &ring->execlist_queue, execlist_link) {
+		if (!req0)
+			req0 = cursor;
+		else if (req0->ctx == cursor->ctx) {
+			/* Same ctx: ignore first request, as second request
+			 * will update tail past first request's workload */
+			list_del(&req0->execlist_link);
+			i915_gem_context_unreference(req0->ctx);
+			kfree(req0);
+			req0 = cursor;
+		} else {
+			req1 = cursor;
+			break;
+		}
+	}
+
+	BUG_ON(gen8_switch_context(ring, req0->ctx, req0->tail,
+			req1? req1->ctx : NULL, req1? req1->tail : 0));
+}
+
+int gen8_switch_context_queue(struct intel_engine *ring,
+			      struct i915_hw_context *to,
+			      u32 tail)
+{
+	struct drm_i915_gem_request *req = NULL;
+	unsigned long flags;
+	bool was_empty;
+
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (req == NULL)
+		return -ENOMEM;
+	req->ring = ring;
+	req->ctx = to;
+	i915_gem_context_reference(req->ctx);
+	req->tail = tail;
+
+	spin_lock_irqsave(&ring->execlist_lock, flags);
+
+	was_empty = list_empty(&ring->execlist_queue);
+	list_add_tail(&req->execlist_link, &ring->execlist_queue);
+	if (was_empty)
+		gen8_switch_context_unqueue(ring);
+
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
+
+	return 0;
+}
+
 struct i915_hw_context *
 gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 			  struct intel_engine *ring, const u32 ctx_id)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 847fec5..35ced7c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1554,6 +1554,9 @@ static int intel_init_ring(struct drm_device *dev,
 
 	init_waitqueue_head(&ring->irq_queue);
 
+	INIT_LIST_HEAD(&ring->execlist_queue);
+	spin_lock_init(&ring->execlist_lock);
+
 	if (dev_priv->lrc_enabled) {
 		struct drm_i915_gem_object *obj;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 709b1f1..daf91de 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -176,6 +176,9 @@ struct intel_engine {
 
 	wait_queue_head_t irq_queue;
 
+	spinlock_t execlist_lock;
+	struct list_head execlist_queue;
+
 	struct i915_hw_context *default_context;
 	struct i915_hw_context *last_context;
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 40/50] drm/i915/bdw: Handle context switch events
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (38 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 39/50] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-06-11 11:52   ` Daniel Vetter
  2014-05-09 12:09 ` [PATCH 41/50] drm/i915/bdw: Start queueing contexts to be submitted oscar.mateo
                   ` (11 subsequent siblings)
  51 siblings, 1 reply; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Thomas Daniel <thomas.daniel@intel.com>

Handle all context status events in the context status buffer on every
context switch interrupt. We only remove work from the execlist queue
after a context status buffer reports that it has completed and we only
attempt to schedule new contexts on interrupt when a previously submitted
context completes (unless no contexts are queued, which means the GPU is
free).

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v2: Unreferencing the context when we are freeing the request might free
the backing bo, which requires the struct_mutex to be grabbed, so defer
unreferencing and freeing to a bottom half.

v3:
- Ack the interrupt inmediately, before trying to handle it (fix for
missing interrupts by Bob Beckett <robert.beckett@intel.com>).
- Update the Context Status Buffer Read Pointer, just in case (spotted
by Damien Lespiau).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |   3 +
 drivers/gpu/drm/i915/i915_irq.c         |  38 +++++++-----
 drivers/gpu/drm/i915/intel_lrc.c        | 102 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.c |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
 5 files changed, 129 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f2aae6a..07b8bdc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1748,6 +1748,8 @@ struct drm_i915_gem_request {
 
 	/** execlist queue entry for this request */
 	struct list_head execlist_link;
+	/** Struct to handle this request in the bottom half of an interrupt */
+	struct work_struct work;
 };
 
 struct drm_i915_file_private {
@@ -2449,6 +2451,7 @@ static inline u32 intel_get_lr_contextid(struct drm_i915_gem_object *ctx_obj)
 int gen8_switch_context_queue(struct intel_engine *ring,
 			      struct i915_hw_context *to,
 			      u32 tail);
+void gen8_handle_context_events(struct intel_engine *ring);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index a28cf6c..fbffead 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1300,6 +1300,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 				       struct drm_i915_private *dev_priv,
 				       u32 master_ctl)
 {
+	struct intel_engine *ring;
 	u32 rcs, bcs, vcs, vecs;
 	uint32_t tmp = 0;
 	irqreturn_t ret = IRQ_NONE;
@@ -1307,16 +1308,22 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 	if (master_ctl & (GEN8_GT_RCS_IRQ | GEN8_GT_BCS_IRQ)) {
 		tmp = I915_READ(GEN8_GT_IIR(0));
 		if (tmp) {
+			I915_WRITE(GEN8_GT_IIR(0), tmp);
 			ret = IRQ_HANDLED;
+
 			rcs = tmp >> GEN8_RCS_IRQ_SHIFT;
-			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
+			ring = &dev_priv->ring[RCS];
 			if (rcs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[RCS]);
+				notify_ring(dev, ring);
+			if (rcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+				gen8_handle_context_events(ring);
+
+			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
+			ring = &dev_priv->ring[BCS];
 			if (bcs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[BCS]);
-			if ((rcs | bcs) & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
-			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
-			I915_WRITE(GEN8_GT_IIR(0), tmp);
+				notify_ring(dev, ring);
+			if (bcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+				gen8_handle_context_events(ring);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT0)!\n");
 	}
@@ -1324,18 +1331,20 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 	if (master_ctl & (GEN8_GT_VCS1_IRQ | GEN8_GT_VCS2_IRQ)) {
 		tmp = I915_READ(GEN8_GT_IIR(1));
 		if (tmp) {
+			I915_WRITE(GEN8_GT_IIR(1), tmp);
 			ret = IRQ_HANDLED;
 			vcs = tmp >> GEN8_VCS1_IRQ_SHIFT;
+			ring = &dev_priv->ring[VCS];
 			if (vcs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[VCS]);
+				notify_ring(dev, ring);
 			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
-			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
+			     gen8_handle_context_events(ring);
 			vcs = tmp >> GEN8_VCS2_IRQ_SHIFT;
+			ring = &dev_priv->ring[VCS2];
 			if (vcs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[VCS2]);
+				notify_ring(dev, ring);
 			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
-			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
-			I915_WRITE(GEN8_GT_IIR(1), tmp);
+			     gen8_handle_context_events(ring);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT1)!\n");
 	}
@@ -1343,13 +1352,14 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 	if (master_ctl & GEN8_GT_VECS_IRQ) {
 		tmp = I915_READ(GEN8_GT_IIR(3));
 		if (tmp) {
+			I915_WRITE(GEN8_GT_IIR(3), tmp);
 			ret = IRQ_HANDLED;
 			vecs = tmp >> GEN8_VECS_IRQ_SHIFT;
+			ring = &dev_priv->ring[VECS];
 			if (vecs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[VECS]);
+				notify_ring(dev, ring);
 			if (vecs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
-			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
-			I915_WRITE(GEN8_GT_IIR(3), tmp);
+				gen8_handle_context_events(ring);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT3)!\n");
 	}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6da7db9..1ff493a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -49,6 +49,22 @@
 #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
 #define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
 #define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
+#define RING_CONTEXT_STATUS_BUF(ring)	((ring)->mmio_base+0x370)
+#define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
+
+#define RING_EXECLIST_QFULL		(1 << 0x2)
+#define RING_EXECLIST1_VALID		(1 << 0x3)
+#define RING_EXECLIST0_VALID		(1 << 0x4)
+#define RING_EXECLIST_ACTIVE_STATUS	(3 << 0xE)
+#define RING_EXECLIST1_ACTIVE		(1 << 0x11)
+#define RING_EXECLIST0_ACTIVE		(1 << 0x12)
+
+#define GEN8_CTX_STATUS_IDLE_ACTIVE	(1 << 0)
+#define GEN8_CTX_STATUS_PREEMPTED	(1 << 1)
+#define GEN8_CTX_STATUS_ELEMENT_SWITCH	(1 << 2)
+#define GEN8_CTX_STATUS_ACTIVE_IDLE	(1 << 3)
+#define GEN8_CTX_STATUS_COMPLETE	(1 << 4)
+#define GEN8_CTX_STATUS_LITE_RESTORE	(1 << 15)
 
 #define CTX_LRI_HEADER_0		0x01
 #define CTX_CONTEXT_CONTROL		0x02
@@ -203,6 +219,9 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
 {
 	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
 	struct drm_i915_gem_request *cursor = NULL, *tmp = NULL;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+
+	assert_spin_locked(&ring->execlist_lock);
 
 	if (list_empty(&ring->execlist_queue))
 		return;
@@ -215,8 +234,7 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
 			/* Same ctx: ignore first request, as second request
 			 * will update tail past first request's workload */
 			list_del(&req0->execlist_link);
-			i915_gem_context_unreference(req0->ctx);
-			kfree(req0);
+			queue_work(dev_priv->wq, &req0->work);
 			req0 = cursor;
 		} else {
 			req1 = cursor;
@@ -228,6 +246,85 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
 			req1? req1->ctx : NULL, req1? req1->tail : 0));
 }
 
+static bool check_remove_request(struct intel_engine *ring, u32 request_id)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_request *head_req;
+
+	assert_spin_locked(&ring->execlist_lock);
+
+	head_req = list_first_entry_or_null(&ring->execlist_queue,
+			struct drm_i915_gem_request, execlist_link);
+	if (head_req != NULL) {
+		struct drm_i915_gem_object *ctx_obj =
+				head_req->ctx->engine[ring->id].obj;
+		if (intel_get_lr_contextid(ctx_obj) == request_id) {
+			list_del(&head_req->execlist_link);
+			queue_work(dev_priv->wq, &head_req->work);
+			return true;
+		}
+	}
+
+	return false;
+}
+
+void gen8_handle_context_events(struct intel_engine *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	u32 status_pointer;
+	u8 read_pointer;
+	u8 write_pointer;
+	u32 status;
+	u32 status_id;
+	u32 submit_contexts = 0;
+
+	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
+
+	read_pointer = ring->next_context_status_buffer;
+	write_pointer = status_pointer & 0x07;
+	if (read_pointer > write_pointer)
+		write_pointer += 6;
+
+	spin_lock(&ring->execlist_lock);
+
+	while (read_pointer < write_pointer) {
+		read_pointer++;
+		status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
+				(read_pointer % 6) * 8);
+		status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
+				(read_pointer % 6) * 8 + 4);
+
+		if (status & GEN8_CTX_STATUS_COMPLETE) {
+			if (check_remove_request(ring, status_id))
+				submit_contexts++;
+		}
+	}
+
+	if (submit_contexts != 0)
+		gen8_switch_context_unqueue(ring);
+
+	spin_unlock(&ring->execlist_lock);
+
+	WARN(submit_contexts > 2, "More than two context complete events?\n");
+	ring->next_context_status_buffer = write_pointer % 6;
+
+	I915_WRITE(RING_CONTEXT_STATUS_PTR(ring),
+			((u32)ring->next_context_status_buffer & 0x07) << 8);
+}
+
+static void free_request_task(struct work_struct *work)
+{
+	struct drm_i915_gem_request *req =
+			container_of(work, struct drm_i915_gem_request, work);
+	struct drm_device *dev = req->ring->dev;
+
+	mutex_lock(&dev->struct_mutex);
+	i915_gem_context_unreference(req->ctx);
+	mutex_unlock(&dev->struct_mutex);
+
+	kfree(req);
+}
+
 int gen8_switch_context_queue(struct intel_engine *ring,
 			      struct i915_hw_context *to,
 			      u32 tail)
@@ -243,6 +340,7 @@ int gen8_switch_context_queue(struct intel_engine *ring,
 	req->ctx = to;
 	i915_gem_context_reference(req->ctx);
 	req->tail = tail;
+	INIT_WORK(&req->work, free_request_task);
 
 	spin_lock_irqsave(&ring->execlist_lock, flags);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 35ced7c..9cd6ee8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1573,6 +1573,7 @@ static int intel_init_ring(struct drm_device *dev,
 		if (ring->status_page.page_addr == NULL)
 			return -ENOMEM;
 		ring->status_page.obj = obj;
+		ring->next_context_status_buffer = 0;
 	} else if (I915_NEED_GFX_HWS(dev)) {
 		ret = init_status_page(ring);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index daf91de..f3ae547 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -178,6 +178,7 @@ struct intel_engine {
 
 	spinlock_t execlist_lock;
 	struct list_head execlist_queue;
+	u8 next_context_status_buffer;
 
 	struct i915_hw_context *default_context;
 	struct i915_hw_context *last_context;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 41/50] drm/i915/bdw: Start queueing contexts to be submitted
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (39 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 40/50] drm/i915/bdw: Handle context switch events oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 42/50] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
                   ` (10 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Finally, start queueing request on ring->submit. Also, remove
remaining legacy context switches.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c            |  9 ++++++---
 drivers/gpu/drm/i915/i915_gem_context.c    | 10 ++++++----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  8 +++++---
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  5 ++++-
 4 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f9ed89e..e2d2edb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2813,9 +2813,12 @@ int i915_gpu_idle(struct drm_device *dev)
 
 	/* Flush everything onto the inactive list. */
 	for_each_active_ring(ring, dev_priv, i) {
-		ret = i915_switch_context(ring, ring->default_context);
-		if (ret)
-			return ret;
+		if (!dev_priv->lrc_enabled) {
+			ret = i915_switch_context(ring,
+					ring->default_context);
+			if (ret)
+				return ret;
+		}
 
 		ret = intel_ring_idle(ring);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index d4c6863..bf6264a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -532,10 +532,12 @@ int i915_gem_context_enable(struct drm_i915_private *dev_priv)
 
 	BUG_ON(!dev_priv->ring[RCS].default_context);
 
-	for_each_active_ring(ring, dev_priv, i) {
-		ret = i915_switch_context(ring, ring->default_context);
-		if (ret)
-			return ret;
+	if (!dev_priv->lrc_enabled) {
+		for_each_active_ring(ring, dev_priv, i) {
+			ret = i915_switch_context(ring, ring->default_context);
+			if (ret)
+				return ret;
+		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f7dad8c..9d17bd8 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1288,9 +1288,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	if (ret)
 		goto err;
 
-	ret = i915_switch_context(ring, ctx);
-	if (ret)
-		goto err;
+	if (!dev_priv->lrc_enabled) {
+		ret = i915_switch_context(ring, ctx);
+		if (ret)
+			goto err;
+	}
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    mode != dev_priv->relative_constants_mode) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 9cd6ee8..94c1716 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -433,7 +433,10 @@ static void ring_write_tail(struct intel_engine *ring,
 static void gen8_submit_ctx(struct intel_engine *ring,
 			    struct i915_hw_context *ctx, u32 value)
 {
-	DRM_ERROR("Execlists still not ready!\n");
+	if (WARN_ON(ctx == NULL))
+		ctx = ring->default_context;
+
+	gen8_switch_context_queue(ring, ctx, value);
 }
 
 u64 intel_ring_get_active_head(struct intel_engine *ring)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 42/50] drm/i915/bdw: Display execlists info in debugfs
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (40 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 41/50] drm/i915/bdw: Start queueing contexts to be submitted oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 43/50] drm/i915/bdw: Display context backing obj & ringbuffer " oscar.mateo
                   ` (9 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

v2: Warn and return if LRCs are not enabled.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 74 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h     |  7 ++++
 drivers/gpu/drm/i915/intel_lrc.c    |  6 ---
 3 files changed, 81 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 204b432..cc212df 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1716,6 +1716,79 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	return 0;
 }
 
+static int i915_execlists(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine *ring;
+	u32 status_pointer;
+	u8 read_pointer;
+	u8 write_pointer;
+	u32 status;
+	u32 ctx_id;
+	struct list_head *cursor;
+	struct drm_i915_gem_request *head_req;
+	int ring_id, i;
+
+	if (!dev_priv->lrc_enabled) {
+		seq_printf(m, "Logical Ring Contexts are disabled\n");
+		return 0;
+	}
+
+	for_each_active_ring(ring, dev_priv, ring_id) {
+		int count = 0;
+
+		seq_printf(m, "%s\n", ring->name);
+
+		status = I915_READ(RING_EXECLIST_STATUS(ring));
+		ctx_id = I915_READ(RING_EXECLIST_STATUS(ring) + 4);
+		seq_printf(m, "\tExeclist status: 0x%08X, context: %u\n",
+				status, ctx_id);
+
+		status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
+		seq_printf(m, "\tStatus pointer: 0x%08X\n", status_pointer);
+
+		read_pointer = ring->next_context_status_buffer;
+		write_pointer = status_pointer & 0x07;
+		if (read_pointer > write_pointer)
+			write_pointer += 6;
+		seq_printf(m, "\tRead pointer: 0x%08X, write pointer 0x%08X\n",
+				read_pointer, write_pointer);
+
+		for (i = 0; i < 6; i++) {
+			status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) + 8*i);
+			ctx_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) + 8*i + 4);
+
+			seq_printf(m, "\tStatus buffer %d: 0x%08X, context: %u\n",
+					i, status, ctx_id);
+		}
+
+		list_for_each(cursor, &ring->execlist_queue) {
+			count++;
+		}
+		seq_printf(m, "\t%d requests in queue\n", count);
+
+		if (count > 0) {
+			struct drm_i915_gem_object *ctx_obj;
+
+			head_req = list_first_entry(&ring->execlist_queue,
+					struct drm_i915_gem_request, execlist_link);
+
+			ctx_obj = head_req->ctx->engine[ring_id].obj;
+			seq_printf(m, "\tHead request id: %u\n",
+					intel_get_lr_contextid(ctx_obj));
+			seq_printf(m, "\tHead request seqno: %u\n", head_req->seqno);
+			seq_printf(m, "\tHead request tail: %u\n", head_req->tail);
+
+		}
+
+		seq_putc(m, '\n');
+	}
+
+	return 0;
+}
+
 static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -3771,6 +3844,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_opregion", i915_opregion, 0},
 	{"i915_gem_framebuffer", i915_gem_framebuffer_info, 0},
 	{"i915_context_status", i915_context_status, 0},
+	{"i915_execlists", i915_execlists, 0},
 	{"i915_gen6_forcewake_count", i915_gen6_forcewake_count_info, 0},
 	{"i915_swizzle_info", i915_swizzle_info, 0},
 	{"i915_ppgtt_info", i915_ppgtt_info, 0},
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 97a51f8..ab3a650 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -116,6 +116,13 @@
 #define GEN8_RING_PDP_UDW(ring, n)	((ring)->mmio_base+0x270 + ((n) * 8 + 4))
 #define GEN8_RING_PDP_LDW(ring, n)	((ring)->mmio_base+0x270 + (n) * 8)
 
+/* Execlists regs */
+#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
+#define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
+#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
+#define RING_CONTEXT_STATUS_BUF(ring)	((ring)->mmio_base+0x370)
+#define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
+
 #define GAM_ECOCHK			0x4090
 #define   ECOCHK_SNB_BIT		(1<<10)
 #define   HSW_ECOCHK_ARB_PRIO_SOL	(1<<6)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1ff493a..dc1ab25 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -46,12 +46,6 @@
 
 #define GEN8_LR_CONTEXT_ALIGN 4096
 
-#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
-#define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
-#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
-#define RING_CONTEXT_STATUS_BUF(ring)	((ring)->mmio_base+0x370)
-#define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
-
 #define RING_EXECLIST_QFULL		(1 << 0x2)
 #define RING_EXECLIST1_VALID		(1 << 0x3)
 #define RING_EXECLIST0_VALID		(1 << 0x4)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 43/50] drm/i915/bdw: Display context backing obj & ringbuffer info in debugfs
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (41 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 42/50] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 44/50] drm/i915/bdw: Print context state " oscar.mateo
                   ` (8 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index cc212df..c99a872 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1671,6 +1671,12 @@ static int i915_gem_framebuffer_info(struct seq_file *m, void *data)
 
 	return 0;
 }
+static void describe_ctx_ringbuf(struct seq_file *m, struct intel_ringbuffer *ringbuf)
+{
+	seq_printf(m, " (ringbuffer, space: %d, head: %u, tail: %u, last head: %d)",
+			ringbuf->space, ringbuf->head, ringbuf->tail,
+			ringbuf->last_retired_head);
+}
 
 static int i915_context_status(struct seq_file *m, void *unused)
 {
@@ -1698,7 +1704,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	}
 
 	list_for_each_entry(ctx, &dev_priv->context_list, link) {
-		if (ctx->engine[RCS].obj == NULL)
+		if (!dev_priv->lrc_enabled && ctx->engine[RCS].obj == NULL)
 			continue;
 
 		seq_puts(m, "HW context ");
@@ -1707,7 +1713,23 @@ static int i915_context_status(struct seq_file *m, void *unused)
 			if (ring->default_context == ctx)
 				seq_printf(m, "(default context %s) ", ring->name);
 
-		describe_obj(m, ctx->engine[RCS].obj);
+		if (dev_priv->lrc_enabled) {
+			seq_putc(m, '\n');
+			for_each_active_ring(ring, dev_priv, i) {
+				struct drm_i915_gem_object *ctx_obj = ctx->engine[i].obj;
+				struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
+
+				seq_printf(m, "%s: ", ring->name);
+				if (ctx_obj)
+					describe_obj(m, ctx_obj);
+				if (ringbuf)
+					describe_ctx_ringbuf(m, ringbuf);
+				seq_putc(m, '\n');
+			}
+		} else {
+			describe_obj(m, ctx->engine[RCS].obj);
+		}
+
 		seq_putc(m, '\n');
 	}
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 44/50] drm/i915/bdw: Print context state in debugfs
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (42 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 43/50] drm/i915/bdw: Display context backing obj & ringbuffer " oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 45/50] drm/i915/bdw: Document execlists and logical ring contexts oscar.mateo
                   ` (7 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

This has turned out to be really handy in debug so far.

Update:
Since writing this patch, I've gotten similar code upstream for error
state. I've used it quite a bit in debugfs however, and I'd like to keep
it here at least until preemption is working.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

This patch was accidentally dropped in the first Execlists version, and
it has been very useful indeed. Put it back again, but as a standalone
debugfs file.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 52 +++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c99a872..7f661bf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1738,6 +1738,57 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	return 0;
 }
 
+static int i915_dump_lrc(struct seq_file *m, void *unused)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine *ring;
+	struct i915_hw_context *ctx;
+	int ret, i;
+
+	if (!dev_priv->lrc_enabled) {
+		seq_printf(m, "Logical Ring Contexts are disabled\n");
+		return 0;
+	}
+
+	ret = mutex_lock_interruptible(&dev->mode_config.mutex);
+	if (ret)
+		return ret;
+
+	list_for_each_entry(ctx, &dev_priv->context_list, link) {
+		for_each_active_ring(ring, dev_priv, i) {
+			struct drm_i915_gem_object *ctx_obj = ctx->engine[i].obj;
+
+			if (ring->default_context == ctx)
+				continue;
+
+			if (ctx_obj) {
+				struct page *page = i915_gem_object_get_page(ctx_obj, 1);
+				uint32_t *reg_state = kmap_atomic(page);
+				int j;
+
+				seq_printf(m, "CONTEXT: %s %u\n", ring->name,
+						intel_get_lr_contextid(ctx_obj));
+
+				for (j = 0; j < 0x600 / sizeof(u32) / 4; j += 4) {
+					seq_printf(m, "\t[0x%08lx] 0x%08x 0x%08x 0x%08x 0x%08x\n",
+					i915_gem_obj_ggtt_offset(ctx_obj) + 4096 + (j * 4),
+					reg_state[j], reg_state[j + 1],
+					reg_state[j + 2], reg_state[j + 3]);
+				}
+				kunmap_atomic(reg_state);
+
+				seq_putc(m, '\n');
+			}
+		}
+	}
+
+	mutex_unlock(&dev->mode_config.mutex);
+
+	return 0;
+}
+
 static int i915_execlists(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -3866,6 +3917,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_opregion", i915_opregion, 0},
 	{"i915_gem_framebuffer", i915_gem_framebuffer_info, 0},
 	{"i915_context_status", i915_context_status, 0},
+	{"i915_dump_lrc", i915_dump_lrc, 0},
 	{"i915_execlists", i915_execlists, 0},
 	{"i915_gen6_forcewake_count", i915_gen6_forcewake_count_info, 0},
 	{"i915_swizzle_info", i915_swizzle_info, 0},
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 45/50] drm/i915/bdw: Document execlists and logical ring contexts
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (43 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 44/50] drm/i915/bdw: Print context state " oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 46/50] drm/i915/bdw: Avoid non-lite-restore preemptions oscar.mateo
                   ` (6 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Oscar Mateo <oscar.mateo@intel.com>

Explain intel_lrc.c with some execlists notes

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v2: Add notes on logical ring context creation.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 67 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index dc1ab25..49f6c9d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -33,8 +33,75 @@
  * These expanded contexts enable a number of new abilities, especially
  * "Execlists" (also implemented in this file).
  *
+ * One of the main differences with the legacy HW contexts is that logical
+ * ring contexts incorporate many more things to the context's state, like
+ * PDPs or ringbuffer control registers.
+ *
+ * Regarding the creation of contexts, we have:
+ *
+ * - One global default context.
+ * - One local default context for each opened fd.
+ * - One local extra context for each context create ioctl call.
+ *
+ * Now that ringbuffers belong per-context (and not per-engine, like before)
+ * and that contexts are uniquely tied to a given engine (and not reusable,
+ * like before) we need:
+ *
+ * - One ringbuffer per-engine inside each context.
+ * - One backing object per-engine inside each context.
+ *
+ * The global default context starts its life with these new objects fully
+ * allocated and populated. Regarding non-global contexts, we don't know
+ * at creation time which engine is going to use them, so we have implemented
+ * a deferred creation of LR contexts: the local context starts its life as a
+ * hollow or blank holder, that gets populated for a given engine once we receive
+ * an execbuffer. If later on we receive another execbuffer ioctl for the same
+ * context but a different engine, we allocate/populate a new ringbuffer and
+ * context backing object and so on.
+ *
  * Execlists are the new method by which, on gen8+ hardware, workloads are
  * submitted for execution (as opposed to the legacy, ringbuffer-based, method).
+ * This method works as follows:
+ *
+ * When a request is committed, its commands (the BB start and any leading or
+ * trailing commands, like the seqno breadcrumbs) are placed in the ringbuffer
+ * for the appropriate context. The tail pointer in the hardware context is not
+ * updated at this time, but instead, kept by the driver in the ringbuffer
+ * structure. A structure representing this request is added to a request queue
+ * for the appropriate engine: this structure contains a copy of the context's
+ * tail after the request was written to the ring buffer and a pointer to the
+ * context itself.
+ *
+ * If the engine's request queue was empty before the request was added, the
+ * queue is processed immediately. Otherwise the queue will be processed during
+ * a context switch interrupt. In any case, elements on the queue will get sent
+ * (in pairs) to the GPU's ExecLists Submit Port (ELSP, for short) with a
+ * globally unique 20-bits submission ID.
+ *
+ * When execution of a request completes, the GPU updates the context status
+ * buffer with a context complete event and generates a context switch interrupt.
+ * During the interrupt handling, the driver examines the events in the buffer:
+ * for each context complete event, if the announced ID matches that on the head
+ * of the request queue then that request is retired and removed from the queue.
+ *
+ * After processing, if any requests were retired and the queue is not empty
+ * then a new execution list can be submitted. The two requests at the front of
+ * the queue are next to be submitted but since a context may not occur twice in
+ * an execution list, if subsequent requests have the same ID as the first then
+ * the two requests must be combined. This is done simply by discarding requests
+ * at the head of the queue until either only one requests is left (in which case
+ * we use a NULL second context) or the first two requests have unique IDs.
+ *
+ * By always executing the first two requests in the queue the driver ensures
+ * that the GPU is kept as busy as possible. In the case where a single context
+ * completes but a second context is still executing, the request for the second
+ * context will be at the head of the queue when we remove the first one. This
+ * request will then be resubmitted along with a new request for a different context,
+ * which will cause the hardware to continue executing the second request and queue
+ * the new request (the GPU detects the condition of a context getting preempted
+ * with the same context and optimizes the context switch flow by not doing
+ * preemption, but just sampling the new tail pointer).
+ *
  */
 
 #include <drm/drmP.h>
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 46/50] drm/i915/bdw: Avoid non-lite-restore preemptions
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (44 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 45/50] drm/i915/bdw: Document execlists and logical ring contexts oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 47/50] drm/i915/bdw: Make sure gpu reset still works with Execlists oscar.mateo
                   ` (5 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

In the current Execlists feeding mechanism, full preemption is not
supported yet: only lite-restores are allowed (this is: the GPU
simply samples a new tail pointer for the context currently in
execution).

But we have identified an scenario in which a full preemption occurs:
1) We submit two contexts for execution (A & B).
2) The GPU finishes with the first one (A), switches to the second one
(B) and informs us.
3) We submit B again (hoping to cause a lite restore) together with C,
but in the time we spend writing to the ELSP, the GPU finishes B.
4) The GPU start executing B again (since we told it so).
5) We receive a B finished interrupt and, mistakenly, we submit C (again)
and D, causing a full preemption of B.

By keeping a better track of our submissions, we can avoid the scenario
described above.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |  3 +++
 drivers/gpu/drm/i915/intel_lrc.c | 28 ++++++++++++++++++++++++----
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 07b8bdc..c797e63 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1750,6 +1750,9 @@ struct drm_i915_gem_request {
 	struct list_head execlist_link;
 	/** Struct to handle this request in the bottom half of an interrupt */
 	struct work_struct work;
+
+	/** No. of times this request has been sent to the ELSP */
+	int elsp_submitted;
 };
 
 struct drm_i915_file_private {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 49f6c9d..a13a570 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -294,6 +294,7 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
 		else if (req0->ctx == cursor->ctx) {
 			/* Same ctx: ignore first request, as second request
 			 * will update tail past first request's workload */
+			cursor->elsp_submitted = req0->elsp_submitted;
 			list_del(&req0->execlist_link);
 			queue_work(dev_priv->wq, &req0->work);
 			req0 = cursor;
@@ -303,8 +304,14 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
 		}
 	}
 
+	WARN_ON(req1 && req1->elsp_submitted);
+
 	BUG_ON(gen8_switch_context(ring, req0->ctx, req0->tail,
 			req1? req1->ctx : NULL, req1? req1->tail : 0));
+
+	req0->elsp_submitted++;
+	if (req1)
+		req1->elsp_submitted++;
 }
 
 static bool check_remove_request(struct intel_engine *ring, u32 request_id)
@@ -320,9 +327,13 @@ static bool check_remove_request(struct intel_engine *ring, u32 request_id)
 		struct drm_i915_gem_object *ctx_obj =
 				head_req->ctx->engine[ring->id].obj;
 		if (intel_get_lr_contextid(ctx_obj) == request_id) {
-			list_del(&head_req->execlist_link);
-			queue_work(dev_priv->wq, &head_req->work);
-			return true;
+			WARN(head_req->elsp_submitted == 0,
+					"Never submitted head request\n");
+			if (--head_req->elsp_submitted <= 0) {
+				list_del(&head_req->execlist_link);
+				queue_work(dev_priv->wq, &head_req->work);
+				return true;
+			}
 		}
 	}
 
@@ -355,7 +366,16 @@ void gen8_handle_context_events(struct intel_engine *ring)
 		status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
 				(read_pointer % 6) * 8 + 4);
 
-		if (status & GEN8_CTX_STATUS_COMPLETE) {
+		if (status & GEN8_CTX_STATUS_PREEMPTED) {
+			if (status & GEN8_CTX_STATUS_LITE_RESTORE) {
+				if (check_remove_request(ring, status_id))
+					WARN(1, "Lite Restored request removed from queue\n");
+			} else
+				WARN(1, "Preemption without Lite Restore\n");
+		}
+
+		 if ((status & GEN8_CTX_STATUS_ACTIVE_IDLE) ||
+		     (status & GEN8_CTX_STATUS_ELEMENT_SWITCH)) {
 			if (check_remove_request(ring, status_id))
 				submit_contexts++;
 		}
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 47/50] drm/i915/bdw: Make sure gpu reset still works with Execlists
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (45 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 46/50] drm/i915/bdw: Avoid non-lite-restore preemptions oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 48/50] drm/i915/bdw: Make sure error capture keeps working " oscar.mateo
                   ` (4 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

If we reset a ring after a hang, we have to make sure that we clear
out all queued Execlists requests and that we re-program the ring for
execution. Also, reset the hangcheck counters.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 13 +++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 10 +---------
 drivers/gpu/drm/i915/intel_ringbuffer.c |  8 ++++++++
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e2d2edb..4f1bb46 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2400,6 +2400,19 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
 		i915_gem_free_request(request);
 	}
 
+	if (dev_priv->lrc_enabled) {
+		while (!list_empty(&ring->execlist_queue)) {
+			struct drm_i915_gem_request *request;
+
+			request = list_first_entry(&ring->execlist_queue,
+						   struct drm_i915_gem_request,
+						   execlist_link);
+			list_del(&request->execlist_link);
+			i915_gem_context_unreference(request->ctx);
+			kfree(request);
+		}
+	}
+
 	/* These may not have been flush before the reset, do so now */
 	kfree(ring->preallocated_lazy_request);
 	ring->preallocated_lazy_request = NULL;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a13a570..d9edd10 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -700,17 +700,9 @@ int gen8_gem_context_init(struct drm_device *dev)
 		goto err_out;
 	}
 
-	for_each_ring(ring, dev_priv, ring_id) {
+	for_each_ring(ring, dev_priv, ring_id)
 		ring->default_context = ctx;
 
-		I915_WRITE(RING_MODE_GEN7(ring),
-			_MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
-			_MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
-		POSTING_READ(RING_MODE_GEN7(ring));
-		DRM_DEBUG_DRIVER("Execlists enabled for %s\n",
-				ring->name);
-	}
-
 	DRM_DEBUG_DRIVER("LR context support initialized\n");
 	return 0;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 94c1716..9c0deb2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -573,6 +573,14 @@ static int init_ring_common_lrc(struct intel_engine *ring)
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
+	I915_WRITE(RING_MODE_GEN7(ring),
+		_MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
+		_MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
+	POSTING_READ(RING_MODE_GEN7(ring));
+	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name);
+
+	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
+
 	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
 	I915_WRITE(RING_HWSTAM(ring->mmio_base), 0xffffffff);
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 48/50] drm/i915/bdw: Make sure error capture keeps working with Execlists
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (46 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 47/50] drm/i915/bdw: Make sure gpu reset still works with Execlists oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-09 12:09 ` [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler oscar.mateo
                   ` (3 subsequent siblings)
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Since the ringbuffer does not belong per engine anymore, we have to
make sure that we are always recording the correct ringbuffer.

TODO: This is only a small fix to keep basic error capture working, but
we need to add more information for it to be useful (e.g. dump the
context being executed).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 6724e32..31ff7e1 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -823,9 +823,6 @@ static void i915_record_ring_state(struct drm_device *dev,
 		ering->hws = I915_READ(mmio);
 	}
 
-	ering->cpu_ring_head = ring->default_ringbuf.head;
-	ering->cpu_ring_tail = ring->default_ringbuf.tail;
-
 	ering->hangcheck_score = ring->hangcheck.score;
 	ering->hangcheck_action = ring->hangcheck.action;
 
@@ -881,6 +878,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_request *request;
+	struct intel_ringbuffer *ringbuf;
 	int i, count;
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
@@ -927,8 +925,13 @@ static void i915_gem_record_rings(struct drm_device *dev,
 			}
 		}
 
+		ringbuf = intel_ringbuffer_get(ring,
+				request ? request->ctx : ring->default_context);
+		error->ring[i].cpu_ring_head = ringbuf->head;
+		error->ring[i].cpu_ring_tail = ringbuf->tail;
+
 		error->ring[i].ringbuffer =
-			i915_error_ggtt_object_create(dev_priv, ring->default_ringbuf.obj);
+			i915_error_ggtt_object_create(dev_priv, ringbuf->obj);
 
 		if (ring->status_page.obj)
 			error->ring[i].hws_page =
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (47 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 48/50] drm/i915/bdw: Make sure error capture keeps working " oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-06-11 11:50   ` Daniel Vetter
  2014-05-09 12:09 ` [PATCH 50/50] drm/i915/bdw: Enable logical ring contexts oscar.mateo
                   ` (2 subsequent siblings)
  51 siblings, 1 reply; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

If we receive a storm of requests for the same context (see gem_storedw_loop_*)
we might end up iterating over too many elements in interrupt time, looking for
contexts to squash together. Instead, share the burden by giving more
intelligence to the queue function. At most, the interrupt will iterate over
three elements.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d9edd10..0aad721 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -410,9 +410,11 @@ int gen8_switch_context_queue(struct intel_engine *ring,
 			      struct i915_hw_context *to,
 			      u32 tail)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct drm_i915_gem_request *req = NULL;
 	unsigned long flags;
-	bool was_empty;
+	struct drm_i915_gem_request *cursor;
+	int num_elements = 0;
 
 	req = kzalloc(sizeof(*req), GFP_KERNEL);
 	if (req == NULL)
@@ -425,9 +427,24 @@ int gen8_switch_context_queue(struct intel_engine *ring,
 
 	spin_lock_irqsave(&ring->execlist_lock, flags);
 
-	was_empty = list_empty(&ring->execlist_queue);
+	list_for_each_entry(cursor, &ring->execlist_queue, execlist_link)
+		if (++num_elements > 2)
+			break;
+
+	if (num_elements > 2) {
+		struct drm_i915_gem_request *tail_req =
+				list_last_entry(&ring->execlist_queue,
+					struct drm_i915_gem_request, execlist_link);
+		if (to == tail_req->ctx) {
+			WARN(tail_req->elsp_submitted != 0,
+					"More than 2 already-submitted reqs queued\n");
+			list_del(&tail_req->execlist_link);
+			queue_work(dev_priv->wq, &tail_req->work);
+		}
+	}
+
 	list_add_tail(&req->execlist_link, &ring->execlist_queue);
-	if (was_empty)
+	if (num_elements == 0)
 		gen8_switch_context_unqueue(ring);
 
 	spin_unlock_irqrestore(&ring->execlist_lock, flags);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 50/50] drm/i915/bdw: Enable logical ring contexts
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (48 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler oscar.mateo
@ 2014-05-09 12:09 ` oscar.mateo
  2014-05-12 17:04 ` [PATCH 49.1/50] drm/i915/bdw: Do not call intel_runtime_pm_get() in an interrupt oscar.mateo
  2014-05-13 13:48 ` [PATCH 00/50] Execlists v2 Daniel Vetter
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-09 12:09 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The time has come, the Walrus said, to talk of many things.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c797e63..969962c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1937,7 +1937,7 @@ struct drm_i915_cmd_table {
 #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
 
 #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
-#define HAS_LOGICAL_RING_CONTEXTS(dev)	0
+#define HAS_LOGICAL_RING_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 8)
 #define HAS_ALIASING_PPGTT(dev)	(INTEL_INFO(dev)->gen >= 6 && \
 				 (!IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev)))
 #define HAS_PPGTT(dev)		(INTEL_INFO(dev)->gen >= 7 \
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-05-09 12:08 ` [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
@ 2014-05-09 13:36   ` Damien Lespiau
  2014-05-12 17:00   ` [PATCH v2 " oscar.mateo
  1 sibling, 0 replies; 94+ messages in thread
From: Damien Lespiau @ 2014-05-09 13:36 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Fri, May 09, 2014 at 01:08:54PM +0100, oscar.mateo@intel.com wrote:
> +	if (ring->id == RCS) {
> +		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> +		reg_state[CTX_LRI_HEADER_2] |= MI_LRI_FORCE_POSTED;

This header doesn't have bit 12 set in BSpec.

> +		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
> +		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
> +#if 0
> +		/* Offsets not yet defined for these */
> +		reg_state[CTX_GPGPU_CSR_BASE_ADDRESS] = 0;
> +		reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
> +#endif

Remove dead code?

-- 
Damien

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v2 24/50] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-05-09 12:08 ` [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
  2014-05-09 13:36   ` Damien Lespiau
@ 2014-05-12 17:00   ` oscar.mateo
  1 sibling, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-12 17:00 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

For the most part, logical ring context objects are similar to hardware
contexts in that the backing object is meant to be opaque. There are
some exceptions where we need to poke certain offsets of the object for
initialization, updating the tail pointer or updating the PDPs.

For our basic execlist implementation we'll only need our PPGTT PDs,
and ringbuffer addresses in order to set up the context. With previous
patches, we have both, so start prepping the context to be load.

Before running a context for the first time you must populate some
fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
first page (in 0 based counting) of the context  image. These same
fields will be read and written to as contexts are saved and restored
once the system is up and running.

Many of these fields are completely reused from previous global
registers: ringbuffer head/tail/control, context control matches some
previous MI_SET_CONTEXT flags, and page directories. There are other
fields which we don't touch which we may want in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
for other engines.

Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>

v3: Several rebases and general changes to the code.

v4: Squash with "Extract LR context object populating"
Also, Damien's review comments:
- Set the Force Posted bit on the LRI header, as the BSpec suggest we do.
- Prevent warning when compiling a 32-bits kernel without HIGHMEM64.
- Add a clarifying comment to the context population code.

v5: Damien's review comments:
- The third MI_LOAD_REGISTER_IMM in the context does not set Force Posted.
- Remove dead code.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h  |   1 +
 drivers/gpu/drm/i915/intel_lrc.c | 153 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 154 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 03ffc57..33d007d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -269,6 +269,7 @@
  *   address/value pairs. Don't overdue it, though, x <= 2^4 must hold!
  */
 #define MI_LOAD_REGISTER_IMM(x)	MI_INSTR(0x22, 2*(x)-1)
+#define   MI_LRI_FORCE_POSTED		(1<<12)
 #define MI_STORE_REGISTER_MEM(x) MI_INSTR(0x24, 2*(x)-1)
 #define MI_STORE_REGISTER_MEM_GEN8(x) MI_INSTR(0x24, 3*(x)-1)
 #define   MI_SRM_LRM_GLOBAL_GTT		(1<<22)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0f2c5cb..3bd0670 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -46,6 +46,146 @@
 
 #define GEN8_LR_CONTEXT_ALIGN 4096
 
+#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
+#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
+
+#define CTX_LRI_HEADER_0		0x01
+#define CTX_CONTEXT_CONTROL		0x02
+#define CTX_RING_HEAD			0x04
+#define CTX_RING_TAIL			0x06
+#define CTX_RING_BUFFER_START		0x08
+#define CTX_RING_BUFFER_CONTROL	0x0a
+#define CTX_BB_HEAD_U			0x0c
+#define CTX_BB_HEAD_L			0x0e
+#define CTX_BB_STATE			0x10
+#define CTX_SECOND_BB_HEAD_U		0x12
+#define CTX_SECOND_BB_HEAD_L		0x14
+#define CTX_SECOND_BB_STATE		0x16
+#define CTX_BB_PER_CTX_PTR		0x18
+#define CTX_RCS_INDIRECT_CTX		0x1a
+#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
+#define CTX_LRI_HEADER_1		0x21
+#define CTX_CTX_TIMESTAMP		0x22
+#define CTX_PDP3_UDW			0x24
+#define CTX_PDP3_LDW			0x26
+#define CTX_PDP2_UDW			0x28
+#define CTX_PDP2_LDW			0x2a
+#define CTX_PDP1_UDW			0x2c
+#define CTX_PDP1_LDW			0x2e
+#define CTX_PDP0_UDW			0x30
+#define CTX_PDP0_LDW			0x32
+#define CTX_LRI_HEADER_2		0x41
+#define CTX_R_PWR_CLK_STATE		0x42
+#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
+
+static int
+intel_populate_lr_context(struct i915_hw_context *ctx,
+			  struct intel_engine *ring)
+{
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].obj;
+	struct drm_i915_gem_object *ring_obj = ctx->engine[ring->id].ringbuf->obj;
+	struct i915_hw_ppgtt *ppgtt;
+	struct page *page;
+	uint32_t *reg_state;
+	int ret;
+
+	ppgtt = ctx_to_ppgtt(ctx);
+
+	ret = i915_gem_object_set_to_cpu_domain(ctx_obj, true);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Could not set to CPU domain\n");
+		return ret;
+	}
+
+	ret = i915_gem_object_get_pages(ctx_obj);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Could not get object pages\n");
+		return ret;
+	}
+
+	i915_gem_object_pin_pages(ctx_obj);
+
+	/* The second page of the context object contains some fields which must
+	 * be set up prior to the first execution. */
+	page = i915_gem_object_get_page(ctx_obj, 1);
+	reg_state = kmap_atomic(page);
+
+	/* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
+	 * commands followed by (reg, value) pairs. The values we are setting here are
+	 * only for the first context restore: on a subsequent save, the GPU will
+	 * recreate this batchbuffer with new values (including all the missing
+	 * MI_LOAD_REGISTER_IMM commands that we are not initializing here). */
+	if (ring->id == RCS)
+		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
+	else
+		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
+	reg_state[CTX_LRI_HEADER_0] |= MI_LRI_FORCE_POSTED;
+	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
+	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
+	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
+	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
+	reg_state[CTX_RING_HEAD+1] = 0;
+	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
+	reg_state[CTX_RING_TAIL+1] = 0;
+	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
+	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
+	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
+	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
+	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
+	reg_state[CTX_BB_HEAD_U+1] = 0;
+	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
+	reg_state[CTX_BB_HEAD_L+1] = 0;
+	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
+	reg_state[CTX_BB_STATE+1] = (1<<5);
+	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
+	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
+	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
+	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
+	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
+	reg_state[CTX_SECOND_BB_STATE+1] = 0;
+	if (ring->id == RCS) {
+		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
+		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
+		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
+		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
+		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
+		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
+	}
+	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
+	reg_state[CTX_LRI_HEADER_1] |= MI_LRI_FORCE_POSTED;
+	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
+	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
+	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
+	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
+	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
+	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
+	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
+	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
+	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
+	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
+	reg_state[CTX_PDP3_UDW+1] = (u64)ppgtt->pd_dma_addr[3] >> 32;
+	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
+	reg_state[CTX_PDP2_UDW+1] = (u64)ppgtt->pd_dma_addr[2] >> 32;
+	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
+	reg_state[CTX_PDP1_UDW+1] = (u64)ppgtt->pd_dma_addr[1] >> 32;
+	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
+	reg_state[CTX_PDP0_UDW+1] = (u64)ppgtt->pd_dma_addr[0] >> 32;
+	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
+	if (ring->id == RCS) {
+		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
+		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
+		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
+	}
+
+	kunmap_atomic(reg_state);
+
+	ctx_obj->dirty = 1;
+	set_page_dirty(page);
+	i915_gem_object_unpin_pages(ctx_obj);
+
+	return 0;
+}
+
 static uint32_t get_lr_context_size(struct intel_engine *ring)
 {
 	int ret = 0;
@@ -135,6 +275,19 @@ int gen8_create_lr_context(struct i915_hw_context *ctx,
 	ctx->engine[ring->id].ringbuf = ringbuf;
 	ctx->engine[ring->id].obj = ctx_obj;
 
+	ret = intel_populate_lr_context(ctx, ring);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Failed to populate LRC: %d\n", ret);
+		ctx->engine[ring->id].ringbuf = NULL;
+		ctx->engine[ring->id].obj = NULL;
+		intel_destroy_ring_buffer(ringbuf);
+		if (file_priv)
+			kfree(ringbuf);
+		i915_gem_object_ggtt_unpin(ctx_obj);
+		drm_gem_object_unreference(&ctx_obj->base);
+		return ret;
+	}
+
 	return 0;
 }
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 49.1/50] drm/i915/bdw: Do not call intel_runtime_pm_get() in an interrupt
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (49 preceding siblings ...)
  2014-05-09 12:09 ` [PATCH 50/50] drm/i915/bdw: Enable logical ring contexts oscar.mateo
@ 2014-05-12 17:04 ` oscar.mateo
  2014-05-13 13:48 ` [PATCH 00/50] Execlists v2 Daniel Vetter
  51 siblings, 0 replies; 94+ messages in thread
From: oscar.mateo @ 2014-05-12 17:04 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Or with a spinlock grabbed, because it might sleep, which is not
a nice thing to do. Instead, do the runtime_pm get/put together
with the create/destroy request, and handle the forcewake get/put
directly.

This can be squashed with:

[PATCH 35/50] drm/i915/bdw: Add forcewake lock around ELSP writes

but it needs some patch reordering, so I will leave it for the next
patchset version.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e624580..55255e8 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -205,6 +205,7 @@ static void submit_execlist(struct intel_engine *ring,
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	uint64_t temp = 0;
 	uint32_t desc[4];
+	unsigned long flags;
 
 	/* XXX: You must always write both descriptors in the order below. */
 	if (ctx_obj1)
@@ -218,9 +219,17 @@ static void submit_execlist(struct intel_engine *ring,
 	desc[3] = (u32)(temp >> 32);
 	desc[2] = (u32)temp;
 
-	/* Set Force Wakeup bit to prevent GT from entering C6 while
-	 * ELSP writes are in progress */
-	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
+	/* Set Force Wakeup bit to prevent GT from entering C6 while ELSP writes
+	 * are in progress.
+	 *
+	 * The other problem is that we can't just call gen6_gt_force_wake_get()
+	 * because that function calls intel_runtime_pm_get(), which might sleep.
+	 * Instead, we do the runtime_pm_get/put when creating/destroying requests.
+	 */
+	spin_lock_irqsave(&dev_priv->uncore.lock, flags);
+	if (dev_priv->uncore.forcewake_count++ == 0)
+		dev_priv->uncore.funcs.force_wake_get(dev_priv, FORCEWAKE_ALL);
+	spin_unlock_irqrestore(&dev_priv->uncore.lock, flags);
 
 	I915_RAW_WRITE(RING_ELSP(ring), desc[1]);
 	I915_RAW_WRITE(RING_ELSP(ring), desc[0]);
@@ -231,8 +240,11 @@ static void submit_execlist(struct intel_engine *ring,
 	/* ELSP is a write only register, so this serves as a posting read */
 	POSTING_READ(RING_EXECLIST_STATUS(ring));
 
-	/* Release Force Wakeup */
-	gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
+	/* Release Force Wakeup (see the big comment above). */
+	spin_lock_irqsave(&dev_priv->uncore.lock, flags);
+	if (--dev_priv->uncore.forcewake_count == 0)
+		dev_priv->uncore.funcs.force_wake_put(dev_priv, FORCEWAKE_ALL);
+	spin_unlock_irqrestore(&dev_priv->uncore.lock, flags);
 }
 
 static int lr_context_write_tail(struct drm_i915_gem_object *ctx_obj, u32 tail)
@@ -398,6 +410,9 @@ static void free_request_task(struct work_struct *work)
 	struct drm_i915_gem_request *req =
 			container_of(work, struct drm_i915_gem_request, work);
 	struct drm_device *dev = req->ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	intel_runtime_pm_put(dev_priv);
 
 	mutex_lock(&dev->struct_mutex);
 	i915_gem_context_unreference(req->ctx);
@@ -424,6 +439,7 @@ int gen8_switch_context_queue(struct intel_engine *ring,
 	i915_gem_context_reference(req->ctx);
 	req->tail = tail;
 	INIT_WORK(&req->work, free_request_task);
+	intel_runtime_pm_get(dev_priv);
 
 	spin_lock_irqsave(&ring->execlist_lock, flags);
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 02/50] drm/i915: for_each_ring
  2014-05-09 12:08 ` [PATCH 02/50] drm/i915: for_each_ring oscar.mateo
@ 2014-05-13 13:25   ` Daniel Vetter
  2014-05-19 16:33   ` Volkin, Bradley D
  1 sibling, 0 replies; 94+ messages in thread
From: Daniel Vetter @ 2014-05-13 13:25 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Fri, May 09, 2014 at 01:08:32PM +0100, oscar.mateo@intel.com wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> for_each_ring() iterates over all rings supported by the hardware, not
> just those which have been initialized as in for_each_active_ring()
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Acked-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a53a028..b1725c6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1544,6 +1544,17 @@ static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
>  	return dev->dev_private;
>  }
>  
> +/* NB: Typically you want to use for_each_ring in init code before ringbuffers
> + * are setup, or in debug code. for_each_active_ring is more suited for code
> + * which is dynamically handling active rings, ie. normal code. In most
> + * (currently all cases except on pre-production hardware) for_each_ring will
> + * work even if it's a bad idea to use it - so be careful.
> + */

Proper kerneldoc comment would look neater imo instead of an "NB:". Bonus
points if you pull it into the drm docbook (just the header files with the
!I directive in the DocBook template).
-Daniel

> +#define for_each_ring(ring__, dev_priv__, i__) \
> +	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
> +		if (((ring__) = &(dev_priv__)->ring[(i__)]), \
> +		    INTEL_INFO((dev_priv__)->dev)->ring_mask & (1<<(i__)))
> +
>  /* Iterate over initialised rings */
>  #define for_each_active_ring(ring__, dev_priv__, i__) \
>  	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init)
  2014-05-09 12:08 ` [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
@ 2014-05-13 13:26   ` Daniel Vetter
  2014-05-13 13:47     ` Chris Wilson
  2014-05-14 11:53     ` Mateo Lozano, Oscar
  0 siblings, 2 replies; 94+ messages in thread
From: Daniel Vetter @ 2014-05-13 13:26 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Fri, May 09, 2014 at 01:08:34PM +0100, oscar.mateo@intel.com wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> It's beneficial to be able to get a name, base, and id before we've
> actually initialized the rings. This ability was effectively destroyed
> in the ringbuffer fire which Daniel started.
> 
> With the simple early init function, that ability is restored.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> v2: The Full PPGTT series have moved things around a little bit.
> Also, don't forget the VEBOX.
> 
> v3: Checking ring->dev is not a good way to test if a ring is
> initialized...
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>

Needs to be updated for VEBOX2. Also I don't really see the point, where
exactly do we need this? Ripping apart the ring init like this doesn't
look too great imo.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem.c         |  2 ++
>  drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 60 ++++++++++++++++++---------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
>  4 files changed, 37 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ce941cf..6ef53bd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4502,6 +4502,8 @@ int i915_gem_init(struct drm_device *dev)
>  
>  	i915_gem_init_global_gtt(dev);
>  
> +	intel_init_rings_early(dev);
> +
>  	ret = i915_gem_context_init(dev);
>  	if (ret) {
>  		mutex_unlock(&dev->struct_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 2d81985..8f37238 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -886,7 +886,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
>  	for (i = 0; i < I915_NUM_RINGS; i++) {
>  		struct intel_ring_buffer *ring = &dev_priv->ring[i];
>  
> -		if (ring->dev == NULL)
> +		if (!intel_ring_initialized(ring))
>  			continue;
>  
>  		error->ring[i].valid = true;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index a112971..fc737c8 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1417,7 +1417,6 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>  {
>  	int ret;
>  
> -	ring->dev = dev;
>  	INIT_LIST_HEAD(&ring->active_list);
>  	INIT_LIST_HEAD(&ring->request_list);
>  	ring->size = 32 * PAGE_SIZE;
> @@ -1908,10 +1907,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
>  
> -	ring->name = "render ring";
> -	ring->id = RCS;
> -	ring->mmio_base = RENDER_RING_BASE;
> -
>  	if (INTEL_INFO(dev)->gen >= 6) {
>  		ring->add_request = gen6_add_request;
>  		ring->flush = gen7_render_ring_flush;
> @@ -2019,10 +2014,6 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
>  	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
> -	ring->name = "render ring";
> -	ring->id = RCS;
> -	ring->mmio_base = RENDER_RING_BASE;
> -
>  	if (INTEL_INFO(dev)->gen >= 6) {
>  		/* non-kms not supported on gen6+ */
>  		return -ENODEV;
> @@ -2056,7 +2047,6 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
>  	ring->init = init_render_ring;
>  	ring->cleanup = render_ring_cleanup;
>  
> -	ring->dev = dev;
>  	INIT_LIST_HEAD(&ring->active_list);
>  	INIT_LIST_HEAD(&ring->request_list);
>  
> @@ -2086,12 +2076,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_ring_buffer *ring = &dev_priv->ring[VCS];
>  
> -	ring->name = "bsd ring";
> -	ring->id = VCS;
> -
>  	ring->write_tail = ring_write_tail;
>  	if (INTEL_INFO(dev)->gen >= 6) {
> -		ring->mmio_base = GEN6_BSD_RING_BASE;
>  		/* gen6 bsd needs a special wa for tail updates */
>  		if (IS_GEN6(dev))
>  			ring->write_tail = gen6_bsd_ring_write_tail;
> @@ -2132,7 +2118,6 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
>  		ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
>  		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
>  	} else {
> -		ring->mmio_base = BSD_RING_BASE;
>  		ring->flush = bsd_ring_flush;
>  		ring->add_request = i9xx_add_request;
>  		ring->get_seqno = ring_get_seqno;
> @@ -2167,11 +2152,7 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
>  		return -EINVAL;
>  	}
>  
> -	ring->name = "bds2_ring";
> -	ring->id = VCS2;
> -
>  	ring->write_tail = ring_write_tail;
> -	ring->mmio_base = GEN8_BSD2_RING_BASE;
>  	ring->flush = gen6_bsd_ring_flush;
>  	ring->add_request = gen6_add_request;
>  	ring->get_seqno = gen6_ring_get_seqno;
> @@ -2210,10 +2191,6 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_ring_buffer *ring = &dev_priv->ring[BCS];
>  
> -	ring->name = "blitter ring";
> -	ring->id = BCS;
> -
> -	ring->mmio_base = BLT_RING_BASE;
>  	ring->write_tail = ring_write_tail;
>  	ring->flush = gen6_ring_flush;
>  	ring->add_request = gen6_add_request;
> @@ -2259,10 +2236,6 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_ring_buffer *ring = &dev_priv->ring[VECS];
>  
> -	ring->name = "video enhancement ring";
> -	ring->id = VECS;
> -
> -	ring->mmio_base = VEBOX_RING_BASE;
>  	ring->write_tail = ring_write_tail;
>  	ring->flush = gen6_ring_flush;
>  	ring->add_request = gen6_add_request;
> @@ -2351,3 +2324,36 @@ intel_stop_ring_buffer(struct intel_ring_buffer *ring)
>  
>  	stop_ring(ring);
>  }
> +
> +void intel_init_rings_early(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +	dev_priv->ring[RCS].name = "render ring";
> +	dev_priv->ring[RCS].id = RCS;
> +	dev_priv->ring[RCS].mmio_base = RENDER_RING_BASE;
> +	dev_priv->ring[RCS].dev = dev;
> +
> +	dev_priv->ring[BCS].name = "blitter ring";
> +	dev_priv->ring[BCS].id = BCS;
> +	dev_priv->ring[BCS].mmio_base = BLT_RING_BASE;
> +	dev_priv->ring[BCS].dev = dev;
> +
> +	dev_priv->ring[VCS].name = "bsd ring";
> +	dev_priv->ring[VCS].id = VCS;
> +	if (INTEL_INFO(dev)->gen >= 6)
> +		dev_priv->ring[VCS].mmio_base = GEN6_BSD_RING_BASE;
> +	else
> +		dev_priv->ring[VCS].mmio_base = BSD_RING_BASE;
> +	dev_priv->ring[VCS].dev = dev;
> +
> +	dev_priv->ring[VCS2].name = "bds2_ring";
> +	dev_priv->ring[VCS2].id = VCS2;
> +	dev_priv->ring[VCS2].mmio_base = GEN8_BSD2_RING_BASE;
> +	dev_priv->ring[VCS2].dev = dev;
> +
> +	dev_priv->ring[VECS].name = "video enhancement ring";
> +	dev_priv->ring[VECS].id = VECS;
> +	dev_priv->ring[VECS].mmio_base = VEBOX_RING_BASE;
> +	dev_priv->ring[VECS].dev = dev;
> +}
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 72c3c15..b1bf767 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -297,6 +297,7 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno);
>  int intel_ring_flush_all_caches(struct intel_ring_buffer *ring);
>  int intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring);
>  
> +void intel_init_rings_early(struct drm_device *dev);
>  int intel_init_render_ring_buffer(struct drm_device *dev);
>  int intel_init_bsd_ring_buffer(struct drm_device *dev);
>  int intel_init_bsd2_ring_buffer(struct drm_device *dev);
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-09 12:08 ` [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
@ 2014-05-13 13:28   ` Daniel Vetter
  2014-05-14 13:26     ` Damien Lespiau
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-05-13 13:28 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Fri, May 09, 2014 at 01:08:36PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> In the upcoming patches, we plan to break the correlation between
> engines (a.k.a. rings) and ringbuffers, so it makes sense to
> refactor the code and make the change obvious.
> 
> No functional changes.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>

If we rename stuff I'd vote for something close to Bspec language, like
CS. So maybe intel_cs_engine?

/me sucks at this naming game
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c     |  16 +--
>  drivers/gpu/drm/i915/i915_debugfs.c        |  16 +--
>  drivers/gpu/drm/i915/i915_dma.c            |  10 +-
>  drivers/gpu/drm/i915/i915_drv.h            |  32 +++---
>  drivers/gpu/drm/i915/i915_gem.c            |  58 +++++------
>  drivers/gpu/drm/i915/i915_gem_context.c    |  14 +--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  18 ++--
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  18 ++--
>  drivers/gpu/drm/i915/i915_gem_gtt.h        |   2 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c      |   6 +-
>  drivers/gpu/drm/i915/i915_irq.c            |  28 ++---
>  drivers/gpu/drm/i915/i915_trace.h          |  26 ++---
>  drivers/gpu/drm/i915/intel_display.c       |  14 +--
>  drivers/gpu/drm/i915/intel_drv.h           |   4 +-
>  drivers/gpu/drm/i915/intel_overlay.c       |  12 +--
>  drivers/gpu/drm/i915/intel_pm.c            |  10 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    | 158 ++++++++++++++---------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    |  76 +++++++-------
>  18 files changed, 259 insertions(+), 259 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 69d34e4..3234d36 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -498,7 +498,7 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
>  	return 0;
>  }
>  
> -static bool validate_cmds_sorted(struct intel_ring_buffer *ring)
> +static bool validate_cmds_sorted(struct intel_engine *ring)
>  {
>  	int i;
>  	bool ret = true;
> @@ -550,7 +550,7 @@ static bool check_sorted(int ring_id, const u32 *reg_table, int reg_count)
>  	return ret;
>  }
>  
> -static bool validate_regs_sorted(struct intel_ring_buffer *ring)
> +static bool validate_regs_sorted(struct intel_engine *ring)
>  {
>  	return check_sorted(ring->id, ring->reg_table, ring->reg_count) &&
>  		check_sorted(ring->id, ring->master_reg_table,
> @@ -562,10 +562,10 @@ static bool validate_regs_sorted(struct intel_ring_buffer *ring)
>   * @ring: the ringbuffer to initialize
>   *
>   * Optionally initializes fields related to batch buffer command parsing in the
> - * struct intel_ring_buffer based on whether the platform requires software
> + * struct intel_engine based on whether the platform requires software
>   * command parsing.
>   */
> -void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring)
> +void i915_cmd_parser_init_ring(struct intel_engine *ring)
>  {
>  	if (!IS_GEN7(ring->dev))
>  		return;
> @@ -664,7 +664,7 @@ find_cmd_in_table(const struct drm_i915_cmd_table *table,
>   * ring's default length encoding and returns default_desc.
>   */
>  static const struct drm_i915_cmd_descriptor*
> -find_cmd(struct intel_ring_buffer *ring,
> +find_cmd(struct intel_engine *ring,
>  	 u32 cmd_header,
>  	 struct drm_i915_cmd_descriptor *default_desc)
>  {
> @@ -744,7 +744,7 @@ finish:
>   *
>   * Return: true if the ring requires software command parsing
>   */
> -bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
> +bool i915_needs_cmd_parser(struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  
> @@ -763,7 +763,7 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
>  	return (i915.enable_cmd_parser == 1);
>  }
>  
> -static bool check_cmd(const struct intel_ring_buffer *ring,
> +static bool check_cmd(const struct intel_engine *ring,
>  		      const struct drm_i915_cmd_descriptor *desc,
>  		      const u32 *cmd,
>  		      const bool is_master,
> @@ -865,7 +865,7 @@ static bool check_cmd(const struct intel_ring_buffer *ring,
>   *
>   * Return: non-zero if the parser finds violations or otherwise fails
>   */
> -int i915_parse_cmds(struct intel_ring_buffer *ring,
> +int i915_parse_cmds(struct intel_engine *ring,
>  		    struct drm_i915_gem_object *batch_obj,
>  		    u32 batch_start_offset,
>  		    bool is_master)
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 103e62c..0052460 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -562,7 +562,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	struct drm_i915_gem_request *gem_request;
>  	int ret, count, i;
>  
> @@ -594,7 +594,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
>  }
>  
>  static void i915_ring_seqno_info(struct seq_file *m,
> -				 struct intel_ring_buffer *ring)
> +				 struct intel_engine *ring)
>  {
>  	if (ring->get_seqno) {
>  		seq_printf(m, "Current sequence (%s): %u\n",
> @@ -607,7 +607,7 @@ static int i915_gem_seqno_info(struct seq_file *m, void *data)
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int ret, i;
>  
>  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> @@ -630,7 +630,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int ret, i, pipe;
>  
>  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> @@ -800,7 +800,7 @@ static int i915_hws_info(struct seq_file *m, void *data)
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	const u32 *hws;
>  	int i;
>  
> @@ -1677,7 +1677,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	struct i915_hw_context *ctx;
>  	int ret, i;
>  
> @@ -1826,7 +1826,7 @@ static int per_file_ctx(int id, void *ptr, void *data)
>  static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
>  	int unused, i;
>  
> @@ -1850,7 +1850,7 @@ static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
>  static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	struct drm_file *file;
>  	int i;
>  
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index d02c8de..5263d63 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -119,7 +119,7 @@ static void i915_write_hws_pga(struct drm_device *dev)
>  static void i915_free_hws(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> +	struct intel_engine *ring = LP_RING(dev_priv);
>  
>  	if (dev_priv->status_page_dmah) {
>  		drm_pci_free(dev, dev_priv->status_page_dmah);
> @@ -139,7 +139,7 @@ void i915_kernel_lost_context(struct drm_device * dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_master_private *master_priv;
> -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> +	struct intel_engine *ring = LP_RING(dev_priv);
>  
>  	/*
>  	 * We should never lose context on the ring with modesetting
> @@ -234,7 +234,7 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
>  static int i915_dma_resume(struct drm_device * dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> +	struct intel_engine *ring = LP_RING(dev_priv);
>  
>  	DRM_DEBUG_DRIVER("%s\n", __func__);
>  
> @@ -782,7 +782,7 @@ static int i915_wait_irq(struct drm_device * dev, int irq_nr)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_master_private *master_priv = dev->primary->master->driver_priv;
>  	int ret = 0;
> -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> +	struct intel_engine *ring = LP_RING(dev_priv);
>  
>  	DRM_DEBUG_DRIVER("irq_nr=%d breadcrumb=%d\n", irq_nr,
>  		  READ_BREADCRUMB(dev_priv));
> @@ -1073,7 +1073,7 @@ static int i915_set_status_page(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	drm_i915_hws_addr_t *hws = data;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  
>  	if (drm_core_check_feature(dev, DRIVER_MODESET))
>  		return -ENODEV;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index b1725c6..3b7a36f9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -594,7 +594,7 @@ struct i915_hw_context {
>  	bool is_initialized;
>  	uint8_t remap_slice;
>  	struct drm_i915_file_private *file_priv;
> -	struct intel_ring_buffer *last_ring;
> +	struct intel_engine *last_ring;
>  	struct drm_i915_gem_object *obj;
>  	struct i915_ctx_hang_stats hang_stats;
>  	struct i915_address_space *vm;
> @@ -1354,7 +1354,7 @@ struct drm_i915_private {
>  	wait_queue_head_t gmbus_wait_queue;
>  
>  	struct pci_dev *bridge_dev;
> -	struct intel_ring_buffer ring[I915_NUM_RINGS];
> +	struct intel_engine ring[I915_NUM_RINGS];
>  	uint32_t last_seqno, next_seqno;
>  
>  	drm_dma_handle_t *status_page_dmah;
> @@ -1675,7 +1675,7 @@ struct drm_i915_gem_object {
>  	void *dma_buf_vmapping;
>  	int vmapping_count;
>  
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  
>  	/** Breadcrumb of last rendering to the buffer. */
>  	uint32_t last_read_seqno;
> @@ -1714,7 +1714,7 @@ struct drm_i915_gem_object {
>   */
>  struct drm_i915_gem_request {
>  	/** On Which ring this request was generated */
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  
>  	/** GEM sequence number associated with this request. */
>  	uint32_t seqno;
> @@ -1755,7 +1755,7 @@ struct drm_i915_file_private {
>  
>  	struct i915_hw_context *private_default_ctx;
>  	atomic_t rps_wait_boost;
> -	struct  intel_ring_buffer *bsd_ring;
> +	struct  intel_engine *bsd_ring;
>  };
>  
>  /*
> @@ -2182,9 +2182,9 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>  
>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> -			 struct intel_ring_buffer *to);
> +			 struct intel_engine *to);
>  void i915_vma_move_to_active(struct i915_vma *vma,
> -			     struct intel_ring_buffer *ring);
> +			     struct intel_engine *ring);
>  int i915_gem_dumb_create(struct drm_file *file_priv,
>  			 struct drm_device *dev,
>  			 struct drm_mode_create_dumb *args);
> @@ -2226,7 +2226,7 @@ i915_gem_object_unpin_fence(struct drm_i915_gem_object *obj)
>  }
>  
>  struct drm_i915_gem_request *
> -i915_gem_find_active_request(struct intel_ring_buffer *ring);
> +i915_gem_find_active_request(struct intel_engine *ring);
>  
>  bool i915_gem_retire_requests(struct drm_device *dev);
>  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> @@ -2264,18 +2264,18 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
>  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
>  int __must_check i915_gem_init(struct drm_device *dev);
>  int __must_check i915_gem_init_hw(struct drm_device *dev);
> -int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice);
> +int i915_gem_l3_remap(struct intel_engine *ring, int slice);
>  void i915_gem_init_swizzling(struct drm_device *dev);
>  void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>  int __must_check i915_gpu_idle(struct drm_device *dev);
>  int __must_check i915_gem_suspend(struct drm_device *dev);
> -int __i915_add_request(struct intel_ring_buffer *ring,
> +int __i915_add_request(struct intel_engine *ring,
>  		       struct drm_file *file,
>  		       struct drm_i915_gem_object *batch_obj,
>  		       u32 *seqno);
>  #define i915_add_request(ring, seqno) \
>  	__i915_add_request(ring, NULL, NULL, seqno)
> -int __must_check i915_wait_seqno(struct intel_ring_buffer *ring,
> +int __must_check i915_wait_seqno(struct intel_engine *ring,
>  				 uint32_t seqno);
>  int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
>  int __must_check
> @@ -2286,7 +2286,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
>  int __must_check
>  i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  				     u32 alignment,
> -				     struct intel_ring_buffer *pipelined);
> +				     struct intel_engine *pipelined);
>  void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj);
>  int i915_gem_attach_phys_object(struct drm_device *dev,
>  				struct drm_i915_gem_object *obj,
> @@ -2388,7 +2388,7 @@ void i915_gem_context_reset(struct drm_device *dev);
>  int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
>  int i915_gem_context_enable(struct drm_i915_private *dev_priv);
>  void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
> -int i915_switch_context(struct intel_ring_buffer *ring,
> +int i915_switch_context(struct intel_engine *ring,
>  			struct i915_hw_context *to);
>  struct i915_hw_context *
>  i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
> @@ -2497,9 +2497,9 @@ const char *i915_cache_level_str(int type);
>  
>  /* i915_cmd_parser.c */
>  int i915_cmd_parser_get_version(void);
> -void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring);
> -bool i915_needs_cmd_parser(struct intel_ring_buffer *ring);
> -int i915_parse_cmds(struct intel_ring_buffer *ring,
> +void i915_cmd_parser_init_ring(struct intel_engine *ring);
> +bool i915_needs_cmd_parser(struct intel_engine *ring);
> +int i915_parse_cmds(struct intel_engine *ring,
>  		    struct drm_i915_gem_object *batch_obj,
>  		    u32 batch_start_offset,
>  		    bool is_master);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6ef53bd..a3b697b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -64,7 +64,7 @@ static unsigned long i915_gem_inactive_scan(struct shrinker *shrinker,
>  static unsigned long i915_gem_purge(struct drm_i915_private *dev_priv, long target);
>  static unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv);
>  static void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
> -static void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
> +static void i915_gem_retire_requests_ring(struct intel_engine *ring);
>  
>  static bool cpu_cache_is_coherent(struct drm_device *dev,
>  				  enum i915_cache_level level)
> @@ -977,7 +977,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>   * equal.
>   */
>  static int
> -i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
> +i915_gem_check_olr(struct intel_engine *ring, u32 seqno)
>  {
>  	int ret;
>  
> @@ -996,7 +996,7 @@ static void fake_irq(unsigned long data)
>  }
>  
>  static bool missed_irq(struct drm_i915_private *dev_priv,
> -		       struct intel_ring_buffer *ring)
> +		       struct intel_engine *ring)
>  {
>  	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
>  }
> @@ -1027,7 +1027,7 @@ static bool can_wait_boost(struct drm_i915_file_private *file_priv)
>   * Returns 0 if the seqno was found within the alloted time. Else returns the
>   * errno with remaining time filled in timeout argument.
>   */
> -static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
> +static int __wait_seqno(struct intel_engine *ring, u32 seqno,
>  			unsigned reset_counter,
>  			bool interruptible,
>  			struct timespec *timeout,
> @@ -1134,7 +1134,7 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
>   * request and object lists appropriately for that event.
>   */
>  int
> -i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno)
> +i915_wait_seqno(struct intel_engine *ring, uint32_t seqno)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1159,7 +1159,7 @@ i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno)
>  
>  static int
>  i915_gem_object_wait_rendering__tail(struct drm_i915_gem_object *obj,
> -				     struct intel_ring_buffer *ring)
> +				     struct intel_engine *ring)
>  {
>  	if (!obj->active)
>  		return 0;
> @@ -1184,7 +1184,7 @@ static __must_check int
>  i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
>  			       bool readonly)
>  {
> -	struct intel_ring_buffer *ring = obj->ring;
> +	struct intel_engine *ring = obj->ring;
>  	u32 seqno;
>  	int ret;
>  
> @@ -1209,7 +1209,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = obj->ring;
> +	struct intel_engine *ring = obj->ring;
>  	unsigned reset_counter;
>  	u32 seqno;
>  	int ret;
> @@ -2011,7 +2011,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  
>  static void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> -			       struct intel_ring_buffer *ring)
> +			       struct intel_engine *ring)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -2049,7 +2049,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>  
>  void i915_vma_move_to_active(struct i915_vma *vma,
> -			     struct intel_ring_buffer *ring)
> +			     struct intel_engine *ring)
>  {
>  	list_move_tail(&vma->mm_list, &vma->vm->active_list);
>  	return i915_gem_object_move_to_active(vma->obj, ring);
> @@ -2090,7 +2090,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  static void
>  i915_gem_object_retire(struct drm_i915_gem_object *obj)
>  {
> -	struct intel_ring_buffer *ring = obj->ring;
> +	struct intel_engine *ring = obj->ring;
>  
>  	if (ring == NULL)
>  		return;
> @@ -2104,7 +2104,7 @@ static int
>  i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int ret, i, j;
>  
>  	/* Carefully retire all requests without writing to the rings */
> @@ -2170,7 +2170,7 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
>  	return 0;
>  }
>  
> -int __i915_add_request(struct intel_ring_buffer *ring,
> +int __i915_add_request(struct intel_engine *ring,
>  		       struct drm_file *file,
>  		       struct drm_i915_gem_object *obj,
>  		       u32 *out_seqno)
> @@ -2330,7 +2330,7 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request)
>  }
>  
>  struct drm_i915_gem_request *
> -i915_gem_find_active_request(struct intel_ring_buffer *ring)
> +i915_gem_find_active_request(struct intel_engine *ring)
>  {
>  	struct drm_i915_gem_request *request;
>  	u32 completed_seqno;
> @@ -2348,7 +2348,7 @@ i915_gem_find_active_request(struct intel_ring_buffer *ring)
>  }
>  
>  static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv,
> -				       struct intel_ring_buffer *ring)
> +				       struct intel_engine *ring)
>  {
>  	struct drm_i915_gem_request *request;
>  	bool ring_hung;
> @@ -2367,7 +2367,7 @@ static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv,
>  }
>  
>  static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
> -					struct intel_ring_buffer *ring)
> +					struct intel_engine *ring)
>  {
>  	while (!list_empty(&ring->active_list)) {
>  		struct drm_i915_gem_object *obj;
> @@ -2426,7 +2426,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int i;
>  
>  	/*
> @@ -2449,7 +2449,7 @@ void i915_gem_reset(struct drm_device *dev)
>   * This function clears the request list as sequence numbers are passed.
>   */
>  static void
> -i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> +i915_gem_retire_requests_ring(struct intel_engine *ring)
>  {
>  	uint32_t seqno;
>  
> @@ -2512,7 +2512,7 @@ bool
>  i915_gem_retire_requests(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	bool idle = true;
>  	int i;
>  
> @@ -2606,7 +2606,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_wait *args = data;
>  	struct drm_i915_gem_object *obj;
> -	struct intel_ring_buffer *ring = NULL;
> +	struct intel_engine *ring = NULL;
>  	struct timespec timeout_stack, *timeout = NULL;
>  	unsigned reset_counter;
>  	u32 seqno = 0;
> @@ -2677,9 +2677,9 @@ out:
>   */
>  int
>  i915_gem_object_sync(struct drm_i915_gem_object *obj,
> -		     struct intel_ring_buffer *to)
> +		     struct intel_engine *to)
>  {
> -	struct intel_ring_buffer *from = obj->ring;
> +	struct intel_engine *from = obj->ring;
>  	u32 seqno;
>  	int ret, idx;
>  
> @@ -2800,7 +2800,7 @@ int i915_vma_unbind(struct i915_vma *vma)
>  int i915_gpu_idle(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int ret, i;
>  
>  	/* Flush everything onto the inactive list. */
> @@ -3659,7 +3659,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
>  int
>  i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  				     u32 alignment,
> -				     struct intel_ring_buffer *pipelined)
> +				     struct intel_engine *pipelined)
>  {
>  	u32 old_read_domains, old_write_domain;
>  	int ret;
> @@ -3812,7 +3812,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
>  	unsigned long recent_enough = jiffies - msecs_to_jiffies(20);
>  	struct drm_i915_gem_request *request;
> -	struct intel_ring_buffer *ring = NULL;
> +	struct intel_engine *ring = NULL;
>  	unsigned reset_counter;
>  	u32 seqno = 0;
>  	int ret;
> @@ -4258,7 +4258,7 @@ static void
>  i915_gem_stop_ringbuffers(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int i;
>  
>  	for_each_active_ring(ring, dev_priv, i)
> @@ -4307,7 +4307,7 @@ err:
>  	return ret;
>  }
>  
> -int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
> +int i915_gem_l3_remap(struct intel_engine *ring, int slice)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -4532,7 +4532,7 @@ void
>  i915_gem_cleanup_ringbuffer(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int i;
>  
>  	for_each_active_ring(ring, dev_priv, i)
> @@ -4608,7 +4608,7 @@ i915_gem_lastclose(struct drm_device *dev)
>  }
>  
>  static void
> -init_ring_lists(struct intel_ring_buffer *ring)
> +init_ring_lists(struct intel_engine *ring)
>  {
>  	INIT_LIST_HEAD(&ring->active_list);
>  	INIT_LIST_HEAD(&ring->request_list);
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 014fb8f..4d37e20 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -359,7 +359,7 @@ err_destroy:
>  void i915_gem_context_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int i;
>  
>  	/* Prevent the hardware from restoring the last context (which hung) on
> @@ -392,7 +392,7 @@ int i915_gem_context_init(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct i915_hw_context *ctx;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int unused;
>  
>  	/* Init should only be called once per module load. Eventually the
> @@ -428,7 +428,7 @@ void i915_gem_context_fini(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct i915_hw_context *dctx = dev_priv->ring[RCS].default_context;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int unused;
>  
>  	if (dctx->obj) {
> @@ -467,7 +467,7 @@ void i915_gem_context_fini(struct drm_device *dev)
>  
>  int i915_gem_context_enable(struct drm_i915_private *dev_priv)
>  {
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int ret, i;
>  
>  	/* This is the only place the aliasing PPGTT gets enabled, which means
> @@ -546,7 +546,7 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
>  }
>  
>  static inline int
> -mi_set_context(struct intel_ring_buffer *ring,
> +mi_set_context(struct intel_engine *ring,
>  	       struct i915_hw_context *new_context,
>  	       u32 hw_flags)
>  {
> @@ -596,7 +596,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  	return ret;
>  }
>  
> -static int do_switch(struct intel_ring_buffer *ring,
> +static int do_switch(struct intel_engine *ring,
>  		     struct i915_hw_context *to)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> @@ -726,7 +726,7 @@ unpin_out:
>   * it will have a refoucnt > 1. This allows us to destroy the context abstract
>   * object while letting the normal object tracking destroy the backing BO.
>   */
> -int i915_switch_context(struct intel_ring_buffer *ring,
> +int i915_switch_context(struct intel_engine *ring,
>  			struct i915_hw_context *to)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 47fe8ec..95e797e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -541,7 +541,7 @@ need_reloc_mappable(struct i915_vma *vma)
>  
>  static int
>  i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> -				struct intel_ring_buffer *ring,
> +				struct intel_engine *ring,
>  				bool *need_reloc)
>  {
>  	struct drm_i915_gem_object *obj = vma->obj;
> @@ -596,7 +596,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
>  }
>  
>  static int
> -i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> +i915_gem_execbuffer_reserve(struct intel_engine *ring,
>  			    struct list_head *vmas,
>  			    bool *need_relocs)
>  {
> @@ -711,7 +711,7 @@ static int
>  i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  				  struct drm_i915_gem_execbuffer2 *args,
>  				  struct drm_file *file,
> -				  struct intel_ring_buffer *ring,
> +				  struct intel_engine *ring,
>  				  struct eb_vmas *eb,
>  				  struct drm_i915_gem_exec_object2 *exec)
>  {
> @@ -827,7 +827,7 @@ err:
>  }
>  
>  static int
> -i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring,
> +i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
>  				struct list_head *vmas)
>  {
>  	struct i915_vma *vma;
> @@ -912,7 +912,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
>  
>  static struct i915_hw_context *
>  i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
> -			  struct intel_ring_buffer *ring, const u32 ctx_id)
> +			  struct intel_engine *ring, const u32 ctx_id)
>  {
>  	struct i915_hw_context *ctx = NULL;
>  	struct i915_ctx_hang_stats *hs;
> @@ -935,7 +935,7 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
>  
>  static void
>  i915_gem_execbuffer_move_to_active(struct list_head *vmas,
> -				   struct intel_ring_buffer *ring)
> +				   struct intel_engine *ring)
>  {
>  	struct i915_vma *vma;
>  
> @@ -970,7 +970,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  static void
>  i915_gem_execbuffer_retire_commands(struct drm_device *dev,
>  				    struct drm_file *file,
> -				    struct intel_ring_buffer *ring,
> +				    struct intel_engine *ring,
>  				    struct drm_i915_gem_object *obj)
>  {
>  	/* Unconditionally force add_request to emit a full flush. */
> @@ -982,7 +982,7 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev,
>  
>  static int
>  i915_reset_gen7_sol_offsets(struct drm_device *dev,
> -			    struct intel_ring_buffer *ring)
> +			    struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	int ret, i;
> @@ -1048,7 +1048,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	struct eb_vmas *eb;
>  	struct drm_i915_gem_object *batch_obj;
>  	struct drm_clip_rect *cliprects = NULL;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	struct i915_hw_context *ctx;
>  	struct i915_address_space *vm;
>  	const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 1dff805..31b58ee 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -207,7 +207,7 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr,
>  }
>  
>  /* Broadwell Page Directory Pointer Descriptors */
> -static int gen8_write_pdp(struct intel_ring_buffer *ring, unsigned entry,
> +static int gen8_write_pdp(struct intel_engine *ring, unsigned entry,
>  			   uint64_t val, bool synchronous)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> @@ -237,7 +237,7 @@ static int gen8_write_pdp(struct intel_ring_buffer *ring, unsigned entry,
>  }
>  
>  static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
> -			  struct intel_ring_buffer *ring,
> +			  struct intel_engine *ring,
>  			  bool synchronous)
>  {
>  	int i, ret;
> @@ -716,7 +716,7 @@ static uint32_t get_pd_offset(struct i915_hw_ppgtt *ppgtt)
>  }
>  
>  static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
> -			 struct intel_ring_buffer *ring,
> +			 struct intel_engine *ring,
>  			 bool synchronous)
>  {
>  	struct drm_device *dev = ppgtt->base.dev;
> @@ -760,7 +760,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
>  }
>  
>  static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
> -			  struct intel_ring_buffer *ring,
> +			  struct intel_engine *ring,
>  			  bool synchronous)
>  {
>  	struct drm_device *dev = ppgtt->base.dev;
> @@ -811,7 +811,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
>  }
>  
>  static int gen6_mm_switch(struct i915_hw_ppgtt *ppgtt,
> -			  struct intel_ring_buffer *ring,
> +			  struct intel_engine *ring,
>  			  bool synchronous)
>  {
>  	struct drm_device *dev = ppgtt->base.dev;
> @@ -832,7 +832,7 @@ static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
>  {
>  	struct drm_device *dev = ppgtt->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int j, ret;
>  
>  	for_each_active_ring(ring, dev_priv, j) {
> @@ -862,7 +862,7 @@ static int gen7_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
>  {
>  	struct drm_device *dev = ppgtt->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	uint32_t ecochk, ecobits;
>  	int i;
>  
> @@ -901,7 +901,7 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
>  {
>  	struct drm_device *dev = ppgtt->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	uint32_t ecochk, gab_ctl, ecobits;
>  	int i;
>  
> @@ -1269,7 +1269,7 @@ static void undo_idling(struct drm_i915_private *dev_priv, bool interruptible)
>  void i915_check_and_clear_faults(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int i;
>  
>  	if (INTEL_INFO(dev)->gen < 6)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index cfca023..0775662 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -261,7 +261,7 @@ struct i915_hw_ppgtt {
>  
>  	int (*enable)(struct i915_hw_ppgtt *ppgtt);
>  	int (*switch_mm)(struct i915_hw_ppgtt *ppgtt,
> -			 struct intel_ring_buffer *ring,
> +			 struct intel_engine *ring,
>  			 bool synchronous);
>  	void (*debug_dump)(struct i915_hw_ppgtt *ppgtt, struct seq_file *m);
>  };
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 8f37238..0853db3 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -745,7 +745,7 @@ static void i915_gem_record_fences(struct drm_device *dev,
>  }
>  
>  static void i915_record_ring_state(struct drm_device *dev,
> -				   struct intel_ring_buffer *ring,
> +				   struct intel_engine *ring,
>  				   struct drm_i915_error_ring *ering)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -857,7 +857,7 @@ static void i915_record_ring_state(struct drm_device *dev,
>  }
>  
>  
> -static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> +static void i915_gem_record_active_context(struct intel_engine *ring,
>  					   struct drm_i915_error_state *error,
>  					   struct drm_i915_error_ring *ering)
>  {
> @@ -884,7 +884,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
>  	int i, count;
>  
>  	for (i = 0; i < I915_NUM_RINGS; i++) {
> -		struct intel_ring_buffer *ring = &dev_priv->ring[i];
> +		struct intel_engine *ring = &dev_priv->ring[i];
>  
>  		if (!intel_ring_initialized(ring))
>  			continue;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 4a8e8cb..58c8812 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1077,7 +1077,7 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
>  }
>  
>  static void notify_ring(struct drm_device *dev,
> -			struct intel_ring_buffer *ring)
> +			struct intel_engine *ring)
>  {
>  	if (ring->obj == NULL)
>  		return;
> @@ -2111,7 +2111,7 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
>  static void i915_error_wake_up(struct drm_i915_private *dev_priv,
>  			       bool reset_completed)
>  {
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int i;
>  
>  	/*
> @@ -2544,14 +2544,14 @@ static void gen8_disable_vblank(struct drm_device *dev, int pipe)
>  }
>  
>  static u32
> -ring_last_seqno(struct intel_ring_buffer *ring)
> +ring_last_seqno(struct intel_engine *ring)
>  {
>  	return list_entry(ring->request_list.prev,
>  			  struct drm_i915_gem_request, list)->seqno;
>  }
>  
>  static bool
> -ring_idle(struct intel_ring_buffer *ring, u32 seqno)
> +ring_idle(struct intel_engine *ring, u32 seqno)
>  {
>  	return (list_empty(&ring->request_list) ||
>  		i915_seqno_passed(seqno, ring_last_seqno(ring)));
> @@ -2574,11 +2574,11 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
>  	}
>  }
>  
> -static struct intel_ring_buffer *
> -semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
> +static struct intel_engine *
> +semaphore_wait_to_signaller_ring(struct intel_engine *ring, u32 ipehr)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> -	struct intel_ring_buffer *signaller;
> +	struct intel_engine *signaller;
>  	int i;
>  
>  	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
> @@ -2606,8 +2606,8 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
>  	return NULL;
>  }
>  
> -static struct intel_ring_buffer *
> -semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
> +static struct intel_engine *
> +semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	u32 cmd, ipehr, head;
> @@ -2649,10 +2649,10 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
>  	return semaphore_wait_to_signaller_ring(ring, ipehr);
>  }
>  
> -static int semaphore_passed(struct intel_ring_buffer *ring)
> +static int semaphore_passed(struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> -	struct intel_ring_buffer *signaller;
> +	struct intel_engine *signaller;
>  	u32 seqno, ctl;
>  
>  	ring->hangcheck.deadlock = true;
> @@ -2671,7 +2671,7 @@ static int semaphore_passed(struct intel_ring_buffer *ring)
>  
>  static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
>  {
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int i;
>  
>  	for_each_active_ring(ring, dev_priv, i)
> @@ -2679,7 +2679,7 @@ static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
>  }
>  
>  static enum intel_ring_hangcheck_action
> -ring_stuck(struct intel_ring_buffer *ring, u64 acthd)
> +ring_stuck(struct intel_engine *ring, u64 acthd)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -2735,7 +2735,7 @@ static void i915_hangcheck_elapsed(unsigned long data)
>  {
>  	struct drm_device *dev = (struct drm_device *)data;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	int i;
>  	int busy_count = 0, rings_hung = 0;
>  	bool stuck[I915_NUM_RINGS] = { 0 };
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index b29d7b1..a4f9e62 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -326,8 +326,8 @@ TRACE_EVENT(i915_gem_evict_vm,
>  );
>  
>  TRACE_EVENT(i915_gem_ring_sync_to,
> -	    TP_PROTO(struct intel_ring_buffer *from,
> -		     struct intel_ring_buffer *to,
> +	    TP_PROTO(struct intel_engine *from,
> +		     struct intel_engine *to,
>  		     u32 seqno),
>  	    TP_ARGS(from, to, seqno),
>  
> @@ -352,7 +352,7 @@ TRACE_EVENT(i915_gem_ring_sync_to,
>  );
>  
>  TRACE_EVENT(i915_gem_ring_dispatch,
> -	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags),
> +	    TP_PROTO(struct intel_engine *ring, u32 seqno, u32 flags),
>  	    TP_ARGS(ring, seqno, flags),
>  
>  	    TP_STRUCT__entry(
> @@ -375,7 +375,7 @@ TRACE_EVENT(i915_gem_ring_dispatch,
>  );
>  
>  TRACE_EVENT(i915_gem_ring_flush,
> -	    TP_PROTO(struct intel_ring_buffer *ring, u32 invalidate, u32 flush),
> +	    TP_PROTO(struct intel_engine *ring, u32 invalidate, u32 flush),
>  	    TP_ARGS(ring, invalidate, flush),
>  
>  	    TP_STRUCT__entry(
> @@ -398,7 +398,7 @@ TRACE_EVENT(i915_gem_ring_flush,
>  );
>  
>  DECLARE_EVENT_CLASS(i915_gem_request,
> -	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
> +	    TP_PROTO(struct intel_engine *ring, u32 seqno),
>  	    TP_ARGS(ring, seqno),
>  
>  	    TP_STRUCT__entry(
> @@ -418,12 +418,12 @@ DECLARE_EVENT_CLASS(i915_gem_request,
>  );
>  
>  DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
> -	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
> +	    TP_PROTO(struct intel_engine *ring, u32 seqno),
>  	    TP_ARGS(ring, seqno)
>  );
>  
>  TRACE_EVENT(i915_gem_request_complete,
> -	    TP_PROTO(struct intel_ring_buffer *ring),
> +	    TP_PROTO(struct intel_engine *ring),
>  	    TP_ARGS(ring),
>  
>  	    TP_STRUCT__entry(
> @@ -443,12 +443,12 @@ TRACE_EVENT(i915_gem_request_complete,
>  );
>  
>  DEFINE_EVENT(i915_gem_request, i915_gem_request_retire,
> -	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
> +	    TP_PROTO(struct intel_engine *ring, u32 seqno),
>  	    TP_ARGS(ring, seqno)
>  );
>  
>  TRACE_EVENT(i915_gem_request_wait_begin,
> -	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
> +	    TP_PROTO(struct intel_engine *ring, u32 seqno),
>  	    TP_ARGS(ring, seqno),
>  
>  	    TP_STRUCT__entry(
> @@ -477,12 +477,12 @@ TRACE_EVENT(i915_gem_request_wait_begin,
>  );
>  
>  DEFINE_EVENT(i915_gem_request, i915_gem_request_wait_end,
> -	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
> +	    TP_PROTO(struct intel_engine *ring, u32 seqno),
>  	    TP_ARGS(ring, seqno)
>  );
>  
>  DECLARE_EVENT_CLASS(i915_ring,
> -	    TP_PROTO(struct intel_ring_buffer *ring),
> +	    TP_PROTO(struct intel_engine *ring),
>  	    TP_ARGS(ring),
>  
>  	    TP_STRUCT__entry(
> @@ -499,12 +499,12 @@ DECLARE_EVENT_CLASS(i915_ring,
>  );
>  
>  DEFINE_EVENT(i915_ring, i915_ring_wait_begin,
> -	    TP_PROTO(struct intel_ring_buffer *ring),
> +	    TP_PROTO(struct intel_engine *ring),
>  	    TP_ARGS(ring)
>  );
>  
>  DEFINE_EVENT(i915_ring, i915_ring_wait_end,
> -	    TP_PROTO(struct intel_ring_buffer *ring),
> +	    TP_PROTO(struct intel_engine *ring),
>  	    TP_ARGS(ring)
>  );
>  
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index c65e7f7..f821147 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -1944,7 +1944,7 @@ static int intel_align_height(struct drm_device *dev, int height, bool tiled)
>  int
>  intel_pin_and_fence_fb_obj(struct drm_device *dev,
>  			   struct drm_i915_gem_object *obj,
> -			   struct intel_ring_buffer *pipelined)
> +			   struct intel_engine *pipelined)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	u32 alignment;
> @@ -8424,7 +8424,7 @@ out:
>  }
>  
>  void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
> -			struct intel_ring_buffer *ring)
> +			struct intel_engine *ring)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_crtc *crtc;
> @@ -8582,7 +8582,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  	u32 flip_mask;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
>  	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
> @@ -8627,7 +8627,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  	u32 flip_mask;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
>  	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
> @@ -8669,7 +8669,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  	uint32_t pf, pipesrc;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
>  	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
> @@ -8717,7 +8717,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	uint32_t pf, pipesrc;
>  	int ret;
>  
> @@ -8762,7 +8762,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	uint32_t plane_bit = 0;
>  	int len, ret;
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index d8b540b..23b5abf 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -694,7 +694,7 @@ int intel_pch_rawclk(struct drm_device *dev);
>  int valleyview_cur_cdclk(struct drm_i915_private *dev_priv);
>  void intel_mark_busy(struct drm_device *dev);
>  void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
> -			struct intel_ring_buffer *ring);
> +			struct intel_engine *ring);
>  void intel_mark_idle(struct drm_device *dev);
>  void intel_crtc_restore_mode(struct drm_crtc *crtc);
>  void intel_crtc_update_dpms(struct drm_crtc *crtc);
> @@ -726,7 +726,7 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
>  				    struct intel_load_detect_pipe *old);
>  int intel_pin_and_fence_fb_obj(struct drm_device *dev,
>  			       struct drm_i915_gem_object *obj,
> -			       struct intel_ring_buffer *pipelined);
> +			       struct intel_engine *pipelined);
>  void intel_unpin_fb_obj(struct drm_i915_gem_object *obj);
>  struct drm_framebuffer *
>  __intel_framebuffer_create(struct drm_device *dev,
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index d8adc91..965eec1 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -213,7 +213,7 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
>  {
>  	struct drm_device *dev = overlay->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
>  	BUG_ON(overlay->last_flip_req);
> @@ -236,7 +236,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>  {
>  	struct drm_device *dev = overlay->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
>  	BUG_ON(overlay->active);
> @@ -263,7 +263,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>  {
>  	struct drm_device *dev = overlay->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	u32 flip_addr = overlay->flip_addr;
>  	u32 tmp;
>  	int ret;
> @@ -320,7 +320,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>  {
>  	struct drm_device *dev = overlay->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	u32 flip_addr = overlay->flip_addr;
>  	int ret;
>  
> @@ -363,7 +363,7 @@ static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay)
>  {
>  	struct drm_device *dev = overlay->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
>  	if (overlay->last_flip_req == 0)
> @@ -389,7 +389,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>  {
>  	struct drm_device *dev = overlay->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
>  	/* Only wait if there is actually an old frame to release to
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index acfded3..17f636e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3379,7 +3379,7 @@ static void parse_rp_state_cap(struct drm_i915_private *dev_priv, u32 rp_state_c
>  static void gen8_enable_rps(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	uint32_t rc6_mask = 0, rp_state_cap;
>  	int unused;
>  
> @@ -3454,7 +3454,7 @@ static void gen8_enable_rps(struct drm_device *dev)
>  static void gen6_enable_rps(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	u32 rp_state_cap;
>  	u32 gt_perf_status;
>  	u32 rc6vids, pcu_mbox = 0, rc6_mask = 0;
> @@ -3783,7 +3783,7 @@ static void valleyview_cleanup_gt_powersave(struct drm_device *dev)
>  static void valleyview_enable_rps(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	u32 gtfifodbg, val, rc6_mode = 0;
>  	int i;
>  
> @@ -3914,7 +3914,7 @@ static int ironlake_setup_rc6(struct drm_device *dev)
>  static void ironlake_enable_rc6(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	bool was_interruptible;
>  	int ret;
>  
> @@ -4426,7 +4426,7 @@ EXPORT_SYMBOL_GPL(i915_gpu_lower);
>  bool i915_gpu_busy(void)
>  {
>  	struct drm_i915_private *dev_priv;
> -	struct intel_ring_buffer *ring;
> +	struct intel_engine *ring;
>  	bool ret = false;
>  	int i;
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 5d61923..4c3cc44 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -40,7 +40,7 @@
>   */
>  #define CACHELINE_BYTES 64
>  
> -static inline int ring_space(struct intel_ring_buffer *ring)
> +static inline int ring_space(struct intel_engine *ring)
>  {
>  	int space = (ring->head & HEAD_ADDR) - (ring->tail + I915_RING_FREE_SPACE);
>  	if (space < 0)
> @@ -48,13 +48,13 @@ static inline int ring_space(struct intel_ring_buffer *ring)
>  	return space;
>  }
>  
> -static bool intel_ring_stopped(struct intel_ring_buffer *ring)
> +static bool intel_ring_stopped(struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	return dev_priv->gpu_error.stop_rings & intel_ring_flag(ring);
>  }
>  
> -void __intel_ring_advance(struct intel_ring_buffer *ring)
> +void __intel_ring_advance(struct intel_engine *ring)
>  {
>  	ring->tail &= ring->size - 1;
>  	if (intel_ring_stopped(ring))
> @@ -63,7 +63,7 @@ void __intel_ring_advance(struct intel_ring_buffer *ring)
>  }
>  
>  static int
> -gen2_render_ring_flush(struct intel_ring_buffer *ring,
> +gen2_render_ring_flush(struct intel_engine *ring,
>  		       u32	invalidate_domains,
>  		       u32	flush_domains)
>  {
> @@ -89,7 +89,7 @@ gen2_render_ring_flush(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> -gen4_render_ring_flush(struct intel_ring_buffer *ring,
> +gen4_render_ring_flush(struct intel_engine *ring,
>  		       u32	invalidate_domains,
>  		       u32	flush_domains)
>  {
> @@ -184,7 +184,7 @@ gen4_render_ring_flush(struct intel_ring_buffer *ring,
>   * really our business.  That leaves only stall at scoreboard.
>   */
>  static int
> -intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring)
> +intel_emit_post_sync_nonzero_flush(struct intel_engine *ring)
>  {
>  	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
>  	int ret;
> @@ -219,7 +219,7 @@ intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring)
>  }
>  
>  static int
> -gen6_render_ring_flush(struct intel_ring_buffer *ring,
> +gen6_render_ring_flush(struct intel_engine *ring,
>                           u32 invalidate_domains, u32 flush_domains)
>  {
>  	u32 flags = 0;
> @@ -271,7 +271,7 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> -gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
> +gen7_render_ring_cs_stall_wa(struct intel_engine *ring)
>  {
>  	int ret;
>  
> @@ -289,7 +289,7 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> -static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
> +static int gen7_ring_fbc_flush(struct intel_engine *ring, u32 value)
>  {
>  	int ret;
>  
> @@ -313,7 +313,7 @@ static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
>  }
>  
>  static int
> -gen7_render_ring_flush(struct intel_ring_buffer *ring,
> +gen7_render_ring_flush(struct intel_engine *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
>  {
>  	u32 flags = 0;
> @@ -374,7 +374,7 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> -gen8_render_ring_flush(struct intel_ring_buffer *ring,
> +gen8_render_ring_flush(struct intel_engine *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
>  {
>  	u32 flags = 0;
> @@ -414,14 +414,14 @@ gen8_render_ring_flush(struct intel_ring_buffer *ring,
>  
>  }
>  
> -static void ring_write_tail(struct intel_ring_buffer *ring,
> +static void ring_write_tail(struct intel_engine *ring,
>  			    u32 value)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	I915_WRITE_TAIL(ring, value);
>  }
>  
> -u64 intel_ring_get_active_head(struct intel_ring_buffer *ring)
> +u64 intel_ring_get_active_head(struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	u64 acthd;
> @@ -437,7 +437,7 @@ u64 intel_ring_get_active_head(struct intel_ring_buffer *ring)
>  	return acthd;
>  }
>  
> -static void ring_setup_phys_status_page(struct intel_ring_buffer *ring)
> +static void ring_setup_phys_status_page(struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	u32 addr;
> @@ -448,7 +448,7 @@ static void ring_setup_phys_status_page(struct intel_ring_buffer *ring)
>  	I915_WRITE(HWS_PGA, addr);
>  }
>  
> -static bool stop_ring(struct intel_ring_buffer *ring)
> +static bool stop_ring(struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(ring->dev);
>  
> @@ -472,7 +472,7 @@ static bool stop_ring(struct intel_ring_buffer *ring)
>  	return (I915_READ_HEAD(ring) & HEAD_ADDR) == 0;
>  }
>  
> -static int init_ring_common(struct intel_ring_buffer *ring)
> +static int init_ring_common(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -550,7 +550,7 @@ out:
>  }
>  
>  static int
> -init_pipe_control(struct intel_ring_buffer *ring)
> +init_pipe_control(struct intel_engine *ring)
>  {
>  	int ret;
>  
> @@ -591,7 +591,7 @@ err:
>  	return ret;
>  }
>  
> -static int init_render_ring(struct intel_ring_buffer *ring)
> +static int init_render_ring(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -647,7 +647,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  	return ret;
>  }
>  
> -static void render_ring_cleanup(struct intel_ring_buffer *ring)
> +static void render_ring_cleanup(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  
> @@ -663,12 +663,12 @@ static void render_ring_cleanup(struct intel_ring_buffer *ring)
>  	ring->scratch.obj = NULL;
>  }
>  
> -static int gen6_signal(struct intel_ring_buffer *signaller,
> +static int gen6_signal(struct intel_engine *signaller,
>  		       unsigned int num_dwords)
>  {
>  	struct drm_device *dev = signaller->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *useless;
> +	struct intel_engine *useless;
>  	int i, ret;
>  
>  	/* NB: In order to be able to do semaphore MBOX updates for varying
> @@ -713,7 +713,7 @@ static int gen6_signal(struct intel_ring_buffer *signaller,
>   * This acts like a signal in the canonical semaphore.
>   */
>  static int
> -gen6_add_request(struct intel_ring_buffer *ring)
> +gen6_add_request(struct intel_engine *ring)
>  {
>  	int ret;
>  
> @@ -745,8 +745,8 @@ static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
>   * @seqno - seqno which the waiter will block on
>   */
>  static int
> -gen6_ring_sync(struct intel_ring_buffer *waiter,
> -	       struct intel_ring_buffer *signaller,
> +gen6_ring_sync(struct intel_engine *waiter,
> +	       struct intel_engine *signaller,
>  	       u32 seqno)
>  {
>  	u32 dw1 = MI_SEMAPHORE_MBOX |
> @@ -794,7 +794,7 @@ do {									\
>  } while (0)
>  
>  static int
> -pc_render_add_request(struct intel_ring_buffer *ring)
> +pc_render_add_request(struct intel_engine *ring)
>  {
>  	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
>  	int ret;
> @@ -842,7 +842,7 @@ pc_render_add_request(struct intel_ring_buffer *ring)
>  }
>  
>  static u32
> -gen6_ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
> +gen6_ring_get_seqno(struct intel_engine *ring, bool lazy_coherency)
>  {
>  	/* Workaround to force correct ordering between irq and seqno writes on
>  	 * ivb (and maybe also on snb) by reading from a CS register (like
> @@ -856,31 +856,31 @@ gen6_ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
>  }
>  
>  static u32
> -ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
> +ring_get_seqno(struct intel_engine *ring, bool lazy_coherency)
>  {
>  	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
>  }
>  
>  static void
> -ring_set_seqno(struct intel_ring_buffer *ring, u32 seqno)
> +ring_set_seqno(struct intel_engine *ring, u32 seqno)
>  {
>  	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
>  }
>  
>  static u32
> -pc_render_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
> +pc_render_get_seqno(struct intel_engine *ring, bool lazy_coherency)
>  {
>  	return ring->scratch.cpu_page[0];
>  }
>  
>  static void
> -pc_render_set_seqno(struct intel_ring_buffer *ring, u32 seqno)
> +pc_render_set_seqno(struct intel_engine *ring, u32 seqno)
>  {
>  	ring->scratch.cpu_page[0] = seqno;
>  }
>  
>  static bool
> -gen5_ring_get_irq(struct intel_ring_buffer *ring)
> +gen5_ring_get_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -898,7 +898,7 @@ gen5_ring_get_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static void
> -gen5_ring_put_irq(struct intel_ring_buffer *ring)
> +gen5_ring_put_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -911,7 +911,7 @@ gen5_ring_put_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static bool
> -i9xx_ring_get_irq(struct intel_ring_buffer *ring)
> +i9xx_ring_get_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -932,7 +932,7 @@ i9xx_ring_get_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static void
> -i9xx_ring_put_irq(struct intel_ring_buffer *ring)
> +i9xx_ring_put_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -948,7 +948,7 @@ i9xx_ring_put_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static bool
> -i8xx_ring_get_irq(struct intel_ring_buffer *ring)
> +i8xx_ring_get_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -969,7 +969,7 @@ i8xx_ring_get_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static void
> -i8xx_ring_put_irq(struct intel_ring_buffer *ring)
> +i8xx_ring_put_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -984,7 +984,7 @@ i8xx_ring_put_irq(struct intel_ring_buffer *ring)
>  	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
>  }
>  
> -void intel_ring_setup_status_page(struct intel_ring_buffer *ring)
> +void intel_ring_setup_status_page(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> @@ -1047,7 +1047,7 @@ void intel_ring_setup_status_page(struct intel_ring_buffer *ring)
>  }
>  
>  static int
> -bsd_ring_flush(struct intel_ring_buffer *ring,
> +bsd_ring_flush(struct intel_engine *ring,
>  	       u32     invalidate_domains,
>  	       u32     flush_domains)
>  {
> @@ -1064,7 +1064,7 @@ bsd_ring_flush(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> -i9xx_add_request(struct intel_ring_buffer *ring)
> +i9xx_add_request(struct intel_engine *ring)
>  {
>  	int ret;
>  
> @@ -1082,7 +1082,7 @@ i9xx_add_request(struct intel_ring_buffer *ring)
>  }
>  
>  static bool
> -gen6_ring_get_irq(struct intel_ring_buffer *ring)
> +gen6_ring_get_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1107,7 +1107,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static void
> -gen6_ring_put_irq(struct intel_ring_buffer *ring)
> +gen6_ring_put_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1125,7 +1125,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static bool
> -hsw_vebox_get_irq(struct intel_ring_buffer *ring)
> +hsw_vebox_get_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1145,7 +1145,7 @@ hsw_vebox_get_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static void
> -hsw_vebox_put_irq(struct intel_ring_buffer *ring)
> +hsw_vebox_put_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1163,7 +1163,7 @@ hsw_vebox_put_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static bool
> -gen8_ring_get_irq(struct intel_ring_buffer *ring)
> +gen8_ring_get_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1189,7 +1189,7 @@ gen8_ring_get_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static void
> -gen8_ring_put_irq(struct intel_ring_buffer *ring)
> +gen8_ring_put_irq(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1209,7 +1209,7 @@ gen8_ring_put_irq(struct intel_ring_buffer *ring)
>  }
>  
>  static int
> -i965_dispatch_execbuffer(struct intel_ring_buffer *ring,
> +i965_dispatch_execbuffer(struct intel_engine *ring,
>  			 u64 offset, u32 length,
>  			 unsigned flags)
>  {
> @@ -1232,7 +1232,7 @@ i965_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  /* Just userspace ABI convention to limit the wa batch bo to a resonable size */
>  #define I830_BATCH_LIMIT (256*1024)
>  static int
> -i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
> +i830_dispatch_execbuffer(struct intel_engine *ring,
>  				u64 offset, u32 len,
>  				unsigned flags)
>  {
> @@ -1283,7 +1283,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> -i915_dispatch_execbuffer(struct intel_ring_buffer *ring,
> +i915_dispatch_execbuffer(struct intel_engine *ring,
>  			 u64 offset, u32 len,
>  			 unsigned flags)
>  {
> @@ -1300,7 +1300,7 @@ i915_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  	return 0;
>  }
>  
> -static void cleanup_status_page(struct intel_ring_buffer *ring)
> +static void cleanup_status_page(struct intel_engine *ring)
>  {
>  	struct drm_i915_gem_object *obj;
>  
> @@ -1314,7 +1314,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
>  	ring->status_page.obj = NULL;
>  }
>  
> -static int init_status_page(struct intel_ring_buffer *ring)
> +static int init_status_page(struct intel_engine *ring)
>  {
>  	struct drm_i915_gem_object *obj;
>  
> @@ -1351,7 +1351,7 @@ err_unref:
>  	return 0;
>  }
>  
> -static int init_phys_status_page(struct intel_ring_buffer *ring)
> +static int init_phys_status_page(struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  
> @@ -1368,7 +1368,7 @@ static int init_phys_status_page(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> -void intel_destroy_ring_buffer(struct intel_ring_buffer *ring)
> +void intel_destroy_ring_buffer(struct intel_engine *ring)
>  {
>  	if (!ring->obj)
>  		return;
> @@ -1379,7 +1379,7 @@ void intel_destroy_ring_buffer(struct intel_ring_buffer *ring)
>  	ring->obj = NULL;
>  }
>  
> -int intel_allocate_ring_buffer(struct intel_ring_buffer *ring)
> +int intel_allocate_ring_buffer(struct intel_engine *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> @@ -1424,7 +1424,7 @@ err_unref:
>  }
>  
>  static int intel_init_ring_buffer(struct drm_device *dev,
> -				  struct intel_ring_buffer *ring)
> +				  struct intel_engine *ring)
>  {
>  	int ret;
>  
> @@ -1465,7 +1465,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>  	return ring->init(ring);
>  }
>  
> -void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
> +void intel_cleanup_ring_buffer(struct intel_engine *ring)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(ring->dev);
>  
> @@ -1485,7 +1485,7 @@ void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
>  	cleanup_status_page(ring);
>  }
>  
> -static int intel_ring_wait_request(struct intel_ring_buffer *ring, int n)
> +static int intel_ring_wait_request(struct intel_engine *ring, int n)
>  {
>  	struct drm_i915_gem_request *request;
>  	u32 seqno = 0, tail;
> @@ -1538,7 +1538,7 @@ static int intel_ring_wait_request(struct intel_ring_buffer *ring, int n)
>  	return 0;
>  }
>  
> -static int ring_wait_for_space(struct intel_ring_buffer *ring, int n)
> +static int ring_wait_for_space(struct intel_engine *ring, int n)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1586,7 +1586,7 @@ static int ring_wait_for_space(struct intel_ring_buffer *ring, int n)
>  	return -EBUSY;
>  }
>  
> -static int intel_wrap_ring_buffer(struct intel_ring_buffer *ring)
> +static int intel_wrap_ring_buffer(struct intel_engine *ring)
>  {
>  	uint32_t __iomem *virt;
>  	int rem = ring->size - ring->tail;
> @@ -1608,7 +1608,7 @@ static int intel_wrap_ring_buffer(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> -int intel_ring_idle(struct intel_ring_buffer *ring)
> +int intel_ring_idle(struct intel_engine *ring)
>  {
>  	u32 seqno;
>  	int ret;
> @@ -1632,7 +1632,7 @@ int intel_ring_idle(struct intel_ring_buffer *ring)
>  }
>  
>  static int
> -intel_ring_alloc_seqno(struct intel_ring_buffer *ring)
> +intel_ring_alloc_seqno(struct intel_engine *ring)
>  {
>  	if (ring->outstanding_lazy_seqno)
>  		return 0;
> @@ -1650,7 +1650,7 @@ intel_ring_alloc_seqno(struct intel_ring_buffer *ring)
>  	return i915_gem_get_seqno(ring->dev, &ring->outstanding_lazy_seqno);
>  }
>  
> -static int __intel_ring_prepare(struct intel_ring_buffer *ring,
> +static int __intel_ring_prepare(struct intel_engine *ring,
>  				int bytes)
>  {
>  	int ret;
> @@ -1670,7 +1670,7 @@ static int __intel_ring_prepare(struct intel_ring_buffer *ring,
>  	return 0;
>  }
>  
> -int intel_ring_begin(struct intel_ring_buffer *ring,
> +int intel_ring_begin(struct intel_engine *ring,
>  		     int num_dwords)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> @@ -1695,7 +1695,7 @@ int intel_ring_begin(struct intel_ring_buffer *ring,
>  }
>  
>  /* Align the ring tail to a cacheline boundary */
> -int intel_ring_cacheline_align(struct intel_ring_buffer *ring)
> +int intel_ring_cacheline_align(struct intel_engine *ring)
>  {
>  	int num_dwords = (ring->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
>  	int ret;
> @@ -1716,7 +1716,7 @@ int intel_ring_cacheline_align(struct intel_ring_buffer *ring)
>  	return 0;
>  }
>  
> -void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno)
> +void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  
> @@ -1733,7 +1733,7 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno)
>  	ring->hangcheck.seqno = seqno;
>  }
>  
> -static void gen6_bsd_ring_write_tail(struct intel_ring_buffer *ring,
> +static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
>  				     u32 value)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> @@ -1766,7 +1766,7 @@ static void gen6_bsd_ring_write_tail(struct intel_ring_buffer *ring,
>  		   _MASKED_BIT_DISABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
>  }
>  
> -static int gen6_bsd_ring_flush(struct intel_ring_buffer *ring,
> +static int gen6_bsd_ring_flush(struct intel_engine *ring,
>  			       u32 invalidate, u32 flush)
>  {
>  	uint32_t cmd;
> @@ -1802,7 +1802,7 @@ static int gen6_bsd_ring_flush(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> -gen8_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
> +gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
>  			      u64 offset, u32 len,
>  			      unsigned flags)
>  {
> @@ -1826,7 +1826,7 @@ gen8_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> -hsw_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
> +hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
>  			      u64 offset, u32 len,
>  			      unsigned flags)
>  {
> @@ -1847,7 +1847,7 @@ hsw_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  }
>  
>  static int
> -gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
> +gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
>  			      u64 offset, u32 len,
>  			      unsigned flags)
>  {
> @@ -1869,7 +1869,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  
>  /* Blitter support (SandyBridge+) */
>  
> -static int gen6_ring_flush(struct intel_ring_buffer *ring,
> +static int gen6_ring_flush(struct intel_engine *ring,
>  			   u32 invalidate, u32 flush)
>  {
>  	struct drm_device *dev = ring->dev;
> @@ -1912,7 +1912,7 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
>  int intel_init_render_ring_buffer(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  
>  	if (INTEL_INFO(dev)->gen >= 6) {
>  		ring->add_request = gen6_add_request;
> @@ -2018,7 +2018,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +	struct intel_engine *ring = &dev_priv->ring[RCS];
>  	int ret;
>  
>  	if (INTEL_INFO(dev)->gen >= 6) {
> @@ -2081,7 +2081,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
>  int intel_init_bsd_ring_buffer(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[VCS];
> +	struct intel_engine *ring = &dev_priv->ring[VCS];
>  
>  	ring->write_tail = ring_write_tail;
>  	if (INTEL_INFO(dev)->gen >= 6) {
> @@ -2152,7 +2152,7 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
>  int intel_init_bsd2_ring_buffer(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[VCS2];
> +	struct intel_engine *ring = &dev_priv->ring[VCS2];
>  
>  	if ((INTEL_INFO(dev)->gen != 8)) {
>  		DRM_ERROR("No dual-BSD ring on non-BDW machine\n");
> @@ -2196,7 +2196,7 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
>  int intel_init_blt_ring_buffer(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[BCS];
> +	struct intel_engine *ring = &dev_priv->ring[BCS];
>  
>  	ring->write_tail = ring_write_tail;
>  	ring->flush = gen6_ring_flush;
> @@ -2241,7 +2241,7 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
>  int intel_init_vebox_ring_buffer(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = &dev_priv->ring[VECS];
> +	struct intel_engine *ring = &dev_priv->ring[VECS];
>  
>  	ring->write_tail = ring_write_tail;
>  	ring->flush = gen6_ring_flush;
> @@ -2279,7 +2279,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
>  }
>  
>  int
> -intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
> +intel_ring_flush_all_caches(struct intel_engine *ring)
>  {
>  	int ret;
>  
> @@ -2297,7 +2297,7 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
>  }
>  
>  int
> -intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
> +intel_ring_invalidate_all_caches(struct intel_engine *ring)
>  {
>  	uint32_t flush_domains;
>  	int ret;
> @@ -2317,7 +2317,7 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
>  }
>  
>  void
> -intel_stop_ring_buffer(struct intel_ring_buffer *ring)
> +intel_stop_ring_buffer(struct intel_engine *ring)
>  {
>  	int ret;
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 680e451..50cc525 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -54,7 +54,7 @@ struct intel_ring_hangcheck {
>  	bool deadlock;
>  };
>  
> -struct  intel_ring_buffer {
> +struct intel_engine {
>  	const char	*name;
>  	enum intel_ring_id {
>  		RCS = 0x0,
> @@ -90,33 +90,33 @@ struct  intel_ring_buffer {
>  	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
>  	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
>  	u32		trace_irq_seqno;
> -	bool __must_check (*irq_get)(struct intel_ring_buffer *ring);
> -	void		(*irq_put)(struct intel_ring_buffer *ring);
> +	bool __must_check (*irq_get)(struct intel_engine *ring);
> +	void		(*irq_put)(struct intel_engine *ring);
>  
> -	int		(*init)(struct intel_ring_buffer *ring);
> +	int		(*init)(struct intel_engine *ring);
>  
> -	void		(*write_tail)(struct intel_ring_buffer *ring,
> +	void		(*write_tail)(struct intel_engine *ring,
>  				      u32 value);
> -	int __must_check (*flush)(struct intel_ring_buffer *ring,
> +	int __must_check (*flush)(struct intel_engine *ring,
>  				  u32	invalidate_domains,
>  				  u32	flush_domains);
> -	int		(*add_request)(struct intel_ring_buffer *ring);
> +	int		(*add_request)(struct intel_engine *ring);
>  	/* Some chipsets are not quite as coherent as advertised and need
>  	 * an expensive kick to force a true read of the up-to-date seqno.
>  	 * However, the up-to-date seqno is not always required and the last
>  	 * seen value is good enough. Note that the seqno will always be
>  	 * monotonic, even if not coherent.
>  	 */
> -	u32		(*get_seqno)(struct intel_ring_buffer *ring,
> +	u32		(*get_seqno)(struct intel_engine *ring,
>  				     bool lazy_coherency);
> -	void		(*set_seqno)(struct intel_ring_buffer *ring,
> +	void		(*set_seqno)(struct intel_engine *ring,
>  				     u32 seqno);
> -	int		(*dispatch_execbuffer)(struct intel_ring_buffer *ring,
> +	int		(*dispatch_execbuffer)(struct intel_engine *ring,
>  					       u64 offset, u32 length,
>  					       unsigned flags);
>  #define I915_DISPATCH_SECURE 0x1
>  #define I915_DISPATCH_PINNED 0x2
> -	void		(*cleanup)(struct intel_ring_buffer *ring);
> +	void		(*cleanup)(struct intel_engine *ring);
>  
>  	struct {
>  		u32	sync_seqno[I915_NUM_RINGS-1];
> @@ -129,10 +129,10 @@ struct  intel_ring_buffer {
>  		} mbox;
>  
>  		/* AKA wait() */
> -		int	(*sync_to)(struct intel_ring_buffer *ring,
> -				   struct intel_ring_buffer *to,
> +		int	(*sync_to)(struct intel_engine *ring,
> +				   struct intel_engine *to,
>  				   u32 seqno);
> -		int	(*signal)(struct intel_ring_buffer *signaller,
> +		int	(*signal)(struct intel_engine *signaller,
>  				  /* num_dwords needed by caller */
>  				  unsigned int num_dwords);
>  	} semaphore;
> @@ -210,20 +210,20 @@ struct  intel_ring_buffer {
>  };
>  
>  static inline bool
> -intel_ring_initialized(struct intel_ring_buffer *ring)
> +intel_ring_initialized(struct intel_engine *ring)
>  {
>  	return ring->obj != NULL;
>  }
>  
>  static inline unsigned
> -intel_ring_flag(struct intel_ring_buffer *ring)
> +intel_ring_flag(struct intel_engine *ring)
>  {
>  	return 1 << ring->id;
>  }
>  
>  static inline u32
> -intel_ring_sync_index(struct intel_ring_buffer *ring,
> -		      struct intel_ring_buffer *other)
> +intel_ring_sync_index(struct intel_engine *ring,
> +		      struct intel_engine *other)
>  {
>  	int idx;
>  
> @@ -241,7 +241,7 @@ intel_ring_sync_index(struct intel_ring_buffer *ring,
>  }
>  
>  static inline u32
> -intel_read_status_page(struct intel_ring_buffer *ring,
> +intel_read_status_page(struct intel_engine *ring,
>  		       int reg)
>  {
>  	/* Ensure that the compiler doesn't optimize away the load. */
> @@ -250,7 +250,7 @@ intel_read_status_page(struct intel_ring_buffer *ring,
>  }
>  
>  static inline void
> -intel_write_status_page(struct intel_ring_buffer *ring,
> +intel_write_status_page(struct intel_engine *ring,
>  			int reg, u32 value)
>  {
>  	ring->status_page.page_addr[reg] = value;
> @@ -275,27 +275,27 @@ intel_write_status_page(struct intel_ring_buffer *ring,
>  #define I915_GEM_HWS_SCRATCH_INDEX	0x30
>  #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
>  
> -void intel_stop_ring_buffer(struct intel_ring_buffer *ring);
> -void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring);
> +void intel_stop_ring_buffer(struct intel_engine *ring);
> +void intel_cleanup_ring_buffer(struct intel_engine *ring);
>  
> -int __must_check intel_ring_begin(struct intel_ring_buffer *ring, int n);
> -int __must_check intel_ring_cacheline_align(struct intel_ring_buffer *ring);
> -static inline void intel_ring_emit(struct intel_ring_buffer *ring,
> +int __must_check intel_ring_begin(struct intel_engine *ring, int n);
> +int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
> +static inline void intel_ring_emit(struct intel_engine *ring,
>  				   u32 data)
>  {
>  	iowrite32(data, ring->virtual_start + ring->tail);
>  	ring->tail += 4;
>  }
> -static inline void intel_ring_advance(struct intel_ring_buffer *ring)
> +static inline void intel_ring_advance(struct intel_engine *ring)
>  {
>  	ring->tail &= ring->size - 1;
>  }
> -void __intel_ring_advance(struct intel_ring_buffer *ring);
> +void __intel_ring_advance(struct intel_engine *ring);
>  
> -int __must_check intel_ring_idle(struct intel_ring_buffer *ring);
> -void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno);
> -int intel_ring_flush_all_caches(struct intel_ring_buffer *ring);
> -int intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring);
> +int __must_check intel_ring_idle(struct intel_engine *ring);
> +void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
> +int intel_ring_flush_all_caches(struct intel_engine *ring);
> +int intel_ring_invalidate_all_caches(struct intel_engine *ring);
>  
>  void intel_init_rings_early(struct drm_device *dev);
>  int intel_init_render_ring_buffer(struct drm_device *dev);
> @@ -304,24 +304,24 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev);
>  int intel_init_blt_ring_buffer(struct drm_device *dev);
>  int intel_init_vebox_ring_buffer(struct drm_device *dev);
>  
> -u64 intel_ring_get_active_head(struct intel_ring_buffer *ring);
> -void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
> +u64 intel_ring_get_active_head(struct intel_engine *ring);
> +void intel_ring_setup_status_page(struct intel_engine *ring);
>  
> -void intel_destroy_ring_buffer(struct intel_ring_buffer *ring);
> -int intel_allocate_ring_buffer(struct intel_ring_buffer *ring);
> +void intel_destroy_ring_buffer(struct intel_engine *ring);
> +int intel_allocate_ring_buffer(struct intel_engine *ring);
>  
> -static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
> +static inline u32 intel_ring_get_tail(struct intel_engine *ring)
>  {
>  	return ring->tail;
>  }
>  
> -static inline u32 intel_ring_get_seqno(struct intel_ring_buffer *ring)
> +static inline u32 intel_ring_get_seqno(struct intel_engine *ring)
>  {
>  	BUG_ON(ring->outstanding_lazy_seqno == 0);
>  	return ring->outstanding_lazy_seqno;
>  }
>  
> -static inline void i915_trace_irq_get(struct intel_ring_buffer *ring, u32 seqno)
> +static inline void i915_trace_irq_get(struct intel_engine *ring, u32 seqno)
>  {
>  	if (ring->trace_irq_seqno == 0 && ring->irq_get(ring))
>  		ring->trace_irq_seqno = seqno;
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, user-created LRCs
  2014-05-09 12:08 ` [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, " oscar.mateo
@ 2014-05-13 13:35   ` Daniel Vetter
  2014-05-14 11:38     ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-05-13 13:35 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Fri, May 09, 2014 at 01:08:56PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> This commit changes the ABI, so it is provided separately so that it can be
> dropped by the maintainer is so he wishes.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>

This looks eerily like a patch from full-ppgtt that I've reverted,
originally authored by Ben ;-) Anyway now that we have per-file contexts I
think we can forgo this for now, since every open fd will get its own
per-ring implicit contxt for blt, vcs and vecs anyway.

And if we have a need for more contexts on those engines the usual rules
apply: I need a fully enabled open-source userspace implementation using
this new capability.
-Daniel

> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index a656b48..0a944c2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -85,12 +85,6 @@ gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
>  	struct i915_hw_context *ctx = NULL;
>  	struct i915_ctx_hang_stats *hs;
>  
> -	/* There is no reason why we cannot accept non-default, non-render contexts,
> -	 * other than it changes the ABI (these kind of custom contexts have not been
> -	 * allowed before) */
> -	if (ring->id != RCS && ctx_id != DEFAULT_CONTEXT_ID)
> -		return ERR_PTR(-EINVAL);
> -
>  	ctx = i915_gem_context_get(file->driver_priv, ctx_id);
>  	if (IS_ERR(ctx))
>  		return ctx;
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init)
  2014-05-13 13:26   ` Daniel Vetter
@ 2014-05-13 13:47     ` Chris Wilson
  2014-05-14 11:53     ` Mateo Lozano, Oscar
  1 sibling, 0 replies; 94+ messages in thread
From: Chris Wilson @ 2014-05-13 13:47 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Tue, May 13, 2014 at 03:26:51PM +0200, Daniel Vetter wrote:
> On Fri, May 09, 2014 at 01:08:34PM +0100, oscar.mateo@intel.com wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index 2d81985..8f37238 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -886,7 +886,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
> >  	for (i = 0; i < I915_NUM_RINGS; i++) {
> >  		struct intel_ring_buffer *ring = &dev_priv->ring[i];
> >  
> > -		if (ring->dev == NULL)
> > +		if (!intel_ring_initialized(ring))
> >  			continue;

Besides this was deliberately written not to use
intel_ring_initialized().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 00/50] Execlists v2
  2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
                   ` (50 preceding siblings ...)
  2014-05-12 17:04 ` [PATCH 49.1/50] drm/i915/bdw: Do not call intel_runtime_pm_get() in an interrupt oscar.mateo
@ 2014-05-13 13:48 ` Daniel Vetter
  51 siblings, 0 replies; 94+ messages in thread
From: Daniel Vetter @ 2014-05-13 13:48 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Fri, May 09, 2014 at 01:08:30PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> For a description of this patchset, please check the previous cover letter [1].
> 
> Together with this patchset, I'm also submitting an IGT test: gem_execlist [2].
> 
> v2:
> - Use the same context struct for all the different engines (suggested by Brad Volkin).
> - Rename write_tail to submit (suggested by Brad).
> - Simplify hardware submission id creation by using LRCA[31:11] as hwCtxId[18:0].
> - Non-render contexts are only two pages long (suggested by Damien Lespiau).
> - Disable HWSTAM, as no one listens to it anyway (suggested by Damien).
> - Do not write PDPs in the context every time, doing it at context creation time is enough.
> - Various kmalloc changes in gen8_switch_context_queue (suggested by Damien).
> - Module parameter to disable Execlists (as per Damien's patches).
> - Update the HW read pointer in CONTEXT_STATUS_PTR (suggested by Damien).
> - Fixed gpu reset and basic error reporting (verified by the new gem_error_capture test).
> - Fix for unexpected full preemption in some scenarios (instead of lite restore).
> - Ack the context switch interrupts as soon as possible (fix by Bob Beckett).
> - Move default context backing object creation to intel_init_ring.
> - Take into account the second BSD ring.
> - Help out the ctx switch interrupt handler by sharing the burden of squashing requests
>   together.
> 
> What I haven't done in this release:
> 
> - Get the context sizes from the CXT_SIZE registers, as suggested by Damien: the BSpec is full 
>   of holes with regards to the various CXT_SIZE registers, but the hardcoded values seem pretty
>   clear.
> - Allocate the ringbuffer together with the context, as suggested by Damien: now that every
>   context has NUM_RINGS ringbuffers on it, the advantage of this is not clear anymore.
> - Damien pointed out that we are missing the RS context restore, but I don't see any RS values
>   that are needed on the first execution (the first save should take care of these).
> - I have added a comment to clarify how the context population takes place (MI_LOAD_REGISTER_IMM
>   plus <reg,value> pairs) but I haven't provided names for each position (as Jeff Mcgee suggested)
>   or created an OUT_BATCH_REG_WRITE(reg, value) (as Daniel Vetter suggested).
> 
> [1]
> http://lists.freedesktop.org/archives/intel-gfx/2014-March/042563.html
> [2]
> http://lists.freedesktop.org/archives/intel-gfx/2014-May/044846.html

I've done a very cursory read of this, and my original comment from my
original high-level review on the internal list still stands: I'm freaked
out by how invasive this is into the existing ring code. All the changes
in i915_dma.c look very suspicious, since that code is for the legacy ums
crap and will _never_ run on bdw. Nor even on anything more modern than
g4x platforms (gen4).

Apparently that review has been lost/ignored, so I'll quote it in full:

"In reading through your patches the big thing which jumped out
is how the new execlist code is intermingled with the old legacy gen8
framebuffer stuff. Imo those two modes don't match at all, and we need to
resolve this mismatch with another abstraction layer ;-)

"I'm thinking of dev_priv->gt.do_execbuf which takes care of all the
lower-level request tracking and command submission. Feel free to massively
bikeshed the name. I'm thinking that we should move everything from
i915_gem_execbuffer_move_to_gpu to i915_gem_execbuffer_retire_commands into
that callback. With the current code that'd include the active list tracking,
maybe we should move that part out again. Otoh if we go wild with scheduling
and preemption, active bo tracking _will_ be rather different from previous
platforms. To support execlist we might need some more vfuncs in the
ringbuffer struct to support execlist specific stuff (submit execlist, enable
context switch interrupts), but a lot of the existing stuff will be redudant.
At the end (once things settles) we should then document which kind of
do_execbuf uses which kinds of low-level ring interfaces.

"With that abstraction:
- We can separate gen8 execlist from legacy gen8 code, and so should avoid
regressions (and so blocking mesa).
- Play around with different execlist approaches (guc, deferred execution,
whatever, ...) since it'll just be differen copy&pasta.

"Maybe we also need a similar abstraction for a wait_seqno/request interface.
But since has/fulsim can't simulate interrupts properly that discussion is a
bit moot for now.

"Finally I think our immediate focus for execlist enabling should be to get
multi-context execlists going, so that we can validate whether that'll work
together with mesa/hw contexts. If it doesn't, not much point in bothering.
The simplest way is to just block in the ->do_execbuf callback if we can't
submit the new context right away. It'll suck a bit perf-wise, but will get
the hw going.

"In summary execlists looks like a big&invasive feature. My aim here with those
ideas is purely risk mitigation: I want to avoid committing resources for a
(potentially) dead-end design, while still giving us enough flexibility to do
the necessary prototyping to figure out the right answers."

That mail was from Mar 25th, 2013.

So essentially what I'd prefer is we keep all the existing ringbuffer code
as-is, and throw in a complete new set (with fairly new datastructures) in
for execlists. Then only interaction points would be:
- Ring init either calls into legacy ring init or new fancy execlist ring
  init.
- Execbuf calls ring->do_submit with ring/engine, ctx object, batch state
  and otherwise doesn't care one bit how it will all get submitted.
- Context state needs to be frobbed a bit so that we create the correct
  backing object (i.e. legacy hw state or execlist ring+ctx). To make this
  feasible it's probably best to switch the implicit per-fd ctx to be
  per-ring. That way we still have the fixed hw-contxt->ring/engine
  relationship and don't need to play tricks with lazy context allocation
  (because those beats are so big with execlists).

Yes this means that a bunch of things like e.g. seqno emission and flusing
in intel_ringbuffer.c will be duplicated into intel_lrc.c. But imo that's
a feature, not a bug - I would be massively suprised if there's not
subtile differences here we need to be able to cope with.

Cheers, Daniel
> 
> Ben Widawsky (13):
>   drm/i915: s/for_each_ring/for_each_active_ring
>   drm/i915: for_each_ring
>   drm/i915: Extract trivial parts of ring init (early init)
>   drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring
>     Contexts)
>   drm/i915/bdw: Rework init code for Logical Ring Contexts
>   drm/i915/bdw: A bit more advanced context init/fini
>   drm/i915/bdw: Populate LR contexts (somewhat)
>   drm/i915/bdw: Status page for LR contexts
>   drm/i915/bdw: Enable execlists in the hardware
>   drm/i915/bdw: LR context ring init
>   drm/i915/bdw: GEN8 new ring flush
>   drm/i915/bdw: Implement context switching (somewhat)
>   drm/i915/bdw: Print context state in debugfs
> 
> Michel Thierry (1):
>   drm/i915/bdw: Get prepared for a two-stage execlist submit process
> 
> Oscar Mateo (33):
>   drm/i915: Simplify a couple of functions thanks to for_each_ring
>   drm/i915: Extract ringbuffer destroy, make destroy & alloc outside
>     accesible
>   drm/i915: s/intel_ring_buffer/intel_engine
>   drm/i915: Split the ringbuffers and the rings
>   drm/i915: Rename functions that mention ringbuffers (meaning rings)
>   drm/i915: Plumb the context everywhere in the execbuffer path
>   drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit
>   drm/i915: Write a new set of context-aware ringbuffer management
>     functions
>   drm/i915: Final touches to ringbuffer and context plumbing and
>     refactoring
>   drm/i915: s/write_tail/submit
>   drm/i915: Introduce one context backing object per engine
>   drm/i915: Make i915_gem_create_context outside accessible
>   drm/i915: Option to skip backing object allocation during context
>     creation
>   drm/i915: Extract context backing object allocation
>   drm/i915/bdw: New file for Logical Ring Contexts and Execlists
>   drm/i915/bdw: Allocate ringbuffer backing objects for default global
>     LRC
>   drm/i915/bdw: Allocate ringbuffer for user-created LRCs
>   drm/i915/bdw: Deferred creation of user-created LRCs
>   drm/i915/bdw: Allow non-default, non-render, user-created LRCs
>   drm/i915/bdw: Execlists ring tail writing
>   drm/i915/bdw: Set the request context information correctly in the LRC
>     case
>   drm/i915/bdw: Always write seqno to default context
>   drm/i915/bdw: Write the tail pointer, LRC style
>   drm/i915/bdw: Don't write PDP in the legacy way when using LRCs
>   drm/i915/bdw: Start queueing contexts to be submitted
>   drm/i915/bdw: Display execlists info in debugfs
>   drm/i915/bdw: Display context backing obj & ringbuffer info in debugfs
>   drm/i915/bdw: Document execlists and logical ring contexts
>   drm/i915/bdw: Avoid non-lite-restore preemptions
>   drm/i915/bdw: Make sure gpu reset still works with Execlists
>   drm/i915/bdw: Make sure error capture keeps working with Execlists
>   drm/i915/bdw: Help out the ctx switch interrupt handler
>   drm/i915/bdw: Enable logical ring contexts
> 
> Thomas Daniel (3):
>   drm/i915/bdw: Add forcewake lock around ELSP writes
>   drm/i915/bdw: LR context switch interrupts
>   drm/i915/bdw: Handle context switch events
> 
>  drivers/gpu/drm/i915/Makefile              |   1 +
>  drivers/gpu/drm/i915/i915_cmd_parser.c     |  16 +-
>  drivers/gpu/drm/i915/i915_debugfs.c        | 180 ++++++-
>  drivers/gpu/drm/i915/i915_dma.c            |  48 +-
>  drivers/gpu/drm/i915/i915_drv.h            |  97 +++-
>  drivers/gpu/drm/i915/i915_gem.c            | 172 ++++---
>  drivers/gpu/drm/i915/i915_gem_context.c    | 220 +++++---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 ++--
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.h        |   2 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c      |  19 +-
>  drivers/gpu/drm/i915/i915_irq.c            | 102 ++--
>  drivers/gpu/drm/i915/i915_params.c         |   6 +
>  drivers/gpu/drm/i915/i915_reg.h            |  11 +
>  drivers/gpu/drm/i915/i915_trace.h          |  26 +-
>  drivers/gpu/drm/i915/intel_display.c       |  26 +-
>  drivers/gpu/drm/i915/intel_drv.h           |   4 +-
>  drivers/gpu/drm/i915/intel_lrc.c           | 729 ++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_overlay.c       |  12 +-
>  drivers/gpu/drm/i915/intel_pm.c            |  18 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    | 792 +++++++++++++++++++----------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    | 196 ++++---
>  22 files changed, 2107 insertions(+), 696 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/intel_lrc.c
> 
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, user-created LRCs
  2014-05-13 13:35   ` Daniel Vetter
@ 2014-05-14 11:38     ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-14 11:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Tuesday, May 13, 2014 2:36 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 26/50] drm/i915/bdw: Allow non-default, non-
> render, user-created LRCs
> 
> On Fri, May 09, 2014 at 01:08:56PM +0100, oscar.mateo@intel.com wrote:
> > From: Oscar Mateo <oscar.mateo@intel.com>
> >
> > This commit changes the ABI, so it is provided separately so that it
> > can be dropped by the maintainer is so he wishes.
> >
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> 
> This looks eerily like a patch from full-ppgtt that I've reverted, originally
> authored by Ben ;-) Anyway now that we have per-file contexts I think we can
> forgo this for now, since every open fd will get its own per-ring implicit contxt
> for blt, vcs and vecs anyway.
> 
> And if we have a need for more contexts on those engines the usual rules
> apply: I need a fully enabled open-source userspace implementation using this
> new capability.
> -Daniel

I know: I saw the patch and your revert, but I had to try anyway :)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init)
  2014-05-13 13:26   ` Daniel Vetter
  2014-05-13 13:47     ` Chris Wilson
@ 2014-05-14 11:53     ` Mateo Lozano, Oscar
  2014-05-14 12:28       ` Daniel Vetter
  1 sibling, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-14 11:53 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Widawsky, Benjamin

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Tuesday, May 13, 2014 2:27 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org; Ben Widawsky; Widawsky, Benjamin
> Subject: Re: [Intel-gfx] [PATCH 04/50] drm/i915: Extract trivial parts of ring init
> (early init)
> 
> On Fri, May 09, 2014 at 01:08:34PM +0100, oscar.mateo@intel.com wrote:
> > From: Ben Widawsky <benjamin.widawsky@intel.com>
> >
> > It's beneficial to be able to get a name, base, and id before we've
> > actually initialized the rings. This ability was effectively destroyed
> > in the ringbuffer fire which Daniel started.
> >
> > With the simple early init function, that ability is restored.
> >
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >
> > v2: The Full PPGTT series have moved things around a little bit.
> > Also, don't forget the VEBOX.
> >
> > v3: Checking ring->dev is not a good way to test if a ring is
> > initialized...
> >
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> 
> Needs to be updated for VEBOX2. Also I don't really see the point, where
> exactly do we need this? Ripping apart the ring init like this doesn't look too
> great imo.

VEBOX2?? Not one week ago I updated it for BSD2 :(
Anyway, this patch is a legacy carry over from previous versions, I can perfectly make do without it.
Thanks!

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init)
  2014-05-14 11:53     ` Mateo Lozano, Oscar
@ 2014-05-14 12:28       ` Daniel Vetter
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Vetter @ 2014-05-14 12:28 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx, Widawsky, Benjamin

On Wed, May 14, 2014 at 11:53:46AM +0000, Mateo Lozano, Oscar wrote:
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> > Sent: Tuesday, May 13, 2014 2:27 PM
> > To: Mateo Lozano, Oscar
> > Cc: intel-gfx@lists.freedesktop.org; Ben Widawsky; Widawsky, Benjamin
> > Subject: Re: [Intel-gfx] [PATCH 04/50] drm/i915: Extract trivial parts of ring init
> > (early init)
> > 
> > On Fri, May 09, 2014 at 01:08:34PM +0100, oscar.mateo@intel.com wrote:
> > > From: Ben Widawsky <benjamin.widawsky@intel.com>
> > >
> > > It's beneficial to be able to get a name, base, and id before we've
> > > actually initialized the rings. This ability was effectively destroyed
> > > in the ringbuffer fire which Daniel started.
> > >
> > > With the simple early init function, that ability is restored.
> > >
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > >
> > > v2: The Full PPGTT series have moved things around a little bit.
> > > Also, don't forget the VEBOX.
> > >
> > > v3: Checking ring->dev is not a good way to test if a ring is
> > > initialized...
> > >
> > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > 
> > Needs to be updated for VEBOX2. Also I don't really see the point, where
> > exactly do we need this? Ripping apart the ring init like this doesn't look too
> > great imo.
> 
> VEBOX2?? Not one week ago I updated it for BSD2 :(
> Anyway, this patch is a legacy carry over from previous versions, I can perfectly make do without it.

Oops, I've meant VCS2, must have been asleep while reading it ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-13 13:28   ` Daniel Vetter
@ 2014-05-14 13:26     ` Damien Lespiau
  2014-05-15 14:17       ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Damien Lespiau @ 2014-05-14 13:26 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Tue, May 13, 2014 at 03:28:27PM +0200, Daniel Vetter wrote:
> On Fri, May 09, 2014 at 01:08:36PM +0100, oscar.mateo@intel.com wrote:
> > From: Oscar Mateo <oscar.mateo@intel.com>
> > 
> > In the upcoming patches, we plan to break the correlation between
> > engines (a.k.a. rings) and ringbuffers, so it makes sense to
> > refactor the code and make the change obvious.
> > 
> > No functional changes.
> > 
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> 
> If we rename stuff I'd vote for something close to Bspec language, like
> CS. So maybe intel_cs_engine?

Also, can we have such patches (and the like of "drm/i915:
for_each_ring") pushed early when everyone is happy with them, they
cause constant rebasing pain.

Thanks,

-- 
Damien

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-14 13:26     ` Damien Lespiau
@ 2014-05-15 14:17       ` Mateo Lozano, Oscar
  2014-05-15 20:52         ` Daniel Vetter
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-15 14:17 UTC (permalink / raw)
  To: Lespiau, Damien, Daniel Vetter; +Cc: intel-gfx

> -----Original Message-----
> From: Lespiau, Damien
> Sent: Wednesday, May 14, 2014 2:26 PM
> To: Daniel Vetter
> Cc: Mateo Lozano, Oscar; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> s/intel_ring_buffer/intel_engine
> 
> On Tue, May 13, 2014 at 03:28:27PM +0200, Daniel Vetter wrote:
> > On Fri, May 09, 2014 at 01:08:36PM +0100, oscar.mateo@intel.com wrote:
> > > From: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > In the upcoming patches, we plan to break the correlation between
> > > engines (a.k.a. rings) and ringbuffers, so it makes sense to
> > > refactor the code and make the change obvious.
> > >
> > > No functional changes.
> > >
> > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> >
> > If we rename stuff I'd vote for something close to Bspec language,
> > like CS. So maybe intel_cs_engine?

Bikeshedding much, are we? :)
If we want to get closer to bspecish, intel_engine_cs would be better.

> Also, can we have such patches (and the like of "drm/i915:
> for_each_ring") pushed early when everyone is happy with them, they cause
> constant rebasing pain.

I second that motion!

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-15 14:17       ` Mateo Lozano, Oscar
@ 2014-05-15 20:52         ` Daniel Vetter
  2014-05-19 10:02           ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-05-15 20:52 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Thu, May 15, 2014 at 02:17:23PM +0000, Mateo Lozano, Oscar wrote:
> > -----Original Message-----
> > From: Lespiau, Damien
> > Sent: Wednesday, May 14, 2014 2:26 PM
> > To: Daniel Vetter
> > Cc: Mateo Lozano, Oscar; intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > s/intel_ring_buffer/intel_engine
> > 
> > On Tue, May 13, 2014 at 03:28:27PM +0200, Daniel Vetter wrote:
> > > On Fri, May 09, 2014 at 01:08:36PM +0100, oscar.mateo@intel.com wrote:
> > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > >
> > > > In the upcoming patches, we plan to break the correlation between
> > > > engines (a.k.a. rings) and ringbuffers, so it makes sense to
> > > > refactor the code and make the change obvious.
> > > >
> > > > No functional changes.
> > > >
> > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > If we rename stuff I'd vote for something close to Bspec language,
> > > like CS. So maybe intel_cs_engine?
> 
> Bikeshedding much, are we? :)
> If we want to get closer to bspecish, intel_engine_cs would be better.

I'm ok with that too ;-)

> > Also, can we have such patches (and the like of "drm/i915:
> > for_each_ring") pushed early when everyone is happy with them, they cause
> > constant rebasing pain.
> 
> I second that motion!

Fully agreed - as soon as we have a rough sketch of where we want to go to
I'll pull in the rename. Aside I highly suggest to do the rename with
coccinelle and regerate it on rebases - that's much less error-prone than
doing it by hand.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path
  2014-05-09 12:08 ` [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path oscar.mateo
@ 2014-05-16 11:04   ` Chris Wilson
  2014-05-16 11:11     ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Chris Wilson @ 2014-05-16 11:04 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Fri, May 09, 2014 at 01:08:39PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> The context are going to become very important pretty soon, and
> we need to be able to access them in a number of places inside
> the command submission path. The idea is that, when we need to
> place commands inside a ringbuffer or update the tail register,
> we know which context we are working with.
> 
> We left intel_ring_begin() as a function macro to quickly adapt
> legacy code, an introduce intel_ringbuffer_begin() as the first
> of a set of new functions for ringbuffer manipulation (the rest
> will come in subsequent patches).
> 
> No functional changes.
> 
> v2: Do not set the context to NULL. In legacy code, set it to
> the default ring context (even if it doesn't get used later on).

Won't rings be stored within the context? So the context should be
derivable from which ring the operation is being issued on.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path
  2014-05-16 11:04   ` Chris Wilson
@ 2014-05-16 11:11     ` Mateo Lozano, Oscar
  2014-05-16 11:31       ` Chris Wilson
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-16 11:11 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

> -----Original Message-----
> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> Sent: Friday, May 16, 2014 12:05 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 09/50] drm/i915: Plumb the context everywhere
> in the execbuffer path
> 
> On Fri, May 09, 2014 at 01:08:39PM +0100, oscar.mateo@intel.com wrote:
> > From: Oscar Mateo <oscar.mateo@intel.com>
> >
> > The context are going to become very important pretty soon, and we
> > need to be able to access them in a number of places inside the
> > command submission path. The idea is that, when we need to place
> > commands inside a ringbuffer or update the tail register, we know
> > which context we are working with.
> >
> > We left intel_ring_begin() as a function macro to quickly adapt legacy
> > code, an introduce intel_ringbuffer_begin() as the first of a set of
> > new functions for ringbuffer manipulation (the rest will come in
> > subsequent patches).
> >
> > No functional changes.
> >
> > v2: Do not set the context to NULL. In legacy code, set it to the
> > default ring context (even if it doesn't get used later on).
> 
> Won't rings be stored within the context? So the context should be derivable
> from which ring the operation is being issued on.
> -Chris

Rings (as in "engine command streamer") still remain in dev_priv and there are only four/five of them. What we store in the context is the new ringbuffer structure (which stores the head, tail, etc...) and the ringbuffer backing object. Knowing only the ring is not enough to derive the context.

- Oscar

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path
  2014-05-16 11:11     ` Mateo Lozano, Oscar
@ 2014-05-16 11:31       ` Chris Wilson
  0 siblings, 0 replies; 94+ messages in thread
From: Chris Wilson @ 2014-05-16 11:31 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Fri, May 16, 2014 at 11:11:38AM +0000, Mateo Lozano, Oscar wrote:
> > -----Original Message-----
> > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> > Sent: Friday, May 16, 2014 12:05 PM
> > To: Mateo Lozano, Oscar
> > Cc: intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 09/50] drm/i915: Plumb the context everywhere
> > in the execbuffer path
> > 
> > On Fri, May 09, 2014 at 01:08:39PM +0100, oscar.mateo@intel.com wrote:
> > > From: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > The context are going to become very important pretty soon, and we
> > > need to be able to access them in a number of places inside the
> > > command submission path. The idea is that, when we need to place
> > > commands inside a ringbuffer or update the tail register, we know
> > > which context we are working with.
> > >
> > > We left intel_ring_begin() as a function macro to quickly adapt legacy
> > > code, an introduce intel_ringbuffer_begin() as the first of a set of
> > > new functions for ringbuffer manipulation (the rest will come in
> > > subsequent patches).
> > >
> > > No functional changes.
> > >
> > > v2: Do not set the context to NULL. In legacy code, set it to the
> > > default ring context (even if it doesn't get used later on).
> > 
> > Won't rings be stored within the context? So the context should be derivable
> > from which ring the operation is being issued on.
> > -Chris
> 
> Rings (as in "engine command streamer") still remain in dev_priv and there are only four/five of them. What we store in the context is the new ringbuffer structure (which stores the head, tail, etc...) and the ringbuffer backing object. Knowing only the ring is not enough to derive the context.

Ugh, I thought an earlier restructuring request was that the logical
ring interface was context specific.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-15 20:52         ` Daniel Vetter
@ 2014-05-19 10:02           ` Mateo Lozano, Oscar
  2014-05-19 12:20             ` Daniel Vetter
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-19 10:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

Hi Daniel,

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Thursday, May 15, 2014 9:52 PM
> To: Mateo Lozano, Oscar
> Cc: Lespiau, Damien; Daniel Vetter; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> s/intel_ring_buffer/intel_engine
> 
> On Thu, May 15, 2014 at 02:17:23PM +0000, Mateo Lozano, Oscar wrote:
> > > -----Original Message-----
> > > From: Lespiau, Damien
> > > Sent: Wednesday, May 14, 2014 2:26 PM
> > > To: Daniel Vetter
> > > Cc: Mateo Lozano, Oscar; intel-gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > > s/intel_ring_buffer/intel_engine
> > >
> > > On Tue, May 13, 2014 at 03:28:27PM +0200, Daniel Vetter wrote:
> > > > On Fri, May 09, 2014 at 01:08:36PM +0100, oscar.mateo@intel.com
> wrote:
> > > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > > >
> > > > > In the upcoming patches, we plan to break the correlation
> > > > > between engines (a.k.a. rings) and ringbuffers, so it makes
> > > > > sense to refactor the code and make the change obvious.
> > > > >
> > > > > No functional changes.
> > > > >
> > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > >
> > > > If we rename stuff I'd vote for something close to Bspec language,
> > > > like CS. So maybe intel_cs_engine?
> >
> > Bikeshedding much, are we? :)
> > If we want to get closer to bspecish, intel_engine_cs would be better.
> 
> I'm ok with that too ;-)
> 
> > > Also, can we have such patches (and the like of "drm/i915:
> > > for_each_ring") pushed early when everyone is happy with them, they
> > > cause constant rebasing pain.
> >
> > I second that motion!
> 
> Fully agreed - as soon as we have a rough sketch of where we want to go to I'll
> pull in the rename. Aside I highly suggest to do the rename with coccinelle and
> regerate it on rebases - that's much less error-prone than doing it by hand.
> -Daniel

I propose the following code refactoring at a minimum. Even if I abstract away all the "i915_gem_context.c" and "intel_ringbuffer.c" functionality, and part of "i915_gem_execbuffer.c", to keep changes to legacy code to a minimum, I still think the following changes are good for the overall code:

1)	s/intel_ring_buffer/intel_engine_cs

Straight renaming: if the actual ring buffers can live either inside the engine/ring (legacy ringbuffer submission) or inside the context (execlists), it doesn´t make sense that the engine/ring is called "intel_ring_buffer".

2)	Split the ringbuffers and the rings

New struct:

+struct intel_ringbuffer {
+       struct drm_i915_gem_object *obj;
+       void __iomem *virtual_start;
+
+       u32 head;
+       u32 tail;
+       int space;
+       int size;
+       int effective_size;
+
+       /** We track the position of the requests in the ring buffer, and
+        * when each is retired we increment last_retired_head as the GPU
+        * must have finished processing the request and so we know we
+        * can advance the ringbuffer up to that position.
+        *
+        * last_retired_head is set to -1 after the value is consumed so
+        * we can detect new retirements.
+        */
+       u32 last_retired_head;
+};

And "struct intel_engine_cs" now groups all these elements into "buffer":

-       void            __iomem *virtual_start;
-       struct          drm_i915_gem_object *obj;
-       u32             head;
-       u32             tail;
-       int             space;
-       int             size;
-       int             effective_size;
-       u32             last_retired_head;
+       struct intel_ringbuffer buffer;

3)	Introduce one context backing object per engine

-       struct drm_i915_gem_object *obj;
+       struct {
+               struct drm_i915_gem_object *obj;
+       } engine[I915_NUM_RINGS];

Legacy code only ever uses engine[RCS], so I can use it everywhere in the existing code.

If we agree on this minimum set, I´ll send the patches right away.

Cheers,
Oscar 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 10:02           ` Mateo Lozano, Oscar
@ 2014-05-19 12:20             ` Daniel Vetter
  2014-05-19 13:41               ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-05-19 12:20 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Mon, May 19, 2014 at 10:02:07AM +0000, Mateo Lozano, Oscar wrote:
> Hi Daniel,
> 
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> > Sent: Thursday, May 15, 2014 9:52 PM
> > To: Mateo Lozano, Oscar
> > Cc: Lespiau, Damien; Daniel Vetter; intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > s/intel_ring_buffer/intel_engine
> > 
> > On Thu, May 15, 2014 at 02:17:23PM +0000, Mateo Lozano, Oscar wrote:
> > > > -----Original Message-----
> > > > From: Lespiau, Damien
> > > > Sent: Wednesday, May 14, 2014 2:26 PM
> > > > To: Daniel Vetter
> > > > Cc: Mateo Lozano, Oscar; intel-gfx@lists.freedesktop.org
> > > > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > > > s/intel_ring_buffer/intel_engine
> > > >
> > > > On Tue, May 13, 2014 at 03:28:27PM +0200, Daniel Vetter wrote:
> > > > > On Fri, May 09, 2014 at 01:08:36PM +0100, oscar.mateo@intel.com
> > wrote:
> > > > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > > > >
> > > > > > In the upcoming patches, we plan to break the correlation
> > > > > > between engines (a.k.a. rings) and ringbuffers, so it makes
> > > > > > sense to refactor the code and make the change obvious.
> > > > > >
> > > > > > No functional changes.
> > > > > >
> > > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > >
> > > > > If we rename stuff I'd vote for something close to Bspec language,
> > > > > like CS. So maybe intel_cs_engine?
> > >
> > > Bikeshedding much, are we? :)
> > > If we want to get closer to bspecish, intel_engine_cs would be better.
> > 
> > I'm ok with that too ;-)
> > 
> > > > Also, can we have such patches (and the like of "drm/i915:
> > > > for_each_ring") pushed early when everyone is happy with them, they
> > > > cause constant rebasing pain.
> > >
> > > I second that motion!
> > 
> > Fully agreed - as soon as we have a rough sketch of where we want to go to I'll
> > pull in the rename. Aside I highly suggest to do the rename with coccinelle and
> > regerate it on rebases - that's much less error-prone than doing it by hand.
> > -Daniel
> 
> I propose the following code refactoring at a minimum. Even if I
> abstract away all the "i915_gem_context.c" and "intel_ringbuffer.c"
> functionality, and part of "i915_gem_execbuffer.c", to keep changes to
> legacy code to a minimum, I still think the following changes are good
> for the overall code:
> 
> 1)	s/intel_ring_buffer/intel_engine_cs
> 
> Straight renaming: if the actual ring buffers can live either inside the
> engine/ring (legacy ringbuffer submission) or inside the context
> (execlists), it doesn´t make sense that the engine/ring is called
> "intel_ring_buffer".

Ack. Like I've said probably best done with coccinelle to cut down on
potential mistakes. One thing this provokes is the obj->ring pointers we
use all over the place. But I guess we can fix those up later on once this
all has settled.

> 2)	Split the ringbuffers and the rings
> 
> New struct:
> 
> +struct intel_ringbuffer {
> +       struct drm_i915_gem_object *obj;
> +       void __iomem *virtual_start;
> +
> +       u32 head;
> +       u32 tail;
> +       int space;
> +       int size;
> +       int effective_size;
> +
> +       /** We track the position of the requests in the ring buffer, and
> +        * when each is retired we increment last_retired_head as the GPU
> +        * must have finished processing the request and so we know we
> +        * can advance the ringbuffer up to that position.
> +        *
> +        * last_retired_head is set to -1 after the value is consumed so
> +        * we can detect new retirements.
> +        */
> +       u32 last_retired_head;
> +};
> 
> And "struct intel_engine_cs" now groups all these elements into "buffer":
> 
> -       void            __iomem *virtual_start;
> -       struct          drm_i915_gem_object *obj;
> -       u32             head;
> -       u32             tail;
> -       int             space;
> -       int             size;
> -       int             effective_size;
> -       u32             last_retired_head;
> +       struct intel_ringbuffer buffer;

Is the idea to embed this new intel_ringbuffer struct into the lrc context
structure so that we can share a bit of the low-level frobbing? I wonder
whether we should switch right away to a pointer and leave the
engine_cs->ring pointer NULL for lrc. That would be good for catching bugs
where we accidentally mix up old and new styles. If you agree that a
engine_cs->ring pointer would fit your design this is acked.

Btw for such code design issues I usually refer to Rusty's api design
manifesto:

http://sweng.the-davies.net/Home/rustys-api-design-manifesto

Oopsing at runtime is still considerid a bad level, but much better than
silently failing.

Again I recommend to look into coccinelle for the sed part of this.

> 3)	Introduce one context backing object per engine
> 
> -       struct drm_i915_gem_object *obj;
> +       struct {
> +               struct drm_i915_gem_object *obj;
> +       } engine[I915_NUM_RINGS];
> 
> Legacy code only ever uses engine[RCS], so I can use it everywhere in the existing code.

Unsure about this. If I understand this correctly we only need to be able
to support multiple backing objects for the same logical ring object (i.e.
what is tracked by struct i915_hw_context) for the implicit per-filp
context 0. But our handling of context id 0 is already magical, so we
might as well add a bit more magic and shovel this array into the filp
data structure:

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 108e1ec2fa4b..e34db43dead3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1825,7 +1825,9 @@ struct drm_i915_file_private {
 	} mm;
 	struct idr context_idr;
 
-	struct i915_hw_context *private_default_ctx;
+	/* default context for each ring, NULL if hw doesn't support hw contexts
+	 * (or fancy new lrcs) on that ring. */
+	struct i915_hw_context *private_default_ctx[I915_NUM_RINGS];
 	atomic_t rps_wait_boost;
 };
 
Of course we need to add an i915_hw_context->engine_cs pointer and we need
to check that at execbuf to make sure we don't run contexts on the wrong
engine.

If we later on decide that we want to expose multiple hw contexts for !RCS
to userspace we can easily add a bunch of ring flags to the context create
ioctl. So this doesn't restrict us at all in the features we can support
without jumping through hoops.

Also if we'd shovel all per-ring lrcs into the same i915_hw_context
structure then we'd need to rename that and drop the _hw part - it's no
longer a 1:1 correspondence to an actual hw ring/context/lrc/whatever
wizzbang thingy.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 12:20             ` Daniel Vetter
@ 2014-05-19 13:41               ` Mateo Lozano, Oscar
  2014-05-19 13:52                 ` Daniel Vetter
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-19 13:41 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Monday, May 19, 2014 1:21 PM
> To: Mateo Lozano, Oscar
> Cc: Daniel Vetter; Lespiau, Damien; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> s/intel_ring_buffer/intel_engine
> 
> On Mon, May 19, 2014 at 10:02:07AM +0000, Mateo Lozano, Oscar wrote:
> > Hi Daniel,
> >
> > > -----Original Message-----
> > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > Daniel Vetter
> > > Sent: Thursday, May 15, 2014 9:52 PM
> > > To: Mateo Lozano, Oscar
> > > Cc: Lespiau, Damien; Daniel Vetter; intel-gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > > s/intel_ring_buffer/intel_engine
> > >
> > > On Thu, May 15, 2014 at 02:17:23PM +0000, Mateo Lozano, Oscar wrote:
> > > > > -----Original Message-----
> > > > > From: Lespiau, Damien
> > > > > Sent: Wednesday, May 14, 2014 2:26 PM
> > > > > To: Daniel Vetter
> > > > > Cc: Mateo Lozano, Oscar; intel-gfx@lists.freedesktop.org
> > > > > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > > > > s/intel_ring_buffer/intel_engine
> > > > >
> > > > > On Tue, May 13, 2014 at 03:28:27PM +0200, Daniel Vetter wrote:
> > > > > > On Fri, May 09, 2014 at 01:08:36PM +0100,
> > > > > > oscar.mateo@intel.com
> > > wrote:
> > > > > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > > > > >
> > > > > > > In the upcoming patches, we plan to break the correlation
> > > > > > > between engines (a.k.a. rings) and ringbuffers, so it makes
> > > > > > > sense to refactor the code and make the change obvious.
> > > > > > >
> > > > > > > No functional changes.
> > > > > > >
> > > > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > > >
> > > > > > If we rename stuff I'd vote for something close to Bspec
> > > > > > language, like CS. So maybe intel_cs_engine?
> > > >
> > > > Bikeshedding much, are we? :)
> > > > If we want to get closer to bspecish, intel_engine_cs would be better.
> > >
> > > I'm ok with that too ;-)
> > >
> > > > > Also, can we have such patches (and the like of "drm/i915:
> > > > > for_each_ring") pushed early when everyone is happy with them,
> > > > > they cause constant rebasing pain.
> > > >
> > > > I second that motion!
> > >
> > > Fully agreed - as soon as we have a rough sketch of where we want to
> > > go to I'll pull in the rename. Aside I highly suggest to do the
> > > rename with coccinelle and regerate it on rebases - that's much less error-
> prone than doing it by hand.
> > > -Daniel
> >
> > I propose the following code refactoring at a minimum. Even if I
> > abstract away all the "i915_gem_context.c" and "intel_ringbuffer.c"
> > functionality, and part of "i915_gem_execbuffer.c", to keep changes to
> > legacy code to a minimum, I still think the following changes are good
> > for the overall code:
> >
> > 1)	s/intel_ring_buffer/intel_engine_cs
> >
> > Straight renaming: if the actual ring buffers can live either inside
> > the engine/ring (legacy ringbuffer submission) or inside the context
> > (execlists), it doesn´t make sense that the engine/ring is called
> > "intel_ring_buffer".
> 
> Ack. Like I've said probably best done with coccinelle to cut down on potential
> mistakes. One thing this provokes is the obj->ring pointers we use all over the
> place. But I guess we can fix those up later on once this all has settled.

ACK, FIN

> > 2)	Split the ringbuffers and the rings
> >
> > New struct:
> >
> > +struct intel_ringbuffer {
> > +       struct drm_i915_gem_object *obj;
> > +       void __iomem *virtual_start;
> > +
> > +       u32 head;
> > +       u32 tail;
> > +       int space;
> > +       int size;
> > +       int effective_size;
> > +
> > +       /** We track the position of the requests in the ring buffer, and
> > +        * when each is retired we increment last_retired_head as the GPU
> > +        * must have finished processing the request and so we know we
> > +        * can advance the ringbuffer up to that position.
> > +        *
> > +        * last_retired_head is set to -1 after the value is consumed so
> > +        * we can detect new retirements.
> > +        */
> > +       u32 last_retired_head;
> > +};
> >
> > And "struct intel_engine_cs" now groups all these elements into "buffer":
> >
> > -       void            __iomem *virtual_start;
> > -       struct          drm_i915_gem_object *obj;
> > -       u32             head;
> > -       u32             tail;
> > -       int             space;
> > -       int             size;
> > -       int             effective_size;
> > -       u32             last_retired_head;
> > +       struct intel_ringbuffer buffer;
> 
> Is the idea to embed this new intel_ringbuffer struct into the lrc context
> structure so that we can share a bit of the low-level frobbing?

Yes, that is the idea.

> I wonder
> whether we should switch right away to a pointer and leave the engine_cs-
> >ring pointer NULL for lrc. That would be good for catching bugs where we
> accidentally mix up old and new styles. If you agree that a engine_cs->ring
> pointer would fit your design this is acked.
> Btw for such code design issues I usually refer to Rusty's api design
> manifesto:
> 
> http://sweng.the-davies.net/Home/rustys-api-design-manifesto
> 
> Oopsing at runtime is still considerid a bad level, but much better than silently
> failing.
> 
> Again I recommend to look into coccinelle for the sed part of this.

Ok, "struct intel_ringbuffer *buffer;" it is, then (NULL when we use LRCs).

> > 3)	Introduce one context backing object per engine
> >
> > -       struct drm_i915_gem_object *obj;
> > +       struct {
> > +               struct drm_i915_gem_object *obj;
> > +       } engine[I915_NUM_RINGS];
> >
> > Legacy code only ever uses engine[RCS], so I can use it everywhere in the
> existing code.
> 
> Unsure about this. If I understand this correctly we only need to be able to
> support multiple backing objects for the same logical ring object (i.e.
> what is tracked by struct i915_hw_context) for the implicit per-filp context 0.
> But our handling of context id 0 is already magical, so we might as well add a
> bit more magic and shovel this array into the filp data structure:
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index 108e1ec2fa4b..e34db43dead3
> 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1825,7 +1825,9 @@ struct drm_i915_file_private {
>  	} mm;
>  	struct idr context_idr;
> 
> -	struct i915_hw_context *private_default_ctx;
> +	/* default context for each ring, NULL if hw doesn't support hw
> contexts
> +	 * (or fancy new lrcs) on that ring. */
> +	struct i915_hw_context *private_default_ctx[I915_NUM_RINGS];
>  	atomic_t rps_wait_boost;
>  };
> 
> Of course we need to add an i915_hw_context->engine_cs pointer and we need
> to check that at execbuf to make sure we don't run contexts on the wrong
> engine.
> If we later on decide that we want to expose multiple hw contexts for !RCS to
> userspace we can easily add a bunch of ring flags to the context create ioctl. So
> this doesn't restrict us at all in the features we can support without jumping
> through hoops.
> 
> Also if we'd shovel all per-ring lrcs into the same i915_hw_context structure
> then we'd need to rename that and drop the _hw part - it's no longer a 1:1
> correspondence to an actual hw ring/context/lrc/whatever wizzbang thingy.

Ok, so we create I915_NUM_RINGS contexts for the global default contexts, plus I915_NUM_RINGS contexts for every filp and 1 render context for every create ioctl.
But the magic stuff is going to pop out in many more places: I cannot idr_alloc/idr_find for the per-filp contexts, because all of them cannot have  ctx->id = DEFAULT_CONTEXT_ID at the same time (I´ll have to special-case them by using dev_priv->private_default_ctx[RING] to find them). Of course, if you prefer, I can abstract away most of the functionality in i915_gem_context.c and make sure this kind magic is only done for the LRC path (similar to what you propose to do with intel_ringbuffer.c).

-- Oscar

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 13:41               ` Mateo Lozano, Oscar
@ 2014-05-19 13:52                 ` Daniel Vetter
  2014-05-19 14:43                   ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-05-19 13:52 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Mon, May 19, 2014 at 3:41 PM, Mateo Lozano, Oscar
<oscar.mateo@intel.com> wrote:
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h index 108e1ec2fa4b..e34db43dead3
>> 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1825,7 +1825,9 @@ struct drm_i915_file_private {
>>       } mm;
>>       struct idr context_idr;
>>
>> -     struct i915_hw_context *private_default_ctx;
>> +     /* default context for each ring, NULL if hw doesn't support hw
>> contexts
>> +      * (or fancy new lrcs) on that ring. */
>> +     struct i915_hw_context *private_default_ctx[I915_NUM_RINGS];
>>       atomic_t rps_wait_boost;
>>  };
>>
>> Of course we need to add an i915_hw_context->engine_cs pointer and we need
>> to check that at execbuf to make sure we don't run contexts on the wrong
>> engine.
>> If we later on decide that we want to expose multiple hw contexts for !RCS to
>> userspace we can easily add a bunch of ring flags to the context create ioctl. So
>> this doesn't restrict us at all in the features we can support without jumping
>> through hoops.
>>
>> Also if we'd shovel all per-ring lrcs into the same i915_hw_context structure
>> then we'd need to rename that and drop the _hw part - it's no longer a 1:1
>> correspondence to an actual hw ring/context/lrc/whatever wizzbang thingy.
>
> Ok, so we create I915_NUM_RINGS contexts for the global default contexts, plus I915_NUM_RINGS contexts for every filp and 1 render context for every create ioctl.
> But the magic stuff is going to pop out in many more places: I cannot idr_alloc/idr_find for the per-filp contexts, because all of them cannot have  ctx->id = DEFAULT_CONTEXT_ID at the same time (I´ll have to special-case them by using dev_priv->private_default_ctx[RING] to find them). Of course, if you prefer, I can abstract away most of the functionality in i915_gem_context.c and make sure this kind magic is only done for the LRC path (similar to what you propose to do with intel_ringbuffer.c).

Argh, forgotten about the pageflips again. But for those we already
need some other context pointer, and thus far we've only supported
ring-switching on one ring (well, almost everywhere at least). Since
the mmio base pageflip patch seems mostly ready I think we could just
merge that one first and then forget about ring-based pageflips for
execlists. Way too much pain to be worth it really ;-)

For the default context special-casing I've somehow though we
special-case that in the lookup code. But the code in there is a bit
convoluted, so a bit of tidying up (and shoveling more of the checking
and lookup logic into i915_gem_context.c) can't hurt really. Also we
seem to lack error checking for the creation of the default context.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 13:52                 ` Daniel Vetter
@ 2014-05-19 14:43                   ` Mateo Lozano, Oscar
  2014-05-19 15:11                     ` Daniel Vetter
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-19 14:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

> -----Original Message-----
> From: daniel.vetter@ffwll.ch [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> Daniel Vetter
> Sent: Monday, May 19, 2014 2:53 PM
> To: Mateo Lozano, Oscar
> Cc: Lespiau, Damien; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> s/intel_ring_buffer/intel_engine
> 
> On Mon, May 19, 2014 at 3:41 PM, Mateo Lozano, Oscar
> <oscar.mateo@intel.com> wrote:
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> >> b/drivers/gpu/drm/i915/i915_drv.h index 108e1ec2fa4b..e34db43dead3
> >> 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -1825,7 +1825,9 @@ struct drm_i915_file_private {
> >>       } mm;
> >>       struct idr context_idr;
> >>
> >> -     struct i915_hw_context *private_default_ctx;
> >> +     /* default context for each ring, NULL if hw doesn't support hw
> >> contexts
> >> +      * (or fancy new lrcs) on that ring. */
> >> +     struct i915_hw_context *private_default_ctx[I915_NUM_RINGS];
> >>       atomic_t rps_wait_boost;
> >>  };
> >>
> >> Of course we need to add an i915_hw_context->engine_cs pointer and we
> >> need to check that at execbuf to make sure we don't run contexts on
> >> the wrong engine.
> >> If we later on decide that we want to expose multiple hw contexts for
> >> !RCS to userspace we can easily add a bunch of ring flags to the
> >> context create ioctl. So this doesn't restrict us at all in the
> >> features we can support without jumping through hoops.
> >>
> >> Also if we'd shovel all per-ring lrcs into the same i915_hw_context
> >> structure then we'd need to rename that and drop the _hw part - it's
> >> no longer a 1:1 correspondence to an actual hw ring/context/lrc/whatever
> wizzbang thingy.
> >
> > Ok, so we create I915_NUM_RINGS contexts for the global default contexts,
> plus I915_NUM_RINGS contexts for every filp and 1 render context for every
> create ioctl.
> > But the magic stuff is going to pop out in many more places: I cannot
> idr_alloc/idr_find for the per-filp contexts, because all of them cannot have
> ctx->id = DEFAULT_CONTEXT_ID at the same time (I´ll have to special-case
> them by using dev_priv->private_default_ctx[RING] to find them). Of course, if
> you prefer, I can abstract away most of the functionality in i915_gem_context.c
> and make sure this kind magic is only done for the LRC path (similar to what
> you propose to do with intel_ringbuffer.c).
> 
> Argh, forgotten about the pageflips again. But for those we already need some
> other context pointer, and thus far we've only supported ring-switching on one
> ring (well, almost everywhere at least). Since the mmio base pageflip patch
> seems mostly ready I think we could just merge that one first and then forget
> about ring-based pageflips for execlists. Way too much pain to be worth it
> really ;-)

Sound like a plan :)

> For the default context special-casing I've somehow though we special-case
> that in the lookup code. But the code in there is a bit convoluted, so a bit of
> tidying up (and shoveling more of the checking and lookup logic into
> i915_gem_context.c) can't hurt really. Also we seem to lack error checking for
> the creation of the default context.

Nope, we don´t special case the per-filp default context search: it uses an idr_find, same as the others. Actually, I don´t really see why private_default_ctx is needed at all in the current code?

So, for the per-filp default contexts:

+     struct i915_hw_context *private_default_ctx[I915_NUM_RINGS];

and we special-case the hell out of them?
for legacy and execlists code, or do you want to abstract i915_gem_context.c away as well?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 14:43                   ` Mateo Lozano, Oscar
@ 2014-05-19 15:11                     ` Daniel Vetter
  2014-05-19 15:26                       ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-05-19 15:11 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Mon, May 19, 2014 at 02:43:05PM +0000, Mateo Lozano, Oscar wrote:
> > -----Original Message-----
> > From: daniel.vetter@ffwll.ch [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > Daniel Vetter
> > Sent: Monday, May 19, 2014 2:53 PM
> > To: Mateo Lozano, Oscar
> > Cc: Lespiau, Damien; intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > s/intel_ring_buffer/intel_engine
> > 
> > On Mon, May 19, 2014 at 3:41 PM, Mateo Lozano, Oscar
> > <oscar.mateo@intel.com> wrote:
> > >> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > >> b/drivers/gpu/drm/i915/i915_drv.h index 108e1ec2fa4b..e34db43dead3
> > >> 100644
> > >> --- a/drivers/gpu/drm/i915/i915_drv.h
> > >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> > >> @@ -1825,7 +1825,9 @@ struct drm_i915_file_private {
> > >>       } mm;
> > >>       struct idr context_idr;
> > >>
> > >> -     struct i915_hw_context *private_default_ctx;
> > >> +     /* default context for each ring, NULL if hw doesn't support hw
> > >> contexts
> > >> +      * (or fancy new lrcs) on that ring. */
> > >> +     struct i915_hw_context *private_default_ctx[I915_NUM_RINGS];
> > >>       atomic_t rps_wait_boost;
> > >>  };
> > >>
> > >> Of course we need to add an i915_hw_context->engine_cs pointer and we
> > >> need to check that at execbuf to make sure we don't run contexts on
> > >> the wrong engine.
> > >> If we later on decide that we want to expose multiple hw contexts for
> > >> !RCS to userspace we can easily add a bunch of ring flags to the
> > >> context create ioctl. So this doesn't restrict us at all in the
> > >> features we can support without jumping through hoops.
> > >>
> > >> Also if we'd shovel all per-ring lrcs into the same i915_hw_context
> > >> structure then we'd need to rename that and drop the _hw part - it's
> > >> no longer a 1:1 correspondence to an actual hw ring/context/lrc/whatever
> > wizzbang thingy.
> > >
> > > Ok, so we create I915_NUM_RINGS contexts for the global default contexts,
> > plus I915_NUM_RINGS contexts for every filp and 1 render context for every
> > create ioctl.
> > > But the magic stuff is going to pop out in many more places: I cannot
> > idr_alloc/idr_find for the per-filp contexts, because all of them cannot have
> > ctx->id = DEFAULT_CONTEXT_ID at the same time (I´ll have to special-case
> > them by using dev_priv->private_default_ctx[RING] to find them). Of course, if
> > you prefer, I can abstract away most of the functionality in i915_gem_context.c
> > and make sure this kind magic is only done for the LRC path (similar to what
> > you propose to do with intel_ringbuffer.c).
> > 
> > Argh, forgotten about the pageflips again. But for those we already need some
> > other context pointer, and thus far we've only supported ring-switching on one
> > ring (well, almost everywhere at least). Since the mmio base pageflip patch
> > seems mostly ready I think we could just merge that one first and then forget
> > about ring-based pageflips for execlists. Way too much pain to be worth it
> > really ;-)
> 
> Sound like a plan :)
> 
> > For the default context special-casing I've somehow though we special-case
> > that in the lookup code. But the code in there is a bit convoluted, so a bit of
> > tidying up (and shoveling more of the checking and lookup logic into
> > i915_gem_context.c) can't hurt really. Also we seem to lack error checking for
> > the creation of the default context.
> 
> Nope, we don´t special case the per-filp default context search: it uses an idr_find, same as the others. Actually, I don´t really see why private_default_ctx is needed at all in the current code?
> 
> So, for the per-filp default contexts:
> 
> +     struct i915_hw_context *private_default_ctx[I915_NUM_RINGS];
> 
> and we special-case the hell out of them?
> for legacy and execlists code, or do you want to abstract i915_gem_context.c away as well?

I think special-casing the i915_gem_context_get function for the default
context and using private_default_ctx a bit more sounds good. We need to
adjust the idr allocator a bit though to reserve 0, and a bit of frobbing
in the context create code.

Wrt ctx abstraction I think separate functions for execlist/legacy
contexts should be good enough. The lookup/create/destroy logic should
carry over.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 15:11                     ` Daniel Vetter
@ 2014-05-19 15:26                       ` Mateo Lozano, Oscar
  2014-05-19 15:49                         ` Daniel Vetter
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-19 15:26 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Monday, May 19, 2014 4:12 PM
> To: Mateo Lozano, Oscar
> Cc: Daniel Vetter; Lespiau, Damien; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> s/intel_ring_buffer/intel_engine
> 
> On Mon, May 19, 2014 at 02:43:05PM +0000, Mateo Lozano, Oscar wrote:
> > > -----Original Message-----
> > > From: daniel.vetter@ffwll.ch [mailto:daniel.vetter@ffwll.ch] On
> > > Behalf Of Daniel Vetter
> > > Sent: Monday, May 19, 2014 2:53 PM
> > > To: Mateo Lozano, Oscar
> > > Cc: Lespiau, Damien; intel-gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > > s/intel_ring_buffer/intel_engine
> > >
> > > On Mon, May 19, 2014 at 3:41 PM, Mateo Lozano, Oscar
> > > <oscar.mateo@intel.com> wrote:
> > > >> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > >> b/drivers/gpu/drm/i915/i915_drv.h index
> > > >> 108e1ec2fa4b..e34db43dead3
> > > >> 100644
> > > >> --- a/drivers/gpu/drm/i915/i915_drv.h
> > > >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > >> @@ -1825,7 +1825,9 @@ struct drm_i915_file_private {
> > > >>       } mm;
> > > >>       struct idr context_idr;
> > > >>
> > > >> -     struct i915_hw_context *private_default_ctx;
> > > >> +     /* default context for each ring, NULL if hw doesn't
> > > >> + support hw
> > > >> contexts
> > > >> +      * (or fancy new lrcs) on that ring. */
> > > >> +     struct i915_hw_context
> > > >> + *private_default_ctx[I915_NUM_RINGS];
> > > >>       atomic_t rps_wait_boost;
> > > >>  };
> > > >>
> > > >> Of course we need to add an i915_hw_context->engine_cs pointer
> > > >> and we need to check that at execbuf to make sure we don't run
> > > >> contexts on the wrong engine.
> > > >> If we later on decide that we want to expose multiple hw contexts
> > > >> for !RCS to userspace we can easily add a bunch of ring flags to
> > > >> the context create ioctl. So this doesn't restrict us at all in
> > > >> the features we can support without jumping through hoops.
> > > >>
> > > >> Also if we'd shovel all per-ring lrcs into the same
> > > >> i915_hw_context structure then we'd need to rename that and drop
> > > >> the _hw part - it's no longer a 1:1 correspondence to an actual
> > > >> hw ring/context/lrc/whatever
> > > wizzbang thingy.
> > > >
> > > > Ok, so we create I915_NUM_RINGS contexts for the global default
> > > > contexts,
> > > plus I915_NUM_RINGS contexts for every filp and 1 render context for
> > > every create ioctl.
> > > > But the magic stuff is going to pop out in many more places: I
> > > > cannot
> > > idr_alloc/idr_find for the per-filp contexts, because all of them
> > > cannot have
> > > ctx->id = DEFAULT_CONTEXT_ID at the same time (I´ll have to
> > > ctx->special-case
> > > them by using dev_priv->private_default_ctx[RING] to find them). Of
> > > course, if you prefer, I can abstract away most of the functionality
> > > in i915_gem_context.c and make sure this kind magic is only done for
> > > the LRC path (similar to what you propose to do with intel_ringbuffer.c).
> > >
> > > Argh, forgotten about the pageflips again. But for those we already
> > > need some other context pointer, and thus far we've only supported
> > > ring-switching on one ring (well, almost everywhere at least). Since
> > > the mmio base pageflip patch seems mostly ready I think we could
> > > just merge that one first and then forget about ring-based pageflips
> > > for execlists. Way too much pain to be worth it really ;-)
> >
> > Sound like a plan :)
> >
> > > For the default context special-casing I've somehow though we
> > > special-case that in the lookup code. But the code in there is a bit
> > > convoluted, so a bit of tidying up (and shoveling more of the
> > > checking and lookup logic into
> > > i915_gem_context.c) can't hurt really. Also we seem to lack error
> > > checking for the creation of the default context.
> >
> > Nope, we don´t special case the per-filp default context search: it uses an
> idr_find, same as the others. Actually, I don´t really see why
> private_default_ctx is needed at all in the current code?
> >
> > So, for the per-filp default contexts:
> >
> > +     struct i915_hw_context *private_default_ctx[I915_NUM_RINGS];
> >
> > and we special-case the hell out of them?
> > for legacy and execlists code, or do you want to abstract i915_gem_context.c
> away as well?
> 
> I think special-casing the i915_gem_context_get function for the default
> context and using private_default_ctx a bit more sounds good. We need to
> adjust the idr allocator a bit though to reserve 0, and a bit of frobbing in the
> context create code.

Ok, no problem. I´ll send an early patch that uses private_default_ctx to do the special casing (no functionality change) together with the other refactoring patches.

> Wrt ctx abstraction I think separate functions for execlist/legacy contexts
> should be good enough. The lookup/create/destroy logic should carry over.

Including the creation of I915_NUM_RINGS contexts per-filp? do you want that to happen for the legacy case as well? This implies a number of other changes, like struct intel_engine *last_ring not making sense anymore and frobbing i915_gem_create_context so that we can create more than one context with the same ppgtt...

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 15:26                       ` Mateo Lozano, Oscar
@ 2014-05-19 15:49                         ` Daniel Vetter
  2014-05-19 16:12                           ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-05-19 15:49 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Mon, May 19, 2014 at 03:26:09PM +0000, Mateo Lozano, Oscar wrote:
> > I think special-casing the i915_gem_context_get function for the default
> > context and using private_default_ctx a bit more sounds good. We need to
> > adjust the idr allocator a bit though to reserve 0, and a bit of frobbing in the
> > context create code.
> 
> Ok, no problem. I´ll send an early patch that uses private_default_ctx
> to do the special casing (no functionality change) together with the
> other refactoring patches.
> 
> > Wrt ctx abstraction I think separate functions for execlist/legacy contexts
> > should be good enough. The lookup/create/destroy logic should carry over.
> 
> Including the creation of I915_NUM_RINGS contexts per-filp? do you want
> that to happen for the legacy case as well? This implies a number of
> other changes, like struct intel_engine *last_ring not making sense
> anymore and frobbing i915_gem_create_context so that we can create more
> than one context with the same ppgtt...

Damn, I didn't think this through. ctx->hang_stats already tracks random
stuff for logical contexts and we only have one of those for the implicit
per-filp default context. Still, allocating backing storage for the big
execlist contexts looks really wasteful since a lot of userspace will just
use 1 ring (e.g. mesa or opencl).

Also we've decided to at least for now support vcs2 implicitly, so
I915_NUM_RINGS is kinda already the wrong thing to use (presuming we can
indeed switch hw contexts between vcs and vcs2).

Otoh adding fake contexts for the legacy case also feels ugly.

So maybe we should rename i915_hw_context to i915_context and figure out a
sane design for the differences later on. Same option would be to embedded
struct i915_context into a i915_legacy_rcs_context and
i915_execlist_context structures, but meh.

So now I think we should:
a) s/i915_hw_context/i915_context/ since its long past a 1:1 relationship
with hw
b) create a union in i915_context for legacy and execlists contexts.
Legacy contexts would just have the single gem bo backing storage needed
for rcs, execlists contexts would have an array of rings plus the backing
storage (since the ring is just embedded in there).

How does this sound?

If we decide later on that the union is too ugly and we want a cleaner
split, we can do the usual subclassing later on.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 15:49                         ` Daniel Vetter
@ 2014-05-19 16:12                           ` Mateo Lozano, Oscar
  2014-05-19 16:24                             ` Volkin, Bradley D
  2014-05-20  8:11                             ` Daniel Vetter
  0 siblings, 2 replies; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-19 16:12 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx



---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47


> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Monday, May 19, 2014 4:50 PM
> To: Mateo Lozano, Oscar
> Cc: Daniel Vetter; Lespiau, Damien; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> s/intel_ring_buffer/intel_engine
> 
> On Mon, May 19, 2014 at 03:26:09PM +0000, Mateo Lozano, Oscar wrote:
> > > I think special-casing the i915_gem_context_get function for the
> > > default context and using private_default_ctx a bit more sounds
> > > good. We need to adjust the idr allocator a bit though to reserve 0,
> > > and a bit of frobbing in the context create code.
> >
> > Ok, no problem. I´ll send an early patch that uses private_default_ctx
> > to do the special casing (no functionality change) together with the
> > other refactoring patches.
> >
> > > Wrt ctx abstraction I think separate functions for execlist/legacy
> > > contexts should be good enough. The lookup/create/destroy logic should
> carry over.
> >
> > Including the creation of I915_NUM_RINGS contexts per-filp? do you
> > want that to happen for the legacy case as well? This implies a number
> > of other changes, like struct intel_engine *last_ring not making sense
> > anymore and frobbing i915_gem_create_context so that we can create
> > more than one context with the same ppgtt...
> 
> Damn, I didn't think this through. ctx->hang_stats already tracks random stuff
> for logical contexts and we only have one of those for the implicit per-filp
> default context. Still, allocating backing storage for the big execlist contexts
> looks really wasteful since a lot of userspace will just use 1 ring (e.g. mesa or
> opencl).
> 
> Also we've decided to at least for now support vcs2 implicitly, so
> I915_NUM_RINGS is kinda already the wrong thing to use (presuming we can
> indeed switch hw contexts between vcs and vcs2).
> 
> Otoh adding fake contexts for the legacy case also feels ugly.
> 
> So maybe we should rename i915_hw_context to i915_context and figure out a
> sane design for the differences later on. Same option would be to embedded
> struct i915_context into a i915_legacy_rcs_context and i915_execlist_context
> structures, but meh.
> 
> So now I think we should:
> a) s/i915_hw_context/i915_context/ since its long past a 1:1 relationship with
> hw
> b) create a union in i915_context for legacy and execlists contexts.
> Legacy contexts would just have the single gem bo backing storage needed for
> rcs, execlists contexts would have an array of rings plus the backing storage
> (since the ring is just embedded in there).
> 
> How does this sound?

Sounds good: other than using a union, my latest patchset already looked like this :)
(and this means bringing back from life the deferred ctx creation idea...)

Summarizing the entire conversation, we have agreed to three refactoring patches:

A) s/intel_ring_buffer/intel_engine_cs/
B) Split the ringbuffers and the rings (the ringbuffer lives inside the ring thanks to a pointer, that will be NULL for the LRC case).
C) s/i915_hw_context/i915_context/

Plus an LRC-specific patch:

D) Introduce one context backing object per engine (using a union that leaves one backing object for the legacy case and an array of backing objects and ringbuffers for the LRC case).

BTW: do you want me to kill private_default_ctx as well? It doesn´t look very useful...

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 16:12                           ` Mateo Lozano, Oscar
@ 2014-05-19 16:24                             ` Volkin, Bradley D
  2014-05-19 16:33                               ` Mateo Lozano, Oscar
  2014-05-20  8:11                             ` Daniel Vetter
  1 sibling, 1 reply; 94+ messages in thread
From: Volkin, Bradley D @ 2014-05-19 16:24 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Mon, May 19, 2014 at 09:12:26AM -0700, Mateo Lozano, Oscar wrote:
> BTW: do you want me to kill private_default_ctx as well? It doesn´t look very useful...

Isn't private_default_ctx the one that's actually used when userspace
specifies DEFAULT_CONTEXT_ID?

Brad

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 16:24                             ` Volkin, Bradley D
@ 2014-05-19 16:33                               ` Mateo Lozano, Oscar
  2014-05-19 16:40                                 ` Volkin, Bradley D
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-19 16:33 UTC (permalink / raw)
  To: Volkin, Bradley D; +Cc: intel-gfx

> -----Original Message-----
> From: Volkin, Bradley D
> Sent: Monday, May 19, 2014 5:24 PM
> To: Mateo Lozano, Oscar
> Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> s/intel_ring_buffer/intel_engine
> 
> On Mon, May 19, 2014 at 09:12:26AM -0700, Mateo Lozano, Oscar wrote:
> > BTW: do you want me to kill private_default_ctx as well? It doesn´t look very
> useful...
> 
> Isn't private_default_ctx the one that's actually used when userspace specifies
> DEFAULT_CONTEXT_ID?

What I see is a normal idr_find:

struct i915_hw_context *
i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
{
	struct i915_hw_context *ctx;

	ctx = (struct i915_hw_context *)idr_find(&file_priv->context_idr, id);
	if (!ctx)
		return ERR_PTR(-ENOENT);

	return ctx;
}

I think Chris has almost killed it off completely:

commit 691e6415c891b8b2b082a120b896b443531c4d45
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Apr 9 09:07:36 2014 +0100

    drm/i915: Always use kref tracking for all contexts.
    
    If we always initialize kref for the context, even if we are using fake
    contexts for hangstats when there is no hw support, we can forgo the
    dance to dereference the ctx->obj and inspect whether we are permitted
    to use kref inside i915_gem_context_reference() and _unreference().
    
    My ulterior motive here is to improve the debugging of a use-after-free
    of ctx->obj. This patch avoids the dereference here and instead forces
    the assertion checks associated with kref.
    
    v2: Refactor the fake contexts to being even more like the real
    contexts, so that there is much less duplicated and special case code.
    
    v3: Tweaks.
    v4: Tweaks, minor.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 02/50] drm/i915: for_each_ring
  2014-05-09 12:08 ` [PATCH 02/50] drm/i915: for_each_ring oscar.mateo
  2014-05-13 13:25   ` Daniel Vetter
@ 2014-05-19 16:33   ` Volkin, Bradley D
  2014-05-19 16:36     ` Mateo Lozano, Oscar
  1 sibling, 1 reply; 94+ messages in thread
From: Volkin, Bradley D @ 2014-05-19 16:33 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx, Ben Widawsky, Widawsky, Benjamin

On Fri, May 09, 2014 at 05:08:32AM -0700, oscar.mateo@intel.com wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> for_each_ring() iterates over all rings supported by the hardware, not
> just those which have been initialized as in for_each_active_ring()

I think we should give this a new name; something like for_each_supported_ring.
My concern is that, with all of the patches in flight, we'll merge something
that uses for_each_ring when it should have been changed to for_each_active_ring.
Better that such a patch not even compile.

Brad

> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Acked-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a53a028..b1725c6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1544,6 +1544,17 @@ static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
>  	return dev->dev_private;
>  }
>  
> +/* NB: Typically you want to use for_each_ring in init code before ringbuffers
> + * are setup, or in debug code. for_each_active_ring is more suited for code
> + * which is dynamically handling active rings, ie. normal code. In most
> + * (currently all cases except on pre-production hardware) for_each_ring will
> + * work even if it's a bad idea to use it - so be careful.
> + */
> +#define for_each_ring(ring__, dev_priv__, i__) \
> +	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
> +		if (((ring__) = &(dev_priv__)->ring[(i__)]), \
> +		    INTEL_INFO((dev_priv__)->dev)->ring_mask & (1<<(i__)))
> +
>  /* Iterate over initialised rings */
>  #define for_each_active_ring(ring__, dev_priv__, i__) \
>  	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 02/50] drm/i915: for_each_ring
  2014-05-19 16:33   ` Volkin, Bradley D
@ 2014-05-19 16:36     ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-19 16:36 UTC (permalink / raw)
  To: Volkin, Bradley D; +Cc: intel-gfx, Ben Widawsky

> -----Original Message-----
> From: Volkin, Bradley D
> Sent: Monday, May 19, 2014 5:34 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org; Ben Widawsky; Widawsky, Benjamin
> Subject: Re: [Intel-gfx] [PATCH 02/50] drm/i915: for_each_ring
> 
> On Fri, May 09, 2014 at 05:08:32AM -0700, oscar.mateo@intel.com wrote:
> > From: Ben Widawsky <benjamin.widawsky@intel.com>
> >
> > for_each_ring() iterates over all rings supported by the hardware, not
> > just those which have been initialized as in for_each_active_ring()
> 
> I think we should give this a new name; something like
> for_each_supported_ring.
> My concern is that, with all of the patches in flight, we'll merge something that
> uses for_each_ring when it should have been changed to for_each_active_ring.
> Better that such a patch not even compile.

I can kill this patch off completely: when we started with Execlists, it simplified a lot of things and made life easier in general, but with each new iteration it just becomes more and more useless...

> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > Acked-by: Oscar Mateo <oscar.mateo@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h index a53a028..b1725c6 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1544,6 +1544,17 @@ static inline struct drm_i915_private
> *to_i915(const struct drm_device *dev)
> >  	return dev->dev_private;
> >  }
> >
> > +/* NB: Typically you want to use for_each_ring in init code before
> > +ringbuffers
> > + * are setup, or in debug code. for_each_active_ring is more suited
> > +for code
> > + * which is dynamically handling active rings, ie. normal code. In
> > +most
> > + * (currently all cases except on pre-production hardware)
> > +for_each_ring will
> > + * work even if it's a bad idea to use it - so be careful.
> > + */
> > +#define for_each_ring(ring__, dev_priv__, i__) \
> > +	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
> > +		if (((ring__) = &(dev_priv__)->ring[(i__)]), \
> > +		    INTEL_INFO((dev_priv__)->dev)->ring_mask & (1<<(i__)))
> > +
> >  /* Iterate over initialised rings */
> >  #define for_each_active_ring(ring__, dev_priv__, i__) \
> >  	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
> > --
> > 1.9.0
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 16:33                               ` Mateo Lozano, Oscar
@ 2014-05-19 16:40                                 ` Volkin, Bradley D
  2014-05-19 16:49                                   ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Volkin, Bradley D @ 2014-05-19 16:40 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Mon, May 19, 2014 at 09:33:37AM -0700, Mateo Lozano, Oscar wrote:
> > -----Original Message-----
> > From: Volkin, Bradley D
> > Sent: Monday, May 19, 2014 5:24 PM
> > To: Mateo Lozano, Oscar
> > Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > s/intel_ring_buffer/intel_engine
> > 
> > On Mon, May 19, 2014 at 09:12:26AM -0700, Mateo Lozano, Oscar wrote:
> > > BTW: do you want me to kill private_default_ctx as well? It doesn´t look very
> > useful...
> > 
> > Isn't private_default_ctx the one that's actually used when userspace specifies
> > DEFAULT_CONTEXT_ID?
> 
> What I see is a normal idr_find:

Right, but i915_gem_context_open() does:
	idr_init(&file_priv->context_idr);
	file_priv->private_default_ctx =
		i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev));

And i915_gem_create_context() calls __create_hw_context(), which does:
	if (file_priv != NULL) {
		ret = idr_alloc(&file_priv->context_idr, ctx,
				DEFAULT_CONTEXT_ID, 0, GFP_KERNEL);
		if (ret < 0)
			goto err_out;
	} else
		ret = DEFAULT_CONTEXT_ID;

So I think the idr_find() should indirectly give us private_default_ctx.

Brad

> 
> struct i915_hw_context *
> i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
> {
> 	struct i915_hw_context *ctx;
> 
> 	ctx = (struct i915_hw_context *)idr_find(&file_priv->context_idr, id);
> 	if (!ctx)
> 		return ERR_PTR(-ENOENT);
> 
> 	return ctx;
> }
> 
> I think Chris has almost killed it off completely:
> 
> commit 691e6415c891b8b2b082a120b896b443531c4d45
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Wed Apr 9 09:07:36 2014 +0100
> 
>     drm/i915: Always use kref tracking for all contexts.
>     
>     If we always initialize kref for the context, even if we are using fake
>     contexts for hangstats when there is no hw support, we can forgo the
>     dance to dereference the ctx->obj and inspect whether we are permitted
>     to use kref inside i915_gem_context_reference() and _unreference().
>     
>     My ulterior motive here is to improve the debugging of a use-after-free
>     of ctx->obj. This patch avoids the dereference here and instead forces
>     the assertion checks associated with kref.
>     
>     v2: Refactor the fake contexts to being even more like the real
>     contexts, so that there is much less duplicated and special case code.
>     
>     v3: Tweaks.
>     v4: Tweaks, minor.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 16:40                                 ` Volkin, Bradley D
@ 2014-05-19 16:49                                   ` Mateo Lozano, Oscar
  2014-05-19 17:00                                     ` Volkin, Bradley D
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-19 16:49 UTC (permalink / raw)
  To: Volkin, Bradley D; +Cc: intel-gfx

> -----Original Message-----
> From: Volkin, Bradley D
> Sent: Monday, May 19, 2014 5:41 PM
> To: Mateo Lozano, Oscar
> Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> s/intel_ring_buffer/intel_engine
> 
> On Mon, May 19, 2014 at 09:33:37AM -0700, Mateo Lozano, Oscar wrote:
> > > -----Original Message-----
> > > From: Volkin, Bradley D
> > > Sent: Monday, May 19, 2014 5:24 PM
> > > To: Mateo Lozano, Oscar
> > > Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > > s/intel_ring_buffer/intel_engine
> > >
> > > On Mon, May 19, 2014 at 09:12:26AM -0700, Mateo Lozano, Oscar wrote:
> > > > BTW: do you want me to kill private_default_ctx as well? It
> > > > doesn´t look very
> > > useful...
> > >
> > > Isn't private_default_ctx the one that's actually used when
> > > userspace specifies DEFAULT_CONTEXT_ID?
> >
> > What I see is a normal idr_find:
> 
> Right, but i915_gem_context_open() does:
> 	idr_init(&file_priv->context_idr);
> 	file_priv->private_default_ctx =
> 		i915_gem_create_context(dev, file_priv,
> USES_FULL_PPGTT(dev));
> 
> And i915_gem_create_context() calls __create_hw_context(), which does:
> 	if (file_priv != NULL) {
> 		ret = idr_alloc(&file_priv->context_idr, ctx,
> 				DEFAULT_CONTEXT_ID, 0, GFP_KERNEL);
> 		if (ret < 0)
> 			goto err_out;
> 	} else
> 		ret = DEFAULT_CONTEXT_ID;
> 
> So I think the idr_find() should indirectly give us private_default_ctx.

Exactly!: why are we keeping file_priv->private_default_ctx then? If you need to get it somewehere, you can simply do idr_find(&file_priv->context_idr, DEFAULT_CONTEXT_ID);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 16:49                                   ` Mateo Lozano, Oscar
@ 2014-05-19 17:00                                     ` Volkin, Bradley D
  0 siblings, 0 replies; 94+ messages in thread
From: Volkin, Bradley D @ 2014-05-19 17:00 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Mon, May 19, 2014 at 09:49:31AM -0700, Mateo Lozano, Oscar wrote:
> > -----Original Message-----
> > From: Volkin, Bradley D
> > Sent: Monday, May 19, 2014 5:41 PM
> > To: Mateo Lozano, Oscar
> > Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > s/intel_ring_buffer/intel_engine
> > 
> > On Mon, May 19, 2014 at 09:33:37AM -0700, Mateo Lozano, Oscar wrote:
> > > > -----Original Message-----
> > > > From: Volkin, Bradley D
> > > > Sent: Monday, May 19, 2014 5:24 PM
> > > > To: Mateo Lozano, Oscar
> > > > Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> > > > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > > > s/intel_ring_buffer/intel_engine
> > > >
> > > > On Mon, May 19, 2014 at 09:12:26AM -0700, Mateo Lozano, Oscar wrote:
> > > > > BTW: do you want me to kill private_default_ctx as well? It
> > > > > doesn´t look very
> > > > useful...
> > > >
> > > > Isn't private_default_ctx the one that's actually used when
> > > > userspace specifies DEFAULT_CONTEXT_ID?
> > >
> > > What I see is a normal idr_find:
> > 
> > Right, but i915_gem_context_open() does:
> > 	idr_init(&file_priv->context_idr);
> > 	file_priv->private_default_ctx =
> > 		i915_gem_create_context(dev, file_priv,
> > USES_FULL_PPGTT(dev));
> > 
> > And i915_gem_create_context() calls __create_hw_context(), which does:
> > 	if (file_priv != NULL) {
> > 		ret = idr_alloc(&file_priv->context_idr, ctx,
> > 				DEFAULT_CONTEXT_ID, 0, GFP_KERNEL);
> > 		if (ret < 0)
> > 			goto err_out;
> > 	} else
> > 		ret = DEFAULT_CONTEXT_ID;
> > 
> > So I think the idr_find() should indirectly give us private_default_ctx.
> 
> Exactly!: why are we keeping file_priv->private_default_ctx then? If you need to get it somewehere, you can simply do idr_find(&file_priv->context_idr, DEFAULT_CONTEXT_ID);

Oh, sorry, I completely misunderstood what you were suggesting. Kill away :)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine
  2014-05-19 16:12                           ` Mateo Lozano, Oscar
  2014-05-19 16:24                             ` Volkin, Bradley D
@ 2014-05-20  8:11                             ` Daniel Vetter
  1 sibling, 0 replies; 94+ messages in thread
From: Daniel Vetter @ 2014-05-20  8:11 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Mon, May 19, 2014 at 04:12:26PM +0000, Mateo Lozano, Oscar wrote:
> 
> 
> ---------------------------------------------------------------------
> Intel Corporation (UK) Limited
> Registered No. 1134945 (England)
> Registered Office: Pipers Way, Swindon SN3 1RJ
> VAT No: 860 2173 47
> 
> 
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> > Sent: Monday, May 19, 2014 4:50 PM
> > To: Mateo Lozano, Oscar
> > Cc: Daniel Vetter; Lespiau, Damien; intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 06/50] drm/i915:
> > s/intel_ring_buffer/intel_engine
> > 
> > On Mon, May 19, 2014 at 03:26:09PM +0000, Mateo Lozano, Oscar wrote:
> > > > I think special-casing the i915_gem_context_get function for the
> > > > default context and using private_default_ctx a bit more sounds
> > > > good. We need to adjust the idr allocator a bit though to reserve 0,
> > > > and a bit of frobbing in the context create code.
> > >
> > > Ok, no problem. I´ll send an early patch that uses private_default_ctx
> > > to do the special casing (no functionality change) together with the
> > > other refactoring patches.
> > >
> > > > Wrt ctx abstraction I think separate functions for execlist/legacy
> > > > contexts should be good enough. The lookup/create/destroy logic should
> > carry over.
> > >
> > > Including the creation of I915_NUM_RINGS contexts per-filp? do you
> > > want that to happen for the legacy case as well? This implies a number
> > > of other changes, like struct intel_engine *last_ring not making sense
> > > anymore and frobbing i915_gem_create_context so that we can create
> > > more than one context with the same ppgtt...
> > 
> > Damn, I didn't think this through. ctx->hang_stats already tracks random stuff
> > for logical contexts and we only have one of those for the implicit per-filp
> > default context. Still, allocating backing storage for the big execlist contexts
> > looks really wasteful since a lot of userspace will just use 1 ring (e.g. mesa or
> > opencl).
> > 
> > Also we've decided to at least for now support vcs2 implicitly, so
> > I915_NUM_RINGS is kinda already the wrong thing to use (presuming we can
> > indeed switch hw contexts between vcs and vcs2).
> > 
> > Otoh adding fake contexts for the legacy case also feels ugly.
> > 
> > So maybe we should rename i915_hw_context to i915_context and figure out a
> > sane design for the differences later on. Same option would be to embedded
> > struct i915_context into a i915_legacy_rcs_context and i915_execlist_context
> > structures, but meh.
> > 
> > So now I think we should:
> > a) s/i915_hw_context/i915_context/ since its long past a 1:1 relationship with
> > hw
> > b) create a union in i915_context for legacy and execlists contexts.
> > Legacy contexts would just have the single gem bo backing storage needed for
> > rcs, execlists contexts would have an array of rings plus the backing storage
> > (since the ring is just embedded in there).
> > 
> > How does this sound?
> 
> Sounds good: other than using a union, my latest patchset already looked like this :)
> (and this means bringing back from life the deferred ctx creation idea...)

Yeah, it took me a while to see the light here ;-) But if we split up the
sw ctx tracking structure a bit from the low-level stuff I think this
indeed makes sense. The lazy context allocation simply irked me while we
still called the structure a _hw_ context. Which at that point it just
isn't any more.
> 
> Summarizing the entire conversation, we have agreed to three refactoring patches:
> 
> A) s/intel_ring_buffer/intel_engine_cs/
> B) Split the ringbuffers and the rings (the ringbuffer lives inside the
> ring thanks to a pointer, that will be NULL for the LRC case).
> C) s/i915_hw_context/i915_context/
> 
> Plus an LRC-specific patch:
> 
> D) Introduce one context backing object per engine (using a union that
> leaves one backing object for the legacy case and an array of backing
> objects and ringbuffers for the LRC case).

Yes, ack on this prep-work refactor plan. And in case I didn't mention it
yet ofc ack for shoveling all the new execlist stuff into intel_lrc.c.

Btw for intel_lrc.c can you please (at the end is probably best) add a
patch to add kerneldoc for non-static functions and pull them into the
i915 guide? Similar to how we've done it for Brad's cmd parser. I think we
need better docs, so best to get started with new features.

> BTW: do you want me to kill private_default_ctx as well? It doesn´t look very useful...

Yeah, might as well throw this on top.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler
  2014-05-09 12:09 ` [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler oscar.mateo
@ 2014-06-11 11:50   ` Daniel Vetter
  2014-06-11 12:01     ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-06-11 11:50 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Fri, May 09, 2014 at 01:09:19PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> If we receive a storm of requests for the same context (see gem_storedw_loop_*)
> we might end up iterating over too many elements in interrupt time, looking for
> contexts to squash together. Instead, share the burden by giving more
> intelligence to the queue function. At most, the interrupt will iterate over
> three elements.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 23 ++++++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index d9edd10..0aad721 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -410,9 +410,11 @@ int gen8_switch_context_queue(struct intel_engine *ring,
>  			      struct i915_hw_context *to,
>  			      u32 tail)
>  {
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct drm_i915_gem_request *req = NULL;
>  	unsigned long flags;
> -	bool was_empty;
> +	struct drm_i915_gem_request *cursor;
> +	int num_elements = 0;
>  
>  	req = kzalloc(sizeof(*req), GFP_KERNEL);
>  	if (req == NULL)
> @@ -425,9 +427,24 @@ int gen8_switch_context_queue(struct intel_engine *ring,
>  
>  	spin_lock_irqsave(&ring->execlist_lock, flags);
>  
> -	was_empty = list_empty(&ring->execlist_queue);
> +	list_for_each_entry(cursor, &ring->execlist_queue, execlist_link)
> +		if (++num_elements > 2)
> +			break;
> +
> +	if (num_elements > 2) {
> +		struct drm_i915_gem_request *tail_req =
> +				list_last_entry(&ring->execlist_queue,
> +					struct drm_i915_gem_request, execlist_link);
> +		if (to == tail_req->ctx) {
> +			WARN(tail_req->elsp_submitted != 0,
> +					"More than 2 already-submitted reqs queued\n");
> +			list_del(&tail_req->execlist_link);
> +			queue_work(dev_priv->wq, &tail_req->work);
> +		}
> +	}

Completely forgotten to mention this: Chris&I discussed this on irc and I
guess this issue will disappear if we track contexts instead of requests
in the scheduler. I guess this is an artifact of the gen7 scheduler you've
based this on, but even for that I think scheduling contexts (with preempt
point after each batch) is the right approach. But I haven't dug out the
scheduler patches again so might be wrong with that.
-Daniel

> +
>  	list_add_tail(&req->execlist_link, &ring->execlist_queue);
> -	if (was_empty)
> +	if (num_elements == 0)
>  		gen8_switch_context_unqueue(ring);
>  
>  	spin_unlock_irqrestore(&ring->execlist_lock, flags);
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 40/50] drm/i915/bdw: Handle context switch events
  2014-05-09 12:09 ` [PATCH 40/50] drm/i915/bdw: Handle context switch events oscar.mateo
@ 2014-06-11 11:52   ` Daniel Vetter
  2014-06-11 12:02     ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-06-11 11:52 UTC (permalink / raw)
  To: oscar.mateo; +Cc: Thomas Daniel, intel-gfx

On Fri, May 09, 2014 at 01:09:10PM +0100, oscar.mateo@intel.com wrote:
> From: Thomas Daniel <thomas.daniel@intel.com>
> 
> Handle all context status events in the context status buffer on every
> context switch interrupt. We only remove work from the execlist queue
> after a context status buffer reports that it has completed and we only
> attempt to schedule new contexts on interrupt when a previously submitted
> context completes (unless no contexts are queued, which means the GPU is
> free).
> 
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> 
> v2: Unreferencing the context when we are freeing the request might free
> the backing bo, which requires the struct_mutex to be grabbed, so defer
> unreferencing and freeing to a bottom half.
> 
> v3:
> - Ack the interrupt inmediately, before trying to handle it (fix for
> missing interrupts by Bob Beckett <robert.beckett@intel.com>).

This interrupt handling change is interesting since it might explain our
irq handling woes on gen5+ with the two-level GT interrupt handling
scheme. Can you please roll this out as a prep patch for all the existing
gt interrupt sources we handle already for gen5+?

Thanks, Daniel

> - Update the Context Status Buffer Read Pointer, just in case (spotted
> by Damien Lespiau).
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |   3 +
>  drivers/gpu/drm/i915/i915_irq.c         |  38 +++++++-----
>  drivers/gpu/drm/i915/intel_lrc.c        | 102 +++++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
>  5 files changed, 129 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f2aae6a..07b8bdc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1748,6 +1748,8 @@ struct drm_i915_gem_request {
>  
>  	/** execlist queue entry for this request */
>  	struct list_head execlist_link;
> +	/** Struct to handle this request in the bottom half of an interrupt */
> +	struct work_struct work;
>  };
>  
>  struct drm_i915_file_private {
> @@ -2449,6 +2451,7 @@ static inline u32 intel_get_lr_contextid(struct drm_i915_gem_object *ctx_obj)
>  int gen8_switch_context_queue(struct intel_engine *ring,
>  			      struct i915_hw_context *to,
>  			      u32 tail);
> +void gen8_handle_context_events(struct intel_engine *ring);
>  
>  /* i915_gem_evict.c */
>  int __must_check i915_gem_evict_something(struct drm_device *dev,
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index a28cf6c..fbffead 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1300,6 +1300,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
>  				       struct drm_i915_private *dev_priv,
>  				       u32 master_ctl)
>  {
> +	struct intel_engine *ring;
>  	u32 rcs, bcs, vcs, vecs;
>  	uint32_t tmp = 0;
>  	irqreturn_t ret = IRQ_NONE;
> @@ -1307,16 +1308,22 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
>  	if (master_ctl & (GEN8_GT_RCS_IRQ | GEN8_GT_BCS_IRQ)) {
>  		tmp = I915_READ(GEN8_GT_IIR(0));
>  		if (tmp) {
> +			I915_WRITE(GEN8_GT_IIR(0), tmp);
>  			ret = IRQ_HANDLED;
> +
>  			rcs = tmp >> GEN8_RCS_IRQ_SHIFT;
> -			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
> +			ring = &dev_priv->ring[RCS];
>  			if (rcs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[RCS]);
> +				notify_ring(dev, ring);
> +			if (rcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> +				gen8_handle_context_events(ring);
> +
> +			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
> +			ring = &dev_priv->ring[BCS];
>  			if (bcs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[BCS]);
> -			if ((rcs | bcs) & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> -			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
> -			I915_WRITE(GEN8_GT_IIR(0), tmp);
> +				notify_ring(dev, ring);
> +			if (bcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> +				gen8_handle_context_events(ring);
>  		} else
>  			DRM_ERROR("The master control interrupt lied (GT0)!\n");
>  	}
> @@ -1324,18 +1331,20 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
>  	if (master_ctl & (GEN8_GT_VCS1_IRQ | GEN8_GT_VCS2_IRQ)) {
>  		tmp = I915_READ(GEN8_GT_IIR(1));
>  		if (tmp) {
> +			I915_WRITE(GEN8_GT_IIR(1), tmp);
>  			ret = IRQ_HANDLED;
>  			vcs = tmp >> GEN8_VCS1_IRQ_SHIFT;
> +			ring = &dev_priv->ring[VCS];
>  			if (vcs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[VCS]);
> +				notify_ring(dev, ring);
>  			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> -			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
> +			     gen8_handle_context_events(ring);
>  			vcs = tmp >> GEN8_VCS2_IRQ_SHIFT;
> +			ring = &dev_priv->ring[VCS2];
>  			if (vcs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[VCS2]);
> +				notify_ring(dev, ring);
>  			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> -			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
> -			I915_WRITE(GEN8_GT_IIR(1), tmp);
> +			     gen8_handle_context_events(ring);
>  		} else
>  			DRM_ERROR("The master control interrupt lied (GT1)!\n");
>  	}
> @@ -1343,13 +1352,14 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
>  	if (master_ctl & GEN8_GT_VECS_IRQ) {
>  		tmp = I915_READ(GEN8_GT_IIR(3));
>  		if (tmp) {
> +			I915_WRITE(GEN8_GT_IIR(3), tmp);
>  			ret = IRQ_HANDLED;
>  			vecs = tmp >> GEN8_VECS_IRQ_SHIFT;
> +			ring = &dev_priv->ring[VECS];
>  			if (vecs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[VECS]);
> +				notify_ring(dev, ring);
>  			if (vecs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> -			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
> -			I915_WRITE(GEN8_GT_IIR(3), tmp);
> +				gen8_handle_context_events(ring);
>  		} else
>  			DRM_ERROR("The master control interrupt lied (GT3)!\n");
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6da7db9..1ff493a 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -49,6 +49,22 @@
>  #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
>  #define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
>  #define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
> +#define RING_CONTEXT_STATUS_BUF(ring)	((ring)->mmio_base+0x370)
> +#define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
> +
> +#define RING_EXECLIST_QFULL		(1 << 0x2)
> +#define RING_EXECLIST1_VALID		(1 << 0x3)
> +#define RING_EXECLIST0_VALID		(1 << 0x4)
> +#define RING_EXECLIST_ACTIVE_STATUS	(3 << 0xE)
> +#define RING_EXECLIST1_ACTIVE		(1 << 0x11)
> +#define RING_EXECLIST0_ACTIVE		(1 << 0x12)
> +
> +#define GEN8_CTX_STATUS_IDLE_ACTIVE	(1 << 0)
> +#define GEN8_CTX_STATUS_PREEMPTED	(1 << 1)
> +#define GEN8_CTX_STATUS_ELEMENT_SWITCH	(1 << 2)
> +#define GEN8_CTX_STATUS_ACTIVE_IDLE	(1 << 3)
> +#define GEN8_CTX_STATUS_COMPLETE	(1 << 4)
> +#define GEN8_CTX_STATUS_LITE_RESTORE	(1 << 15)
>  
>  #define CTX_LRI_HEADER_0		0x01
>  #define CTX_CONTEXT_CONTROL		0x02
> @@ -203,6 +219,9 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
>  {
>  	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
>  	struct drm_i915_gem_request *cursor = NULL, *tmp = NULL;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +
> +	assert_spin_locked(&ring->execlist_lock);
>  
>  	if (list_empty(&ring->execlist_queue))
>  		return;
> @@ -215,8 +234,7 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
>  			/* Same ctx: ignore first request, as second request
>  			 * will update tail past first request's workload */
>  			list_del(&req0->execlist_link);
> -			i915_gem_context_unreference(req0->ctx);
> -			kfree(req0);
> +			queue_work(dev_priv->wq, &req0->work);
>  			req0 = cursor;
>  		} else {
>  			req1 = cursor;
> @@ -228,6 +246,85 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
>  			req1? req1->ctx : NULL, req1? req1->tail : 0));
>  }
>  
> +static bool check_remove_request(struct intel_engine *ring, u32 request_id)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct drm_i915_gem_request *head_req;
> +
> +	assert_spin_locked(&ring->execlist_lock);
> +
> +	head_req = list_first_entry_or_null(&ring->execlist_queue,
> +			struct drm_i915_gem_request, execlist_link);
> +	if (head_req != NULL) {
> +		struct drm_i915_gem_object *ctx_obj =
> +				head_req->ctx->engine[ring->id].obj;
> +		if (intel_get_lr_contextid(ctx_obj) == request_id) {
> +			list_del(&head_req->execlist_link);
> +			queue_work(dev_priv->wq, &head_req->work);
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}
> +
> +void gen8_handle_context_events(struct intel_engine *ring)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	u32 status_pointer;
> +	u8 read_pointer;
> +	u8 write_pointer;
> +	u32 status;
> +	u32 status_id;
> +	u32 submit_contexts = 0;
> +
> +	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
> +
> +	read_pointer = ring->next_context_status_buffer;
> +	write_pointer = status_pointer & 0x07;
> +	if (read_pointer > write_pointer)
> +		write_pointer += 6;
> +
> +	spin_lock(&ring->execlist_lock);
> +
> +	while (read_pointer < write_pointer) {
> +		read_pointer++;
> +		status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> +				(read_pointer % 6) * 8);
> +		status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> +				(read_pointer % 6) * 8 + 4);
> +
> +		if (status & GEN8_CTX_STATUS_COMPLETE) {
> +			if (check_remove_request(ring, status_id))
> +				submit_contexts++;
> +		}
> +	}
> +
> +	if (submit_contexts != 0)
> +		gen8_switch_context_unqueue(ring);
> +
> +	spin_unlock(&ring->execlist_lock);
> +
> +	WARN(submit_contexts > 2, "More than two context complete events?\n");
> +	ring->next_context_status_buffer = write_pointer % 6;
> +
> +	I915_WRITE(RING_CONTEXT_STATUS_PTR(ring),
> +			((u32)ring->next_context_status_buffer & 0x07) << 8);
> +}
> +
> +static void free_request_task(struct work_struct *work)
> +{
> +	struct drm_i915_gem_request *req =
> +			container_of(work, struct drm_i915_gem_request, work);
> +	struct drm_device *dev = req->ring->dev;
> +
> +	mutex_lock(&dev->struct_mutex);
> +	i915_gem_context_unreference(req->ctx);
> +	mutex_unlock(&dev->struct_mutex);
> +
> +	kfree(req);
> +}
> +
>  int gen8_switch_context_queue(struct intel_engine *ring,
>  			      struct i915_hw_context *to,
>  			      u32 tail)
> @@ -243,6 +340,7 @@ int gen8_switch_context_queue(struct intel_engine *ring,
>  	req->ctx = to;
>  	i915_gem_context_reference(req->ctx);
>  	req->tail = tail;
> +	INIT_WORK(&req->work, free_request_task);
>  
>  	spin_lock_irqsave(&ring->execlist_lock, flags);
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 35ced7c..9cd6ee8 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1573,6 +1573,7 @@ static int intel_init_ring(struct drm_device *dev,
>  		if (ring->status_page.page_addr == NULL)
>  			return -ENOMEM;
>  		ring->status_page.obj = obj;
> +		ring->next_context_status_buffer = 0;
>  	} else if (I915_NEED_GFX_HWS(dev)) {
>  		ret = init_status_page(ring);
>  		if (ret)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index daf91de..f3ae547 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -178,6 +178,7 @@ struct intel_engine {
>  
>  	spinlock_t execlist_lock;
>  	struct list_head execlist_queue;
> +	u8 next_context_status_buffer;
>  
>  	struct i915_hw_context *default_context;
>  	struct i915_hw_context *last_context;
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler
  2014-06-11 11:50   ` Daniel Vetter
@ 2014-06-11 12:01     ` Mateo Lozano, Oscar
  2014-06-11 13:57       ` Daniel Vetter
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-06-11 12:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Wednesday, June 11, 2014 12:50 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 49/50] drm/i915/bdw: Help out the ctx switch
> interrupt handler
> 
> On Fri, May 09, 2014 at 01:09:19PM +0100, oscar.mateo@intel.com wrote:
> > From: Oscar Mateo <oscar.mateo@intel.com>
> >
> > If we receive a storm of requests for the same context (see
> > gem_storedw_loop_*) we might end up iterating over too many elements
> > in interrupt time, looking for contexts to squash together. Instead,
> > share the burden by giving more intelligence to the queue function. At
> > most, the interrupt will iterate over three elements.
> >
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_lrc.c | 23 ++++++++++++++++++++---
> >  1 file changed, 20 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > b/drivers/gpu/drm/i915/intel_lrc.c
> > index d9edd10..0aad721 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -410,9 +410,11 @@ int gen8_switch_context_queue(struct
> intel_engine *ring,
> >  			      struct i915_hw_context *to,
> >  			      u32 tail)
> >  {
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct drm_i915_gem_request *req = NULL;
> >  	unsigned long flags;
> > -	bool was_empty;
> > +	struct drm_i915_gem_request *cursor;
> > +	int num_elements = 0;
> >
> >  	req = kzalloc(sizeof(*req), GFP_KERNEL);
> >  	if (req == NULL)
> > @@ -425,9 +427,24 @@ int gen8_switch_context_queue(struct
> intel_engine
> > *ring,
> >
> >  	spin_lock_irqsave(&ring->execlist_lock, flags);
> >
> > -	was_empty = list_empty(&ring->execlist_queue);
> > +	list_for_each_entry(cursor, &ring->execlist_queue, execlist_link)
> > +		if (++num_elements > 2)
> > +			break;
> > +
> > +	if (num_elements > 2) {
> > +		struct drm_i915_gem_request *tail_req =
> > +				list_last_entry(&ring->execlist_queue,
> > +					struct drm_i915_gem_request,
> execlist_link);
> > +		if (to == tail_req->ctx) {
> > +			WARN(tail_req->elsp_submitted != 0,
> > +					"More than 2 already-submitted reqs
> queued\n");
> > +			list_del(&tail_req->execlist_link);
> > +			queue_work(dev_priv->wq, &tail_req->work);
> > +		}
> > +	}
> 
> Completely forgotten to mention this: Chris&I discussed this on irc and I
> guess this issue will disappear if we track contexts instead of requests in the
> scheduler. I guess this is an artifact of the gen7 scheduler you've based this
> on, but even for that I think scheduling contexts (with preempt point after
> each batch) is the right approach. But I haven't dug out the scheduler patches
> again so might be wrong with that.
> -Daniel

Hmmmm... I didn´t really base this on the scheduler. Some kind of queue to hold context submissions until the hardware was ready was needed, and queuing drm_i915_gem_requests seemed like a good choice at the time (by the way, in the next version I am using a new struct intel_ctx_submit_request, since I don´t need most of the fields in drm_i915_gem_requests, and I have to add a couple of new ones anyway).

What do you mean by "scheduling contexts"? Notice that the requests I am queuing basically just contain the context and the tail at the point it was submitted for execution...

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 40/50] drm/i915/bdw: Handle context switch events
  2014-06-11 11:52   ` Daniel Vetter
@ 2014-06-11 12:02     ` Mateo Lozano, Oscar
  2014-06-11 15:23       ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-06-11 12:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel, Thomas, intel-gfx

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Wednesday, June 11, 2014 12:52 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org; Daniel, Thomas
> Subject: Re: [Intel-gfx] [PATCH 40/50] drm/i915/bdw: Handle context switch
> events
> 
> On Fri, May 09, 2014 at 01:09:10PM +0100, oscar.mateo@intel.com wrote:
> > From: Thomas Daniel <thomas.daniel@intel.com>
> >
> > Handle all context status events in the context status buffer on every
> > context switch interrupt. We only remove work from the execlist queue
> > after a context status buffer reports that it has completed and we
> > only attempt to schedule new contexts on interrupt when a previously
> > submitted context completes (unless no contexts are queued, which
> > means the GPU is free).
> >
> > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> >
> > v2: Unreferencing the context when we are freeing the request might
> > free the backing bo, which requires the struct_mutex to be grabbed, so
> > defer unreferencing and freeing to a bottom half.
> >
> > v3:
> > - Ack the interrupt inmediately, before trying to handle it (fix for
> > missing interrupts by Bob Beckett <robert.beckett@intel.com>).
> 
> This interrupt handling change is interesting since it might explain our irq
> handling woes on gen5+ with the two-level GT interrupt handling scheme.
> Can you please roll this out as a prep patch for all the existing gt interrupt
> sources we handle already for gen5+?
> 
> Thanks, Daniel

Can do.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler
  2014-06-11 12:01     ` Mateo Lozano, Oscar
@ 2014-06-11 13:57       ` Daniel Vetter
  2014-06-11 14:26         ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 94+ messages in thread
From: Daniel Vetter @ 2014-06-11 13:57 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx

On Wed, Jun 11, 2014 at 12:01:42PM +0000, Mateo Lozano, Oscar wrote:
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
> > Vetter
> > Sent: Wednesday, June 11, 2014 12:50 PM
> > To: Mateo Lozano, Oscar
> > Cc: intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 49/50] drm/i915/bdw: Help out the ctx switch
> > interrupt handler
> > 
> > On Fri, May 09, 2014 at 01:09:19PM +0100, oscar.mateo@intel.com wrote:
> > > From: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > If we receive a storm of requests for the same context (see
> > > gem_storedw_loop_*) we might end up iterating over too many elements
> > > in interrupt time, looking for contexts to squash together. Instead,
> > > share the burden by giving more intelligence to the queue function. At
> > > most, the interrupt will iterate over three elements.
> > >
> > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/intel_lrc.c | 23 ++++++++++++++++++++---
> > >  1 file changed, 20 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > > b/drivers/gpu/drm/i915/intel_lrc.c
> > > index d9edd10..0aad721 100644
> > > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > > @@ -410,9 +410,11 @@ int gen8_switch_context_queue(struct
> > intel_engine *ring,
> > >  			      struct i915_hw_context *to,
> > >  			      u32 tail)
> > >  {
> > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > >  	struct drm_i915_gem_request *req = NULL;
> > >  	unsigned long flags;
> > > -	bool was_empty;
> > > +	struct drm_i915_gem_request *cursor;
> > > +	int num_elements = 0;
> > >
> > >  	req = kzalloc(sizeof(*req), GFP_KERNEL);
> > >  	if (req == NULL)
> > > @@ -425,9 +427,24 @@ int gen8_switch_context_queue(struct
> > intel_engine
> > > *ring,
> > >
> > >  	spin_lock_irqsave(&ring->execlist_lock, flags);
> > >
> > > -	was_empty = list_empty(&ring->execlist_queue);
> > > +	list_for_each_entry(cursor, &ring->execlist_queue, execlist_link)
> > > +		if (++num_elements > 2)
> > > +			break;
> > > +
> > > +	if (num_elements > 2) {
> > > +		struct drm_i915_gem_request *tail_req =
> > > +				list_last_entry(&ring->execlist_queue,
> > > +					struct drm_i915_gem_request,
> > execlist_link);
> > > +		if (to == tail_req->ctx) {
> > > +			WARN(tail_req->elsp_submitted != 0,
> > > +					"More than 2 already-submitted reqs
> > queued\n");
> > > +			list_del(&tail_req->execlist_link);
> > > +			queue_work(dev_priv->wq, &tail_req->work);
> > > +		}
> > > +	}
> > 
> > Completely forgotten to mention this: Chris&I discussed this on irc and I
> > guess this issue will disappear if we track contexts instead of requests in the
> > scheduler. I guess this is an artifact of the gen7 scheduler you've based this
> > on, but even for that I think scheduling contexts (with preempt point after
> > each batch) is the right approach. But I haven't dug out the scheduler patches
> > again so might be wrong with that.
> > -Daniel
> 
> Hmmmm... I didn´t really base this on the scheduler. Some kind of queue
> to hold context submissions until the hardware was ready was needed, and
> queuing drm_i915_gem_requests seemed like a good choice at the time (by
> the way, in the next version I am using a new struct
> intel_ctx_submit_request, since I don´t need most of the fields in
> drm_i915_gem_requests, and I have to add a couple of new ones anyway).
> 
> What do you mean by "scheduling contexts"? Notice that the requests I am
> queuing basically just contain the context and the tail at the point it
> was submitted for execution...

Well I've thought we could just throw contexts at the hardware and throw
new ones at it when the old ones get stuck/are completed. But now I've
realized that since we do the cross-engine/ctx depency tracking in
software it's not quite that easy and we can't unconditionally update the
tail-pointer.

Still for the degenerate case of one ctx submitting batches exclusively
I've hoped just updating the tail pointer in the context and telling the
hw to reload the current context should have been enough. Or at least I've
hoped so, and that should take (mostly) care of the insane request
overload case your patch here addresses.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler
  2014-06-11 13:57       ` Daniel Vetter
@ 2014-06-11 14:26         ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-06-11 14:26 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx



---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47


> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Wednesday, June 11, 2014 2:58 PM
> To: Mateo Lozano, Oscar
> Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 49/50] drm/i915/bdw: Help out the ctx switch
> interrupt handler
> 
> On Wed, Jun 11, 2014 at 12:01:42PM +0000, Mateo Lozano, Oscar wrote:
> > > -----Original Message-----
> > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > Daniel Vetter
> > > Sent: Wednesday, June 11, 2014 12:50 PM
> > > To: Mateo Lozano, Oscar
> > > Cc: intel-gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH 49/50] drm/i915/bdw: Help out the
> > > ctx switch interrupt handler
> > >
> > > On Fri, May 09, 2014 at 01:09:19PM +0100, oscar.mateo@intel.com
> wrote:
> > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > >
> > > > If we receive a storm of requests for the same context (see
> > > > gem_storedw_loop_*) we might end up iterating over too many
> > > > elements in interrupt time, looking for contexts to squash
> > > > together. Instead, share the burden by giving more intelligence to
> > > > the queue function. At most, the interrupt will iterate over three
> elements.
> > > >
> > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_lrc.c | 23 ++++++++++++++++++++---
> > > >  1 file changed, 20 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > > > b/drivers/gpu/drm/i915/intel_lrc.c
> > > > index d9edd10..0aad721 100644
> > > > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > > > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > > > @@ -410,9 +410,11 @@ int gen8_switch_context_queue(struct
> > > intel_engine *ring,
> > > >  			      struct i915_hw_context *to,
> > > >  			      u32 tail)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > >  	struct drm_i915_gem_request *req = NULL;
> > > >  	unsigned long flags;
> > > > -	bool was_empty;
> > > > +	struct drm_i915_gem_request *cursor;
> > > > +	int num_elements = 0;
> > > >
> > > >  	req = kzalloc(sizeof(*req), GFP_KERNEL);
> > > >  	if (req == NULL)
> > > > @@ -425,9 +427,24 @@ int gen8_switch_context_queue(struct
> > > intel_engine
> > > > *ring,
> > > >
> > > >  	spin_lock_irqsave(&ring->execlist_lock, flags);
> > > >
> > > > -	was_empty = list_empty(&ring->execlist_queue);
> > > > +	list_for_each_entry(cursor, &ring->execlist_queue, execlist_link)
> > > > +		if (++num_elements > 2)
> > > > +			break;
> > > > +
> > > > +	if (num_elements > 2) {
> > > > +		struct drm_i915_gem_request *tail_req =
> > > > +				list_last_entry(&ring->execlist_queue,
> > > > +					struct drm_i915_gem_request,
> > > execlist_link);
> > > > +		if (to == tail_req->ctx) {
> > > > +			WARN(tail_req->elsp_submitted != 0,
> > > > +					"More than 2 already-submitted reqs
> > > queued\n");
> > > > +			list_del(&tail_req->execlist_link);
> > > > +			queue_work(dev_priv->wq, &tail_req->work);
> > > > +		}
> > > > +	}
> > >
> > > Completely forgotten to mention this: Chris&I discussed this on irc
> > > and I guess this issue will disappear if we track contexts instead
> > > of requests in the scheduler. I guess this is an artifact of the
> > > gen7 scheduler you've based this on, but even for that I think
> > > scheduling contexts (with preempt point after each batch) is the
> > > right approach. But I haven't dug out the scheduler patches again so
> might be wrong with that.
> > > -Daniel
> >
> > Hmmmm... I didn´t really base this on the scheduler. Some kind of
> > queue to hold context submissions until the hardware was ready was
> > needed, and queuing drm_i915_gem_requests seemed like a good choice
> at
> > the time (by the way, in the next version I am using a new struct
> > intel_ctx_submit_request, since I don´t need most of the fields in
> > drm_i915_gem_requests, and I have to add a couple of new ones anyway).
> >
> > What do you mean by "scheduling contexts"? Notice that the requests I
> > am queuing basically just contain the context and the tail at the
> > point it was submitted for execution...
> 
> Well I've thought we could just throw contexts at the hardware and throw
> new ones at it when the old ones get stuck/are completed. But now I've
> realized that since we do the cross-engine/ctx depency tracking in software
> it's not quite that easy and we can't unconditionally update the tail-pointer.

Exactly: unconditionally updating the tail-pointer also means the seqnos might get executed out of order, which is not nice (at least until there is a scheduler keeping track of the dependencies).

> Still for the degenerate case of one ctx submitting batches exclusively I've
> hoped just updating the tail pointer in the context and telling the hw to
> reload the current context should have been enough. Or at least I've hoped
> so, and that should take (mostly) care of the insane request overload case
> your patch here addresses.

What we had before this patch:

A) The submitter places one request per new batch received in the queue, regardless.
B) The interrupt handler traverses the queue backwards, squashing together all requests for the same context that come in a row. Notice that the first one in the queue might be already in execution, in which case the squashing will end up doing exactly what you said: updating the tail pointer.
C) When the ELSP is written to, if the same context was already in execution, the GPU performs a Lite Restore (the new tail is sampled and execution continues).

After this patch:

A) The submitter places one request per new batch received in the queue. But, if the last request in the queue belongs to the same context, and it´s not suspicious of being in-execution (which means it occupies the 3rd position or higher) then squash the two requests together (the old and the new).
B) The interrupt handler traverses the queue backwards, squashing together all requests for the same context that come in a row (maximum of two: one in execution, and the pending one).
C) When the ELSP is written to, if the same context was already in execution, the GPU performs a Lite Restore (the new tail is sampled and execution continues).

In a graphical form, if A and B are contexts and (*) means they are in execution:

B3 -> B2 B1* A1* (the submitter wants to queue B with tail 3)
B3 B1* A1*(as B2 is redundant, the submitter drops it and queues B3 in place)
B3 B1* (the interrupt informs us that A is complete, so we remove it from the head of the queue)
-> B3 (since B1 is redundant, we submit B3, causing a Lite Restore)

(* context in execution)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 40/50] drm/i915/bdw: Handle context switch events
  2014-06-11 12:02     ` Mateo Lozano, Oscar
@ 2014-06-11 15:23       ` Mateo Lozano, Oscar
  2014-06-12  6:53         ` Daniel Vetter
  0 siblings, 1 reply; 94+ messages in thread
From: Mateo Lozano, Oscar @ 2014-06-11 15:23 UTC (permalink / raw)
  To: Mateo Lozano, Oscar, Daniel Vetter; +Cc: Daniel, Thomas, intel-gfx

> > > - Ack the interrupt inmediately, before trying to handle it (fix for
> > > missing interrupts by Bob Beckett <robert.beckett@intel.com>).
> >
> > This interrupt handling change is interesting since it might explain
> > our irq handling woes on gen5+ with the two-level GT interrupt handling
> scheme.
> > Can you please roll this out as a prep patch for all the existing gt
> > interrupt sources we handle already for gen5+?
> >
> > Thanks, Daniel
> 
> Can do.

One question, though: why only the GT interrupts? what about DE, PM, etc...?

It looks like the BSpec is pretty clear on this:

1 - Disable Master Interrupt Control
2 - Find the category of interrupt that is pending
3 - Find the source(s) of the interrupt and ***clear the Interrupt Identity bits (IIR)***
4 - Process the interrupt(s) that had bits set in the IIRs
5 - Re-enable Master Interrupt Control

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 40/50] drm/i915/bdw: Handle context switch events
  2014-06-11 15:23       ` Mateo Lozano, Oscar
@ 2014-06-12  6:53         ` Daniel Vetter
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Vetter @ 2014-06-12  6:53 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: Daniel, Thomas, intel-gfx

On Wed, Jun 11, 2014 at 03:23:33PM +0000, Mateo Lozano, Oscar wrote:
> > > > - Ack the interrupt inmediately, before trying to handle it (fix for
> > > > missing interrupts by Bob Beckett <robert.beckett@intel.com>).
> > >
> > > This interrupt handling change is interesting since it might explain
> > > our irq handling woes on gen5+ with the two-level GT interrupt handling
> > scheme.
> > > Can you please roll this out as a prep patch for all the existing gt
> > > interrupt sources we handle already for gen5+?
> > >
> > > Thanks, Daniel
> > 
> > Can do.
> 
> One question, though: why only the GT interrupts? what about DE, PM, etc...?
> 
> It looks like the BSpec is pretty clear on this:
> 
> 1 - Disable Master Interrupt Control
> 2 - Find the category of interrupt that is pending
> 3 - Find the source(s) of the interrupt and ***clear the Interrupt Identity bits (IIR)***
> 4 - Process the interrupt(s) that had bits set in the IIRs
> 5 - Re-enable Master Interrupt Control

Yeah, makes sense to do it for all. When I've looked at it the funky part
are the SDE interrupts where we (at least on pre-bdw) have this crazy
hack. I guess at least that one we should leave in since apparently it
works.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2014-06-12  6:53 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
2014-05-09 12:08 ` [PATCH 01/50] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
2014-05-09 12:08 ` [PATCH 02/50] drm/i915: for_each_ring oscar.mateo
2014-05-13 13:25   ` Daniel Vetter
2014-05-19 16:33   ` Volkin, Bradley D
2014-05-19 16:36     ` Mateo Lozano, Oscar
2014-05-09 12:08 ` [PATCH 03/50] drm/i915: Simplify a couple of functions thanks to for_each_ring oscar.mateo
2014-05-09 12:08 ` [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
2014-05-13 13:26   ` Daniel Vetter
2014-05-13 13:47     ` Chris Wilson
2014-05-14 11:53     ` Mateo Lozano, Oscar
2014-05-14 12:28       ` Daniel Vetter
2014-05-09 12:08 ` [PATCH 05/50] drm/i915: Extract ringbuffer destroy, make destroy & alloc outside accesible oscar.mateo
2014-05-09 12:08 ` [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
2014-05-13 13:28   ` Daniel Vetter
2014-05-14 13:26     ` Damien Lespiau
2014-05-15 14:17       ` Mateo Lozano, Oscar
2014-05-15 20:52         ` Daniel Vetter
2014-05-19 10:02           ` Mateo Lozano, Oscar
2014-05-19 12:20             ` Daniel Vetter
2014-05-19 13:41               ` Mateo Lozano, Oscar
2014-05-19 13:52                 ` Daniel Vetter
2014-05-19 14:43                   ` Mateo Lozano, Oscar
2014-05-19 15:11                     ` Daniel Vetter
2014-05-19 15:26                       ` Mateo Lozano, Oscar
2014-05-19 15:49                         ` Daniel Vetter
2014-05-19 16:12                           ` Mateo Lozano, Oscar
2014-05-19 16:24                             ` Volkin, Bradley D
2014-05-19 16:33                               ` Mateo Lozano, Oscar
2014-05-19 16:40                                 ` Volkin, Bradley D
2014-05-19 16:49                                   ` Mateo Lozano, Oscar
2014-05-19 17:00                                     ` Volkin, Bradley D
2014-05-20  8:11                             ` Daniel Vetter
2014-05-09 12:08 ` [PATCH 07/50] drm/i915: Split the ringbuffers and the rings oscar.mateo
2014-05-09 12:08 ` [PATCH 08/50] drm/i915: Rename functions that mention ringbuffers (meaning rings) oscar.mateo
2014-05-09 12:08 ` [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path oscar.mateo
2014-05-16 11:04   ` Chris Wilson
2014-05-16 11:11     ` Mateo Lozano, Oscar
2014-05-16 11:31       ` Chris Wilson
2014-05-09 12:08 ` [PATCH 10/50] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit oscar.mateo
2014-05-09 12:08 ` [PATCH 11/50] drm/i915: Write a new set of context-aware ringbuffer management functions oscar.mateo
2014-05-09 12:08 ` [PATCH 12/50] drm/i915: Final touches to ringbuffer and context plumbing and refactoring oscar.mateo
2014-05-09 12:08 ` [PATCH 13/50] drm/i915: s/write_tail/submit oscar.mateo
2014-05-09 12:08 ` [PATCH 14/50] drm/i915: Introduce one context backing object per engine oscar.mateo
2014-05-09 12:08 ` [PATCH 15/50] drm/i915: Make i915_gem_create_context outside accessible oscar.mateo
2014-05-09 12:08 ` [PATCH 16/50] drm/i915: Option to skip backing object allocation during context creation oscar.mateo
2014-05-09 12:08 ` [PATCH 17/50] drm/i915: Extract context backing object allocation oscar.mateo
2014-05-09 12:08 ` [PATCH 18/50] drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring Contexts) oscar.mateo
2014-05-09 12:08 ` [PATCH 19/50] drm/i915/bdw: New file for Logical Ring Contexts and Execlists oscar.mateo
2014-05-09 12:08 ` [PATCH 20/50] drm/i915/bdw: Rework init code for Logical Ring Contexts oscar.mateo
2014-05-09 12:08 ` [PATCH 21/50] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
2014-05-09 12:08 ` [PATCH 22/50] drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC oscar.mateo
2014-05-09 12:08 ` [PATCH 23/50] drm/i915/bdw: Allocate ringbuffer for user-created LRCs oscar.mateo
2014-05-09 12:08 ` [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
2014-05-09 13:36   ` Damien Lespiau
2014-05-12 17:00   ` [PATCH v2 " oscar.mateo
2014-05-09 12:08 ` [PATCH 25/50] drm/i915/bdw: Deferred creation of user-created LRCs oscar.mateo
2014-05-09 12:08 ` [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, " oscar.mateo
2014-05-13 13:35   ` Daniel Vetter
2014-05-14 11:38     ` Mateo Lozano, Oscar
2014-05-09 12:08 ` [PATCH 27/50] drm/i915/bdw: Status page for LR contexts oscar.mateo
2014-05-09 12:08 ` [PATCH 28/50] drm/i915/bdw: Enable execlists in the hardware oscar.mateo
2014-05-09 12:08 ` [PATCH 29/50] drm/i915/bdw: Execlists ring tail writing oscar.mateo
2014-05-09 12:09 ` [PATCH 30/50] drm/i915/bdw: LR context ring init oscar.mateo
2014-05-09 12:09 ` [PATCH 31/50] drm/i915/bdw: Set the request context information correctly in the LRC case oscar.mateo
2014-05-09 12:09 ` [PATCH 32/50] drm/i915/bdw: GEN8 new ring flush oscar.mateo
2014-05-09 12:09 ` [PATCH 33/50] drm/i915/bdw: Always write seqno to default context oscar.mateo
2014-05-09 12:09 ` [PATCH 34/50] drm/i915/bdw: Implement context switching (somewhat) oscar.mateo
2014-05-09 12:09 ` [PATCH 35/50] drm/i915/bdw: Add forcewake lock around ELSP writes oscar.mateo
2014-05-09 12:09 ` [PATCH 36/50] drm/i915/bdw: Write the tail pointer, LRC style oscar.mateo
2014-05-09 12:09 ` [PATCH 37/50] drm/i915/bdw: Don't write PDP in the legacy way when using LRCs oscar.mateo
2014-05-09 12:09 ` [PATCH 38/50] drm/i915/bdw: LR context switch interrupts oscar.mateo
2014-05-09 12:09 ` [PATCH 39/50] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
2014-05-09 12:09 ` [PATCH 40/50] drm/i915/bdw: Handle context switch events oscar.mateo
2014-06-11 11:52   ` Daniel Vetter
2014-06-11 12:02     ` Mateo Lozano, Oscar
2014-06-11 15:23       ` Mateo Lozano, Oscar
2014-06-12  6:53         ` Daniel Vetter
2014-05-09 12:09 ` [PATCH 41/50] drm/i915/bdw: Start queueing contexts to be submitted oscar.mateo
2014-05-09 12:09 ` [PATCH 42/50] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
2014-05-09 12:09 ` [PATCH 43/50] drm/i915/bdw: Display context backing obj & ringbuffer " oscar.mateo
2014-05-09 12:09 ` [PATCH 44/50] drm/i915/bdw: Print context state " oscar.mateo
2014-05-09 12:09 ` [PATCH 45/50] drm/i915/bdw: Document execlists and logical ring contexts oscar.mateo
2014-05-09 12:09 ` [PATCH 46/50] drm/i915/bdw: Avoid non-lite-restore preemptions oscar.mateo
2014-05-09 12:09 ` [PATCH 47/50] drm/i915/bdw: Make sure gpu reset still works with Execlists oscar.mateo
2014-05-09 12:09 ` [PATCH 48/50] drm/i915/bdw: Make sure error capture keeps working " oscar.mateo
2014-05-09 12:09 ` [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler oscar.mateo
2014-06-11 11:50   ` Daniel Vetter
2014-06-11 12:01     ` Mateo Lozano, Oscar
2014-06-11 13:57       ` Daniel Vetter
2014-06-11 14:26         ` Mateo Lozano, Oscar
2014-05-09 12:09 ` [PATCH 50/50] drm/i915/bdw: Enable logical ring contexts oscar.mateo
2014-05-12 17:04 ` [PATCH 49.1/50] drm/i915/bdw: Do not call intel_runtime_pm_get() in an interrupt oscar.mateo
2014-05-13 13:48 ` [PATCH 00/50] Execlists v2 Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.