All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 13/49] drm/i915/bdw: Execlists ring tail writing
  2014-03-27 17:59 ` [PATCH 13/49] drm/i915/bdw: Execlists ring tail writing oscar.mateo
@ 2014-03-27 17:13   ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-03-27 17:13 UTC (permalink / raw)
  To: intel-gfx

I already got a review from Brad Volkin on this that I agree with: change the "write_tail" vfunc name to something different, like "submit". If no one disagrees, I´ll change it in the next submission.


> -----Original Message-----
> From: Mateo Lozano, Oscar
> Sent: Thursday, March 27, 2014 6:00 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: Mateo Lozano, Oscar
> Subject: [PATCH 13/49] drm/i915/bdw: Execlists ring tail writing
> 
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> The write tail function is a very special place for execlists: since all access to
> the ring is mediated through requests (thanks to Chris Wilson's "Write
> RING_TAIL once per-request" for that) and all requests end up with a write tail,
> this is the place we are going to use to submit contexts for execution.
> 
> For the moment, just mark the place (we still need to do a lot of preparation
> before execlists are ready to start submitting things).
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 35e022f..a18dcf7 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -413,6 +413,12 @@ static void ring_write_tail(struct intel_engine *ring,
>  	I915_WRITE_TAIL(ring, value);
>  }
> 
> +static void gen8_write_tail_lrc(struct intel_engine *ring,
> +				u32 value)
> +{
> +	DRM_ERROR("Execlists still not ready!\n"); }
> +
>  u32 intel_ring_get_active_head(struct intel_engine *ring)  {
>  	drm_i915_private_t *dev_priv = ring->dev->dev_private; @@ -1907,12
> +1913,15 @@ int intel_init_render_ring(struct drm_device *dev)
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct intel_engine *ring = &dev_priv->ring[RCS];
> 
> +	ring->write_tail = ring_write_tail;
>  	if (INTEL_INFO(dev)->gen >= 6) {
>  		ring->add_request = gen6_add_request;
>  		ring->flush = gen7_render_ring_flush;
>  		if (INTEL_INFO(dev)->gen == 6)
>  			ring->flush = gen6_render_ring_flush;
>  		if (INTEL_INFO(dev)->gen >= 8) {
> +			if (dev_priv->lrc_enabled)
> +				ring->write_tail = gen8_write_tail_lrc;
>  			ring->flush = gen8_render_ring_flush;
>  			ring->irq_get = gen8_ring_get_irq;
>  			ring->irq_put = gen8_ring_put_irq;
> @@ -1958,7 +1967,7 @@ int intel_init_render_ring(struct drm_device *dev)
>  		}
>  		ring->irq_enable_mask = I915_USER_INTERRUPT;
>  	}
> -	ring->write_tail = ring_write_tail;
> +
>  	if (IS_HASWELL(dev))
>  		ring->dispatch_execbuffer = hsw_ring_dispatch_execbuffer;
>  	else if (IS_GEN8(dev))
> @@ -2079,6 +2088,8 @@ int intel_init_bsd_ring(struct drm_device *dev)
>  		ring->get_seqno = gen6_ring_get_seqno;
>  		ring->set_seqno = ring_set_seqno;
>  		if (INTEL_INFO(dev)->gen >= 8) {
> +			if (dev_priv->lrc_enabled)
> +				ring->write_tail = gen8_write_tail_lrc;
>  			ring->irq_enable_mask =
>  				GT_RENDER_USER_INTERRUPT <<
> GEN8_VCS1_IRQ_SHIFT;
>  			ring->irq_get = gen8_ring_get_irq;
> @@ -2133,6 +2144,8 @@ int intel_init_blt_ring(struct drm_device *dev)
>  	ring->get_seqno = gen6_ring_get_seqno;
>  	ring->set_seqno = ring_set_seqno;
>  	if (INTEL_INFO(dev)->gen >= 8) {
> +		if (dev_priv->lrc_enabled)
> +			ring->write_tail = gen8_write_tail_lrc;
>  		ring->irq_enable_mask =
>  			GT_RENDER_USER_INTERRUPT <<
> GEN8_BCS_IRQ_SHIFT;
>  		ring->irq_get = gen8_ring_get_irq;
> @@ -2170,6 +2183,8 @@ int intel_init_vebox_ring(struct drm_device *dev)
>  	ring->set_seqno = ring_set_seqno;
> 
>  	if (INTEL_INFO(dev)->gen >= 8) {
> +		if (dev_priv->lrc_enabled)
> +			ring->write_tail = gen8_write_tail_lrc;
>  		ring->irq_enable_mask =
>  			GT_RENDER_USER_INTERRUPT <<
> GEN8_VECS_IRQ_SHIFT;
>  		ring->irq_get = gen8_ring_get_irq;
> --
> 1.9.0

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 31/49] drm/i915/bdw: Introduce dependent contexts
  2014-03-27 18:00 ` [PATCH 31/49] drm/i915/bdw: Introduce dependent contexts oscar.mateo
@ 2014-03-27 17:21   ` Mateo Lozano, Oscar
  2014-04-09 16:54     ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-03-27 17:21 UTC (permalink / raw)
  To: intel-gfx

I already got a fair review comment from Brad Volkin on this: he proposes to do this instead

	struct i915_hw_context {
		struct i915_address_space *vm;
		struct {
			struct drm_i915_gem_object *ctx_obj;
			struct intel_ringbuffer *ringbuf;
		} engine[I915_MAX_RINGS];
		...
	};

This is: instead of creating extra contexts with the same Context ID, modify the current i915_hw_context to work with all engines. I agree this alternative looks less *hackish*, but I want to get eyes on it (several things need careful consideration if we do this, e.g.: should the hang_stats also be per-engine?)

> -----Original Message-----
> From: Mateo Lozano, Oscar
> Sent: Thursday, March 27, 2014 6:00 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: Mateo Lozano, Oscar
> Subject: [PATCH 31/49] drm/i915/bdw: Introduce dependent contexts
> 
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> From here on, we define a stand-alone context as the first context with a given
> ID to be created for a new fd or a new context create ioctl. This is the one we
> can easily find using integer ID management. On the other hand, dependent
> contexts are subsequently created with the same ID and simply hang from the
> stand-alone one.
> 
> This patch, together with the two previous and the next, are meant to solve a
> big problem we have: with execlists, we need contexts to work with all engines,
> and we cannot reuse one context for more than one engine.
> 
> Because, on a new fd or a context create ioctl, we really don't know which
> engine is going to be used later on, we are going to create at that point a
> "blank" context and assign it to an engine on a deferred way (during the
> execbuffer, to be precise). If later on, we execbuffer on a different engine, we
> create a new dependent context on the previous.
> 
> Note: I have tried to colour this patch in a different way, using a different struct
> (a "context group") to hold the context ID from where the per-engine contexts
> hang, but it makes legacy contexts unnecessary complex.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  6 +++++-
>  drivers/gpu/drm/i915/i915_gem_context.c | 17 +++++++++++++--
>  drivers/gpu/drm/i915/i915_lrc.c         | 37
> ++++++++++++++++++++++++++++++---
>  3 files changed, 54 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index 91b0886..d9470a4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -602,6 +602,9 @@ struct i915_hw_context {
>  	struct i915_address_space *vm;
> 
>  	struct list_head link;
> +
> +	/* Advanced contexts only */
> +	struct list_head dependent_contexts;
>  };
> 
>  struct i915_fbc {
> @@ -2321,7 +2324,8 @@ int gen8_gem_context_init(struct drm_device *dev);
> void gen8_gem_context_fini(struct drm_device *dev);  struct i915_hw_context
> *gen8_gem_create_context(struct drm_device *dev,
>  			struct intel_engine *ring,
> -			struct drm_i915_file_private *file_priv, bool
> create_vm);
> +			struct drm_i915_file_private *file_priv,
> +			struct i915_hw_context *standalone_ctx, bool
> create_vm);
>  void gen8_gem_context_free(struct i915_hw_context *ctx);
> 
>  /* i915_gem_evict.c */
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> b/drivers/gpu/drm/i915/i915_gem_context.c
> index 6baa5ab..17015b2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -271,6 +271,8 @@ __create_hw_context(struct drm_device *dev,
>  	 * is no remap info, it will be a NOP. */
>  	ctx->remap_slice = (1 << NUM_L3_SLICES(dev)) - 1;
> 
> +	INIT_LIST_HEAD(&ctx->dependent_contexts);
> +
>  	return ctx;
> 
>  err_out:
> @@ -511,6 +513,12 @@ int i915_gem_context_enable(struct
> drm_i915_private *dev_priv)  static int context_idr_cleanup(int id, void *p, void
> *data)  {
>  	struct i915_hw_context *ctx = p;
> +	struct i915_hw_context *cursor, *tmp;
> +
> +	list_for_each_entry_safe(cursor, tmp, &ctx->dependent_contexts,
> dependent_contexts) {
> +		list_del(&cursor->dependent_contexts);
> +		i915_gem_context_unreference(cursor);
> +	}
> 
>  	/* Ignore the default context because close will handle it */
>  	if (i915_gem_context_is_default(ctx))
> @@ -543,7 +551,7 @@ int i915_gem_context_open(struct drm_device *dev,
> struct drm_file *file)
>  	if (dev_priv->lrc_enabled)
>  		file_priv->private_default_ctx =
> gen8_gem_create_context(dev,
>  						&dev_priv->ring[RCS],
> file_priv,
> -						USES_FULL_PPGTT(dev));
> +						NULL,
> USES_FULL_PPGTT(dev));
>  	else
>  		file_priv->private_default_ctx = i915_gem_create_context(dev,
>  						file_priv,
> USES_FULL_PPGTT(dev)); @@ -805,7 +813,7 @@ int
> i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> 
>  	if (dev_priv->lrc_enabled)
>  		ctx = gen8_gem_create_context(dev, &dev_priv->ring[RCS],
> -					file_priv, USES_FULL_PPGTT(dev));
> +					file_priv, NULL,
> USES_FULL_PPGTT(dev));
>  	else
>  		ctx = i915_gem_create_context(dev, file_priv,
>  					USES_FULL_PPGTT(dev));
> @@ -825,6 +833,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device
> *dev, void *data,
>  	struct drm_i915_gem_context_destroy *args = data;
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
>  	struct i915_hw_context *ctx;
> +	struct i915_hw_context *cursor, *tmp;
>  	int ret;
> 
>  	if (args->ctx_id == DEFAULT_CONTEXT_ID) @@ -841,6 +850,10 @@ int
> i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>  	}
> 
>  	idr_remove(&ctx->file_priv->context_idr, ctx->id);
> +	list_for_each_entry_safe(cursor, tmp, &ctx->dependent_contexts,
> dependent_contexts) {
> +		list_del(&cursor->dependent_contexts);
> +		i915_gem_context_unreference(cursor);
> +	}
>  	i915_gem_context_unreference(ctx);
>  	mutex_unlock(&dev->struct_mutex);
> 
> diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
> index 124e5f2..99011cc 100644
> --- a/drivers/gpu/drm/i915/i915_lrc.c
> +++ b/drivers/gpu/drm/i915/i915_lrc.c
> @@ -195,23 +195,54 @@ intel_populate_lrc(struct i915_hw_context *ctx,
>  	return 0;
>  }
> 
> +static void assert_on_ppgtt_release(struct kref *kref) {
> +	WARN(1, "Are we trying to free the aliasing PPGTT?\n"); }
> +
>  struct i915_hw_context *
>  gen8_gem_create_context(struct drm_device *dev,
>  			struct intel_engine *ring,
>  			struct drm_i915_file_private *file_priv,
> +			struct i915_hw_context *standalone_ctx,
>  			bool create_vm)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct i915_hw_context *ctx = NULL;
>  	struct drm_i915_gem_object *ring_obj = NULL;
>  	struct intel_ringbuffer *ringbuf = NULL;
> +	bool is_dependent;
>  	int ret;
> 
> -	ctx = i915_gem_create_context(dev, file_priv, create_vm);
> +	/* NB: a standalone context is the first context with a given id to be
> +	 * created for a new fd. Dependent contexts simply hang from the
> stand-alone,
> +	 * sharing their ID and their PPGTT */
> +	is_dependent = (file_priv != NULL) && (standalone_ctx != NULL);
> +
> +	ctx = i915_gem_create_context(dev, is_dependent? NULL : file_priv,
> +					is_dependent? false : create_vm);
>  	if (IS_ERR_OR_NULL(ctx))
>  		return ctx;
> 
> -	if (file_priv) {
> +	if (is_dependent) {
> +		struct i915_hw_ppgtt *ppgtt;
> +
> +		/* We take the same PPGTT as the standalone */
> +		ppgtt = ctx_to_ppgtt(ctx);
> +		kref_put(&ppgtt->ref, assert_on_ppgtt_release);
> +		ppgtt = ctx_to_ppgtt(standalone_ctx);
> +		ctx->vm = &ppgtt->base;
> +		kref_get(&ppgtt->ref);
> +
> +		ctx->file_priv = file_priv;
> +		ctx->id = standalone_ctx->id;
> +		ctx->remap_slice = standalone_ctx->remap_slice;
> +
> +		list_add_tail(&ctx->dependent_contexts,
> +				&standalone_ctx->dependent_contexts);
> +	}
> +
> +	if (file_priv && !is_dependent) {
>  		ret = i915_gem_obj_ggtt_pin(ctx->obj, GEN8_CONTEXT_ALIGN,
> 0);
>  		if (ret) {
>  			DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret); @@ -
> 337,7 +368,7 @@ int gen8_gem_context_init(struct drm_device *dev)
> 
>  	for_each_ring(ring, dev_priv, ring_id) {
>  		ring->default_context = gen8_gem_create_context(dev, ring,
> -						NULL, (ring_id == RCS));
> +					NULL, NULL, (ring_id == RCS));
>  		if (IS_ERR_OR_NULL(ring->default_context)) {
>  			ret = PTR_ERR(ring->default_context);
>  			DRM_DEBUG_DRIVER("Create ctx failed: %d\n", ret);
> --
> 1.9.0

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 00/49] Execlists
@ 2014-03-27 17:59 oscar.mateo
  2014-03-27 17:59 ` [PATCH 01/49] drm/i915/bdw: Macro to distinguish LRCs (Logical Ring Contexts) oscar.mateo
                   ` (49 more replies)
  0 siblings, 50 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Hi all,

This patch series implement execlists for GEN8+. Before continuing, it is important to mention that I might have taken upon myself to assemble the series and rewrite it for upstreaming, but many people have worked on this series before me. Namely:

Ben Widawsky (benjamin.widawsky@intel.com).
Jesse Barnes (jbarnes@virtuousgeek.org).
Michel Thierry (michel.thierry@intel.com).
Thomas Daniel (thomas.daniel@intel.com).
Rafael Barbalho (rafael.barbalho@intel.com).

All good ideas in the series belong to these authors, and so I have tried to maintain authorship in the patches accordingly (to the extent possible, since the patches have suffered a lot of squashing & splitting). These authors do not, however, bear any of the blame for errors: I am solely responsible for them. 

Now, let's get back to the subject at hand:

With GEN8 comes an expansion of the HW contexts: "Logical Ring Contexts". One of the main differences with the legacy HW contexts is that logical ring contexts incorporate many more things to the context's state, like PDPs or ringbuffer control registers. These logical ring contexts enable a number of new abilities, especially "Execlists". Execlists are the new method by which, on GEN8+ hardware, workloads are submitted for execution (as opposed to the legacy, ringbuffer-based). With this new method, commands in the context's ringbuffer are executed when the GPU moves to this context from a previous one (a.k.a. context switch).

On a context switch, the GPU has to remember the current state of the context being switched out including the head and tail pointers of the ring buffer, so it:

- Flushes the pipe.
- Saves ringbuffer head pointer.
- Saves engine state.

Similarly, on a context restore (When a previously switched out context is resubmitted), the GPU restores the saved context and resumes execution where it stopped:

- Restores PDPs and sets-up PPGTT.
- Restores ringbuffer.
- Restores engine state.

The way in which contexts are submitted for execution is the GPU's ExecLists Submit Port (ELSP, for short). This port supports the submission of two contexts at a time, which are executed in a serial way (Context-0 first, Context-1 next) upon every context completion. The GPU keeps the software informed about the status of this list via context switch interrupts and context status buffers, to help software keep track of the progress. The existance of a second context ensures some useful work done in HW while the Context-0 switch status is being processed by SW. After Context-1 completion, HW goes IDLE if there is no further contexts scheduled in the ELSP.

Every time a new Execution List is submitted to the ELSP where one of the contexts is already running will result in a Lite Restore (sampling of the new tail pointer).

Regarding the creation of logical ring contexts, we had before (since PPGTT was introduced):

- One global default context.
- One private default context for each opened fd.
- One extra private context for each context create ioctl call.

The global default context existed for future shrinker usage as well as reset handling. At the same time, every file got it's own context, plus any number of extra contexts if the context create ioctl call was used by the userspace driver. These private contexts were the ones used by the driver for execbuffer calls.

Now that ringbuffers belong per-context (and not per-engine, like before) and that contexts are uniquely tied to a given engine (and not reusable, like before) we need:

- No. of engines global default contexts.
- Up to no. of engines private default contexts for each opened fd.
- Up to no. of engines extra private contexts for each context create ioctl call.

Given that at creation time of a non-global context we don't know which engine is going to use it, we have implemented a deferred creation of logical ring contexts: the private default context starts its life as a hollow or blank holder, that gets populated once we receive an execbuffer ioctl (for a particular engine) on that fd. If later on we receive another execbuffer ioctl for a different engine, we create a second private default context and so on. The same rules apply to the create context ioctl call.

Execlists have been implemented as follows:

When a request is committed, its commands (the BB start and any leading or trailing commands, like the seqno breadcrumbs) are placed in the ringbuffer for the appropriate context. The tail pointer in the hardware context is not updated at this time, but instead, kept by the driver in the ringbuffer structure. A structure representing this execution request is added to a request queue for the appropriate engine: this structure contains a copy of the context's tail after the request was written to the ringbuffer and a pointer to the context itself.

If the engine's request queue was empty before the request was added, the queue is processed immediately. Otherwise the queue will be processed during a context switch interrupt. In any case, elements on the queue will get sent (in pairs) to the ELSP with a globally unique 20-bits submission ID (constructed with the fd's ID, plus our own context ID, plus the engine's ID).

When execution of a request completes, the GPU updates the context status buffer with a context complete event and generates a context switch interrupt. During context switch interrupt handling, the driver examines the context status events in the context status buffer: for each context complete event, if the announced ID matches that on the head of the request queue, then that request is retired and removed from the queue.

After processing, if any requests were retired and the queue is not empty then a new execution list can be submitted. The two requests at the front of the queue are next to be submitted but since a context may not occur twice in an execution list, if subsequent requests have the same ID as the first then the two requests must be combined. This is done simply by discarding requests at the head of the queue until either only one requests is left (in which case we use a NULL second context) or the first two requests have unique IDs.

By always executing the first two requests in the queue the driver ensures that the GPU is kept as busy as possible. In the case where a single context completes but a second context is still executing, the request for the second context will be at the head of the queue when we remove the first one. This request will then be resubmitted along with a new request for a different context, which will cause the hardware to continue executing the second request and queue the new request (the GPU detects the condition of a context getting preempted with the same context and optimizes the context switch flow by not doing preemption, but just sampling the new tail pointer).

Because the GPU continues to execute while the context switch interrupt is being handled, there is a race condition where a second context completes while handling the completion of the previous. This results in the second context being resubmitted (potentially along with a third), and an extra context complete event for that context will occur. The request will be removed from the queue at the first context complete event, and the second context complete event will not result in removal of a request from the queue because the IDs of the request and the event will not match.

Cheers,
Oscar

Ben Widawsky (15):
  drm/i915/bdw: Macro to distinguish LRCs (Logical Ring Contexts)
  drm/i915: s/for_each_ring/for_each_active_ring
  drm/i915: for_each_ring
  drm/i915: Extract trivial parts of ring init (early init)
  drm/i915/bdw: Rework init code for gen8 contexts
  drm/i915: Extract ringbuffer obj alloc & destroy
  drm/i915/bdw: LR context ring init
  drm/i915/bdw: GEN8 semaphoreless ring add request
  drm/i915/bdw: GEN8 new ring flush
  drm/i915/bdw: A bit more advanced context init/fini
  drm/i915/bdw: Allocate ringbuffer for LR contexts
  drm/i915/bdw: Populate LR contexts (somewhat)
  drm/i915/bdw: Status page for LR contexts
  drm/i915/bdw: Enable execlists in the hardware
  drm/i915/bdw: Implement context switching (somewhat)

Michel Thierry (1):
  drm/i915/bdw: Get prepared for a two-stage execlist submit process

Oscar Mateo (30):
  drm/i915: Simplify a couple of functions thanks to for_each_ring
  drm/i915/bdw: New file for logical ring contexts and execlists
  drm/i915: Make i915_gem_create_context outside accessible
  drm/i915: s/intel_ring_buffer/intel_engine
  drm/i915: Split the ringbuffers and the rings
  drm/i915: Rename functions that mention ringbuffers (meaning rings)
  drm/i915/bdw: Execlists ring tail writing
  drm/i915/bdw: Plumbing for user LR context switching
  drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit
  drm/i915/bdw: Write a new set of context-aware ringbuffer management
    functions
  drm/i915: Final touches to LR contexts plumbing and refactoring
  drm/i915/bdw: Set the request context information correctly in the LRC
    case
  drm/i915/bdw: Prepare for user-created LR contexts
  drm/i915/bdw: Start creating & destroying user LR contexts
  drm/i915/bdw: Pin context pages at context create time
  drm/i915/bdw: Extract LR context object populating
  drm/i915/bdw: Introduce dependent contexts
  drm/i915/bdw: Create stand-alone and dependent contexts
  drm/i915/bdw: Allow non-default, non-render user LR contexts
  drm/i915/bdw: Fix reset stats ioctl with LR contexts
  drm/i915: Allocate an integer ID for each new file descriptor
  drm/i915/bdw: Prepare for a 20-bits globally unique submission ID
  drm/i915/bdw: Swap the PPGTT PDPs, LRC style
  drm/i915/bdw: Write the tail pointer, LRC style
  drm/i915/bdw: Display execlists info in debugfs
  drm/i915/bdw: Display context ringbuffer info in debugfs
  drm/i915/bdw: Start queueing contexts to be submitted
  drm/i915/bdw: Always write seqno to default context
  drm/i915/bdw: Enable logical ring contexts
  drm/i915/bdw: Document execlists and logical ring contexts

Thomas Daniel (3):
  drm/i915/bdw: Add forcewake lock around ELSP writes
  drm/i915/bdw: LR context switch interrupts
  drm/i915/bdw: Handle context switch events

 drivers/gpu/drm/i915/Makefile              |   1 +
 drivers/gpu/drm/i915/i915_cmd_parser.c     |  14 +-
 drivers/gpu/drm/i915/i915_debugfs.c        | 103 +++-
 drivers/gpu/drm/i915/i915_dma.c            |  57 +-
 drivers/gpu/drm/i915/i915_drv.h            |  90 +++-
 drivers/gpu/drm/i915/i915_gem.c            | 153 +++---
 drivers/gpu/drm/i915/i915_gem_context.c    | 109 ++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +--
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  39 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h        |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      |  12 +-
 drivers/gpu/drm/i915/i915_irq.c            |  93 ++--
 drivers/gpu/drm/i915/i915_lrc.c            | 826 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h            |  10 +
 drivers/gpu/drm/i915/i915_trace.h          |  26 +-
 drivers/gpu/drm/i915/intel_display.c       |  26 +-
 drivers/gpu/drm/i915/intel_drv.h           |   4 +-
 drivers/gpu/drm/i915/intel_overlay.c       |  12 +-
 drivers/gpu/drm/i915/intel_pm.c            |  18 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 796 +++++++++++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 187 ++++---
 drivers/gpu/drm/i915/intel_uncore.c        |  15 +
 22 files changed, 2043 insertions(+), 635 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_lrc.c

-- 
1.9.0

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 01/49] drm/i915/bdw: Macro to distinguish LRCs (Logical Ring Contexts)
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 02/49] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
                   ` (48 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

GEN8 brings an expansion of the HW contexts: "Logical Ring Contexts".
These expanded contexts enable a number of new abilities, especially
"Execlists".

In dev_priv, lrc_enabled will reflect the state of whether or not we've
actually properly initialized these new contexts. This helps the
transition in the code but is a candidate for removal at some point.

The macro is defined to off until we have things in place to hope to
work.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Rename "advanced contexts" to the more correct "logical ring
contexts"

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ef7e0ff..53196d0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1430,6 +1430,8 @@ typedef struct drm_i915_private {
 
 	uint32_t hw_context_size;
 	struct list_head context_list;
+	/* Logical Ring Contexts */
+	bool lrc_enabled;
 
 	u32 fdi_rx_config;
 
@@ -1822,6 +1824,7 @@ struct drm_i915_cmd_table {
 #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
 
 #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
+#define HAS_LOGICAL_RING_CONTEXTS(dev)	0
 #define HAS_ALIASING_PPGTT(dev)	(INTEL_INFO(dev)->gen >= 6 && !IS_VALLEYVIEW(dev))
 #define HAS_PPGTT(dev)		(INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) \
 				 && !IS_BROADWELL(dev))
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 02/49] drm/i915: s/for_each_ring/for_each_active_ring
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
  2014-03-27 17:59 ` [PATCH 01/49] drm/i915/bdw: Macro to distinguish LRCs (Logical Ring Contexts) oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 03/49] drm/i915: for_each_ring oscar.mateo
                   ` (47 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

The name "active" was recommended by Chris.

With the ordering change of how we initialize things, it is desirable to
be able to address each ring, whether initialized or not.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Several rebases.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     | 12 ++++++------
 drivers/gpu/drm/i915/i915_drv.h         |  2 +-
 drivers/gpu/drm/i915/i915_gem.c         | 14 +++++++-------
 drivers/gpu/drm/i915/i915_gem_context.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c     | 10 +++++-----
 drivers/gpu/drm/i915/i915_irq.c         |  8 ++++----
 drivers/gpu/drm/i915/intel_pm.c         |  8 ++++----
 drivers/gpu/drm/i915/intel_ringbuffer.c |  2 +-
 8 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 049dcb5..f423eb6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -571,7 +571,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 		return ret;
 
 	count = 0;
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		if (list_empty(&ring->request_list))
 			continue;
 
@@ -615,7 +615,7 @@ static int i915_gem_seqno_info(struct seq_file *m, void *data)
 		return ret;
 	intel_runtime_pm_get(dev_priv);
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		i915_ring_seqno_info(m, ring);
 
 	intel_runtime_pm_put(dev_priv);
@@ -752,7 +752,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 		seq_printf(m, "Graphics Interrupt mask:		%08x\n",
 			   I915_READ(GTIMR));
 	}
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		if (INTEL_INFO(dev)->gen >= 6) {
 			seq_printf(m,
 				   "Graphics Interrupt mask (%s):	%08x\n",
@@ -1677,7 +1677,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	list_for_each_entry(ctx, &dev_priv->context_list, link) {
 		seq_puts(m, "HW context ");
 		describe_ctx(m, ctx);
-		for_each_ring(ring, dev_priv, i)
+		for_each_active_ring(ring, dev_priv, i)
 			if (ring->default_context == ctx)
 				seq_printf(m, "(default context %s) ", ring->name);
 
@@ -1809,7 +1809,7 @@ static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 
 	seq_printf(m, "Page directories: %d\n", ppgtt->num_pd_pages);
 	seq_printf(m, "Page tables: %d\n", ppgtt->num_pd_entries);
-	for_each_ring(ring, dev_priv, unused) {
+	for_each_active_ring(ring, dev_priv, unused) {
 		seq_printf(m, "%s\n", ring->name);
 		for (i = 0; i < 4; i++) {
 			u32 offset = 0x270 + i * 8;
@@ -1832,7 +1832,7 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen == 6)
 		seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(GFX_MODE));
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		seq_printf(m, "%s\n", ring->name);
 		if (INTEL_INFO(dev)->gen == 7)
 			seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(RING_MODE_GEN7(ring)));
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 53196d0..57bf4a7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1470,7 +1470,7 @@ static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
 }
 
 /* Iterate over initialised rings */
-#define for_each_ring(ring__, dev_priv__, i__) \
+#define for_each_active_ring(ring__, dev_priv__, i__) \
 	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
 		if (((ring__) = &(dev_priv__)->ring[(i__)]), intel_ring_initialized((ring__)))
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 33bbaa0..1d85dc9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2097,7 +2097,7 @@ i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
 	int ret, i, j;
 
 	/* Carefully retire all requests without writing to the rings */
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		ret = intel_ring_idle(ring);
 		if (ret)
 			return ret;
@@ -2105,7 +2105,7 @@ i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
 	i915_gem_retire_requests(dev);
 
 	/* Finally reset hw state */
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		intel_ring_init_seqno(ring, seqno);
 
 		for (j = 0; j < ARRAY_SIZE(ring->sync_seqno); j++)
@@ -2417,10 +2417,10 @@ void i915_gem_reset(struct drm_device *dev)
 	 * them for finding the guilty party. As the requests only borrow
 	 * their reference to the objects, the inspection must be done first.
 	 */
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		i915_gem_reset_ring_status(dev_priv, ring);
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		i915_gem_reset_ring_cleanup(dev_priv, ring);
 
 	i915_gem_cleanup_ringbuffer(dev);
@@ -2501,7 +2501,7 @@ i915_gem_retire_requests(struct drm_device *dev)
 	bool idle = true;
 	int i;
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		i915_gem_retire_requests_ring(ring);
 		idle &= list_empty(&ring->request_list);
 	}
@@ -2789,7 +2789,7 @@ int i915_gpu_idle(struct drm_device *dev)
 	int ret, i;
 
 	/* Flush everything onto the inactive list. */
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		ret = i915_switch_context(ring, NULL, ring->default_context);
 		if (ret)
 			return ret;
@@ -4492,7 +4492,7 @@ i915_gem_cleanup_ringbuffer(struct drm_device *dev)
 	struct intel_ring_buffer *ring;
 	int i;
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		intel_cleanup_ring_buffer(ring);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 6043062..3cfdfbe 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -502,7 +502,7 @@ int i915_gem_context_enable(struct drm_i915_private *dev_priv)
 
 	BUG_ON(!dev_priv->ring[RCS].default_context);
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		ret = do_switch(ring, ring->default_context);
 		if (ret)
 			return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5774eb2..5ff0b20 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -806,7 +806,7 @@ static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 	struct intel_ring_buffer *ring;
 	int j, ret;
 
-	for_each_ring(ring, dev_priv, j) {
+	for_each_active_ring(ring, dev_priv, j) {
 		I915_WRITE(RING_MODE_GEN7(ring),
 			   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
 
@@ -823,7 +823,7 @@ static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 	return 0;
 
 err_out:
-	for_each_ring(ring, dev_priv, j)
+	for_each_active_ring(ring, dev_priv, j)
 		I915_WRITE(RING_MODE_GEN7(ring),
 			   _MASKED_BIT_DISABLE(GFX_PPGTT_ENABLE));
 	return ret;
@@ -849,7 +849,7 @@ static int gen7_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 	}
 	I915_WRITE(GAM_ECOCHK, ecochk);
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		int ret;
 		/* GFX_MODE is per-ring on gen7+ */
 		I915_WRITE(RING_MODE_GEN7(ring),
@@ -888,7 +888,7 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 
 	I915_WRITE(GFX_MODE, _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		int ret = ppgtt->switch_mm(ppgtt, ring, true);
 		if (ret)
 			return ret;
@@ -1246,7 +1246,7 @@ void i915_check_and_clear_faults(struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen < 6)
 		return;
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		u32 fault_reg;
 		fault_reg = I915_READ(RING_FAULT_REG(ring));
 		if (fault_reg & RING_FAULT_VALID) {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index bc36c8e..1fdec5f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2114,7 +2114,7 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 	 */
 
 	/* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		wake_up_all(&ring->irq_queue);
 
 	/* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
@@ -2657,7 +2657,7 @@ static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
 	struct intel_ring_buffer *ring;
 	int i;
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		ring->hangcheck.deadlock = false;
 }
 
@@ -2729,7 +2729,7 @@ static void i915_hangcheck_elapsed(unsigned long data)
 	if (!i915.enable_hangcheck)
 		return;
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		u32 seqno, acthd;
 		bool busy = true;
 
@@ -2807,7 +2807,7 @@ static void i915_hangcheck_elapsed(unsigned long data)
 		busy_count += busy;
 	}
 
-	for_each_ring(ring, dev_priv, i) {
+	for_each_active_ring(ring, dev_priv, i) {
 		if (ring->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG) {
 			DRM_INFO("%s on %s\n",
 				 stuck[i] ? "stuck" : "no progress",
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index fd68f93..c14a6ac 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3271,7 +3271,7 @@ static void gen8_enable_rps(struct drm_device *dev)
 	I915_WRITE(GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16);
 	I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
 	I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */
-	for_each_ring(ring, dev_priv, unused)
+	for_each_active_ring(ring, dev_priv, unused)
 		I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
 	I915_WRITE(GEN6_RC_SLEEP, 0);
 	I915_WRITE(GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */
@@ -3379,7 +3379,7 @@ static void gen6_enable_rps(struct drm_device *dev)
 	I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000);
 	I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25);
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
 
 	I915_WRITE(GEN6_RC_SLEEP, 0);
@@ -3631,7 +3631,7 @@ static void valleyview_enable_rps(struct drm_device *dev)
 	I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000);
 	I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25);
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
 
 	I915_WRITE(GEN6_RC6_THRESHOLD, 0x557);
@@ -4273,7 +4273,7 @@ bool i915_gpu_busy(void)
 		goto out_unlock;
 	dev_priv = i915_mch_dev;
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_active_ring(ring, dev_priv, i)
 		ret |= !list_empty(&ring->request_list);
 
 out_unlock:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 87d1a2d..489046a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -678,7 +678,7 @@ gen6_add_request(struct intel_ring_buffer *ring)
 		return ret;
 
 	if (i915_semaphore_is_enabled(dev)) {
-		for_each_ring(useless, dev_priv, i) {
+		for_each_active_ring(useless, dev_priv, i) {
 			u32 mbox_reg = ring->signal_mbox[i];
 			if (mbox_reg != GEN6_NOSYNC)
 				update_mboxes(ring, mbox_reg);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 03/49] drm/i915: for_each_ring
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
  2014-03-27 17:59 ` [PATCH 01/49] drm/i915/bdw: Macro to distinguish LRCs (Logical Ring Contexts) oscar.mateo
  2014-03-27 17:59 ` [PATCH 02/49] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 04/49] drm/i915: Simplify a couple of functions thanks to for_each_ring oscar.mateo
                   ` (46 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

for_each_ring() iterates over all rings supported by the hardware, not
just those which have been initialized as in for_each_active_ring()

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 57bf4a7..488d406 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1469,6 +1469,17 @@ static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
 	return dev->dev_private;
 }
 
+/* NB: Typically you want to use for_each_ring in init code before ringbuffers
+ * are setup, or in debug code. for_each_active_ring is more suited for code
+ * which is dynamically handling active rings, ie. normal code. In most
+ * (currently all cases except on pre-production hardware) for_each_ring will
+ * work even if it's a bad idea to use it - so be careful.
+ */
+#define for_each_ring(ring__, dev_priv__, i__) \
+	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
+		if (((ring__) = &(dev_priv__)->ring[(i__)]), \
+		    INTEL_INFO((dev_priv__)->dev)->ring_mask & (1<<(i__)))
+
 /* Iterate over initialised rings */
 #define for_each_active_ring(ring__, dev_priv__, i__) \
 	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 04/49] drm/i915: Simplify a couple of functions thanks to for_each_ring
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (2 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 03/49] drm/i915: for_each_ring oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 05/49] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
                   ` (45 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

This patch should have no functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 37 +++++++++++----------------------
 1 file changed, 12 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 3cfdfbe..e1c544e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -360,13 +360,10 @@ void i915_gem_context_reset(struct drm_device *dev)
 
 	/* Prevent the hardware from restoring the last context (which hung) on
 	 * the next switch */
-	for (i = 0; i < I915_NUM_RINGS; i++) {
+	for_each_ring(ring, dev_priv, i) {
 		struct i915_hw_context *dctx;
-		if (!(INTEL_INFO(dev)->ring_mask & (1<<i)))
-			continue;
 
 		/* Do a fake switch to the default context */
-		ring = &dev_priv->ring[i];
 		dctx = ring->default_context;
 		if (WARN_ON(!dctx))
 			continue;
@@ -395,7 +392,8 @@ int i915_gem_context_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring;
-	int i;
+	struct i915_hw_context *default_context;
+	int unused;
 
 	if (!HAS_HW_CONTEXTS(dev))
 		return 0;
@@ -412,23 +410,15 @@ int i915_gem_context_init(struct drm_device *dev)
 		return -E2BIG;
 	}
 
-	dev_priv->ring[RCS].default_context =
-		i915_gem_create_context(dev, NULL, USES_PPGTT(dev));
-
-	if (IS_ERR_OR_NULL(dev_priv->ring[RCS].default_context)) {
+	default_context = i915_gem_create_context(dev, NULL, USES_PPGTT(dev));
+	if (IS_ERR_OR_NULL(default_context)) {
 		DRM_DEBUG_DRIVER("Disabling HW Contexts; create failed %ld\n",
-				 PTR_ERR(dev_priv->ring[RCS].default_context));
-		return PTR_ERR(dev_priv->ring[RCS].default_context);
+				 PTR_ERR(default_context));
+		return PTR_ERR(default_context);
 	}
 
-	for (i = RCS + 1; i < I915_NUM_RINGS; i++) {
-		if (!(INTEL_INFO(dev)->ring_mask & (1<<i)))
-			continue;
-
-		ring = &dev_priv->ring[i];
-
-		/* NB: RCS will hold a ref for all rings */
-		ring->default_context = dev_priv->ring[RCS].default_context;
+	for_each_ring(ring, dev_priv, unused) {
+		ring->default_context = default_context;
 	}
 
 	DRM_DEBUG_DRIVER("HW context support initialized\n");
@@ -439,7 +429,8 @@ void i915_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *dctx = dev_priv->ring[RCS].default_context;
-	int i;
+	struct intel_ring_buffer *ring;
+	int unused;
 
 	if (!HAS_HW_CONTEXTS(dev))
 		return;
@@ -464,11 +455,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 		dev_priv->ring[RCS].last_context = NULL;
 	}
 
-	for (i = 0; i < I915_NUM_RINGS; i++) {
-		struct intel_ring_buffer *ring = &dev_priv->ring[i];
-		if (!(INTEL_INFO(dev)->ring_mask & (1<<i)))
-			continue;
-
+	for_each_ring(ring, dev_priv, unused) {
 		if (ring->last_context)
 			i915_gem_context_unreference(ring->last_context);
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 05/49] drm/i915: Extract trivial parts of ring init (early init)
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (3 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 04/49] drm/i915: Simplify a couple of functions thanks to for_each_ring oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 06/49] drm/i915/bdw: New file for logical ring contexts and execlists oscar.mateo
                   ` (44 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

It's beneficial to be able to get a name, base, and id before we've
actually initialized the rings. This ability was effectively destroyed
in the ringbuffer fire which Daniel started.

With the simple early init function, that ability is restored.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: The Full PPGTT series have moved things around a little bit.
Also, don't forget the VEBOX.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |  2 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 57 +++++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 3 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1d85dc9..f429887 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4464,6 +4464,8 @@ int i915_gem_init(struct drm_device *dev)
 
 	i915_gem_init_global_gtt(dev);
 
+	intel_init_rings_early(dev);
+
 	ret = i915_gem_context_init(dev);
 	if (ret) {
 		mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 489046a..659fed0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1875,10 +1875,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
 
-	ring->name = "render ring";
-	ring->id = RCS;
-	ring->mmio_base = RENDER_RING_BASE;
-
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
@@ -1977,10 +1973,6 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
 	int ret;
 
-	ring->name = "render ring";
-	ring->id = RCS;
-	ring->mmio_base = RENDER_RING_BASE;
-
 	if (INTEL_INFO(dev)->gen >= 6) {
 		/* non-kms not supported on gen6+ */
 		return -ENODEV;
@@ -2044,12 +2036,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[VCS];
 
-	ring->name = "bsd ring";
-	ring->id = VCS;
-
 	ring->write_tail = ring_write_tail;
 	if (INTEL_INFO(dev)->gen >= 6) {
-		ring->mmio_base = GEN6_BSD_RING_BASE;
 		/* gen6 bsd needs a special wa for tail updates */
 		if (IS_GEN6(dev))
 			ring->write_tail = gen6_bsd_ring_write_tail;
@@ -2081,7 +2069,6 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 		ring->signal_mbox[BCS] = GEN6_BVSYNC;
 		ring->signal_mbox[VECS] = GEN6_VEVSYNC;
 	} else {
-		ring->mmio_base = BSD_RING_BASE;
 		ring->flush = bsd_ring_flush;
 		ring->add_request = i9xx_add_request;
 		ring->get_seqno = ring_get_seqno;
@@ -2107,10 +2094,6 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[BCS];
 
-	ring->name = "blitter ring";
-	ring->id = BCS;
-
-	ring->mmio_base = BLT_RING_BASE;
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
@@ -2147,10 +2130,6 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[VECS];
 
-	ring->name = "video enhancement ring";
-	ring->id = VECS;
-
-	ring->mmio_base = VEBOX_RING_BASE;
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
@@ -2220,3 +2199,39 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
 	ring->gpu_caches_dirty = false;
 	return 0;
 }
+
+void intel_init_rings_early(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	dev_priv->ring[RCS].name = "render ring";
+	dev_priv->ring[RCS].id = RCS;
+	dev_priv->ring[RCS].mmio_base = RENDER_RING_BASE;
+	dev_priv->ring[RCS].dev = dev;
+	dev_priv->ring[RCS].head = 0;
+	dev_priv->ring[RCS].tail = 0;
+
+	dev_priv->ring[BCS].name = "blitter ring";
+	dev_priv->ring[BCS].id = BCS;
+	dev_priv->ring[BCS].mmio_base = BLT_RING_BASE;
+	dev_priv->ring[BCS].dev = dev;
+	dev_priv->ring[BCS].head = 0;
+	dev_priv->ring[BCS].tail = 0;
+
+	dev_priv->ring[VCS].name = "bsd ring";
+	dev_priv->ring[VCS].id = VCS;
+	if (INTEL_INFO(dev)->gen >= 6)
+		dev_priv->ring[VCS].mmio_base = GEN6_BSD_RING_BASE;
+	else
+		dev_priv->ring[VCS].mmio_base = BSD_RING_BASE;
+	dev_priv->ring[VCS].dev = dev;
+	dev_priv->ring[VCS].head = 0;
+	dev_priv->ring[VCS].tail = 0;
+
+	dev_priv->ring[VECS].name = "video enhancement ring";
+	dev_priv->ring[VECS].id = VECS;
+	dev_priv->ring[VECS].mmio_base = VEBOX_RING_BASE;
+	dev_priv->ring[VECS].dev = dev;
+	dev_priv->ring[VECS].head = 0;
+	dev_priv->ring[VECS].tail = 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f11ceb2..135bdc1 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -287,6 +287,7 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno);
 int intel_ring_flush_all_caches(struct intel_ring_buffer *ring);
 int intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring);
 
+void intel_init_rings_early(struct drm_device *dev);
 int intel_init_render_ring_buffer(struct drm_device *dev);
 int intel_init_bsd_ring_buffer(struct drm_device *dev);
 int intel_init_blt_ring_buffer(struct drm_device *dev);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 06/49] drm/i915/bdw: New file for logical ring contexts and execlists
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (4 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 05/49] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 07/49] drm/i915/bdw: Rework init code for gen8 contexts oscar.mateo
                   ` (43 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Some legacy HW context and ringbuffer code assumptions don't make sense
for this new submission method, so we will place this stuff in a separate
file.

Note for reviewers: I've carefully considered the best name for this file
and this was my best option (other possibilities were intel_lr_context.c
or intel_execlist.c). I am open to a certain bikeshedding on this matter,
anyway. Regarding splitting execlists and logical ring contexts, it is
probably not worth it for the moment.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/Makefile   |  1 +
 drivers/gpu/drm/i915/i915_lrc.c | 42 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_lrc.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b1445b7..e81aed7 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -17,6 +17,7 @@ i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o
 
 # GEM code
 i915-y += i915_cmd_parser.o \
+	  i915_lrc.o \
 	  i915_gem_context.o \
 	  i915_gem_debug.o \
 	  i915_gem_dmabuf.o \
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
new file mode 100644
index 0000000..49bb6fc
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -0,0 +1,42 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Ben Widawsky <ben@bwidawsk.net>
+ *    Michel Thierry <michel.thierry@intel.com>
+ *    Thomas Daniel <thomas.daniel@intel.com>
+ *    Oscar Mateo <oscar.mateo@intel.com>
+ *
+ */
+
+/*
+ * GEN8 brings an expansion of the HW contexts: "Logical Ring Contexts".
+ * These expanded contexts enable a number of new abilities, especially
+ * "Execlists" (also implemented in this file).
+ *
+ * Execlists are the new method by which, on gen8+ hardware, workloads are
+ * submitted for execution (as opposed to the legacy, ringbuffer-based, method).
+ */
+
+#include <drm/drmP.h>
+#include <drm/i915_drm.h>
+#include "i915_drv.h"
-- 
1.9.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 07/49] drm/i915/bdw: Rework init code for gen8 contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (5 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 06/49] drm/i915/bdw: New file for logical ring contexts and execlists oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 08/49] drm/i915: Make i915_gem_create_context outside accessible oscar.mateo
                   ` (42 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

This modifies the init code to try to start logical ring contexts when
possible, and fall back to legacy ringbuffers when not.

Most importantly, things make things easy if we do the context creation
before ringbuffer initialization. Upcoming patches will make it clearer
why that is. For the initial enabling of execlists, the ringbuffer code
will be reused a decent amount. As such this code will have to change,
but it helps with enabling to be able to run through a bunch of the
context init, and still have a system boot.

Finally, for the bikeshedders out there: I've tried merging the legacy
hw context init functionality, ie. one init function, and this logic was
in the context creation. The resulting code was much uglier and for no
real gain.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Rebased on top of the Full PPGTT series.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  3 +++
 drivers/gpu/drm/i915/i915_gem.c | 16 +++++++++++-----
 drivers/gpu/drm/i915/i915_lrc.c |  5 +++++
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 488d406..872849e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2312,6 +2312,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file);
 
+/* i915_lrc.c */
+int gen8_gem_context_init(struct drm_device *dev);
+
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
 					  struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f429887..d734ee3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4451,7 +4451,7 @@ err_out:
 int i915_gem_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret;
+	int ret = -1;
 
 	mutex_lock(&dev->struct_mutex);
 
@@ -4466,11 +4466,17 @@ int i915_gem_init(struct drm_device *dev)
 
 	intel_init_rings_early(dev);
 
-	ret = i915_gem_context_init(dev);
+	if (HAS_LOGICAL_RING_CONTEXTS(dev) && USES_PPGTT(dev))
+		ret = gen8_gem_context_init(dev);
+
 	if (ret) {
-		mutex_unlock(&dev->struct_mutex);
-		return ret;
-	}
+		ret = i915_gem_context_init(dev);
+		if (ret) {
+			mutex_unlock(&dev->struct_mutex);
+			return ret;
+		}
+	} else
+		dev_priv->lrc_enabled = true;
 
 	ret = i915_gem_init_hw(dev);
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 49bb6fc..3a93e99 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -40,3 +40,8 @@
 #include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
+
+int gen8_gem_context_init(struct drm_device *dev)
+{
+	return -ENOSYS;
+}
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 08/49] drm/i915: Make i915_gem_create_context outside accessible
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (6 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 07/49] drm/i915/bdw: Rework init code for gen8 contexts oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 09/49] drm/i915: Extract ringbuffer obj alloc & destroy oscar.mateo
                   ` (41 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

We are going to reuse it during logical ring context creation.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 2 ++
 drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 872849e..5a1cf27 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2281,6 +2281,8 @@ void i915_gem_object_ggtt_unpin(struct drm_i915_gem_object *obj);
 #define ctx_to_ppgtt(ctx) container_of((ctx)->vm, struct i915_hw_ppgtt, base)
 int __must_check i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
+struct i915_hw_context *i915_gem_create_context(struct drm_device *dev,
+		struct drm_i915_file_private *file_priv, bool create_vm);
 void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 int i915_gem_context_enable(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e1c544e..cdd44fa 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -278,7 +278,7 @@ err_out:
  * context state of the GPU for applications that don't utilize HW contexts, as
  * well as an idle case.
  */
-static struct i915_hw_context *
+struct i915_hw_context *
 i915_gem_create_context(struct drm_device *dev,
 			struct drm_i915_file_private *file_priv,
 			bool create_vm)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 09/49] drm/i915: Extract ringbuffer obj alloc & destroy
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (7 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 08/49] drm/i915: Make i915_gem_create_context outside accessible oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 10/49] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
                   ` (40 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

There should be no functional changes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Multiple rebases.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 73 +++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 659fed0..f734c9d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1332,6 +1332,45 @@ static int init_phys_status_page(struct intel_ring_buffer *ring)
 	return 0;
 }
 
+static void destroy_ring_buffer(struct intel_ring_buffer *ring)
+{
+	i915_gem_object_ggtt_unpin(ring->obj);
+	drm_gem_object_unreference(&ring->obj->base);
+	ring->obj = NULL;
+}
+
+static int alloc_ring_buffer(struct intel_ring_buffer *ring)
+{
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_gem_object *obj = NULL;
+	int ret;
+
+	if (!HAS_LLC(dev))
+		obj = i915_gem_object_create_stolen(dev, ring->size);
+	if (obj == NULL)
+		obj = i915_gem_alloc_object(dev, ring->size);
+	if (obj == NULL) {
+		DRM_ERROR("Failed to allocate ringbuffer\n");
+		return -ENOMEM;
+	}
+
+	ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
+	if (ret) {
+		drm_gem_object_unreference(&obj->base);
+		return ret;
+	}
+
+	ret = i915_gem_object_set_to_gtt_domain(obj, true);
+	if (ret) {
+		destroy_ring_buffer(ring);
+		return ret;
+	}
+
+	ring->obj = obj;
+
+	return 0;
+}
+
 static int intel_init_ring_buffer(struct drm_device *dev,
 				  struct intel_ring_buffer *ring)
 {
@@ -1358,26 +1397,11 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 			return ret;
 	}
 
-	obj = NULL;
-	if (!HAS_LLC(dev))
-		obj = i915_gem_object_create_stolen(dev, ring->size);
-	if (obj == NULL)
-		obj = i915_gem_alloc_object(dev, ring->size);
-	if (obj == NULL) {
-		DRM_ERROR("Failed to allocate ringbuffer\n");
-		ret = -ENOMEM;
-		goto err_hws;
-	}
-
-	ring->obj = obj;
-
-	ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
+	ret = alloc_ring_buffer(ring);
 	if (ret)
-		goto err_unref;
+		goto err_hws;
 
-	ret = i915_gem_object_set_to_gtt_domain(obj, true);
-	if (ret)
-		goto err_unpin;
+	obj = ring->obj;
 
 	ring->virtual_start =
 		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj),
@@ -1385,7 +1409,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	if (ring->virtual_start == NULL) {
 		DRM_ERROR("Failed to map ringbuffer.\n");
 		ret = -EINVAL;
-		goto err_unpin;
+		goto destroy_ring;
 	}
 
 	ret = ring->init(ring);
@@ -1406,11 +1430,8 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 
 err_unmap:
 	iounmap(ring->virtual_start);
-err_unpin:
-	i915_gem_object_ggtt_unpin(obj);
-err_unref:
-	drm_gem_object_unreference(&obj->base);
-	ring->obj = NULL;
+destroy_ring:
+	destroy_ring_buffer(ring);
 err_hws:
 	cleanup_status_page(ring);
 	return ret;
@@ -1435,9 +1456,7 @@ void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
 
 	iounmap(ring->virtual_start);
 
-	i915_gem_object_ggtt_unpin(ring->obj);
-	drm_gem_object_unreference(&ring->obj->base);
-	ring->obj = NULL;
+	destroy_ring_buffer(ring);
 	ring->preallocated_lazy_request = NULL;
 	ring->outstanding_lazy_seqno = 0;
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 10/49] drm/i915: s/intel_ring_buffer/intel_engine
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (8 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 09/49] drm/i915: Extract ringbuffer obj alloc & destroy oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 11/49] drm/i915: Split the ringbuffers and the rings oscar.mateo
                   ` (39 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Since we are about to break the correlation between number of engines
(or rings) and number of ringbuffers, it makes sense to refactor the
code and make the change obvious.

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_cmd_parser.c     |  14 +--
 drivers/gpu/drm/i915/i915_debugfs.c        |  16 +--
 drivers/gpu/drm/i915/i915_dma.c            |  10 +-
 drivers/gpu/drm/i915/i915_drv.h            |  30 +++---
 drivers/gpu/drm/i915/i915_gem.c            |  54 +++++------
 drivers/gpu/drm/i915/i915_gem_context.c    |  16 +--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  18 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  18 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.h        |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      |   6 +-
 drivers/gpu/drm/i915/i915_irq.c            |  28 +++---
 drivers/gpu/drm/i915/i915_trace.h          |  26 ++---
 drivers/gpu/drm/i915/intel_display.c       |  14 +--
 drivers/gpu/drm/i915/intel_drv.h           |   4 +-
 drivers/gpu/drm/i915/intel_overlay.c       |  12 +--
 drivers/gpu/drm/i915/intel_pm.c            |  10 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 150 ++++++++++++++---------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  70 +++++++-------
 18 files changed, 250 insertions(+), 248 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 0eaed44..a29e893 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -137,7 +137,7 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
 	return 0;
 }
 
-static void validate_cmds_sorted(struct intel_ring_buffer *ring)
+static void validate_cmds_sorted(struct intel_engine *ring)
 {
 	int i;
 
@@ -179,7 +179,7 @@ static void check_sorted(int ring_id, const u32 *reg_table, int reg_count)
 	}
 }
 
-static void validate_regs_sorted(struct intel_ring_buffer *ring)
+static void validate_regs_sorted(struct intel_engine *ring)
 {
 	check_sorted(ring->id, ring->reg_table, ring->reg_count);
 	check_sorted(ring->id, ring->master_reg_table, ring->master_reg_count);
@@ -190,10 +190,10 @@ static void validate_regs_sorted(struct intel_ring_buffer *ring)
  * @ring: the ringbuffer to initialize
  *
  * Optionally initializes fields related to batch buffer command parsing in the
- * struct intel_ring_buffer based on whether the platform requires software
+ * struct intel_engine based on whether the platform requires software
  * command parsing.
  */
-void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring)
+void i915_cmd_parser_init_ring(struct intel_engine *ring)
 {
 	if (!IS_GEN7(ring->dev))
 		return;
@@ -249,7 +249,7 @@ find_cmd_in_table(const struct drm_i915_cmd_table *table,
  * ring's default length encoding and returns default_desc.
  */
 static const struct drm_i915_cmd_descriptor*
-find_cmd(struct intel_ring_buffer *ring,
+find_cmd(struct intel_engine *ring,
 	 u32 cmd_header,
 	 struct drm_i915_cmd_descriptor *default_desc)
 {
@@ -329,7 +329,7 @@ finish:
  *
  * Return: true if the ring requires software command parsing
  */
-bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
+bool i915_needs_cmd_parser(struct intel_engine *ring)
 {
 	/* No command tables indicates a platform without parsing */
 	if (!ring->cmd_tables)
@@ -352,7 +352,7 @@ bool i915_needs_cmd_parser(struct intel_ring_buffer *ring)
  *
  * Return: non-zero if the parser finds violations or otherwise fails
  */
-int i915_parse_cmds(struct intel_ring_buffer *ring,
+int i915_parse_cmds(struct intel_engine *ring,
 		    struct drm_i915_gem_object *batch_obj,
 		    u32 batch_start_offset,
 		    bool is_master)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f423eb6..8b06acb 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -562,7 +562,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct drm_i915_gem_request *gem_request;
 	int ret, count, i;
 
@@ -594,7 +594,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 }
 
 static void i915_ring_seqno_info(struct seq_file *m,
-				 struct intel_ring_buffer *ring)
+				 struct intel_engine *ring)
 {
 	if (ring->get_seqno) {
 		seq_printf(m, "Current sequence (%s): %u\n",
@@ -607,7 +607,7 @@ static int i915_gem_seqno_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -630,7 +630,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i, pipe;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -800,7 +800,7 @@ static int i915_hws_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	const u32 *hws;
 	int i;
 
@@ -1654,7 +1654,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct i915_hw_context *ctx;
 	int ret, i;
 
@@ -1800,7 +1800,7 @@ static int per_file_ctx(int id, void *ptr, void *data)
 static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
 	int unused, i;
 
@@ -1825,7 +1825,7 @@ static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct drm_file *file;
 	int i;
 
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 4e0a26a..43c5df0 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -119,7 +119,7 @@ static void i915_write_hws_pga(struct drm_device *dev)
 static void i915_free_hws(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	struct intel_engine *ring = LP_RING(dev_priv);
 
 	if (dev_priv->status_page_dmah) {
 		drm_pci_free(dev, dev_priv->status_page_dmah);
@@ -139,7 +139,7 @@ void i915_kernel_lost_context(struct drm_device * dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_master_private *master_priv;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	struct intel_engine *ring = LP_RING(dev_priv);
 
 	/*
 	 * We should never lose context on the ring with modesetting
@@ -234,7 +234,7 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
 static int i915_dma_resume(struct drm_device * dev)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	struct intel_engine *ring = LP_RING(dev_priv);
 
 	DRM_DEBUG_DRIVER("%s\n", __func__);
 
@@ -782,7 +782,7 @@ static int i915_wait_irq(struct drm_device * dev, int irq_nr)
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
 	struct drm_i915_master_private *master_priv = dev->primary->master->driver_priv;
 	int ret = 0;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
+	struct intel_engine *ring = LP_RING(dev_priv);
 
 	DRM_DEBUG_DRIVER("irq_nr=%d breadcrumb=%d\n", irq_nr,
 		  READ_BREADCRUMB(dev_priv));
@@ -1070,7 +1070,7 @@ static int i915_set_status_page(struct drm_device *dev, void *data,
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	drm_i915_hws_addr_t *hws = data;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		return -ENODEV;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5a1cf27..c03c674 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -595,7 +595,7 @@ struct i915_hw_context {
 	bool is_initialized;
 	uint8_t remap_slice;
 	struct drm_i915_file_private *file_priv;
-	struct intel_ring_buffer *last_ring;
+	struct intel_engine *last_ring;
 	struct drm_i915_gem_object *obj;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_address_space *vm;
@@ -1283,7 +1283,7 @@ typedef struct drm_i915_private {
 	wait_queue_head_t gmbus_wait_queue;
 
 	struct pci_dev *bridge_dev;
-	struct intel_ring_buffer ring[I915_NUM_RINGS];
+	struct intel_engine ring[I915_NUM_RINGS];
 	uint32_t last_seqno, next_seqno;
 
 	drm_dma_handle_t *status_page_dmah;
@@ -1600,7 +1600,7 @@ struct drm_i915_gem_object {
 	void *dma_buf_vmapping;
 	int vmapping_count;
 
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 
 	/** Breadcrumb of last rendering to the buffer. */
 	uint32_t last_read_seqno;
@@ -1639,7 +1639,7 @@ struct drm_i915_gem_object {
  */
 struct drm_i915_gem_request {
 	/** On Which ring this request was generated */
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 
 	/** GEM sequence number associated with this request. */
 	uint32_t seqno;
@@ -2091,9 +2091,9 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
-			 struct intel_ring_buffer *to);
+			 struct intel_engine *to);
 void i915_vma_move_to_active(struct i915_vma *vma,
-			     struct intel_ring_buffer *ring);
+			     struct intel_engine *ring);
 int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
 			 struct drm_mode_create_dumb *args);
@@ -2135,7 +2135,7 @@ i915_gem_object_unpin_fence(struct drm_i915_gem_object *obj)
 }
 
 struct drm_i915_gem_request *
-i915_gem_find_active_request(struct intel_ring_buffer *ring);
+i915_gem_find_active_request(struct intel_engine *ring);
 
 bool i915_gem_retire_requests(struct drm_device *dev);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
@@ -2161,18 +2161,18 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_init(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
-int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice);
+int i915_gem_l3_remap(struct intel_engine *ring, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
-int __i915_add_request(struct intel_ring_buffer *ring,
+int __i915_add_request(struct intel_engine *ring,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *batch_obj,
 		       u32 *seqno);
 #define i915_add_request(ring, seqno) \
 	__i915_add_request(ring, NULL, NULL, seqno)
-int __must_check i915_wait_seqno(struct intel_ring_buffer *ring,
+int __must_check i915_wait_seqno(struct intel_engine *ring,
 				 uint32_t seqno);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
 int __must_check
@@ -2183,7 +2183,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
 int __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
-				     struct intel_ring_buffer *pipelined);
+				     struct intel_engine *pipelined);
 void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj);
 int i915_gem_attach_phys_object(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
@@ -2287,7 +2287,7 @@ void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 int i915_gem_context_enable(struct drm_i915_private *dev_priv);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
-int i915_switch_context(struct intel_ring_buffer *ring,
+int i915_switch_context(struct intel_engine *ring,
 			struct drm_file *file, struct i915_hw_context *to);
 struct i915_hw_context *
 i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
@@ -2400,9 +2400,9 @@ void i915_get_extra_instdone(struct drm_device *dev, uint32_t *instdone);
 const char *i915_cache_level_str(int type);
 
 /* i915_cmd_parser.c */
-void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring);
-bool i915_needs_cmd_parser(struct intel_ring_buffer *ring);
-int i915_parse_cmds(struct intel_ring_buffer *ring,
+void i915_cmd_parser_init_ring(struct intel_engine *ring);
+bool i915_needs_cmd_parser(struct intel_engine *ring);
+int i915_parse_cmds(struct intel_engine *ring,
 		    struct drm_i915_gem_object *batch_obj,
 		    u32 batch_start_offset,
 		    bool is_master);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d734ee3..37df622 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -61,7 +61,7 @@ static unsigned long i915_gem_inactive_scan(struct shrinker *shrinker,
 static unsigned long i915_gem_purge(struct drm_i915_private *dev_priv, long target);
 static unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv);
 static void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
-static void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
+static void i915_gem_retire_requests_ring(struct intel_engine *ring);
 
 static bool cpu_cache_is_coherent(struct drm_device *dev,
 				  enum i915_cache_level level)
@@ -970,7 +970,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * equal.
  */
 static int
-i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
+i915_gem_check_olr(struct intel_engine *ring, u32 seqno)
 {
 	int ret;
 
@@ -989,7 +989,7 @@ static void fake_irq(unsigned long data)
 }
 
 static bool missed_irq(struct drm_i915_private *dev_priv,
-		       struct intel_ring_buffer *ring)
+		       struct intel_engine *ring)
 {
 	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
 }
@@ -1020,7 +1020,7 @@ static bool can_wait_boost(struct drm_i915_file_private *file_priv)
  * Returns 0 if the seqno was found within the alloted time. Else returns the
  * errno with remaining time filled in timeout argument.
  */
-static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
+static int __wait_seqno(struct intel_engine *ring, u32 seqno,
 			unsigned reset_counter,
 			bool interruptible,
 			struct timespec *timeout,
@@ -1127,7 +1127,7 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
  * request and object lists appropriately for that event.
  */
 int
-i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno)
+i915_wait_seqno(struct intel_engine *ring, uint32_t seqno)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1152,7 +1152,7 @@ i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno)
 
 static int
 i915_gem_object_wait_rendering__tail(struct drm_i915_gem_object *obj,
-				     struct intel_ring_buffer *ring)
+				     struct intel_engine *ring)
 {
 	i915_gem_retire_requests_ring(ring);
 
@@ -1177,7 +1177,7 @@ static __must_check int
 i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 			       bool readonly)
 {
-	struct intel_ring_buffer *ring = obj->ring;
+	struct intel_engine *ring = obj->ring;
 	u32 seqno;
 	int ret;
 
@@ -1202,7 +1202,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = obj->ring;
+	struct intel_engine *ring = obj->ring;
 	unsigned reset_counter;
 	u32 seqno;
 	int ret;
@@ -2013,7 +2013,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 
 static void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
-			       struct intel_ring_buffer *ring)
+			       struct intel_engine *ring)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -2051,7 +2051,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 }
 
 void i915_vma_move_to_active(struct i915_vma *vma,
-			     struct intel_ring_buffer *ring)
+			     struct intel_engine *ring)
 {
 	list_move_tail(&vma->mm_list, &vma->vm->active_list);
 	return i915_gem_object_move_to_active(vma->obj, ring);
@@ -2093,7 +2093,7 @@ static int
 i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i, j;
 
 	/* Carefully retire all requests without writing to the rings */
@@ -2159,7 +2159,7 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
 	return 0;
 }
 
-int __i915_add_request(struct intel_ring_buffer *ring,
+int __i915_add_request(struct intel_engine *ring,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *obj,
 		       u32 *out_seqno)
@@ -2318,7 +2318,7 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request)
 }
 
 struct drm_i915_gem_request *
-i915_gem_find_active_request(struct intel_ring_buffer *ring)
+i915_gem_find_active_request(struct intel_engine *ring)
 {
 	struct drm_i915_gem_request *request;
 	u32 completed_seqno;
@@ -2336,7 +2336,7 @@ i915_gem_find_active_request(struct intel_ring_buffer *ring)
 }
 
 static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv,
-				       struct intel_ring_buffer *ring)
+				       struct intel_engine *ring)
 {
 	struct drm_i915_gem_request *request;
 	bool ring_hung;
@@ -2355,7 +2355,7 @@ static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv,
 }
 
 static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
-					struct intel_ring_buffer *ring)
+					struct intel_engine *ring)
 {
 	while (!list_empty(&ring->active_list)) {
 		struct drm_i915_gem_object *obj;
@@ -2409,7 +2409,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	/*
@@ -2434,7 +2434,7 @@ void i915_gem_reset(struct drm_device *dev)
  * This function clears the request list as sequence numbers are passed.
  */
 static void
-i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
+i915_gem_retire_requests_ring(struct intel_engine *ring)
 {
 	uint32_t seqno;
 
@@ -2497,7 +2497,7 @@ bool
 i915_gem_retire_requests(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	bool idle = true;
 	int i;
 
@@ -2591,7 +2591,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_gem_wait *args = data;
 	struct drm_i915_gem_object *obj;
-	struct intel_ring_buffer *ring = NULL;
+	struct intel_engine *ring = NULL;
 	struct timespec timeout_stack, *timeout = NULL;
 	unsigned reset_counter;
 	u32 seqno = 0;
@@ -2662,9 +2662,9 @@ out:
  */
 int
 i915_gem_object_sync(struct drm_i915_gem_object *obj,
-		     struct intel_ring_buffer *to)
+		     struct intel_engine *to)
 {
-	struct intel_ring_buffer *from = obj->ring;
+	struct intel_engine *from = obj->ring;
 	u32 seqno;
 	int ret, idx;
 
@@ -2785,7 +2785,7 @@ int i915_vma_unbind(struct i915_vma *vma)
 int i915_gpu_idle(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i;
 
 	/* Flush everything onto the inactive list. */
@@ -3641,7 +3641,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
 int
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
-				     struct intel_ring_buffer *pipelined)
+				     struct intel_engine *pipelined)
 {
 	u32 old_read_domains, old_write_domain;
 	int ret;
@@ -3793,7 +3793,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	unsigned long recent_enough = jiffies - msecs_to_jiffies(20);
 	struct drm_i915_gem_request *request;
-	struct intel_ring_buffer *ring = NULL;
+	struct intel_engine *ring = NULL;
 	unsigned reset_counter;
 	u32 seqno = 0;
 	int ret;
@@ -4273,7 +4273,7 @@ err:
 	return ret;
 }
 
-int i915_gem_l3_remap(struct intel_ring_buffer *ring, int slice)
+int i915_gem_l3_remap(struct intel_engine *ring, int slice)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -4497,7 +4497,7 @@ void
 i915_gem_cleanup_ringbuffer(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	for_each_active_ring(ring, dev_priv, i)
@@ -4572,7 +4572,7 @@ i915_gem_lastclose(struct drm_device *dev)
 }
 
 static void
-init_ring_lists(struct intel_ring_buffer *ring)
+init_ring_lists(struct intel_engine *ring)
 {
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cdd44fa..e92b9c5 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -96,7 +96,7 @@
 #define GEN6_CONTEXT_ALIGN (64<<10)
 #define GEN7_CONTEXT_ALIGN 4096
 
-static int do_switch(struct intel_ring_buffer *ring,
+static int do_switch(struct intel_engine *ring,
 		     struct i915_hw_context *to);
 
 static void do_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
@@ -352,7 +352,7 @@ err_destroy:
 void i915_gem_context_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	if (!HAS_HW_CONTEXTS(dev))
@@ -391,7 +391,7 @@ void i915_gem_context_reset(struct drm_device *dev)
 int i915_gem_context_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct i915_hw_context *default_context;
 	int unused;
 
@@ -429,7 +429,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *dctx = dev_priv->ring[RCS].default_context;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int unused;
 
 	if (!HAS_HW_CONTEXTS(dev))
@@ -470,7 +470,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 
 int i915_gem_context_enable(struct drm_i915_private *dev_priv)
 {
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int ret, i;
 
 	if (!HAS_HW_CONTEXTS(dev_priv->dev))
@@ -572,7 +572,7 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 }
 
 static inline int
-mi_set_context(struct intel_ring_buffer *ring,
+mi_set_context(struct intel_engine *ring,
 	       struct i915_hw_context *new_context,
 	       u32 hw_flags)
 {
@@ -622,7 +622,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 	return ret;
 }
 
-static int do_switch(struct intel_ring_buffer *ring,
+static int do_switch(struct intel_engine *ring,
 		     struct i915_hw_context *to)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -753,7 +753,7 @@ unpin_out:
  * it will have a refoucnt > 1. This allows us to destroy the context abstract
  * object while letting the normal object tracking destroy the backing BO.
  */
-int i915_switch_context(struct intel_ring_buffer *ring,
+int i915_switch_context(struct intel_engine *ring,
 			struct drm_file *file,
 			struct i915_hw_context *to)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3851a1b..73f8712 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -538,7 +538,7 @@ need_reloc_mappable(struct i915_vma *vma)
 
 static int
 i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
-				struct intel_ring_buffer *ring,
+				struct intel_engine *ring,
 				bool *need_reloc)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
@@ -593,7 +593,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 }
 
 static int
-i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
+i915_gem_execbuffer_reserve(struct intel_engine *ring,
 			    struct list_head *vmas,
 			    bool *need_relocs)
 {
@@ -708,7 +708,7 @@ static int
 i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 				  struct drm_i915_gem_execbuffer2 *args,
 				  struct drm_file *file,
-				  struct intel_ring_buffer *ring,
+				  struct intel_engine *ring,
 				  struct eb_vmas *eb,
 				  struct drm_i915_gem_exec_object2 *exec)
 {
@@ -824,7 +824,7 @@ err:
 }
 
 static int
-i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring,
+i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
 				struct list_head *vmas)
 {
 	struct i915_vma *vma;
@@ -909,7 +909,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
 
 static struct i915_hw_context *
 i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
-			  struct intel_ring_buffer *ring, const u32 ctx_id)
+			  struct intel_engine *ring, const u32 ctx_id)
 {
 	struct i915_hw_context *ctx = NULL;
 	struct i915_ctx_hang_stats *hs;
@@ -932,7 +932,7 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 
 static void
 i915_gem_execbuffer_move_to_active(struct list_head *vmas,
-				   struct intel_ring_buffer *ring)
+				   struct intel_engine *ring)
 {
 	struct i915_vma *vma;
 
@@ -964,7 +964,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 static void
 i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 				    struct drm_file *file,
-				    struct intel_ring_buffer *ring,
+				    struct intel_engine *ring,
 				    struct drm_i915_gem_object *obj)
 {
 	/* Unconditionally force add_request to emit a full flush. */
@@ -976,7 +976,7 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 
 static int
 i915_reset_gen7_sol_offsets(struct drm_device *dev,
-			    struct intel_ring_buffer *ring)
+			    struct intel_engine *ring)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	int ret, i;
@@ -1009,7 +1009,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct eb_vmas *eb;
 	struct drm_i915_gem_object *batch_obj;
 	struct drm_clip_rect *cliprects = NULL;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	struct i915_hw_context *ctx;
 	struct i915_address_space *vm;
 	const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5ff0b20..5333319 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -187,7 +187,7 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr,
 }
 
 /* Broadwell Page Directory Pointer Descriptors */
-static int gen8_write_pdp(struct intel_ring_buffer *ring, unsigned entry,
+static int gen8_write_pdp(struct intel_engine *ring, unsigned entry,
 			   uint64_t val, bool synchronous)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -217,7 +217,7 @@ static int gen8_write_pdp(struct intel_ring_buffer *ring, unsigned entry,
 }
 
 static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_ring_buffer *ring,
+			  struct intel_engine *ring,
 			  bool synchronous)
 {
 	int i, ret;
@@ -687,7 +687,7 @@ static uint32_t get_pd_offset(struct i915_hw_ppgtt *ppgtt)
 }
 
 static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			 struct intel_ring_buffer *ring,
+			 struct intel_engine *ring,
 			 bool synchronous)
 {
 	struct drm_device *dev = ppgtt->base.dev;
@@ -731,7 +731,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_ring_buffer *ring,
+			  struct intel_engine *ring,
 			  bool synchronous)
 {
 	struct drm_device *dev = ppgtt->base.dev;
@@ -782,7 +782,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static int gen6_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_ring_buffer *ring,
+			  struct intel_engine *ring,
 			  bool synchronous)
 {
 	struct drm_device *dev = ppgtt->base.dev;
@@ -803,7 +803,7 @@ static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int j, ret;
 
 	for_each_active_ring(ring, dev_priv, j) {
@@ -833,7 +833,7 @@ static int gen7_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	uint32_t ecochk, ecobits;
 	int i;
 
@@ -872,7 +872,7 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	uint32_t ecochk, gab_ctl, ecobits;
 	int i;
 
@@ -1240,7 +1240,7 @@ static void undo_idling(struct drm_i915_private *dev_priv, bool interruptible)
 void i915_check_and_clear_faults(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	if (INTEL_INFO(dev)->gen < 6)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b5e8ac0..c6b61ea 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -260,7 +260,7 @@ struct i915_hw_ppgtt {
 
 	int (*enable)(struct i915_hw_ppgtt *ppgtt);
 	int (*switch_mm)(struct i915_hw_ppgtt *ppgtt,
-			 struct intel_ring_buffer *ring,
+			 struct intel_engine *ring,
 			 bool synchronous);
 	void (*debug_dump)(struct i915_hw_ppgtt *ppgtt, struct seq_file *m);
 };
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index baf1ca6..83d8db5 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -752,7 +752,7 @@ static void i915_gem_record_fences(struct drm_device *dev,
 }
 
 static void i915_record_ring_state(struct drm_device *dev,
-				   struct intel_ring_buffer *ring,
+				   struct intel_engine *ring,
 				   struct drm_i915_error_ring *ering)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -862,7 +862,7 @@ static void i915_record_ring_state(struct drm_device *dev,
 }
 
 
-static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
+static void i915_gem_record_active_context(struct intel_engine *ring,
 					   struct drm_i915_error_state *error,
 					   struct drm_i915_error_ring *ering)
 {
@@ -892,7 +892,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 	int i, count;
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
-		struct intel_ring_buffer *ring = &dev_priv->ring[i];
+		struct intel_engine *ring = &dev_priv->ring[i];
 
 		if (ring->dev == NULL)
 			continue;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1fdec5f..d30a30b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1073,7 +1073,7 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 }
 
 static void notify_ring(struct drm_device *dev,
-			struct intel_ring_buffer *ring)
+			struct intel_engine *ring)
 {
 	if (ring->obj == NULL)
 		return;
@@ -2103,7 +2103,7 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
 static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 			       bool reset_completed)
 {
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	/*
@@ -2526,14 +2526,14 @@ static void gen8_disable_vblank(struct drm_device *dev, int pipe)
 }
 
 static u32
-ring_last_seqno(struct intel_ring_buffer *ring)
+ring_last_seqno(struct intel_engine *ring)
 {
 	return list_entry(ring->request_list.prev,
 			  struct drm_i915_gem_request, list)->seqno;
 }
 
 static bool
-ring_idle(struct intel_ring_buffer *ring, u32 seqno)
+ring_idle(struct intel_engine *ring, u32 seqno)
 {
 	return (list_empty(&ring->request_list) ||
 		i915_seqno_passed(seqno, ring_last_seqno(ring)));
@@ -2556,11 +2556,11 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
 	}
 }
 
-static struct intel_ring_buffer *
-semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
+static struct intel_engine *
+semaphore_wait_to_signaller_ring(struct intel_engine *ring, u32 ipehr)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ring_buffer *signaller;
+	struct intel_engine *signaller;
 	int i;
 
 	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
@@ -2589,8 +2589,8 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
 	return NULL;
 }
 
-static struct intel_ring_buffer *
-semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
+static struct intel_engine *
+semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 cmd, ipehr, head;
@@ -2632,10 +2632,10 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
 	return semaphore_wait_to_signaller_ring(ring, ipehr);
 }
 
-static int semaphore_passed(struct intel_ring_buffer *ring)
+static int semaphore_passed(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ring_buffer *signaller;
+	struct intel_engine *signaller;
 	u32 seqno, ctl;
 
 	ring->hangcheck.deadlock = true;
@@ -2654,7 +2654,7 @@ static int semaphore_passed(struct intel_ring_buffer *ring)
 
 static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
 {
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 
 	for_each_active_ring(ring, dev_priv, i)
@@ -2662,7 +2662,7 @@ static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
 }
 
 static enum intel_ring_hangcheck_action
-ring_stuck(struct intel_ring_buffer *ring, u32 acthd)
+ring_stuck(struct intel_engine *ring, u32 acthd)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -2718,7 +2718,7 @@ static void i915_hangcheck_elapsed(unsigned long data)
 {
 	struct drm_device *dev = (struct drm_device *)data;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	int i;
 	int busy_count = 0, rings_hung = 0;
 	bool stuck[I915_NUM_RINGS] = { 0 };
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 23c26f1..b11044c 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -251,8 +251,8 @@ TRACE_EVENT(i915_gem_evict_vm,
 );
 
 TRACE_EVENT(i915_gem_ring_sync_to,
-	    TP_PROTO(struct intel_ring_buffer *from,
-		     struct intel_ring_buffer *to,
+	    TP_PROTO(struct intel_engine *from,
+		     struct intel_engine *to,
 		     u32 seqno),
 	    TP_ARGS(from, to, seqno),
 
@@ -277,7 +277,7 @@ TRACE_EVENT(i915_gem_ring_sync_to,
 );
 
 TRACE_EVENT(i915_gem_ring_dispatch,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno, u32 flags),
 	    TP_ARGS(ring, seqno, flags),
 
 	    TP_STRUCT__entry(
@@ -300,7 +300,7 @@ TRACE_EVENT(i915_gem_ring_dispatch,
 );
 
 TRACE_EVENT(i915_gem_ring_flush,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 invalidate, u32 flush),
+	    TP_PROTO(struct intel_engine *ring, u32 invalidate, u32 flush),
 	    TP_ARGS(ring, invalidate, flush),
 
 	    TP_STRUCT__entry(
@@ -323,7 +323,7 @@ TRACE_EVENT(i915_gem_ring_flush,
 );
 
 DECLARE_EVENT_CLASS(i915_gem_request,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno),
 
 	    TP_STRUCT__entry(
@@ -343,12 +343,12 @@ DECLARE_EVENT_CLASS(i915_gem_request,
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno)
 );
 
 TRACE_EVENT(i915_gem_request_complete,
-	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_PROTO(struct intel_engine *ring),
 	    TP_ARGS(ring),
 
 	    TP_STRUCT__entry(
@@ -368,12 +368,12 @@ TRACE_EVENT(i915_gem_request_complete,
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_retire,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno)
 );
 
 TRACE_EVENT(i915_gem_request_wait_begin,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno),
 
 	    TP_STRUCT__entry(
@@ -402,12 +402,12 @@ TRACE_EVENT(i915_gem_request_wait_begin,
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_wait_end,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
+	    TP_PROTO(struct intel_engine *ring, u32 seqno),
 	    TP_ARGS(ring, seqno)
 );
 
 DECLARE_EVENT_CLASS(i915_ring,
-	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_PROTO(struct intel_engine *ring),
 	    TP_ARGS(ring),
 
 	    TP_STRUCT__entry(
@@ -424,12 +424,12 @@ DECLARE_EVENT_CLASS(i915_ring,
 );
 
 DEFINE_EVENT(i915_ring, i915_ring_wait_begin,
-	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_PROTO(struct intel_engine *ring),
 	    TP_ARGS(ring)
 );
 
 DEFINE_EVENT(i915_ring, i915_ring_wait_end,
-	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_PROTO(struct intel_engine *ring),
 	    TP_ARGS(ring)
 );
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 7e4ea8d..30ab378 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1954,7 +1954,7 @@ static int intel_align_height(struct drm_device *dev, int height, bool tiled)
 int
 intel_pin_and_fence_fb_obj(struct drm_device *dev,
 			   struct drm_i915_gem_object *obj,
-			   struct intel_ring_buffer *pipelined)
+			   struct intel_engine *pipelined)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 alignment;
@@ -8452,7 +8452,7 @@ out:
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
-			struct intel_ring_buffer *ring)
+			struct intel_engine *ring)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_crtc *crtc;
@@ -8610,7 +8610,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	u32 flip_mask;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
@@ -8655,7 +8655,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	u32 flip_mask;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
@@ -8697,7 +8697,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	uint32_t pf, pipesrc;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
@@ -8745,7 +8745,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	uint32_t pf, pipesrc;
 	int ret;
 
@@ -8790,7 +8790,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	uint32_t plane_bit = 0;
 	int len, ret;
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index fa99104..a693843 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -667,7 +667,7 @@ bool intel_has_pending_fb_unpin(struct drm_device *dev);
 int intel_pch_rawclk(struct drm_device *dev);
 void intel_mark_busy(struct drm_device *dev);
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
-			struct intel_ring_buffer *ring);
+			struct intel_engine *ring);
 void intel_mark_idle(struct drm_device *dev);
 void intel_crtc_restore_mode(struct drm_crtc *crtc);
 void intel_crtc_update_dpms(struct drm_crtc *crtc);
@@ -699,7 +699,7 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
 				    struct intel_load_detect_pipe *old);
 int intel_pin_and_fence_fb_obj(struct drm_device *dev,
 			       struct drm_i915_gem_object *obj,
-			       struct intel_ring_buffer *pipelined);
+			       struct intel_engine *pipelined);
 void intel_unpin_fb_obj(struct drm_i915_gem_object *obj);
 struct drm_framebuffer *
 __intel_framebuffer_create(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 312961a..4e5662f 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -213,7 +213,7 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
 {
 	struct drm_device *dev = overlay->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	BUG_ON(overlay->last_flip_req);
@@ -236,7 +236,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 {
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	BUG_ON(overlay->active);
@@ -263,7 +263,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 {
 	struct drm_device *dev = overlay->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	u32 flip_addr = overlay->flip_addr;
 	u32 tmp;
 	int ret;
@@ -320,7 +320,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
 {
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	u32 flip_addr = overlay->flip_addr;
 	int ret;
 
@@ -363,7 +363,7 @@ static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay)
 {
 	struct drm_device *dev = overlay->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	if (overlay->last_flip_req == 0)
@@ -389,7 +389,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
 {
 	struct drm_device *dev = overlay->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	/* Only wait if there is actually an old frame to release to
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c14a6ac..b2decfa 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3251,7 +3251,7 @@ static void gen6_enable_rps_interrupts(struct drm_device *dev)
 static void gen8_enable_rps(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	uint32_t rc6_mask = 0, rp_state_cap;
 	int unused;
 
@@ -3323,7 +3323,7 @@ static void gen8_enable_rps(struct drm_device *dev)
 static void gen6_enable_rps(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	u32 rp_state_cap;
 	u32 gt_perf_status;
 	u32 rc6vids, pcu_mbox = 0, rc6_mask = 0;
@@ -3597,7 +3597,7 @@ out:
 static void valleyview_enable_rps(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	u32 gtfifodbg, val, rc6_mode = 0;
 	int i;
 
@@ -3752,7 +3752,7 @@ static int ironlake_setup_rc6(struct drm_device *dev)
 static void ironlake_enable_rc6(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	bool was_interruptible;
 	int ret;
 
@@ -4264,7 +4264,7 @@ EXPORT_SYMBOL_GPL(i915_gpu_lower);
 bool i915_gpu_busy(void)
 {
 	struct drm_i915_private *dev_priv;
-	struct intel_ring_buffer *ring;
+	struct intel_engine *ring;
 	bool ret = false;
 	int i;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index f734c9d..9387196 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -33,7 +33,7 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 
-static inline int ring_space(struct intel_ring_buffer *ring)
+static inline int ring_space(struct intel_engine *ring)
 {
 	int space = (ring->head & HEAD_ADDR) - (ring->tail + I915_RING_FREE_SPACE);
 	if (space < 0)
@@ -41,7 +41,7 @@ static inline int ring_space(struct intel_ring_buffer *ring)
 	return space;
 }
 
-void __intel_ring_advance(struct intel_ring_buffer *ring)
+void __intel_ring_advance(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
@@ -52,7 +52,7 @@ void __intel_ring_advance(struct intel_ring_buffer *ring)
 }
 
 static int
-gen2_render_ring_flush(struct intel_ring_buffer *ring,
+gen2_render_ring_flush(struct intel_engine *ring,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -78,7 +78,7 @@ gen2_render_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-gen4_render_ring_flush(struct intel_ring_buffer *ring,
+gen4_render_ring_flush(struct intel_engine *ring,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -173,7 +173,7 @@ gen4_render_ring_flush(struct intel_ring_buffer *ring,
  * really our business.  That leaves only stall at scoreboard.
  */
 static int
-intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring)
+intel_emit_post_sync_nonzero_flush(struct intel_engine *ring)
 {
 	u32 scratch_addr = ring->scratch.gtt_offset + 128;
 	int ret;
@@ -208,7 +208,7 @@ intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring)
 }
 
 static int
-gen6_render_ring_flush(struct intel_ring_buffer *ring,
+gen6_render_ring_flush(struct intel_engine *ring,
                          u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -260,7 +260,7 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
+gen7_render_ring_cs_stall_wa(struct intel_engine *ring)
 {
 	int ret;
 
@@ -278,7 +278,7 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
+static int gen7_ring_fbc_flush(struct intel_engine *ring, u32 value)
 {
 	int ret;
 
@@ -302,7 +302,7 @@ static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring, u32 value)
 }
 
 static int
-gen7_render_ring_flush(struct intel_ring_buffer *ring,
+gen7_render_ring_flush(struct intel_engine *ring,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -363,7 +363,7 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-gen8_render_ring_flush(struct intel_ring_buffer *ring,
+gen8_render_ring_flush(struct intel_engine *ring,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -403,14 +403,14 @@ gen8_render_ring_flush(struct intel_ring_buffer *ring,
 
 }
 
-static void ring_write_tail(struct intel_ring_buffer *ring,
+static void ring_write_tail(struct intel_engine *ring,
 			    u32 value)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
 	I915_WRITE_TAIL(ring, value);
 }
 
-u32 intel_ring_get_active_head(struct intel_ring_buffer *ring)
+u32 intel_ring_get_active_head(struct intel_engine *ring)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
 	u32 acthd_reg = INTEL_INFO(ring->dev)->gen >= 4 ?
@@ -419,7 +419,7 @@ u32 intel_ring_get_active_head(struct intel_ring_buffer *ring)
 	return I915_READ(acthd_reg);
 }
 
-static void ring_setup_phys_status_page(struct intel_ring_buffer *ring)
+static void ring_setup_phys_status_page(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 addr;
@@ -430,7 +430,7 @@ static void ring_setup_phys_status_page(struct intel_ring_buffer *ring)
 	I915_WRITE(HWS_PGA, addr);
 }
 
-static int init_ring_common(struct intel_ring_buffer *ring)
+static int init_ring_common(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -519,7 +519,7 @@ out:
 }
 
 static int
-init_pipe_control(struct intel_ring_buffer *ring)
+init_pipe_control(struct intel_engine *ring)
 {
 	int ret;
 
@@ -560,7 +560,7 @@ err:
 	return ret;
 }
 
-static int init_render_ring(struct intel_ring_buffer *ring)
+static int init_render_ring(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -620,7 +620,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	return ret;
 }
 
-static void render_ring_cleanup(struct intel_ring_buffer *ring)
+static void render_ring_cleanup(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 
@@ -637,7 +637,7 @@ static void render_ring_cleanup(struct intel_ring_buffer *ring)
 }
 
 static void
-update_mboxes(struct intel_ring_buffer *ring,
+update_mboxes(struct intel_engine *ring,
 	      u32 mmio_offset)
 {
 /* NB: In order to be able to do semaphore MBOX updates for varying number
@@ -662,11 +662,11 @@ update_mboxes(struct intel_ring_buffer *ring,
  * This acts like a signal in the canonical semaphore.
  */
 static int
-gen6_add_request(struct intel_ring_buffer *ring)
+gen6_add_request(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *useless;
+	struct intel_engine *useless;
 	int i, ret, num_dwords = 4;
 
 	if (i915_semaphore_is_enabled(dev))
@@ -709,8 +709,8 @@ static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
  * @seqno - seqno which the waiter will block on
  */
 static int
-gen6_ring_sync(struct intel_ring_buffer *waiter,
-	       struct intel_ring_buffer *signaller,
+gen6_ring_sync(struct intel_engine *waiter,
+	       struct intel_engine *signaller,
 	       u32 seqno)
 {
 	int ret;
@@ -760,7 +760,7 @@ do {									\
 } while (0)
 
 static int
-pc_render_add_request(struct intel_ring_buffer *ring)
+pc_render_add_request(struct intel_engine *ring)
 {
 	u32 scratch_addr = ring->scratch.gtt_offset + 128;
 	int ret;
@@ -808,7 +808,7 @@ pc_render_add_request(struct intel_ring_buffer *ring)
 }
 
 static u32
-gen6_ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
+gen6_ring_get_seqno(struct intel_engine *ring, bool lazy_coherency)
 {
 	/* Workaround to force correct ordering between irq and seqno writes on
 	 * ivb (and maybe also on snb) by reading from a CS register (like
@@ -819,31 +819,31 @@ gen6_ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
 }
 
 static u32
-ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
+ring_get_seqno(struct intel_engine *ring, bool lazy_coherency)
 {
 	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
 }
 
 static void
-ring_set_seqno(struct intel_ring_buffer *ring, u32 seqno)
+ring_set_seqno(struct intel_engine *ring, u32 seqno)
 {
 	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
 }
 
 static u32
-pc_render_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency)
+pc_render_get_seqno(struct intel_engine *ring, bool lazy_coherency)
 {
 	return ring->scratch.cpu_page[0];
 }
 
 static void
-pc_render_set_seqno(struct intel_ring_buffer *ring, u32 seqno)
+pc_render_set_seqno(struct intel_engine *ring, u32 seqno)
 {
 	ring->scratch.cpu_page[0] = seqno;
 }
 
 static bool
-gen5_ring_get_irq(struct intel_ring_buffer *ring)
+gen5_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -861,7 +861,7 @@ gen5_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-gen5_ring_put_irq(struct intel_ring_buffer *ring)
+gen5_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -874,7 +874,7 @@ gen5_ring_put_irq(struct intel_ring_buffer *ring)
 }
 
 static bool
-i9xx_ring_get_irq(struct intel_ring_buffer *ring)
+i9xx_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -895,7 +895,7 @@ i9xx_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-i9xx_ring_put_irq(struct intel_ring_buffer *ring)
+i9xx_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -911,7 +911,7 @@ i9xx_ring_put_irq(struct intel_ring_buffer *ring)
 }
 
 static bool
-i8xx_ring_get_irq(struct intel_ring_buffer *ring)
+i8xx_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -932,7 +932,7 @@ i8xx_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-i8xx_ring_put_irq(struct intel_ring_buffer *ring)
+i8xx_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -947,7 +947,7 @@ i8xx_ring_put_irq(struct intel_ring_buffer *ring)
 	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
 }
 
-void intel_ring_setup_status_page(struct intel_ring_buffer *ring)
+void intel_ring_setup_status_page(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
@@ -1005,7 +1005,7 @@ void intel_ring_setup_status_page(struct intel_ring_buffer *ring)
 }
 
 static int
-bsd_ring_flush(struct intel_ring_buffer *ring,
+bsd_ring_flush(struct intel_engine *ring,
 	       u32     invalidate_domains,
 	       u32     flush_domains)
 {
@@ -1022,7 +1022,7 @@ bsd_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-i9xx_add_request(struct intel_ring_buffer *ring)
+i9xx_add_request(struct intel_engine *ring)
 {
 	int ret;
 
@@ -1040,7 +1040,7 @@ i9xx_add_request(struct intel_ring_buffer *ring)
 }
 
 static bool
-gen6_ring_get_irq(struct intel_ring_buffer *ring)
+gen6_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -1065,7 +1065,7 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-gen6_ring_put_irq(struct intel_ring_buffer *ring)
+gen6_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -1083,7 +1083,7 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
 }
 
 static bool
-hsw_vebox_get_irq(struct intel_ring_buffer *ring)
+hsw_vebox_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1103,7 +1103,7 @@ hsw_vebox_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-hsw_vebox_put_irq(struct intel_ring_buffer *ring)
+hsw_vebox_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1121,7 +1121,7 @@ hsw_vebox_put_irq(struct intel_ring_buffer *ring)
 }
 
 static bool
-gen8_ring_get_irq(struct intel_ring_buffer *ring)
+gen8_ring_get_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1147,7 +1147,7 @@ gen8_ring_get_irq(struct intel_ring_buffer *ring)
 }
 
 static void
-gen8_ring_put_irq(struct intel_ring_buffer *ring)
+gen8_ring_put_irq(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1167,7 +1167,7 @@ gen8_ring_put_irq(struct intel_ring_buffer *ring)
 }
 
 static int
-i965_dispatch_execbuffer(struct intel_ring_buffer *ring,
+i965_dispatch_execbuffer(struct intel_engine *ring,
 			 u32 offset, u32 length,
 			 unsigned flags)
 {
@@ -1190,7 +1190,7 @@ i965_dispatch_execbuffer(struct intel_ring_buffer *ring,
 /* Just userspace ABI convention to limit the wa batch bo to a resonable size */
 #define I830_BATCH_LIMIT (256*1024)
 static int
-i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
+i830_dispatch_execbuffer(struct intel_engine *ring,
 				u32 offset, u32 len,
 				unsigned flags)
 {
@@ -1241,7 +1241,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
 }
 
 static int
-i915_dispatch_execbuffer(struct intel_ring_buffer *ring,
+i915_dispatch_execbuffer(struct intel_engine *ring,
 			 u32 offset, u32 len,
 			 unsigned flags)
 {
@@ -1258,7 +1258,7 @@ i915_dispatch_execbuffer(struct intel_ring_buffer *ring,
 	return 0;
 }
 
-static void cleanup_status_page(struct intel_ring_buffer *ring)
+static void cleanup_status_page(struct intel_engine *ring)
 {
 	struct drm_i915_gem_object *obj;
 
@@ -1272,7 +1272,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
 	ring->status_page.obj = NULL;
 }
 
-static int init_status_page(struct intel_ring_buffer *ring)
+static int init_status_page(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_gem_object *obj;
@@ -1315,7 +1315,7 @@ err:
 	return ret;
 }
 
-static int init_phys_status_page(struct intel_ring_buffer *ring)
+static int init_phys_status_page(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
@@ -1332,14 +1332,14 @@ static int init_phys_status_page(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-static void destroy_ring_buffer(struct intel_ring_buffer *ring)
+static void destroy_ring_buffer(struct intel_engine *ring)
 {
 	i915_gem_object_ggtt_unpin(ring->obj);
 	drm_gem_object_unreference(&ring->obj->base);
 	ring->obj = NULL;
 }
 
-static int alloc_ring_buffer(struct intel_ring_buffer *ring)
+static int alloc_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_gem_object *obj = NULL;
@@ -1372,7 +1372,7 @@ static int alloc_ring_buffer(struct intel_ring_buffer *ring)
 }
 
 static int intel_init_ring_buffer(struct drm_device *dev,
-				  struct intel_ring_buffer *ring)
+				  struct intel_engine *ring)
 {
 	struct drm_i915_gem_object *obj;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1437,7 +1437,7 @@ err_hws:
 	return ret;
 }
 
-void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
+void intel_cleanup_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv;
 	int ret;
@@ -1466,7 +1466,7 @@ void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
 	cleanup_status_page(ring);
 }
 
-static int intel_ring_wait_request(struct intel_ring_buffer *ring, int n)
+static int intel_ring_wait_request(struct intel_engine *ring, int n)
 {
 	struct drm_i915_gem_request *request;
 	u32 seqno = 0, tail;
@@ -1519,7 +1519,7 @@ static int intel_ring_wait_request(struct intel_ring_buffer *ring, int n)
 	return 0;
 }
 
-static int ring_wait_for_space(struct intel_ring_buffer *ring, int n)
+static int ring_wait_for_space(struct intel_engine *ring, int n)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1567,7 +1567,7 @@ static int ring_wait_for_space(struct intel_ring_buffer *ring, int n)
 	return -EBUSY;
 }
 
-static int intel_wrap_ring_buffer(struct intel_ring_buffer *ring)
+static int intel_wrap_ring_buffer(struct intel_engine *ring)
 {
 	uint32_t __iomem *virt;
 	int rem = ring->size - ring->tail;
@@ -1589,7 +1589,7 @@ static int intel_wrap_ring_buffer(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-int intel_ring_idle(struct intel_ring_buffer *ring)
+int intel_ring_idle(struct intel_engine *ring)
 {
 	u32 seqno;
 	int ret;
@@ -1613,7 +1613,7 @@ int intel_ring_idle(struct intel_ring_buffer *ring)
 }
 
 static int
-intel_ring_alloc_seqno(struct intel_ring_buffer *ring)
+intel_ring_alloc_seqno(struct intel_engine *ring)
 {
 	if (ring->outstanding_lazy_seqno)
 		return 0;
@@ -1631,7 +1631,7 @@ intel_ring_alloc_seqno(struct intel_ring_buffer *ring)
 	return i915_gem_get_seqno(ring->dev, &ring->outstanding_lazy_seqno);
 }
 
-static int __intel_ring_prepare(struct intel_ring_buffer *ring,
+static int __intel_ring_prepare(struct intel_engine *ring,
 				int bytes)
 {
 	int ret;
@@ -1651,7 +1651,7 @@ static int __intel_ring_prepare(struct intel_ring_buffer *ring,
 	return 0;
 }
 
-int intel_ring_begin(struct intel_ring_buffer *ring,
+int intel_ring_begin(struct intel_engine *ring,
 		     int num_dwords)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
@@ -1676,7 +1676,7 @@ int intel_ring_begin(struct intel_ring_buffer *ring,
 }
 
 /* Align the ring tail to a cacheline boundary */
-int intel_ring_cacheline_align(struct intel_ring_buffer *ring)
+int intel_ring_cacheline_align(struct intel_engine *ring)
 {
 	int num_dwords = (64 - (ring->tail & 63)) / sizeof(uint32_t);
 	int ret;
@@ -1696,7 +1696,7 @@ int intel_ring_cacheline_align(struct intel_ring_buffer *ring)
 	return 0;
 }
 
-void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno)
+void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
@@ -1713,7 +1713,7 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno)
 	ring->hangcheck.seqno = seqno;
 }
 
-static void gen6_bsd_ring_write_tail(struct intel_ring_buffer *ring,
+static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
 				     u32 value)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
@@ -1746,7 +1746,7 @@ static void gen6_bsd_ring_write_tail(struct intel_ring_buffer *ring,
 		   _MASKED_BIT_DISABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
 }
 
-static int gen6_bsd_ring_flush(struct intel_ring_buffer *ring,
+static int gen6_bsd_ring_flush(struct intel_engine *ring,
 			       u32 invalidate, u32 flush)
 {
 	uint32_t cmd;
@@ -1782,7 +1782,7 @@ static int gen6_bsd_ring_flush(struct intel_ring_buffer *ring,
 }
 
 static int
-gen8_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
+gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 			      u32 offset, u32 len,
 			      unsigned flags)
 {
@@ -1806,7 +1806,7 @@ gen8_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 }
 
 static int
-hsw_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
+hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
 			      u32 offset, u32 len,
 			      unsigned flags)
 {
@@ -1827,7 +1827,7 @@ hsw_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 }
 
 static int
-gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
+gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
 			      u32 offset, u32 len,
 			      unsigned flags)
 {
@@ -1849,7 +1849,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
 
 /* Blitter support (SandyBridge+) */
 
-static int gen6_ring_flush(struct intel_ring_buffer *ring,
+static int gen6_ring_flush(struct intel_engine *ring,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
@@ -1892,7 +1892,7 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
 int intel_init_render_ring_buffer(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
@@ -1989,7 +1989,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct intel_engine *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	if (INTEL_INFO(dev)->gen >= 6) {
@@ -2053,7 +2053,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 int intel_init_bsd_ring_buffer(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[VCS];
+	struct intel_engine *ring = &dev_priv->ring[VCS];
 
 	ring->write_tail = ring_write_tail;
 	if (INTEL_INFO(dev)->gen >= 6) {
@@ -2111,7 +2111,7 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 int intel_init_blt_ring_buffer(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[BCS];
+	struct intel_engine *ring = &dev_priv->ring[BCS];
 
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
@@ -2147,7 +2147,7 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 int intel_init_vebox_ring_buffer(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[VECS];
+	struct intel_engine *ring = &dev_priv->ring[VECS];
 
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
@@ -2182,7 +2182,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 }
 
 int
-intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
+intel_ring_flush_all_caches(struct intel_engine *ring)
 {
 	int ret;
 
@@ -2200,7 +2200,7 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
 }
 
 int
-intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
+intel_ring_invalidate_all_caches(struct intel_engine *ring)
 {
 	uint32_t flush_domains;
 	int ret;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 135bdc1..a7c40a8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -53,7 +53,9 @@ struct intel_ring_hangcheck {
 	enum intel_ring_hangcheck_action action;
 };
 
-struct  intel_ring_buffer {
+struct i915_hw_context;
+
+struct intel_engine {
 	const char	*name;
 	enum intel_ring_id {
 		RCS = 0x0,
@@ -88,35 +90,35 @@ struct  intel_ring_buffer {
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
 	u32		trace_irq_seqno;
 	u32		sync_seqno[I915_NUM_RINGS-1];
-	bool __must_check (*irq_get)(struct intel_ring_buffer *ring);
-	void		(*irq_put)(struct intel_ring_buffer *ring);
+	bool __must_check (*irq_get)(struct intel_engine *ring);
+	void		(*irq_put)(struct intel_engine *ring);
 
-	int		(*init)(struct intel_ring_buffer *ring);
+	int		(*init)(struct intel_engine *ring);
 
-	void		(*write_tail)(struct intel_ring_buffer *ring,
+	void		(*write_tail)(struct intel_engine *ring,
 				      u32 value);
-	int __must_check (*flush)(struct intel_ring_buffer *ring,
+	int __must_check (*flush)(struct intel_engine *ring,
 				  u32	invalidate_domains,
 				  u32	flush_domains);
-	int		(*add_request)(struct intel_ring_buffer *ring);
+	int		(*add_request)(struct intel_engine *ring);
 	/* Some chipsets are not quite as coherent as advertised and need
 	 * an expensive kick to force a true read of the up-to-date seqno.
 	 * However, the up-to-date seqno is not always required and the last
 	 * seen value is good enough. Note that the seqno will always be
 	 * monotonic, even if not coherent.
 	 */
-	u32		(*get_seqno)(struct intel_ring_buffer *ring,
+	u32		(*get_seqno)(struct intel_engine *ring,
 				     bool lazy_coherency);
-	void		(*set_seqno)(struct intel_ring_buffer *ring,
+	void		(*set_seqno)(struct intel_engine *ring,
 				     u32 seqno);
-	int		(*dispatch_execbuffer)(struct intel_ring_buffer *ring,
+	int		(*dispatch_execbuffer)(struct intel_engine *ring,
 					       u32 offset, u32 length,
 					       unsigned flags);
 #define I915_DISPATCH_SECURE 0x1
 #define I915_DISPATCH_PINNED 0x2
-	void		(*cleanup)(struct intel_ring_buffer *ring);
-	int		(*sync_to)(struct intel_ring_buffer *ring,
-				   struct intel_ring_buffer *to,
+	void		(*cleanup)(struct intel_engine *ring);
+	int		(*sync_to)(struct intel_engine *ring,
+				   struct intel_engine *to,
 				   u32 seqno);
 
 	/* our mbox written by others */
@@ -201,20 +203,20 @@ struct  intel_ring_buffer {
 };
 
 static inline bool
-intel_ring_initialized(struct intel_ring_buffer *ring)
+intel_ring_initialized(struct intel_engine *ring)
 {
 	return ring->obj != NULL;
 }
 
 static inline unsigned
-intel_ring_flag(struct intel_ring_buffer *ring)
+intel_ring_flag(struct intel_engine *ring)
 {
 	return 1 << ring->id;
 }
 
 static inline u32
-intel_ring_sync_index(struct intel_ring_buffer *ring,
-		      struct intel_ring_buffer *other)
+intel_ring_sync_index(struct intel_engine *ring,
+		      struct intel_engine *other)
 {
 	int idx;
 
@@ -232,7 +234,7 @@ intel_ring_sync_index(struct intel_ring_buffer *ring,
 }
 
 static inline u32
-intel_read_status_page(struct intel_ring_buffer *ring,
+intel_read_status_page(struct intel_engine *ring,
 		       int reg)
 {
 	/* Ensure that the compiler doesn't optimize away the load. */
@@ -241,7 +243,7 @@ intel_read_status_page(struct intel_ring_buffer *ring,
 }
 
 static inline void
-intel_write_status_page(struct intel_ring_buffer *ring,
+intel_write_status_page(struct intel_engine *ring,
 			int reg, u32 value)
 {
 	ring->status_page.page_addr[reg] = value;
@@ -266,26 +268,26 @@ intel_write_status_page(struct intel_ring_buffer *ring,
 #define I915_GEM_HWS_SCRATCH_INDEX	0x30
 #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
 
-void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring);
+void intel_cleanup_ring_buffer(struct intel_engine *ring);
 
-int __must_check intel_ring_begin(struct intel_ring_buffer *ring, int n);
-int __must_check intel_ring_cacheline_align(struct intel_ring_buffer *ring);
-static inline void intel_ring_emit(struct intel_ring_buffer *ring,
+int __must_check intel_ring_begin(struct intel_engine *ring, int n);
+int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
+static inline void intel_ring_emit(struct intel_engine *ring,
 				   u32 data)
 {
 	iowrite32(data, ring->virtual_start + ring->tail);
 	ring->tail += 4;
 }
-static inline void intel_ring_advance(struct intel_ring_buffer *ring)
+static inline void intel_ring_advance(struct intel_engine *ring)
 {
 	ring->tail &= ring->size - 1;
 }
-void __intel_ring_advance(struct intel_ring_buffer *ring);
+void __intel_ring_advance(struct intel_engine *ring);
 
-int __must_check intel_ring_idle(struct intel_ring_buffer *ring);
-void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno);
-int intel_ring_flush_all_caches(struct intel_ring_buffer *ring);
-int intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring);
+int __must_check intel_ring_idle(struct intel_engine *ring);
+void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
+int intel_ring_flush_all_caches(struct intel_engine *ring);
+int intel_ring_invalidate_all_caches(struct intel_engine *ring);
 
 void intel_init_rings_early(struct drm_device *dev);
 int intel_init_render_ring_buffer(struct drm_device *dev);
@@ -293,21 +295,21 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev);
 int intel_init_blt_ring_buffer(struct drm_device *dev);
 int intel_init_vebox_ring_buffer(struct drm_device *dev);
 
-u32 intel_ring_get_active_head(struct intel_ring_buffer *ring);
-void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
+u32 intel_ring_get_active_head(struct intel_engine *ring);
+void intel_ring_setup_status_page(struct intel_engine *ring);
 
-static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
+static inline u32 intel_ring_get_tail(struct intel_engine *ring)
 {
 	return ring->tail;
 }
 
-static inline u32 intel_ring_get_seqno(struct intel_ring_buffer *ring)
+static inline u32 intel_ring_get_seqno(struct intel_engine *ring)
 {
 	BUG_ON(ring->outstanding_lazy_seqno == 0);
 	return ring->outstanding_lazy_seqno;
 }
 
-static inline void i915_trace_irq_get(struct intel_ring_buffer *ring, u32 seqno)
+static inline void i915_trace_irq_get(struct intel_engine *ring, u32 seqno)
 {
 	if (ring->trace_irq_seqno == 0 && ring->irq_get(ring))
 		ring->trace_irq_seqno = seqno;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 11/49] drm/i915: Split the ringbuffers and the rings
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (9 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 10/49] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 12/49] drm/i915: Rename functions that mention ringbuffers (meaning rings) oscar.mateo
                   ` (38 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Following the logic behind the previous patch, the ringbuffers and the rings
belong in different structs. We keep the relationship between the two via the
default_ringbuf living inside each ring/engine.

This commit should not introduce functional changes (unless I made an error,
this is).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  25 +++---
 drivers/gpu/drm/i915/i915_gem.c         |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c   |   6 +-
 drivers/gpu/drm/i915/i915_irq.c         |   9 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 136 ++++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  61 ++++++++------
 6 files changed, 136 insertions(+), 103 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 43c5df0..288e1c9 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -47,6 +47,8 @@
 
 #define LP_RING(d) (&((struct drm_i915_private *)(d))->ring[RCS])
 
+#define LP_RINGBUF(d) (&((struct drm_i915_private *)(d))->ring[RCS].default_ringbuf)
+
 #define BEGIN_LP_RING(n) \
 	intel_ring_begin(LP_RING(dev_priv), (n))
 
@@ -63,7 +65,7 @@
  * has access to the ring.
  */
 #define RING_LOCK_TEST_WITH_RETURN(dev, file) do {			\
-	if (LP_RING(dev->dev_private)->obj == NULL)			\
+	if (LP_RINGBUF(dev->dev_private)->obj == NULL)			\
 		LOCK_TEST_WITH_RETURN(dev, file);			\
 } while (0)
 
@@ -140,6 +142,7 @@ void i915_kernel_lost_context(struct drm_device * dev)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_master_private *master_priv;
 	struct intel_engine *ring = LP_RING(dev_priv);
+	struct intel_ringbuffer *ringbuf = LP_RINGBUF(dev_priv);
 
 	/*
 	 * We should never lose context on the ring with modesetting
@@ -148,17 +151,17 @@ void i915_kernel_lost_context(struct drm_device * dev)
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		return;
 
-	ring->head = I915_READ_HEAD(ring) & HEAD_ADDR;
-	ring->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
-	ring->space = ring->head - (ring->tail + I915_RING_FREE_SPACE);
-	if (ring->space < 0)
-		ring->space += ring->size;
+	ringbuf->head = I915_READ_HEAD(ring) & HEAD_ADDR;
+	ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
+	ringbuf->space = ringbuf->head - (ringbuf->tail + I915_RING_FREE_SPACE);
+	if (ringbuf->space < 0)
+		ringbuf->space += ringbuf->size;
 
 	if (!dev->primary->master)
 		return;
 
 	master_priv = dev->primary->master->driver_priv;
-	if (ring->head == ring->tail && master_priv->sarea_priv)
+	if (ringbuf->head == ringbuf->tail && master_priv->sarea_priv)
 		master_priv->sarea_priv->perf_boxes |= I915_BOX_RING_EMPTY;
 }
 
@@ -201,7 +204,7 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
 	}
 
 	if (init->ring_size != 0) {
-		if (LP_RING(dev_priv)->obj != NULL) {
+		if (LP_RINGBUF(dev_priv)->obj != NULL) {
 			i915_dma_cleanup(dev);
 			DRM_ERROR("Client tried to initialize ringbuffer in "
 				  "GEM mode\n");
@@ -238,7 +241,7 @@ static int i915_dma_resume(struct drm_device * dev)
 
 	DRM_DEBUG_DRIVER("%s\n", __func__);
 
-	if (ring->virtual_start == NULL) {
+	if (__get_ringbuf(ring)->virtual_start == NULL) {
 		DRM_ERROR("can not ioremap virtual address for"
 			  " ring buffer\n");
 		return -ENOMEM;
@@ -360,7 +363,7 @@ static int i915_emit_cmds(struct drm_device * dev, int *buffer, int dwords)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	int i, ret;
 
-	if ((dwords+1) * sizeof(int) >= LP_RING(dev_priv)->size - 8)
+	if ((dwords+1) * sizeof(int) >= LP_RINGBUF(dev_priv)->size - 8)
 		return -EINVAL;
 
 	for (i = 0; i < dwords;) {
@@ -823,7 +826,7 @@ static int i915_irq_emit(struct drm_device *dev, void *data,
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		return -ENODEV;
 
-	if (!dev_priv || !LP_RING(dev_priv)->virtual_start) {
+	if (!dev_priv || !LP_RINGBUF(dev_priv)->virtual_start) {
 		DRM_ERROR("called with no initialization\n");
 		return -EINVAL;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 37df622..26b89e9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2479,7 +2479,7 @@ i915_gem_retire_requests_ring(struct intel_engine *ring)
 		 * of tail of the request to update the last known position
 		 * of the GPU head.
 		 */
-		ring->last_retired_head = request->tail;
+		__get_ringbuf(ring)->last_retired_head = request->tail;
 
 		i915_gem_free_request(request);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 83d8db5..67a1fc7 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -828,8 +828,8 @@ static void i915_record_ring_state(struct drm_device *dev,
 		ering->hws = I915_READ(mmio);
 	}
 
-	ering->cpu_ring_head = ring->head;
-	ering->cpu_ring_tail = ring->tail;
+	ering->cpu_ring_head = __get_ringbuf(ring)->head;
+	ering->cpu_ring_tail = __get_ringbuf(ring)->tail;
 
 	ering->hangcheck_score = ring->hangcheck.score;
 	ering->hangcheck_action = ring->hangcheck.action;
@@ -936,7 +936,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 		}
 
 		error->ring[i].ringbuffer =
-			i915_error_ggtt_object_create(dev_priv, ring->obj);
+			i915_error_ggtt_object_create(dev_priv, __get_ringbuf(ring)->obj);
 
 		if (ring->status_page.obj)
 			error->ring[i].hws_page =
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index d30a30b..340cf34 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1075,7 +1075,7 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 static void notify_ring(struct drm_device *dev,
 			struct intel_engine *ring)
 {
-	if (ring->obj == NULL)
+	if (!intel_ring_initialized(ring))
 		return;
 
 	trace_i915_gem_request_complete(ring);
@@ -2593,6 +2593,7 @@ static struct intel_engine *
 semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	u32 cmd, ipehr, head;
 	int i;
 
@@ -2615,10 +2616,10 @@ semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 		 * our ring is smaller than what the hardware (and hence
 		 * HEAD_ADDR) allows. Also handles wrap-around.
 		 */
-		head &= ring->size - 1;
+		head &= ringbuf->size - 1;
 
 		/* This here seems to blow up */
-		cmd = ioread32(ring->virtual_start + head);
+		cmd = ioread32(ringbuf->virtual_start + head);
 		if (cmd == ipehr)
 			break;
 
@@ -2628,7 +2629,7 @@ semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 	if (!i)
 		return NULL;
 
-	*seqno = ioread32(ring->virtual_start + head + 4) + 1;
+	*seqno = ioread32(ringbuf->virtual_start + head + 4) + 1;
 	return semaphore_wait_to_signaller_ring(ring, ipehr);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 9387196..0da4289 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -35,20 +35,23 @@
 
 static inline int ring_space(struct intel_engine *ring)
 {
-	int space = (ring->head & HEAD_ADDR) - (ring->tail + I915_RING_FREE_SPACE);
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	int space = (ringbuf->head & HEAD_ADDR) - (ringbuf->tail + I915_RING_FREE_SPACE);
 	if (space < 0)
-		space += ring->size;
+		space += ringbuf->size;
 	return space;
 }
 
 void __intel_ring_advance(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 
-	ring->tail &= ring->size - 1;
+	ringbuf->tail &= ringbuf->size - 1;
 	if (dev_priv->gpu_error.stop_rings & intel_ring_flag(ring))
 		return;
-	ring->write_tail(ring, ring->tail);
+	ring->write_tail(ring, ringbuf->tail);
 }
 
 static int
@@ -434,7 +437,8 @@ static int init_ring_common(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj = ring->obj;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct drm_i915_gem_object *obj = ringbuf->obj;
 	int ret = 0;
 	u32 head;
 
@@ -483,7 +487,7 @@ static int init_ring_common(struct intel_engine *ring)
 	 * register values. */
 	I915_WRITE_START(ring, i915_gem_obj_ggtt_offset(obj));
 	I915_WRITE_CTL(ring,
-			((ring->size - PAGE_SIZE) & RING_NR_PAGES)
+			((ringbuf->size - PAGE_SIZE) & RING_NR_PAGES)
 			| RING_VALID);
 
 	/* If the head is still not zero, the ring is dead */
@@ -504,10 +508,10 @@ static int init_ring_common(struct intel_engine *ring)
 	if (!drm_core_check_feature(ring->dev, DRIVER_MODESET))
 		i915_kernel_lost_context(ring->dev);
 	else {
-		ring->head = I915_READ_HEAD(ring);
-		ring->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
-		ring->space = ring_space(ring);
-		ring->last_retired_head = -1;
+		ringbuf->head = I915_READ_HEAD(ring);
+		ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
+		ringbuf->space = ring_space(ring);
+		ringbuf->last_retired_head = -1;
 	}
 
 	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
@@ -1334,21 +1338,24 @@ static int init_phys_status_page(struct intel_engine *ring)
 
 static void destroy_ring_buffer(struct intel_engine *ring)
 {
-	i915_gem_object_ggtt_unpin(ring->obj);
-	drm_gem_object_unreference(&ring->obj->base);
-	ring->obj = NULL;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	i915_gem_object_ggtt_unpin(ringbuf->obj);
+	drm_gem_object_unreference(&ringbuf->obj->base);
+	ringbuf->obj = NULL;
 }
 
 static int alloc_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_gem_object *obj = NULL;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	if (!HAS_LLC(dev))
-		obj = i915_gem_object_create_stolen(dev, ring->size);
+		obj = i915_gem_object_create_stolen(dev, ringbuf->size);
 	if (obj == NULL)
-		obj = i915_gem_alloc_object(dev, ring->size);
+		obj = i915_gem_alloc_object(dev, ringbuf->size);
 	if (obj == NULL) {
 		DRM_ERROR("Failed to allocate ringbuffer\n");
 		return -ENOMEM;
@@ -1366,7 +1373,7 @@ static int alloc_ring_buffer(struct intel_engine *ring)
 		return ret;
 	}
 
-	ring->obj = obj;
+	ringbuf->obj = obj;
 
 	return 0;
 }
@@ -1376,12 +1383,13 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 {
 	struct drm_i915_gem_object *obj;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	ring->dev = dev;
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
-	ring->size = 32 * PAGE_SIZE;
+	ringbuf->size = 32 * PAGE_SIZE;
 	memset(ring->sync_seqno, 0, sizeof(ring->sync_seqno));
 
 	init_waitqueue_head(&ring->irq_queue);
@@ -1401,12 +1409,12 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	if (ret)
 		goto err_hws;
 
-	obj = ring->obj;
+	obj = ringbuf->obj;
 
-	ring->virtual_start =
+	ringbuf->virtual_start =
 		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj),
-			   ring->size);
-	if (ring->virtual_start == NULL) {
+				ringbuf->size);
+	if (ringbuf->virtual_start == NULL) {
 		DRM_ERROR("Failed to map ringbuffer.\n");
 		ret = -EINVAL;
 		goto destroy_ring;
@@ -1420,16 +1428,16 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	 * the TAIL pointer points to within the last 2 cachelines
 	 * of the buffer.
 	 */
-	ring->effective_size = ring->size;
+	ringbuf->effective_size = ringbuf->size;
 	if (IS_I830(ring->dev) || IS_845G(ring->dev))
-		ring->effective_size -= 128;
+		ringbuf->effective_size -= 128;
 
 	i915_cmd_parser_init_ring(ring);
 
 	return 0;
 
 err_unmap:
-	iounmap(ring->virtual_start);
+	iounmap(ringbuf->virtual_start);
 destroy_ring:
 	destroy_ring_buffer(ring);
 err_hws:
@@ -1440,9 +1448,10 @@ err_hws:
 void intel_cleanup_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
-	if (ring->obj == NULL)
+	if (ringbuf->obj == NULL)
 		return;
 
 	/* Disable the ring buffer. The ring must be idle at this point */
@@ -1454,7 +1463,7 @@ void intel_cleanup_ring_buffer(struct intel_engine *ring)
 
 	I915_WRITE_CTL(ring, 0);
 
-	iounmap(ring->virtual_start);
+	iounmap(ringbuf->virtual_start);
 
 	destroy_ring_buffer(ring);
 	ring->preallocated_lazy_request = NULL;
@@ -1469,15 +1478,16 @@ void intel_cleanup_ring_buffer(struct intel_engine *ring)
 static int intel_ring_wait_request(struct intel_engine *ring, int n)
 {
 	struct drm_i915_gem_request *request;
+	struct intel_ringbuffer *ring_buf = __get_ringbuf(ring);
 	u32 seqno = 0, tail;
 	int ret;
 
-	if (ring->last_retired_head != -1) {
-		ring->head = ring->last_retired_head;
-		ring->last_retired_head = -1;
+	if (ring_buf->last_retired_head != -1) {
+		ring_buf->head = ring_buf->last_retired_head;
+		ring_buf->last_retired_head = -1;
 
-		ring->space = ring_space(ring);
-		if (ring->space >= n)
+		ring_buf->space = ring_space(ring);
+		if (ring_buf->space >= n)
 			return 0;
 	}
 
@@ -1487,9 +1497,9 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 		if (request->tail == -1)
 			continue;
 
-		space = request->tail - (ring->tail + I915_RING_FREE_SPACE);
+		space = request->tail - (ring_buf->tail + I915_RING_FREE_SPACE);
 		if (space < 0)
-			space += ring->size;
+			space += ring_buf->size;
 		if (space >= n) {
 			seqno = request->seqno;
 			tail = request->tail;
@@ -1511,9 +1521,9 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 	if (ret)
 		return ret;
 
-	ring->head = tail;
-	ring->space = ring_space(ring);
-	if (WARN_ON(ring->space < n))
+	ring_buf->head = tail;
+	ring_buf->space = ring_space(ring);
+	if (WARN_ON(ring_buf->space < n))
 		return -ENOSPC;
 
 	return 0;
@@ -1523,6 +1533,7 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	unsigned long end;
 	int ret;
 
@@ -1542,9 +1553,9 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 	end = jiffies + 60 * HZ;
 
 	do {
-		ring->head = I915_READ_HEAD(ring);
-		ring->space = ring_space(ring);
-		if (ring->space >= n) {
+		ringbuf->head = I915_READ_HEAD(ring);
+		ringbuf->space = ring_space(ring);
+		if (ringbuf->space >= n) {
 			trace_i915_ring_wait_end(ring);
 			return 0;
 		}
@@ -1570,21 +1581,22 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 static int intel_wrap_ring_buffer(struct intel_engine *ring)
 {
 	uint32_t __iomem *virt;
-	int rem = ring->size - ring->tail;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	int rem = ringbuf->size - ringbuf->tail;
 
-	if (ring->space < rem) {
+	if (ringbuf->space < rem) {
 		int ret = ring_wait_for_space(ring, rem);
 		if (ret)
 			return ret;
 	}
 
-	virt = ring->virtual_start + ring->tail;
+	virt = ringbuf->virtual_start + ringbuf->tail;
 	rem /= 4;
 	while (rem--)
 		iowrite32(MI_NOOP, virt++);
 
-	ring->tail = 0;
-	ring->space = ring_space(ring);
+	ringbuf->tail = 0;
+	ringbuf->space = ring_space(ring);
 
 	return 0;
 }
@@ -1634,15 +1646,16 @@ intel_ring_alloc_seqno(struct intel_engine *ring)
 static int __intel_ring_prepare(struct intel_engine *ring,
 				int bytes)
 {
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
-	if (unlikely(ring->tail + bytes > ring->effective_size)) {
+	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
 		ret = intel_wrap_ring_buffer(ring);
 		if (unlikely(ret))
 			return ret;
 	}
 
-	if (unlikely(ring->space < bytes)) {
+	if (unlikely(ringbuf->space < bytes)) {
 		ret = ring_wait_for_space(ring, bytes);
 		if (unlikely(ret))
 			return ret;
@@ -1671,14 +1684,14 @@ int intel_ring_begin(struct intel_engine *ring,
 	if (ret)
 		return ret;
 
-	ring->space -= num_dwords * sizeof(uint32_t);
+	__get_ringbuf(ring)->space -= num_dwords * sizeof(uint32_t);
 	return 0;
 }
 
 /* Align the ring tail to a cacheline boundary */
 int intel_ring_cacheline_align(struct intel_engine *ring)
 {
-	int num_dwords = (64 - (ring->tail & 63)) / sizeof(uint32_t);
+	int num_dwords = (64 - (__get_ringbuf(ring)->tail & 63)) / sizeof(uint32_t);
 	int ret;
 
 	if (num_dwords == 0)
@@ -1990,6 +2003,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[RCS];
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	if (INTEL_INFO(dev)->gen >= 6) {
@@ -2029,13 +2043,13 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 
-	ring->size = size;
-	ring->effective_size = ring->size;
+	ringbuf->size = size;
+	ringbuf->effective_size = ringbuf->size;
 	if (IS_I830(ring->dev) || IS_845G(ring->dev))
-		ring->effective_size -= 128;
+		ringbuf->effective_size -= 128;
 
-	ring->virtual_start = ioremap_wc(start, size);
-	if (ring->virtual_start == NULL) {
+	ringbuf->virtual_start = ioremap_wc(start, size);
+	if (ringbuf->virtual_start == NULL) {
 		DRM_ERROR("can not ioremap virtual address for"
 			  " ring buffer\n");
 		return -ENOMEM;
@@ -2227,15 +2241,15 @@ void intel_init_rings_early(struct drm_device *dev)
 	dev_priv->ring[RCS].id = RCS;
 	dev_priv->ring[RCS].mmio_base = RENDER_RING_BASE;
 	dev_priv->ring[RCS].dev = dev;
-	dev_priv->ring[RCS].head = 0;
-	dev_priv->ring[RCS].tail = 0;
+	dev_priv->ring[RCS].default_ringbuf.head = 0;
+	dev_priv->ring[RCS].default_ringbuf.tail = 0;
 
 	dev_priv->ring[BCS].name = "blitter ring";
 	dev_priv->ring[BCS].id = BCS;
 	dev_priv->ring[BCS].mmio_base = BLT_RING_BASE;
 	dev_priv->ring[BCS].dev = dev;
-	dev_priv->ring[BCS].head = 0;
-	dev_priv->ring[BCS].tail = 0;
+	dev_priv->ring[BCS].default_ringbuf.head = 0;
+	dev_priv->ring[BCS].default_ringbuf.tail = 0;
 
 	dev_priv->ring[VCS].name = "bsd ring";
 	dev_priv->ring[VCS].id = VCS;
@@ -2244,13 +2258,13 @@ void intel_init_rings_early(struct drm_device *dev)
 	else
 		dev_priv->ring[VCS].mmio_base = BSD_RING_BASE;
 	dev_priv->ring[VCS].dev = dev;
-	dev_priv->ring[VCS].head = 0;
-	dev_priv->ring[VCS].tail = 0;
+	dev_priv->ring[VCS].default_ringbuf.head = 0;
+	dev_priv->ring[VCS].default_ringbuf.tail = 0;
 
 	dev_priv->ring[VECS].name = "video enhancement ring";
 	dev_priv->ring[VECS].id = VECS;
 	dev_priv->ring[VECS].mmio_base = VEBOX_RING_BASE;
 	dev_priv->ring[VECS].dev = dev;
-	dev_priv->ring[VECS].head = 0;
-	dev_priv->ring[VECS].tail = 0;
+	dev_priv->ring[VECS].default_ringbuf.head = 0;
+	dev_priv->ring[VECS].default_ringbuf.tail = 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a7c40a8..2281228 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -55,6 +55,27 @@ struct intel_ring_hangcheck {
 
 struct i915_hw_context;
 
+struct intel_ringbuffer {
+	struct drm_i915_gem_object *obj;
+	void __iomem *virtual_start;
+
+	u32 head;
+	u32 tail;
+	int space;
+	int size;
+	int effective_size;
+
+	/** We track the position of the requests in the ring buffer, and
+	 * when each is retired we increment last_retired_head as the GPU
+	 * must have finished processing the request and so we know we
+	 * can advance the ringbuffer up to that position.
+	 *
+	 * last_retired_head is set to -1 after the value is consumed so
+	 * we can detect new retirements.
+	 */
+	u32 last_retired_head;
+};
+
 struct intel_engine {
 	const char	*name;
 	enum intel_ring_id {
@@ -63,29 +84,13 @@ struct intel_engine {
 		BCS,
 		VECS,
 	} id;
+	struct intel_ringbuffer default_ringbuf;
 #define I915_NUM_RINGS 4
 	u32		mmio_base;
-	void		__iomem *virtual_start;
 	struct		drm_device *dev;
-	struct		drm_i915_gem_object *obj;
 
-	u32		head;
-	u32		tail;
-	int		space;
-	int		size;
-	int		effective_size;
 	struct intel_hw_status_page status_page;
 
-	/** We track the position of the requests in the ring buffer, and
-	 * when each is retired we increment last_retired_head as the GPU
-	 * must have finished processing the request and so we know we
-	 * can advance the ringbuffer up to that position.
-	 *
-	 * last_retired_head is set to -1 after the value is consumed so
-	 * we can detect new retirements.
-	 */
-	u32		last_retired_head;
-
 	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
 	u32		trace_irq_seqno;
@@ -128,7 +133,7 @@ struct intel_engine {
 
 	/**
 	 * List of objects currently involved in rendering from the
-	 * ringbuffer.
+	 * engine.
 	 *
 	 * Includes buffers having the contents of their GPU caches
 	 * flushed, not necessarily primitives.  last_rendering_seqno
@@ -202,10 +207,16 @@ struct intel_engine {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
+/* This is a temporary define to help us transition to per-context ringbuffers */
+static inline struct intel_ringbuffer *__get_ringbuf(struct intel_engine *ring)
+{
+	return &ring->default_ringbuf;
+}
+
 static inline bool
 intel_ring_initialized(struct intel_engine *ring)
 {
-	return ring->obj != NULL;
+	return __get_ringbuf(ring)->obj != NULL;
 }
 
 static inline unsigned
@@ -275,12 +286,16 @@ int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
 static inline void intel_ring_emit(struct intel_engine *ring,
 				   u32 data)
 {
-	iowrite32(data, ring->virtual_start + ring->tail);
-	ring->tail += 4;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	iowrite32(data, ringbuf->virtual_start + ringbuf->tail);
+	ringbuf->tail += 4;
 }
 static inline void intel_ring_advance(struct intel_engine *ring)
 {
-	ring->tail &= ring->size - 1;
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	ringbuf->tail &= ringbuf->size - 1;
 }
 void __intel_ring_advance(struct intel_engine *ring);
 
@@ -300,7 +315,7 @@ void intel_ring_setup_status_page(struct intel_engine *ring);
 
 static inline u32 intel_ring_get_tail(struct intel_engine *ring)
 {
-	return ring->tail;
+	return __get_ringbuf(ring)->tail;
 }
 
 static inline u32 intel_ring_get_seqno(struct intel_engine *ring)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 12/49] drm/i915: Rename functions that mention ringbuffers (meaning rings)
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (10 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 11/49] drm/i915: Split the ringbuffers and the rings oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 13/49] drm/i915/bdw: Execlists ring tail writing oscar.mateo
                   ` (37 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Continue the refactoring: do not init or clean a "ringbuffer" when you
actually mean a "ring".

Again, no functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  6 +++---
 drivers/gpu/drm/i915/i915_drv.h         |  2 +-
 drivers/gpu/drm/i915/i915_gem.c         | 28 ++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 22 +++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h | 10 +++++-----
 5 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 288e1c9..d9d28f4 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -179,7 +179,7 @@ static int i915_dma_cleanup(struct drm_device * dev)
 
 	mutex_lock(&dev->struct_mutex);
 	for (i = 0; i < I915_NUM_RINGS; i++)
-		intel_cleanup_ring_buffer(&dev_priv->ring[i]);
+		intel_cleanup_ring(&dev_priv->ring[i]);
 	mutex_unlock(&dev->struct_mutex);
 
 	/* Clear the HWS virtual address at teardown */
@@ -1381,7 +1381,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 
 cleanup_gem:
 	mutex_lock(&dev->struct_mutex);
-	i915_gem_cleanup_ringbuffer(dev);
+	i915_gem_cleanup_ring(dev);
 	i915_gem_context_fini(dev);
 	mutex_unlock(&dev->struct_mutex);
 	WARN_ON(dev_priv->mm.aliasing_ppgtt);
@@ -1837,7 +1837,7 @@ int i915_driver_unload(struct drm_device *dev)
 
 		mutex_lock(&dev->struct_mutex);
 		i915_gem_free_all_phys_object(dev);
-		i915_gem_cleanup_ringbuffer(dev);
+		i915_gem_cleanup_ring(dev);
 		i915_gem_context_fini(dev);
 		WARN_ON(dev_priv->mm.aliasing_ppgtt);
 		mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c03c674..264ea67 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2163,7 +2163,7 @@ int __must_check i915_gem_init(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
 int i915_gem_l3_remap(struct intel_engine *ring, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
-void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
+void i915_gem_cleanup_ring(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 int __i915_add_request(struct intel_engine *ring,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 26b89e9..7ed56f7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2423,7 +2423,7 @@ void i915_gem_reset(struct drm_device *dev)
 	for_each_active_ring(ring, dev_priv, i)
 		i915_gem_reset_ring_cleanup(dev_priv, ring);
 
-	i915_gem_cleanup_ringbuffer(dev);
+	i915_gem_cleanup_ring(dev);
 
 	i915_gem_context_reset(dev);
 
@@ -4252,7 +4252,7 @@ i915_gem_suspend(struct drm_device *dev)
 		i915_gem_evict_everything(dev);
 
 	i915_kernel_lost_context(dev);
-	i915_gem_cleanup_ringbuffer(dev);
+	i915_gem_cleanup_ring(dev);
 
 	/* Hack!  Don't let anybody do execbuf while we don't control the chip.
 	 * We need to replace this with a semaphore, or something.
@@ -4350,24 +4350,24 @@ static int i915_gem_init_rings(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
-	ret = intel_init_render_ring_buffer(dev);
+	ret = intel_init_render_ring(dev);
 	if (ret)
 		return ret;
 
 	if (HAS_BSD(dev)) {
-		ret = intel_init_bsd_ring_buffer(dev);
+		ret = intel_init_bsd_ring(dev);
 		if (ret)
 			goto cleanup_render_ring;
 	}
 
 	if (intel_enable_blt(dev)) {
-		ret = intel_init_blt_ring_buffer(dev);
+		ret = intel_init_blt_ring(dev);
 		if (ret)
 			goto cleanup_bsd_ring;
 	}
 
 	if (HAS_VEBOX(dev)) {
-		ret = intel_init_vebox_ring_buffer(dev);
+		ret = intel_init_vebox_ring(dev);
 		if (ret)
 			goto cleanup_blt_ring;
 	}
@@ -4380,13 +4380,13 @@ static int i915_gem_init_rings(struct drm_device *dev)
 	return 0;
 
 cleanup_vebox_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[VECS]);
+	intel_cleanup_ring(&dev_priv->ring[VECS]);
 cleanup_blt_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[BCS]);
+	intel_cleanup_ring(&dev_priv->ring[BCS]);
 cleanup_bsd_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[VCS]);
+	intel_cleanup_ring(&dev_priv->ring[VCS]);
 cleanup_render_ring:
-	intel_cleanup_ring_buffer(&dev_priv->ring[RCS]);
+	intel_cleanup_ring(&dev_priv->ring[RCS]);
 
 	return ret;
 }
@@ -4444,7 +4444,7 @@ i915_gem_init_hw(struct drm_device *dev)
 	return 0;
 
 err_out:
-	i915_gem_cleanup_ringbuffer(dev);
+	i915_gem_cleanup_ring(dev);
 	return ret;
 }
 
@@ -4494,14 +4494,14 @@ int i915_gem_init(struct drm_device *dev)
 }
 
 void
-i915_gem_cleanup_ringbuffer(struct drm_device *dev)
+i915_gem_cleanup_ring(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_engine *ring;
 	int i;
 
 	for_each_active_ring(ring, dev_priv, i)
-		intel_cleanup_ring_buffer(ring);
+		intel_cleanup_ring(ring);
 }
 
 int
@@ -4539,7 +4539,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
 
 cleanup_ringbuffer:
 	mutex_lock(&dev->struct_mutex);
-	i915_gem_cleanup_ringbuffer(dev);
+	i915_gem_cleanup_ring(dev);
 	dev_priv->ums.mm_suspended = 1;
 	mutex_unlock(&dev->struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0da4289..35e022f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1378,8 +1378,8 @@ static int alloc_ring_buffer(struct intel_engine *ring)
 	return 0;
 }
 
-static int intel_init_ring_buffer(struct drm_device *dev,
-				  struct intel_engine *ring)
+static int intel_init_ring(struct drm_device *dev,
+			   struct intel_engine *ring)
 {
 	struct drm_i915_gem_object *obj;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1445,7 +1445,7 @@ err_hws:
 	return ret;
 }
 
-void intel_cleanup_ring_buffer(struct intel_engine *ring)
+void intel_cleanup_ring(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
@@ -1902,7 +1902,7 @@ static int gen6_ring_flush(struct intel_engine *ring,
 	return 0;
 }
 
-int intel_init_render_ring_buffer(struct drm_device *dev)
+int intel_init_render_ring(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[RCS];
@@ -1996,7 +1996,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->scratch.gtt_offset = i915_gem_obj_ggtt_offset(obj);
 	}
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
 int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
@@ -2064,7 +2064,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 	return 0;
 }
 
-int intel_init_bsd_ring_buffer(struct drm_device *dev)
+int intel_init_bsd_ring(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[VCS];
@@ -2119,10 +2119,10 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 	}
 	ring->init = init_ring_common;
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
-int intel_init_blt_ring_buffer(struct drm_device *dev)
+int intel_init_blt_ring(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[BCS];
@@ -2155,10 +2155,10 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 	ring->signal_mbox[VECS] = GEN6_VEBSYNC;
 	ring->init = init_ring_common;
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
-int intel_init_vebox_ring_buffer(struct drm_device *dev)
+int intel_init_vebox_ring(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[VECS];
@@ -2192,7 +2192,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 	ring->signal_mbox[VECS] = GEN6_NOSYNC;
 	ring->init = init_ring_common;
 
-	return intel_init_ring_buffer(dev, ring);
+	return intel_init_ring(dev, ring);
 }
 
 int
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2281228..a914348 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -279,7 +279,7 @@ intel_write_status_page(struct intel_engine *ring,
 #define I915_GEM_HWS_SCRATCH_INDEX	0x30
 #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
 
-void intel_cleanup_ring_buffer(struct intel_engine *ring);
+void intel_cleanup_ring(struct intel_engine *ring);
 
 int __must_check intel_ring_begin(struct intel_engine *ring, int n);
 int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
@@ -305,10 +305,10 @@ int intel_ring_flush_all_caches(struct intel_engine *ring);
 int intel_ring_invalidate_all_caches(struct intel_engine *ring);
 
 void intel_init_rings_early(struct drm_device *dev);
-int intel_init_render_ring_buffer(struct drm_device *dev);
-int intel_init_bsd_ring_buffer(struct drm_device *dev);
-int intel_init_blt_ring_buffer(struct drm_device *dev);
-int intel_init_vebox_ring_buffer(struct drm_device *dev);
+int intel_init_render_ring(struct drm_device *dev);
+int intel_init_bsd_ring(struct drm_device *dev);
+int intel_init_blt_ring(struct drm_device *dev);
+int intel_init_vebox_ring(struct drm_device *dev);
 
 u32 intel_ring_get_active_head(struct intel_engine *ring);
 void intel_ring_setup_status_page(struct intel_engine *ring);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 13/49] drm/i915/bdw: Execlists ring tail writing
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (11 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 12/49] drm/i915: Rename functions that mention ringbuffers (meaning rings) oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:13   ` Mateo Lozano, Oscar
  2014-03-27 17:59 ` [PATCH 14/49] drm/i915/bdw: LR context ring init oscar.mateo
                   ` (36 subsequent siblings)
  49 siblings, 1 reply; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The write tail function is a very special place for execlists: since
all access to the ring is mediated through requests (thanks to
Chris Wilson's "Write RING_TAIL once per-request" for that) and all
requests end up with a write tail, this is the place we are going to
use to submit contexts for execution.

For the moment, just mark the place (we still need to do a lot of
preparation before execlists are ready to start submitting things).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 35e022f..a18dcf7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -413,6 +413,12 @@ static void ring_write_tail(struct intel_engine *ring,
 	I915_WRITE_TAIL(ring, value);
 }
 
+static void gen8_write_tail_lrc(struct intel_engine *ring,
+				u32 value)
+{
+	DRM_ERROR("Execlists still not ready!\n");
+}
+
 u32 intel_ring_get_active_head(struct intel_engine *ring)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
@@ -1907,12 +1913,15 @@ int intel_init_render_ring(struct drm_device *dev)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[RCS];
 
+	ring->write_tail = ring_write_tail;
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
 			ring->flush = gen6_render_ring_flush;
 		if (INTEL_INFO(dev)->gen >= 8) {
+			if (dev_priv->lrc_enabled)
+				ring->write_tail = gen8_write_tail_lrc;
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
 			ring->irq_put = gen8_ring_put_irq;
@@ -1958,7 +1967,7 @@ int intel_init_render_ring(struct drm_device *dev)
 		}
 		ring->irq_enable_mask = I915_USER_INTERRUPT;
 	}
-	ring->write_tail = ring_write_tail;
+
 	if (IS_HASWELL(dev))
 		ring->dispatch_execbuffer = hsw_ring_dispatch_execbuffer;
 	else if (IS_GEN8(dev))
@@ -2079,6 +2088,8 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (INTEL_INFO(dev)->gen >= 8) {
+			if (dev_priv->lrc_enabled)
+				ring->write_tail = gen8_write_tail_lrc;
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			ring->irq_get = gen8_ring_get_irq;
@@ -2133,6 +2144,8 @@ int intel_init_blt_ring(struct drm_device *dev)
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
+		if (dev_priv->lrc_enabled)
+			ring->write_tail = gen8_write_tail_lrc;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
@@ -2170,6 +2183,8 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	ring->set_seqno = ring_set_seqno;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
+		if (dev_priv->lrc_enabled)
+			ring->write_tail = gen8_write_tail_lrc;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 14/49] drm/i915/bdw: LR context ring init
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (12 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 13/49] drm/i915/bdw: Execlists ring tail writing oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 15/49] drm/i915/bdw: GEN8 semaphoreless ring add request oscar.mateo
                   ` (35 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

Logical ring contexts do not need most of the ring init: we just need
the pipe control object for the render ring and a few other things
(some of which will be added later).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 42 ++++++++++++++++++++++++++-------
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index a18dcf7..6e53ce1 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -528,6 +528,18 @@ out:
 	return ret;
 }
 
+static int init_ring_common_lrc(struct intel_engine *ring)
+{
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+
+	ringbuf->head = 0;
+	ringbuf->tail = 0;
+	ringbuf->space = ringbuf->size;
+	ringbuf->last_retired_head = -1;
+
+	return 0;
+}
+
 static int
 init_pipe_control(struct intel_engine *ring)
 {
@@ -630,6 +642,12 @@ static int init_render_ring(struct intel_engine *ring)
 	return ret;
 }
 
+static int init_render_ring_lrc(struct intel_engine *ring)
+{
+	init_ring_common_lrc(ring);
+	return init_pipe_control(ring);
+}
+
 static void render_ring_cleanup(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
@@ -1914,14 +1932,17 @@ int intel_init_render_ring(struct drm_device *dev)
 	struct intel_engine *ring = &dev_priv->ring[RCS];
 
 	ring->write_tail = ring_write_tail;
+	ring->init = init_render_ring;
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
 			ring->flush = gen6_render_ring_flush;
 		if (INTEL_INFO(dev)->gen >= 8) {
-			if (dev_priv->lrc_enabled)
+			if (dev_priv->lrc_enabled) {
 				ring->write_tail = gen8_write_tail_lrc;
+				ring->init = init_render_ring_lrc;
+			}
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
 			ring->irq_put = gen8_ring_put_irq;
@@ -1980,7 +2001,6 @@ int intel_init_render_ring(struct drm_device *dev)
 		ring->dispatch_execbuffer = i830_dispatch_execbuffer;
 	else
 		ring->dispatch_execbuffer = i915_dispatch_execbuffer;
-	ring->init = init_render_ring;
 	ring->cleanup = render_ring_cleanup;
 
 	/* Workaround batchbuffer to combat CS tlb bug. */
@@ -2079,6 +2099,7 @@ int intel_init_bsd_ring(struct drm_device *dev)
 	struct intel_engine *ring = &dev_priv->ring[VCS];
 
 	ring->write_tail = ring_write_tail;
+	ring->init = init_ring_common;
 	if (INTEL_INFO(dev)->gen >= 6) {
 		/* gen6 bsd needs a special wa for tail updates */
 		if (IS_GEN6(dev))
@@ -2088,8 +2109,10 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (INTEL_INFO(dev)->gen >= 8) {
-			if (dev_priv->lrc_enabled)
+			if (dev_priv->lrc_enabled) {
 				ring->write_tail = gen8_write_tail_lrc;
+				ring->init = init_ring_common_lrc;
+			}
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			ring->irq_get = gen8_ring_get_irq;
@@ -2128,7 +2151,6 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		}
 		ring->dispatch_execbuffer = i965_dispatch_execbuffer;
 	}
-	ring->init = init_ring_common;
 
 	return intel_init_ring(dev, ring);
 }
@@ -2139,13 +2161,16 @@ int intel_init_blt_ring(struct drm_device *dev)
 	struct intel_engine *ring = &dev_priv->ring[BCS];
 
 	ring->write_tail = ring_write_tail;
+	ring->init = init_ring_common;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
-		if (dev_priv->lrc_enabled)
+		if (dev_priv->lrc_enabled) {
 			ring->write_tail = gen8_write_tail_lrc;
+			ring->init = init_ring_common_lrc;
+		}
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
@@ -2166,7 +2191,6 @@ int intel_init_blt_ring(struct drm_device *dev)
 	ring->signal_mbox[VCS] = GEN6_VBSYNC;
 	ring->signal_mbox[BCS] = GEN6_NOSYNC;
 	ring->signal_mbox[VECS] = GEN6_VEBSYNC;
-	ring->init = init_ring_common;
 
 	return intel_init_ring(dev, ring);
 }
@@ -2177,14 +2201,17 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	struct intel_engine *ring = &dev_priv->ring[VECS];
 
 	ring->write_tail = ring_write_tail;
+	ring->init = init_ring_common;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
-		if (dev_priv->lrc_enabled)
+		if (dev_priv->lrc_enabled) {
 			ring->write_tail = gen8_write_tail_lrc;
+			ring->init = init_ring_common_lrc;
+		}
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
@@ -2205,7 +2232,6 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	ring->signal_mbox[VCS] = GEN6_VVESYNC;
 	ring->signal_mbox[BCS] = GEN6_BVESYNC;
 	ring->signal_mbox[VECS] = GEN6_NOSYNC;
-	ring->init = init_ring_common;
 
 	return intel_init_ring(dev, ring);
 }
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 15/49] drm/i915/bdw: GEN8 semaphoreless ring add request
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (13 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 14/49] drm/i915/bdw: LR context ring init oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 16/49] drm/i915/bdw: GEN8 new ring flush oscar.mateo
                   ` (34 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

Semaphores have changed, so let's not submit useless commands to the
ring.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Several rebases.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6e53ce1..e4b4c57 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -722,6 +722,24 @@ gen6_add_request(struct intel_engine *ring)
 	return 0;
 }
 
+static int
+gen8_add_request(struct intel_engine *ring)
+{
+	int ret;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
+	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
+	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
+	intel_ring_emit(ring, MI_USER_INTERRUPT);
+	__intel_ring_advance(ring);
+
+	return 0;
+}
+
 static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
 					      u32 seqno)
 {
@@ -1943,6 +1961,7 @@ int intel_init_render_ring(struct drm_device *dev)
 				ring->write_tail = gen8_write_tail_lrc;
 				ring->init = init_render_ring_lrc;
 			}
+			ring->add_request = gen8_add_request;
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
 			ring->irq_put = gen8_ring_put_irq;
@@ -2113,6 +2132,7 @@ int intel_init_bsd_ring(struct drm_device *dev)
 				ring->write_tail = gen8_write_tail_lrc;
 				ring->init = init_ring_common_lrc;
 			}
+			ring->add_request = gen8_add_request;
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			ring->irq_get = gen8_ring_get_irq;
@@ -2171,6 +2191,7 @@ int intel_init_blt_ring(struct drm_device *dev)
 			ring->write_tail = gen8_write_tail_lrc;
 			ring->init = init_ring_common_lrc;
 		}
+		ring->add_request = gen8_add_request;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
@@ -2206,12 +2227,12 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
-
 	if (INTEL_INFO(dev)->gen >= 8) {
 		if (dev_priv->lrc_enabled) {
 			ring->write_tail = gen8_write_tail_lrc;
 			ring->init = init_ring_common_lrc;
 		}
+		ring->add_request = gen8_add_request;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 16/49] drm/i915/bdw: GEN8 new ring flush
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (14 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 15/49] drm/i915/bdw: GEN8 semaphoreless ring add request oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
                   ` (33 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

The BSD invalidate bit is no longer present, and we can consolidate the
blt and bsd ring flushes into one. This helps prep the code to more
easily handle logical ring contexts.

This partially reverts:
commit 65ea32ce040a0a9a907362e9a362a842fd18cb21
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Dec 13 14:57:32 2012 -0800

    drm/i915/bdw: Update MI_FLUSH_DW

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Several rebases. Do not forget the VEBOX.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 57 +++++++++++++++++++++------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e4b4c57..240e86a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1801,6 +1801,30 @@ static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
 		   _MASKED_BIT_DISABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
 }
 
+static int gen8_ring_flush(struct intel_engine *ring,
+			   u32 invalidate, u32 flush)
+{
+	uint32_t cmd;
+	int ret;
+
+	ret = intel_ring_begin(ring, 4);
+	if (ret)
+		return ret;
+
+	cmd = MI_FLUSH_DW + 1;
+
+	if (invalidate & I915_GEM_GPU_DOMAINS)
+		cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
+			MI_FLUSH_DW_OP_STOREDW;
+	intel_ring_emit(ring, cmd);
+	intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
+	intel_ring_emit(ring, 0); /* upper addr */
+	intel_ring_emit(ring, 0); /* value */
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int gen6_bsd_ring_flush(struct intel_engine *ring,
 			       u32 invalidate, u32 flush)
 {
@@ -1812,8 +1836,7 @@ static int gen6_bsd_ring_flush(struct intel_engine *ring,
 		return ret;
 
 	cmd = MI_FLUSH_DW;
-	if (INTEL_INFO(ring->dev)->gen >= 8)
-		cmd += 1;
+
 	/*
 	 * Bspec vol 1c.5 - video engine command streamer:
 	 * "If ENABLED, all TLBs will be invalidated once the flush
@@ -1825,13 +1848,9 @@ static int gen6_bsd_ring_flush(struct intel_engine *ring,
 			MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
 	intel_ring_emit(ring, cmd);
 	intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
-	if (INTEL_INFO(ring->dev)->gen >= 8) {
-		intel_ring_emit(ring, 0); /* upper addr */
-		intel_ring_emit(ring, 0); /* value */
-	} else  {
-		intel_ring_emit(ring, 0);
-		intel_ring_emit(ring, MI_NOOP);
-	}
+	intel_ring_emit(ring, 0);
+	intel_ring_emit(ring, MI_NOOP);
+
 	intel_ring_advance(ring);
 	return 0;
 }
@@ -1916,8 +1935,7 @@ static int gen6_ring_flush(struct intel_engine *ring,
 		return ret;
 
 	cmd = MI_FLUSH_DW;
-	if (INTEL_INFO(ring->dev)->gen >= 8)
-		cmd += 1;
+
 	/*
 	 * Bspec vol 1c.3 - blitter engine command streamer:
 	 * "If ENABLED, all TLBs will be invalidated once the flush
@@ -1929,13 +1947,7 @@ static int gen6_ring_flush(struct intel_engine *ring,
 			MI_FLUSH_DW_OP_STOREDW;
 	intel_ring_emit(ring, cmd);
 	intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
-	if (INTEL_INFO(ring->dev)->gen >= 8) {
-		intel_ring_emit(ring, 0); /* upper addr */
-		intel_ring_emit(ring, 0); /* value */
-	} else  {
-		intel_ring_emit(ring, 0);
-		intel_ring_emit(ring, MI_NOOP);
-	}
+	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
 
 	if (IS_GEN7(dev) && !invalidate && flush)
@@ -2123,7 +2135,6 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		/* gen6 bsd needs a special wa for tail updates */
 		if (IS_GEN6(dev))
 			ring->write_tail = gen6_bsd_ring_write_tail;
-		ring->flush = gen6_bsd_ring_flush;
 		ring->add_request = gen6_add_request;
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
@@ -2132,6 +2143,7 @@ int intel_init_bsd_ring(struct drm_device *dev)
 				ring->write_tail = gen8_write_tail_lrc;
 				ring->init = init_ring_common_lrc;
 			}
+			ring->flush = gen8_ring_flush;
 			ring->add_request = gen8_add_request;
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
@@ -2140,6 +2152,7 @@ int intel_init_bsd_ring(struct drm_device *dev)
 			ring->dispatch_execbuffer =
 				gen8_ring_dispatch_execbuffer;
 		} else {
+			ring->flush = gen6_bsd_ring_flush;
 			ring->irq_enable_mask = GT_BSD_USER_INTERRUPT;
 			ring->irq_get = gen6_ring_get_irq;
 			ring->irq_put = gen6_ring_put_irq;
@@ -2182,7 +2195,6 @@ int intel_init_blt_ring(struct drm_device *dev)
 
 	ring->write_tail = ring_write_tail;
 	ring->init = init_ring_common;
-	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
@@ -2191,6 +2203,7 @@ int intel_init_blt_ring(struct drm_device *dev)
 			ring->write_tail = gen8_write_tail_lrc;
 			ring->init = init_ring_common_lrc;
 		}
+		ring->flush = gen8_ring_flush;
 		ring->add_request = gen8_add_request;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
@@ -2198,6 +2211,7 @@ int intel_init_blt_ring(struct drm_device *dev)
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 	} else {
+		ring->flush = gen6_ring_flush;
 		ring->irq_enable_mask = GT_BLT_USER_INTERRUPT;
 		ring->irq_get = gen6_ring_get_irq;
 		ring->irq_put = gen6_ring_put_irq;
@@ -2223,7 +2237,6 @@ int intel_init_vebox_ring(struct drm_device *dev)
 
 	ring->write_tail = ring_write_tail;
 	ring->init = init_ring_common;
-	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
@@ -2232,6 +2245,7 @@ int intel_init_vebox_ring(struct drm_device *dev)
 			ring->write_tail = gen8_write_tail_lrc;
 			ring->init = init_ring_common_lrc;
 		}
+		ring->flush = gen8_ring_flush;
 		ring->add_request = gen8_add_request;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
@@ -2239,6 +2253,7 @@ int intel_init_vebox_ring(struct drm_device *dev)
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 	} else {
+		ring->flush = gen6_ring_flush;
 		ring->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
 		ring->irq_get = hsw_vebox_get_irq;
 		ring->irq_put = hsw_vebox_put_irq;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (15 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 16/49] drm/i915/bdw: GEN8 new ring flush oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-04-01  0:38   ` Damien Lespiau
  2014-03-27 17:59 ` [PATCH 18/49] drm/i915/bdw: Allocate ringbuffer for LR contexts oscar.mateo
                   ` (32 subsequent siblings)
  49 siblings, 1 reply; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

There are a few big differences between context init and fini with the
previous implementation of hardware contexts. One of them is
demonstrated in this patch: we must do a context initialization for
every ring.

The patch will still fail at context setup, and therefore won't break
existing code or platform support.

Regarding the context size, reading the register to calculate the sizes
can work, I think, however the docs are very clear about the actual
context sizes on GEN8, so just hardcode that and use it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Rebased on top of the Full PPGTT series. It is important to notice
that at this point we have one global default context per engine, all
of them using the aliasing PPGTT (as opposed to the single global
default context we have with legacy HW contexts).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem_context.c |  5 +++++
 drivers/gpu/drm/i915/i915_lrc.c         | 40 ++++++++++++++++++++++++++++++++-
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 264ea67..ff6a33c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2316,6 +2316,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_lrc.c */
 int gen8_gem_context_init(struct drm_device *dev);
+void gen8_gem_context_fini(struct drm_device *dev);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e92b9c5..4a6f1b0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -440,6 +440,11 @@ void i915_gem_context_fini(struct drm_device *dev)
 	 * other code, leading to spurious errors. */
 	intel_gpu_reset(dev);
 
+	if (dev_priv->lrc_enabled) {
+		gen8_gem_context_fini(dev);
+		return;
+	}
+
 	/* When default context is created and switched to, base object refcount
 	 * will be 2 (+1 from object creation and +1 from do_switch()).
 	 * i915_gem_context_fini() will be called after gpu_idle() has switched
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 3a93e99..10e6dbc 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -41,7 +41,45 @@
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
+#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
+
+void gen8_gem_context_fini(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine *ring;
+	int unused;
+
+	for_each_ring(ring, dev_priv, unused) {
+		if (ring->default_context) {
+			i915_gem_object_ggtt_unpin(ring->default_context->obj);
+			i915_gem_context_unreference(ring->default_context);
+			ring->default_context = NULL;
+		}
+	}
+
+	dev_priv->mm.aliasing_ppgtt = NULL;
+}
+
 int gen8_gem_context_init(struct drm_device *dev)
 {
-	return -ENOSYS;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine *ring;
+	int ret = -ENOSYS, ring_id;
+
+	dev_priv->hw_context_size = round_up(GEN8_LR_CONTEXT_SIZE, 4096);
+
+	for_each_ring(ring, dev_priv, ring_id) {
+		ring->default_context = i915_gem_create_context(dev,
+						NULL, (ring_id == RCS));
+		if (IS_ERR_OR_NULL(ring->default_context)) {
+			ret = PTR_ERR(ring->default_context);
+			DRM_DEBUG_DRIVER("Create ctx failed: %d\n", ret);
+			ring->default_context = NULL;
+			goto err_out;
+		}
+	}
+
+err_out:
+	gen8_gem_context_fini(dev);
+	return ret;
 }
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 18/49] drm/i915/bdw: Allocate ringbuffer for LR contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (16 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
                   ` (31 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

With our setup in previous patches, we've allocated one default context
per ring. Now, each of those contexts holds a pointer to the default
ringbuffers and makes its own allocation of the backing objects.

To reiterate the TODO in the patch: the ringbuffer objects are in the
CPU mappable region. This will likely need to change at some point.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Place a ringbuffer pointer inside the context that, in the global
default context, just points to the engine's default ringbuffer. Update
the ringbuffer backing object early instead of waiting for the alloc &
destroy ringbuffer calls during ring initialization.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  4 ++
 drivers/gpu/drm/i915/i915_lrc.c         | 65 +++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.c |  8 ++++
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ff6a33c..3a36e28 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -597,6 +597,7 @@ struct i915_hw_context {
 	struct drm_i915_file_private *file_priv;
 	struct intel_engine *last_ring;
 	struct drm_i915_gem_object *obj;
+	struct intel_ringbuffer *ringbuf;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_address_space *vm;
 
@@ -2317,6 +2318,9 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 /* i915_lrc.c */
 int gen8_gem_context_init(struct drm_device *dev);
 void gen8_gem_context_fini(struct drm_device *dev);
+struct i915_hw_context *gen8_gem_create_context(struct drm_device *dev,
+			struct intel_engine *ring,
+			struct drm_i915_file_private *file_priv, bool create_vm);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 10e6dbc..40dfa95 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -43,6 +43,56 @@
 
 #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
 
+struct i915_hw_context *
+gen8_gem_create_context(struct drm_device *dev,
+			struct intel_engine *ring,
+			struct drm_i915_file_private *file_priv,
+			bool create_vm)
+{
+	struct i915_hw_context *ctx = NULL;
+	struct drm_i915_gem_object *ring_obj = NULL;
+	int ret;
+
+	ctx = i915_gem_create_context(dev, file_priv, create_vm);
+	if (IS_ERR_OR_NULL(ctx))
+		return ctx;
+
+	ring_obj = i915_gem_alloc_object(dev, 32 * PAGE_SIZE);
+	if (!ring_obj) {
+		i915_gem_object_ggtt_unpin(ctx->obj);
+		i915_gem_context_unreference(ctx);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	/* TODO: For now we put this in the mappable region so that we can reuse
+	 * the existing ringbuffer code which ioremaps it. When we start
+	 * creating many contexts, this will no longer work and we must switch
+	 * to a kmapish interface.
+	 */
+	ret = i915_gem_obj_ggtt_pin(ring_obj, PAGE_SIZE, PIN_MAPPABLE);
+	if (ret) {
+		drm_gem_object_unreference(&ring_obj->base);
+		i915_gem_object_ggtt_unpin(ctx->obj);
+		i915_gem_context_unreference(ctx);
+		return ERR_PTR(ret);
+	}
+
+	/* Failure at this point is almost impossible */
+	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
+	if (ret) {
+		i915_gem_object_ggtt_unpin(ring_obj);
+		drm_gem_object_unreference(&ring_obj->base);
+		i915_gem_object_ggtt_unpin(ctx->obj);
+		i915_gem_context_unreference(ctx);
+		return ERR_PTR(ret);
+	}
+
+	ctx->ringbuf = &ring->default_ringbuf;
+	ctx->ringbuf->obj = ring_obj;
+
+	return ctx;
+}
+
 void gen8_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -50,9 +100,16 @@ void gen8_gem_context_fini(struct drm_device *dev)
 	int unused;
 
 	for_each_ring(ring, dev_priv, unused) {
-		if (ring->default_context) {
-			i915_gem_object_ggtt_unpin(ring->default_context->obj);
-			i915_gem_context_unreference(ring->default_context);
+		struct i915_hw_context *ctx = ring->default_context;
+		if (ctx) {
+			struct drm_i915_gem_object *ring_obj = ctx->ringbuf->obj;
+			if (ring_obj) {
+				i915_gem_object_ggtt_unpin(ring_obj);
+				drm_gem_object_unreference(&ring_obj->base);
+				ctx->ringbuf->obj = NULL;
+			}
+			i915_gem_object_ggtt_unpin(ctx->obj);
+			i915_gem_context_unreference(ctx);
 			ring->default_context = NULL;
 		}
 	}
@@ -69,7 +126,7 @@ int gen8_gem_context_init(struct drm_device *dev)
 	dev_priv->hw_context_size = round_up(GEN8_LR_CONTEXT_SIZE, 4096);
 
 	for_each_ring(ring, dev_priv, ring_id) {
-		ring->default_context = i915_gem_create_context(dev,
+		ring->default_context = gen8_gem_create_context(dev, ring,
 						NULL, (ring_id == RCS));
 		if (IS_ERR_OR_NULL(ring->default_context)) {
 			ret = PTR_ERR(ring->default_context);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 240e86a..a552c48 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1380,8 +1380,12 @@ static int init_phys_status_page(struct intel_engine *ring)
 
 static void destroy_ring_buffer(struct intel_engine *ring)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 
+	if (dev_priv->lrc_enabled)
+		return;
+
 	i915_gem_object_ggtt_unpin(ringbuf->obj);
 	drm_gem_object_unreference(&ringbuf->obj->base);
 	ringbuf->obj = NULL;
@@ -1390,10 +1394,14 @@ static void destroy_ring_buffer(struct intel_engine *ring)
 static int alloc_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj = NULL;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
+	if (dev_priv->lrc_enabled)
+		return 0;
+
 	if (!HAS_LLC(dev))
 		obj = i915_gem_object_create_stolen(dev, ringbuf->size);
 	if (obj == NULL)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (17 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 18/49] drm/i915/bdw: Allocate ringbuffer for LR contexts oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-04-01  0:00   ` Damien Lespiau
  2014-04-15 16:00   ` Jeff McGee
  2014-03-27 17:59 ` [PATCH 20/49] drm/i915/bdw: Status page for LR contexts oscar.mateo
                   ` (30 subsequent siblings)
  49 siblings, 2 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

For the most part, logical rinf context objects are similar to hardware
contexts in that the backing object is meant to be opaque. There are
some exceptions where we need to poke certain offsets of the object for
initialization, updating the tail pointer or updating the PDPs.

For our basic execlist implementation we'll only need our PPGTT PDs,
and ringbuffer addresses in order to set up the context. With previous
patches, we have both, so start prepping the context to be load.

Before running a context for the first time you must populate some
fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
first page (in 0 based counting) of the context  image. These same
fields will be read and written to as contexts are saved and restored
once the system is up and running.

Many of these fields are completely reused from previous global
registers: ringbuffer head/tail/control, context control matches some
previous MI_SET_CONTEXT flags, and page directories. There are other
fields which we don't touch which we may want in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
for other engines.

Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>

v3: Several rebases and general changes to the code.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_lrc.c | 145 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 138 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 40dfa95..f0176ff 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -43,6 +43,38 @@
 
 #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
 
+#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
+#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
+
+#define CTX_LRI_HEADER_0		0x01
+#define CTX_CONTEXT_CONTROL		0x02
+#define CTX_RING_HEAD			0x04
+#define CTX_RING_TAIL			0x06
+#define CTX_RING_BUFFER_START		0x08
+#define CTX_RING_BUFFER_CONTROL	0x0a
+#define CTX_BB_HEAD_U			0x0c
+#define CTX_BB_HEAD_L			0x0e
+#define CTX_BB_STATE			0x10
+#define CTX_SECOND_BB_HEAD_U		0x12
+#define CTX_SECOND_BB_HEAD_L		0x14
+#define CTX_SECOND_BB_STATE		0x16
+#define CTX_BB_PER_CTX_PTR		0x18
+#define CTX_RCS_INDIRECT_CTX		0x1a
+#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
+#define CTX_LRI_HEADER_1		0x21
+#define CTX_CTX_TIMESTAMP		0x22
+#define CTX_PDP3_UDW			0x24
+#define CTX_PDP3_LDW			0x26
+#define CTX_PDP2_UDW			0x28
+#define CTX_PDP2_LDW			0x2a
+#define CTX_PDP1_UDW			0x2c
+#define CTX_PDP1_LDW			0x2e
+#define CTX_PDP0_UDW			0x30
+#define CTX_PDP0_LDW			0x32
+#define CTX_LRI_HEADER_2		0x41
+#define CTX_R_PWR_CLK_STATE		0x42
+#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
+
 struct i915_hw_context *
 gen8_gem_create_context(struct drm_device *dev,
 			struct intel_engine *ring,
@@ -51,6 +83,9 @@ gen8_gem_create_context(struct drm_device *dev,
 {
 	struct i915_hw_context *ctx = NULL;
 	struct drm_i915_gem_object *ring_obj = NULL;
+	struct i915_hw_ppgtt *ppgtt = NULL;
+	struct page *page;
+	uint32_t *reg_state;
 	int ret;
 
 	ctx = i915_gem_create_context(dev, file_priv, create_vm);
@@ -79,18 +114,114 @@ gen8_gem_create_context(struct drm_device *dev,
 
 	/* Failure at this point is almost impossible */
 	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
-	if (ret) {
-		i915_gem_object_ggtt_unpin(ring_obj);
-		drm_gem_object_unreference(&ring_obj->base);
-		i915_gem_object_ggtt_unpin(ctx->obj);
-		i915_gem_context_unreference(ctx);
-		return ERR_PTR(ret);
-	}
+	if (ret)
+		goto destroy_ring_obj;
 
 	ctx->ringbuf = &ring->default_ringbuf;
 	ctx->ringbuf->obj = ring_obj;
 
+	ppgtt = ctx_to_ppgtt(ctx);
+
+	ret = i915_gem_object_set_to_cpu_domain(ctx->obj, true);
+	if (ret)
+		goto destroy_ring_obj;
+
+	ret = i915_gem_object_get_pages(ctx->obj);
+	if (ret)
+		goto destroy_ring_obj;
+
+	i915_gem_object_pin_pages(ctx->obj);
+
+	/* The second page of the context object contains some fields which must
+	 * be set up prior to the first execution.
+	 */
+	page = i915_gem_object_get_page(ctx->obj, 1);
+	reg_state = kmap_atomic(page);
+
+	if (ring->id == RCS)
+		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
+	else
+		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
+	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
+	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
+	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
+	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
+	reg_state[CTX_RING_HEAD+1] = 0;
+	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
+	reg_state[CTX_RING_TAIL+1] = 0;
+	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
+	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
+	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
+	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
+	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
+	reg_state[CTX_BB_HEAD_U+1] = 0;
+	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
+	reg_state[CTX_BB_HEAD_L+1] = 0;
+	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
+	reg_state[CTX_BB_STATE+1] = (1<<5);
+	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
+	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
+	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
+	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
+	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
+	reg_state[CTX_SECOND_BB_STATE+1] = 0;
+	if (ring->id == RCS) {
+		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
+		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
+		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
+		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
+		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
+		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
+	}
+
+	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
+	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
+	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
+	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
+	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
+	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
+	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
+	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
+	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
+	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
+	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
+	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
+	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
+	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
+	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
+	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
+	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
+	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
+	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
+	if (ring->id == RCS) {
+		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
+		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
+		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
+	}
+
+#if 0
+	/* Offsets not yet defined for these */
+	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS[] = ;
+	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
+#endif
+
+	kunmap_atomic(reg_state);
+
+	ctx->obj->dirty = 1;
+	set_page_dirty(page);
+	i915_gem_object_unpin_pages(ctx->obj);
+
 	return ctx;
+
+destroy_ring_obj:
+	i915_gem_object_ggtt_unpin(ring_obj);
+	drm_gem_object_unreference(&ring_obj->base);
+	ctx->ringbuf->obj = NULL;
+	ctx->ringbuf = NULL;
+	i915_gem_object_ggtt_unpin(ctx->obj);
+	i915_gem_context_unreference(ctx);
+
+	return ERR_PTR(ret);
 }
 
 void gen8_gem_context_fini(struct drm_device *dev)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 20/49] drm/i915/bdw: Status page for LR contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (18 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 21/49] drm/i915/bdw: Enable execlists in the hardware oscar.mateo
                   ` (29 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

The status page with logical ring contexts is included already in the
context object. Update the init and cleanup functions to reflect that. The
status page is offset 0 from the context object when using logical ring
contexts.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Several rebases.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index a552c48..d334f5a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1306,16 +1306,21 @@ i915_dispatch_execbuffer(struct intel_engine *ring,
 
 static void cleanup_status_page(struct intel_engine *ring)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct drm_i915_gem_object *obj;
 
 	obj = ring->status_page.obj;
 	if (obj == NULL)
 		return;
+	ring->status_page.obj = NULL;
 
 	kunmap(sg_page(obj->pages->sgl));
+
+	if (dev_priv->lrc_enabled)
+		return;
+
 	i915_gem_object_ggtt_unpin(obj);
 	drm_gem_object_unreference(&obj->base);
-	ring->status_page.obj = NULL;
 }
 
 static int init_status_page(struct intel_engine *ring)
@@ -1444,7 +1449,14 @@ static int intel_init_ring(struct drm_device *dev,
 
 	init_waitqueue_head(&ring->irq_queue);
 
-	if (I915_NEED_GFX_HWS(dev)) {
+	if (dev_priv->lrc_enabled) {
+		obj = ring->default_context->obj;
+		ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(obj);
+		ring->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
+		if (ring->status_page.page_addr == NULL)
+			return -ENOMEM;
+		ring->status_page.obj = obj;
+	} else if (I915_NEED_GFX_HWS(dev)) {
 		ret = init_status_page(ring);
 		if (ret)
 			return ret;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 21/49] drm/i915/bdw: Enable execlists in the hardware
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (19 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 20/49] drm/i915/bdw: Status page for LR contexts oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 22/49] drm/i915/bdw: Plumbing for user LR context switching oscar.mateo
                   ` (28 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: Set Replay Mode to 0 per BSpec

Michel Thierry <michel.thierry@intel.com>

v3: Several rebases.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_lrc.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index f0176ff..a726b26 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -252,7 +252,7 @@ int gen8_gem_context_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine *ring;
-	int ret = -ENOSYS, ring_id;
+	int ret, ring_id;
 
 	dev_priv->hw_context_size = round_up(GEN8_LR_CONTEXT_SIZE, 4096);
 
@@ -265,8 +265,17 @@ int gen8_gem_context_init(struct drm_device *dev)
 			ring->default_context = NULL;
 			goto err_out;
 		}
+
+		I915_WRITE(RING_MODE_GEN7(ring),
+			_MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
+			_MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
+		POSTING_READ(RING_MODE_GEN7(ring));
+
+		DRM_DEBUG_DRIVER("Enabled default logical ring context for %s\n", ring->name);
 	}
 
+	return 0;
+
 err_out:
 	gen8_gem_context_fini(dev);
 	return ret;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 22/49] drm/i915/bdw: Plumbing for user LR context switching
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (20 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 21/49] drm/i915/bdw: Enable execlists in the hardware oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 23/49] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit oscar.mateo
                   ` (27 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Plumb ring->write_tail with a context argument, which in turn
means plumbing ring->add_request, which in turn, etc.... The
idea is that, by the time we would usually update the tail
register, we know which context we are working with and,
therefore, we can send it to the execlist submit port.

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c            |   2 +-
 drivers/gpu/drm/i915/i915_drv.h            |   3 +-
 drivers/gpu/drm/i915/i915_gem.c            |   5 +-
 drivers/gpu/drm/i915/i915_gem_context.c    |   2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  23 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        |   7 +-
 drivers/gpu/drm/i915/intel_display.c       |  10 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 113 ++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  20 +++--
 9 files changed, 113 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index d9d28f4..76b47a6 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -56,7 +56,7 @@
 	intel_ring_emit(LP_RING(dev_priv), x)
 
 #define ADVANCE_LP_RING() \
-	__intel_ring_advance(LP_RING(dev_priv))
+	__intel_ring_advance(LP_RING(dev_priv), NULL)
 
 /**
  * Lock test for when it's just for synchronization of ring access.
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3a36e28..c0f0c3d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2168,11 +2168,12 @@ void i915_gem_cleanup_ring(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 int __i915_add_request(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *batch_obj,
 		       u32 *seqno);
 #define i915_add_request(ring, seqno) \
-	__i915_add_request(ring, NULL, NULL, seqno)
+	__i915_add_request(ring, NULL, NULL, NULL, seqno)
 int __must_check i915_wait_seqno(struct intel_engine *ring,
 				 uint32_t seqno);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7ed56f7..0c7ba1f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2160,6 +2160,7 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
 }
 
 int __i915_add_request(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *obj,
 		       u32 *out_seqno)
@@ -2177,7 +2178,7 @@ int __i915_add_request(struct intel_engine *ring,
 	 * is that the flush _must_ happen before the next request, no matter
 	 * what.
 	 */
-	ret = intel_ring_flush_all_caches(ring);
+	ret = intel_ring_flush_all_caches(ring, ctx);
 	if (ret)
 		return ret;
 
@@ -2192,7 +2193,7 @@ int __i915_add_request(struct intel_engine *ring,
 	 */
 	request_ring_position = intel_ring_get_tail(ring);
 
-	ret = ring->add_request(ring);
+	ret = ring->add_request(ring, ctx);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4a6f1b0..cb43272 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -589,7 +589,7 @@ mi_set_context(struct intel_engine *ring,
 	 * itlb_before_ctx_switch.
 	 */
 	if (IS_GEN6(ring->dev) && ring->itlb_before_ctx_switch) {
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, 0);
+		ret = ring->flush(ring, NULL, I915_GEM_GPU_DOMAINS, 0);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 73f8712..d2ef284 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -825,6 +825,7 @@ err:
 
 static int
 i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
+				struct i915_hw_context *ctx,
 				struct list_head *vmas)
 {
 	struct i915_vma *vma;
@@ -853,7 +854,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
 	/* Unconditionally invalidate gpu caches and ensure that we do flush
 	 * any residual writes from the previous batch.
 	 */
-	return intel_ring_invalidate_all_caches(ring);
+	return intel_ring_invalidate_all_caches(ring, ctx);
 }
 
 static bool
@@ -965,18 +966,20 @@ static void
 i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 				    struct drm_file *file,
 				    struct intel_engine *ring,
+				    struct i915_hw_context *ctx,
 				    struct drm_i915_gem_object *obj)
 {
 	/* Unconditionally force add_request to emit a full flush. */
 	ring->gpu_caches_dirty = true;
 
 	/* Add a breadcrumb for the completion of the batch buffer */
-	(void)__i915_add_request(ring, file, obj, NULL);
+	(void)__i915_add_request(ring, ctx, file, obj, NULL);
 }
 
 static int
 i915_reset_gen7_sol_offsets(struct drm_device *dev,
-			    struct intel_engine *ring)
+			    struct intel_engine *ring,
+			    struct i915_hw_context *ctx)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	int ret, i;
@@ -984,7 +987,7 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 	if (!IS_GEN7(dev) || ring != &dev_priv->ring[RCS])
 		return 0;
 
-	ret = intel_ring_begin(ring, 4 * 3);
+	ret = intel_ringbuffer_begin(ring, ctx, 4 * 3);
 	if (ret)
 		return ret;
 
@@ -1217,7 +1220,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	else
 		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
-	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
+	ret = i915_gem_execbuffer_move_to_gpu(ring, ctx, &eb->vmas);
 	if (ret)
 		goto err;
 
@@ -1227,7 +1230,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    mode != dev_priv->relative_constants_mode) {
-		ret = intel_ring_begin(ring, 4);
+		ret = intel_ringbuffer_begin(ring, ctx, 4);
 		if (ret)
 				goto err;
 
@@ -1241,7 +1244,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	}
 
 	if (args->flags & I915_EXEC_GEN7_SOL_RESET) {
-		ret = i915_reset_gen7_sol_offsets(dev, ring);
+		ret = i915_reset_gen7_sol_offsets(dev, ring, ctx);
 		if (ret)
 			goto err;
 	}
@@ -1255,14 +1258,14 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			if (ret)
 				goto err;
 
-			ret = ring->dispatch_execbuffer(ring,
+			ret = ring->dispatch_execbuffer(ring, ctx,
 							exec_start, exec_len,
 							flags);
 			if (ret)
 				goto err;
 		}
 	} else {
-		ret = ring->dispatch_execbuffer(ring,
+		ret = ring->dispatch_execbuffer(ring, ctx,
 						exec_start, exec_len,
 						flags);
 		if (ret)
@@ -1272,7 +1275,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
 	i915_gem_execbuffer_move_to_active(&eb->vmas, ring);
-	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
+	i915_gem_execbuffer_retire_commands(dev, file, ring, ctx, batch_obj);
 
 err:
 	/* the request owns the ref now */
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5333319..e5911ec 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -711,7 +711,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	}
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, NULL, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -755,7 +755,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	}
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, NULL, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -773,7 +773,8 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 
 	/* XXX: RCS is the only one to auto invalidate the TLBs? */
 	if (ring->id != RCS) {
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+		ret = ring->flush(ring, NULL, I915_GEM_GPU_DOMAINS,
+				I915_GEM_GPU_DOMAINS);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 30ab378..462c7ae 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8637,7 +8637,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8679,7 +8679,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8728,7 +8728,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8773,7 +8773,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8867,7 +8867,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d334f5a..4fbea79 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -43,7 +43,8 @@ static inline int ring_space(struct intel_engine *ring)
 	return space;
 }
 
-void __intel_ring_advance(struct intel_engine *ring)
+void __intel_ring_advance(struct intel_engine *ring,
+			  struct i915_hw_context *ctx)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
@@ -51,11 +52,13 @@ void __intel_ring_advance(struct intel_engine *ring)
 	ringbuf->tail &= ringbuf->size - 1;
 	if (dev_priv->gpu_error.stop_rings & intel_ring_flag(ring))
 		return;
-	ring->write_tail(ring, ringbuf->tail);
+
+	ring->write_tail(ring, ctx, ringbuf->tail);
 }
 
 static int
 gen2_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -82,6 +85,7 @@ gen2_render_ring_flush(struct intel_engine *ring,
 
 static int
 gen4_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -212,7 +216,8 @@ intel_emit_post_sync_nonzero_flush(struct intel_engine *ring)
 
 static int
 gen6_render_ring_flush(struct intel_engine *ring,
-                         u32 invalidate_domains, u32 flush_domains)
+		       struct i915_hw_context *ctx,
+		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
 	u32 scratch_addr = ring->scratch.gtt_offset + 128;
@@ -306,6 +311,7 @@ static int gen7_ring_fbc_flush(struct intel_engine *ring, u32 value)
 
 static int
 gen7_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -367,6 +373,7 @@ gen7_render_ring_flush(struct intel_engine *ring,
 
 static int
 gen8_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -390,7 +397,7 @@ gen8_render_ring_flush(struct intel_engine *ring,
 		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
 	}
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ringbuffer_begin(ring, ctx, 6);
 	if (ret)
 		return ret;
 
@@ -407,13 +414,14 @@ gen8_render_ring_flush(struct intel_engine *ring,
 }
 
 static void ring_write_tail(struct intel_engine *ring,
-			    u32 value)
+			    struct i915_hw_context *ctx, u32 value)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
 	I915_WRITE_TAIL(ring, value);
 }
 
 static void gen8_write_tail_lrc(struct intel_engine *ring,
+				struct i915_hw_context *ctx,
 				u32 value)
 {
 	DRM_ERROR("Execlists still not ready!\n");
@@ -453,7 +461,7 @@ static int init_ring_common(struct intel_engine *ring)
 	/* Stop the ring if it's running. */
 	I915_WRITE_CTL(ring, 0);
 	I915_WRITE_HEAD(ring, 0);
-	ring->write_tail(ring, 0);
+	ring->write_tail(ring, NULL, 0);
 	if (wait_for_atomic((I915_READ_MODE(ring) & MODE_IDLE) != 0, 1000))
 		DRM_ERROR("%s :timed out trying to stop ring\n", ring->name);
 
@@ -690,7 +698,8 @@ update_mboxes(struct intel_engine *ring,
  * This acts like a signal in the canonical semaphore.
  */
 static int
-gen6_add_request(struct intel_engine *ring)
+gen6_add_request(struct intel_engine *ring,
+		 struct i915_hw_context *ctx)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -717,17 +726,18 @@ gen6_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
 
 static int
-gen8_add_request(struct intel_engine *ring)
+gen8_add_request(struct intel_engine *ring,
+		 struct i915_hw_context *ctx)
 {
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ringbuffer_begin(ring, ctx, 4);
 	if (ret)
 		return ret;
 
@@ -735,7 +745,7 @@ gen8_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -806,7 +816,8 @@ do {									\
 } while (0)
 
 static int
-pc_render_add_request(struct intel_engine *ring)
+pc_render_add_request(struct intel_engine *ring,
+		      struct i915_hw_context *ctx)
 {
 	u32 scratch_addr = ring->scratch.gtt_offset + 128;
 	int ret;
@@ -848,7 +859,7 @@ pc_render_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, 0);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -1052,6 +1063,7 @@ void intel_ring_setup_status_page(struct intel_engine *ring)
 
 static int
 bsd_ring_flush(struct intel_engine *ring,
+	       struct i915_hw_context *ctx,
 	       u32     invalidate_domains,
 	       u32     flush_domains)
 {
@@ -1068,7 +1080,8 @@ bsd_ring_flush(struct intel_engine *ring,
 }
 
 static int
-i9xx_add_request(struct intel_engine *ring)
+i9xx_add_request(struct intel_engine *ring,
+		 struct i915_hw_context *ctx)
 {
 	int ret;
 
@@ -1080,7 +1093,7 @@ i9xx_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -1214,6 +1227,7 @@ gen8_ring_put_irq(struct intel_engine *ring)
 
 static int
 i965_dispatch_execbuffer(struct intel_engine *ring,
+			 struct i915_hw_context *ctx,
 			 u32 offset, u32 length,
 			 unsigned flags)
 {
@@ -1237,6 +1251,7 @@ i965_dispatch_execbuffer(struct intel_engine *ring,
 #define I830_BATCH_LIMIT (256*1024)
 static int
 i830_dispatch_execbuffer(struct intel_engine *ring,
+			 struct i915_hw_context *ctx,
 				u32 offset, u32 len,
 				unsigned flags)
 {
@@ -1288,6 +1303,7 @@ i830_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 i915_dispatch_execbuffer(struct intel_engine *ring,
+			 struct i915_hw_context *ctx,
 			 u32 offset, u32 len,
 			 unsigned flags)
 {
@@ -1540,16 +1556,16 @@ void intel_cleanup_ring(struct intel_engine *ring)
 static int intel_ring_wait_request(struct intel_engine *ring, int n)
 {
 	struct drm_i915_gem_request *request;
-	struct intel_ringbuffer *ring_buf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	u32 seqno = 0, tail;
 	int ret;
 
-	if (ring_buf->last_retired_head != -1) {
-		ring_buf->head = ring_buf->last_retired_head;
-		ring_buf->last_retired_head = -1;
+	if (ringbuf->last_retired_head != -1) {
+		ringbuf->head = ringbuf->last_retired_head;
+		ringbuf->last_retired_head = -1;
 
-		ring_buf->space = ring_space(ring);
-		if (ring_buf->space >= n)
+		ringbuf->space = ring_space(ring);
+		if (ringbuf->space >= n)
 			return 0;
 	}
 
@@ -1559,9 +1575,9 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 		if (request->tail == -1)
 			continue;
 
-		space = request->tail - (ring_buf->tail + I915_RING_FREE_SPACE);
+		space = request->tail - (ringbuf->tail + I915_RING_FREE_SPACE);
 		if (space < 0)
-			space += ring_buf->size;
+			space += ringbuf->size;
 		if (space >= n) {
 			seqno = request->seqno;
 			tail = request->tail;
@@ -1583,15 +1599,16 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 	if (ret)
 		return ret;
 
-	ring_buf->head = tail;
-	ring_buf->space = ring_space(ring);
-	if (WARN_ON(ring_buf->space < n))
+	ringbuf->head = tail;
+	ringbuf->space = ring_space(ring);
+	if (WARN_ON(ringbuf->space < n))
 		return -ENOSPC;
 
 	return 0;
 }
 
-static int ring_wait_for_space(struct intel_engine *ring, int n)
+static int ring_wait_for_space(struct intel_engine *ring,
+			       struct i915_hw_context *ctx, int n)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1604,7 +1621,7 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 		return ret;
 
 	/* force the tail write in case we have been skipping them */
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	trace_i915_ring_wait_begin(ring);
 	/* With GEM the hangcheck timer should kick us out of the loop,
@@ -1640,14 +1657,15 @@ static int ring_wait_for_space(struct intel_engine *ring, int n)
 	return -EBUSY;
 }
 
-static int intel_wrap_ring_buffer(struct intel_engine *ring)
+static int intel_wrap_ring_buffer(struct intel_engine *ring,
+				  struct i915_hw_context *ctx)
 {
 	uint32_t __iomem *virt;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int rem = ringbuf->size - ringbuf->tail;
 
 	if (ringbuf->space < rem) {
-		int ret = ring_wait_for_space(ring, rem);
+		int ret = ring_wait_for_space(ring, ctx, rem);
 		if (ret)
 			return ret;
 	}
@@ -1706,19 +1724,19 @@ intel_ring_alloc_seqno(struct intel_engine *ring)
 }
 
 static int __intel_ring_prepare(struct intel_engine *ring,
-				int bytes)
+				struct i915_hw_context *ctx, int bytes)
 {
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
-		ret = intel_wrap_ring_buffer(ring);
+		ret = intel_wrap_ring_buffer(ring, ctx);
 		if (unlikely(ret))
 			return ret;
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
-		ret = ring_wait_for_space(ring, bytes);
+		ret = ring_wait_for_space(ring, ctx, bytes);
 		if (unlikely(ret))
 			return ret;
 	}
@@ -1726,8 +1744,9 @@ static int __intel_ring_prepare(struct intel_engine *ring,
 	return 0;
 }
 
-int intel_ring_begin(struct intel_engine *ring,
-		     int num_dwords)
+int intel_ringbuffer_begin(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
+			      int num_dwords)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
 	int ret;
@@ -1737,7 +1756,7 @@ int intel_ring_begin(struct intel_engine *ring,
 	if (ret)
 		return ret;
 
-	ret = __intel_ring_prepare(ring, num_dwords * sizeof(uint32_t));
+	ret = __intel_ring_prepare(ring, ctx, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
 
@@ -1789,7 +1808,7 @@ void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno)
 }
 
 static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
-				     u32 value)
+				     struct i915_hw_context *ctx, u32 value)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
 
@@ -1822,12 +1841,13 @@ static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
 }
 
 static int gen8_ring_flush(struct intel_engine *ring,
+			   struct i915_hw_context *ctx,
 			   u32 invalidate, u32 flush)
 {
 	uint32_t cmd;
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ringbuffer_begin(ring, ctx, 4);
 	if (ret)
 		return ret;
 
@@ -1846,6 +1866,7 @@ static int gen8_ring_flush(struct intel_engine *ring,
 }
 
 static int gen6_bsd_ring_flush(struct intel_engine *ring,
+			       struct i915_hw_context *ctx,
 			       u32 invalidate, u32 flush)
 {
 	uint32_t cmd;
@@ -1877,6 +1898,7 @@ static int gen6_bsd_ring_flush(struct intel_engine *ring,
 
 static int
 gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
 			      u32 offset, u32 len,
 			      unsigned flags)
 {
@@ -1885,7 +1907,7 @@ gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 		!(flags & I915_DISPATCH_SECURE);
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ringbuffer_begin(ring, ctx, 4);
 	if (ret)
 		return ret;
 
@@ -1901,6 +1923,7 @@ gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
+			     struct i915_hw_context *ctx,
 			      u32 offset, u32 len,
 			      unsigned flags)
 {
@@ -1922,6 +1945,7 @@ hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
 			      u32 offset, u32 len,
 			      unsigned flags)
 {
@@ -1944,6 +1968,7 @@ gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
 /* Blitter support (SandyBridge+) */
 
 static int gen6_ring_flush(struct intel_engine *ring,
+			   struct i915_hw_context *ctx,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
@@ -2293,14 +2318,15 @@ int intel_init_vebox_ring(struct drm_device *dev)
 }
 
 int
-intel_ring_flush_all_caches(struct intel_engine *ring)
+intel_ring_flush_all_caches(struct intel_engine *ring,
+			    struct i915_hw_context *ctx)
 {
 	int ret;
 
 	if (!ring->gpu_caches_dirty)
 		return 0;
 
-	ret = ring->flush(ring, 0, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, ctx, 0, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -2311,7 +2337,8 @@ intel_ring_flush_all_caches(struct intel_engine *ring)
 }
 
 int
-intel_ring_invalidate_all_caches(struct intel_engine *ring)
+intel_ring_invalidate_all_caches(struct intel_engine *ring,
+				 struct i915_hw_context *ctx)
 {
 	uint32_t flush_domains;
 	int ret;
@@ -2320,7 +2347,7 @@ intel_ring_invalidate_all_caches(struct intel_engine *ring)
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;
 
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, flush_domains);
+	ret = ring->flush(ring, ctx, I915_GEM_GPU_DOMAINS, flush_domains);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a914348..95e29e0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -101,11 +101,13 @@ struct intel_engine {
 	int		(*init)(struct intel_engine *ring);
 
 	void		(*write_tail)(struct intel_engine *ring,
-				      u32 value);
+				      struct i915_hw_context *ctx, u32 value);
 	int __must_check (*flush)(struct intel_engine *ring,
+				  struct i915_hw_context *ctx,
 				  u32	invalidate_domains,
 				  u32	flush_domains);
-	int		(*add_request)(struct intel_engine *ring);
+	int		(*add_request)(struct intel_engine *ring,
+				       struct i915_hw_context *ctx);
 	/* Some chipsets are not quite as coherent as advertised and need
 	 * an expensive kick to force a true read of the up-to-date seqno.
 	 * However, the up-to-date seqno is not always required and the last
@@ -117,6 +119,7 @@ struct intel_engine {
 	void		(*set_seqno)(struct intel_engine *ring,
 				     u32 seqno);
 	int		(*dispatch_execbuffer)(struct intel_engine *ring,
+					       struct i915_hw_context *ctx,
 					       u32 offset, u32 length,
 					       unsigned flags);
 #define I915_DISPATCH_SECURE 0x1
@@ -281,7 +284,9 @@ intel_write_status_page(struct intel_engine *ring,
 
 void intel_cleanup_ring(struct intel_engine *ring);
 
-int __must_check intel_ring_begin(struct intel_engine *ring, int n);
+int __must_check intel_ringbuffer_begin(struct intel_engine *ring,
+					   struct i915_hw_context *ctx, int n);
+#define intel_ring_begin(ring, n) intel_ringbuffer_begin(ring, NULL, n)
 int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
 static inline void intel_ring_emit(struct intel_engine *ring,
 				   u32 data)
@@ -297,12 +302,15 @@ static inline void intel_ring_advance(struct intel_engine *ring)
 
 	ringbuf->tail &= ringbuf->size - 1;
 }
-void __intel_ring_advance(struct intel_engine *ring);
+void __intel_ring_advance(struct intel_engine *ring,
+			  struct i915_hw_context *ctx);
 
 int __must_check intel_ring_idle(struct intel_engine *ring);
 void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
-int intel_ring_flush_all_caches(struct intel_engine *ring);
-int intel_ring_invalidate_all_caches(struct intel_engine *ring);
+int intel_ring_flush_all_caches(struct intel_engine *ring,
+				struct i915_hw_context *ctx);
+int intel_ring_invalidate_all_caches(struct intel_engine *ring,
+				     struct i915_hw_context *ctx);
 
 void intel_init_rings_early(struct drm_device *dev);
 int intel_init_render_ring(struct drm_device *dev);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 23/49] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (21 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 22/49] drm/i915/bdw: Plumbing for user LR context switching oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 24/49] drm/i915/bdw: Write a new set of context-aware ringbuffer management functions oscar.mateo
                   ` (26 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The "__" name is too confusing, specially given the refactoring patch
that comes soon with new ringbuffer management functions.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  2 +-
 drivers/gpu/drm/i915/intel_display.c    | 10 +++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.c | 14 +++++++-------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 76b47a6..29583da 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -56,7 +56,7 @@
 	intel_ring_emit(LP_RING(dev_priv), x)
 
 #define ADVANCE_LP_RING() \
-	__intel_ring_advance(LP_RING(dev_priv), NULL)
+	intel_ringbuffer_advance_and_submit(LP_RING(dev_priv), NULL)
 
 /**
  * Lock test for when it's just for synchronization of ring access.
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 462c7ae..22be556 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8637,7 +8637,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8679,7 +8679,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8728,7 +8728,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8773,7 +8773,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8867,7 +8867,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring, ring->default_context);
+	intel_ringbuffer_advance_and_submit(ring, ring->default_context);
 	return 0;
 
 err_unpin:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4fbea79..ac4e618 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -43,8 +43,8 @@ static inline int ring_space(struct intel_engine *ring)
 	return space;
 }
 
-void __intel_ring_advance(struct intel_engine *ring,
-			  struct i915_hw_context *ctx)
+void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
+					 struct i915_hw_context *ctx)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
@@ -726,7 +726,7 @@ gen6_add_request(struct intel_engine *ring,
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	return 0;
 }
@@ -745,7 +745,7 @@ gen8_add_request(struct intel_engine *ring,
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	return 0;
 }
@@ -859,7 +859,7 @@ pc_render_add_request(struct intel_engine *ring,
 	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, 0);
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	return 0;
 }
@@ -1093,7 +1093,7 @@ i9xx_add_request(struct intel_engine *ring,
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	return 0;
 }
@@ -1621,7 +1621,7 @@ static int ring_wait_for_space(struct intel_engine *ring,
 		return ret;
 
 	/* force the tail write in case we have been skipping them */
-	__intel_ring_advance(ring, ctx);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	trace_i915_ring_wait_begin(ring);
 	/* With GEM the hangcheck timer should kick us out of the loop,
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 95e29e0..cd6c52a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -302,8 +302,8 @@ static inline void intel_ring_advance(struct intel_engine *ring)
 
 	ringbuf->tail &= ringbuf->size - 1;
 }
-void __intel_ring_advance(struct intel_engine *ring,
-			  struct i915_hw_context *ctx);
+void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
+					 struct i915_hw_context *ctx);
 
 int __must_check intel_ring_idle(struct intel_engine *ring);
 void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 24/49] drm/i915/bdw: Write a new set of context-aware ringbuffer management functions
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (22 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 23/49] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 25/49] drm/i915: Final touches to LR contexts plumbing and refactoring oscar.mateo
                   ` (25 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Since the ringbuffer can live in the ring (pre-GEN8) or in the context (GEN8+)
we need functions that are aware of this. After this commit and some of the
previous, this new ringbuffer functions finally are:

intel_ringbuffer_get
intel_ringbuffer_begin
intel_ringbuffer_cacheline_align
intel_ringbuffer_emit
intel_ringbuffer_advance
intel_ringbuffer_advance_and_submit
intel_ringbuffer_get_tail

Some of the old ones remain after the refactoring as deprecated functions, simply
calling the previous set of functions to manipulate the engine's default ringbuffer:

intel_ring_begin
intel_ring_emit
intel_ring_advance

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c            |  4 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 19 +++++----
 drivers/gpu/drm/i915/intel_display.c       |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 54 +++++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 67 +++++++++++++++++++++++-------
 5 files changed, 95 insertions(+), 51 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0c7ba1f..a052a80 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2170,7 +2170,7 @@ int __i915_add_request(struct intel_engine *ring,
 	u32 request_ring_position, request_start;
 	int ret;
 
-	request_start = intel_ring_get_tail(ring);
+	request_start = intel_ringbuffer_get_tail(ring, ctx);
 	/*
 	 * Emit any outstanding flushes - execbuf can fail to emit the flush
 	 * after having emitted the batchbuffer command. Hence we need to fix
@@ -2191,7 +2191,7 @@ int __i915_add_request(struct intel_engine *ring,
 	 * GPU processing the request, we never over-estimate the
 	 * position of the head.
 	 */
-	request_ring_position = intel_ring_get_tail(ring);
+	request_ring_position = intel_ringbuffer_get_tail(ring, ctx);
 
 	ret = ring->add_request(ring, ctx);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index d2ef284..c0a1032 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -982,14 +982,15 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 			    struct i915_hw_context *ctx)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	int ret, i;
+	struct intel_ringbuffer *ringbuf;
+	int i;
 
 	if (!IS_GEN7(dev) || ring != &dev_priv->ring[RCS])
 		return 0;
 
-	ret = intel_ringbuffer_begin(ring, ctx, 4 * 3);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 4 * 3);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return PTR_ERR(ringbuf);
 
 	for (i = 0; i < 4; i++) {
 		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
@@ -1230,9 +1231,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    mode != dev_priv->relative_constants_mode) {
-		ret = intel_ringbuffer_begin(ring, ctx, 4);
-		if (ret)
-				goto err;
+		struct intel_ringbuffer *ringbuf;
+
+		ringbuf = intel_ringbuffer_begin(ring, ctx, 4);
+		if (IS_ERR_OR_NULL(ringbuf)) {
+			ret = (PTR_ERR(ringbuf));
+			goto err;
+		}
 
 		intel_ring_emit(ring, MI_NOOP);
 		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 22be556..70c844f 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8832,7 +8832,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	 * then do the cacheline alignment, and finally emit the
 	 * MI_DISPLAY_FLIP.
 	 */
-	ret = intel_ring_cacheline_align(ring);
+	ret = intel_ringbuffer_cacheline_align(ring, ring->default_context);
 	if (ret)
 		goto err_unpin;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ac4e618..54aba64 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -376,9 +376,9 @@ gen8_render_ring_flush(struct intel_engine *ring,
 		       struct i915_hw_context *ctx,
 		       u32 invalidate_domains, u32 flush_domains)
 {
+	struct intel_ringbuffer *ringbuf;
 	u32 flags = 0;
 	u32 scratch_addr = ring->scratch.gtt_offset + 128;
-	int ret;
 
 	flags |= PIPE_CONTROL_CS_STALL;
 
@@ -397,9 +397,9 @@ gen8_render_ring_flush(struct intel_engine *ring,
 		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
 	}
 
-	ret = intel_ringbuffer_begin(ring, ctx, 6);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 6);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
 
 	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(6));
 	intel_ring_emit(ring, flags);
@@ -735,11 +735,11 @@ static int
 gen8_add_request(struct intel_engine *ring,
 		 struct i915_hw_context *ctx)
 {
-	int ret;
+	struct intel_ringbuffer *ringbuf;
 
-	ret = intel_ringbuffer_begin(ring, ctx, 4);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 4);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
 
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
@@ -1744,33 +1744,37 @@ static int __intel_ring_prepare(struct intel_engine *ring,
 	return 0;
 }
 
-int intel_ringbuffer_begin(struct intel_engine *ring,
-			      struct i915_hw_context *ctx,
-			      int num_dwords)
+struct intel_ringbuffer *
+intel_ringbuffer_begin(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
+		       int num_dwords)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	int ret;
 
 	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
 				   dev_priv->mm.interruptible);
 	if (ret)
-		return ret;
+		return ERR_PTR(ret);
 
 	ret = __intel_ring_prepare(ring, ctx, num_dwords * sizeof(uint32_t));
 	if (ret)
-		return ret;
+		return ERR_PTR(ret);
 
 	/* Preallocate the olr before touching the ring */
 	ret = intel_ring_alloc_seqno(ring);
 	if (ret)
-		return ret;
+		return ERR_PTR(ret);
 
-	__get_ringbuf(ring)->space -= num_dwords * sizeof(uint32_t);
-	return 0;
+	ringbuf->space -= num_dwords * sizeof(uint32_t);
+
+	return ringbuf;
 }
 
 /* Align the ring tail to a cacheline boundary */
-int intel_ring_cacheline_align(struct intel_engine *ring)
+int intel_ringbuffer_cacheline_align(struct intel_engine *ring,
+				     struct i915_hw_context *ctx)
 {
 	int num_dwords = (64 - (__get_ringbuf(ring)->tail & 63)) / sizeof(uint32_t);
 	int ret;
@@ -1845,11 +1849,11 @@ static int gen8_ring_flush(struct intel_engine *ring,
 			   u32 invalidate, u32 flush)
 {
 	uint32_t cmd;
-	int ret;
+	struct intel_ringbuffer *ringbuf;
 
-	ret = intel_ringbuffer_begin(ring, ctx, 4);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 4);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
 
 	cmd = MI_FLUSH_DW + 1;
 
@@ -1905,11 +1909,11 @@ gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	bool ppgtt = dev_priv->mm.aliasing_ppgtt != NULL &&
 		!(flags & I915_DISPATCH_SECURE);
-	int ret;
+	struct intel_ringbuffer *ringbuf;
 
-	ret = intel_ringbuffer_begin(ring, ctx, 4);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 4);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
 
 	/* FIXME(BDW): Address space and security selectors. */
 	intel_ring_emit(ring, MI_BATCH_BUFFER_START_GEN8 | (ppgtt<<8));
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index cd6c52a..101d4d4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -284,26 +284,66 @@ intel_write_status_page(struct intel_engine *ring,
 
 void intel_cleanup_ring(struct intel_engine *ring);
 
-int __must_check intel_ringbuffer_begin(struct intel_engine *ring,
-					   struct i915_hw_context *ctx, int n);
-#define intel_ring_begin(ring, n) intel_ringbuffer_begin(ring, NULL, n)
-int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
-static inline void intel_ring_emit(struct intel_engine *ring,
-				   u32 data)
+struct intel_ringbuffer *
+intel_ringbuffer_get(struct intel_engine *ring,
+		struct i915_hw_context *ctx);
+
+struct intel_ringbuffer *
+intel_ringbuffer_begin(struct intel_engine *ring,
+		struct i915_hw_context *ctx, int n);
+
+static inline int __must_check
+intel_ring_begin(struct intel_engine *ring, u32 data)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf;
+
+	ringbuf = intel_ringbuffer_begin(ring, ring->default_context, data);
+	if (IS_ERR(ringbuf))
+		return (PTR_ERR(ringbuf));
+
+	return 0;
+}
+
+int __must_check
+intel_ringbuffer_cacheline_align(struct intel_engine *ring,
+				struct i915_hw_context *ctx);
 
+static inline void
+intel_ringbuffer_emit(struct intel_ringbuffer *ringbuf, u32 data)
+{
 	iowrite32(data, ringbuf->virtual_start + ringbuf->tail);
 	ringbuf->tail += 4;
 }
-static inline void intel_ring_advance(struct intel_engine *ring)
+
+static inline void
+intel_ring_emit(struct intel_engine *ring, u32 data)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	intel_ringbuffer_emit(&ring->default_ringbuf, data);
+}
 
+static inline void
+intel_ringbuffer_advance(struct intel_ringbuffer *ringbuf)
+{
 	ringbuf->tail &= ringbuf->size - 1;
 }
-void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
-					 struct i915_hw_context *ctx);
+
+static inline void
+intel_ring_advance(struct intel_engine *ring)
+{
+	intel_ringbuffer_advance(&ring->default_ringbuf);
+}
+
+void
+intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
+				struct i915_hw_context *ctx);
+
+static inline u32
+intel_ringbuffer_get_tail(struct intel_engine *ring,
+			struct i915_hw_context *ctx)
+{
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
+	return ringbuf->tail;
+}
 
 int __must_check intel_ring_idle(struct intel_engine *ring);
 void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
@@ -321,11 +361,6 @@ int intel_init_vebox_ring(struct drm_device *dev);
 u32 intel_ring_get_active_head(struct intel_engine *ring);
 void intel_ring_setup_status_page(struct intel_engine *ring);
 
-static inline u32 intel_ring_get_tail(struct intel_engine *ring)
-{
-	return __get_ringbuf(ring)->tail;
-}
-
 static inline u32 intel_ring_get_seqno(struct intel_engine *ring)
 {
 	BUG_ON(ring->outstanding_lazy_seqno == 0);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 25/49] drm/i915: Final touches to LR contexts plumbing and refactoring
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (23 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 24/49] drm/i915/bdw: Write a new set of context-aware ringbuffer management functions oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 26/49] drm/i915/bdw: Set the request context information correctly in the LRC case oscar.mateo
                   ` (24 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Thanks to the previous functions and intel_ringbuffer_get(), every function
that needs to be context-aware gets the ringbuffer from the appropriate place
(be it the context or the engine itself). Others (either pre-GEN8 or that
clearly manipulate the rings's default ringbuffer) get it directly from the
engine.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c            |   2 +-
 drivers/gpu/drm/i915/i915_gem.c            |   7 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  20 +++---
 drivers/gpu/drm/i915/i915_gpu_error.c      |   6 +-
 drivers/gpu/drm/i915/i915_irq.c            |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 109 ++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   8 +--
 7 files changed, 82 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 29583da..ea5d965 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -241,7 +241,7 @@ static int i915_dma_resume(struct drm_device * dev)
 
 	DRM_DEBUG_DRIVER("%s\n", __func__);
 
-	if (__get_ringbuf(ring)->virtual_start == NULL) {
+	if (ring->default_ringbuf.virtual_start == NULL) {
 		DRM_ERROR("can not ioremap virtual address for"
 			  " ring buffer\n");
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a052a80..e3c3c58 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2466,6 +2466,7 @@ i915_gem_retire_requests_ring(struct intel_engine *ring)
 
 	while (!list_empty(&ring->request_list)) {
 		struct drm_i915_gem_request *request;
+		struct intel_ringbuffer *ringbuf;
 
 		request = list_first_entry(&ring->request_list,
 					   struct drm_i915_gem_request,
@@ -2475,12 +2476,16 @@ i915_gem_retire_requests_ring(struct intel_engine *ring)
 			break;
 
 		trace_i915_gem_request_retire(ring, request->seqno);
+
+		/* TODO: request->ctx is not correctly updated for LR contexts */
+		ringbuf = intel_ringbuffer_get(ring, request->ctx);
+
 		/* We know the GPU must have read the request to have
 		 * sent us the seqno + interrupt, so use the position
 		 * of tail of the request to update the last known position
 		 * of the GPU head.
 		 */
-		__get_ringbuf(ring)->last_retired_head = request->tail;
+		ringbuf->last_retired_head = request->tail;
 
 		i915_gem_free_request(request);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c0a1032..fa5a439 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -990,15 +990,15 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 
 	ringbuf = intel_ringbuffer_begin(ring, ctx, 4 * 3);
 	if (IS_ERR_OR_NULL(ringbuf))
-		return PTR_ERR(ringbuf);
+		return (PTR_ERR(ringbuf));
 
 	for (i = 0; i < 4; i++) {
-		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
-		intel_ring_emit(ring, GEN7_SO_WRITE_OFFSET(i));
-		intel_ring_emit(ring, 0);
+		intel_ringbuffer_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+		intel_ringbuffer_emit(ringbuf, GEN7_SO_WRITE_OFFSET(i));
+		intel_ringbuffer_emit(ringbuf, 0);
 	}
 
-	intel_ring_advance(ring);
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 }
@@ -1239,11 +1239,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 		}
 
-		intel_ring_emit(ring, MI_NOOP);
-		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
-		intel_ring_emit(ring, INSTPM);
-		intel_ring_emit(ring, mask << 16 | mode);
-		intel_ring_advance(ring);
+		intel_ringbuffer_emit(ringbuf, MI_NOOP);
+		intel_ringbuffer_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+		intel_ringbuffer_emit(ringbuf, INSTPM);
+		intel_ringbuffer_emit(ringbuf, mask << 16 | mode);
+		intel_ringbuffer_advance(ringbuf);
 
 		dev_priv->relative_constants_mode = mode;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 67a1fc7..0238efe 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -828,8 +828,8 @@ static void i915_record_ring_state(struct drm_device *dev,
 		ering->hws = I915_READ(mmio);
 	}
 
-	ering->cpu_ring_head = __get_ringbuf(ring)->head;
-	ering->cpu_ring_tail = __get_ringbuf(ring)->tail;
+	ering->cpu_ring_head = ring->default_ringbuf.head;
+	ering->cpu_ring_tail = ring->default_ringbuf.tail;
 
 	ering->hangcheck_score = ring->hangcheck.score;
 	ering->hangcheck_action = ring->hangcheck.action;
@@ -936,7 +936,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 		}
 
 		error->ring[i].ringbuffer =
-			i915_error_ggtt_object_create(dev_priv, __get_ringbuf(ring)->obj);
+			i915_error_ggtt_object_create(dev_priv, ring->default_ringbuf.obj);
 
 		if (ring->status_page.obj)
 			error->ring[i].hws_page =
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 340cf34..1ba8bb3 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2593,7 +2593,7 @@ static struct intel_engine *
 semaphore_waits_for(struct intel_engine *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	u32 cmd, ipehr, head;
 	int i;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 54aba64..fba9b05 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -33,10 +33,8 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 
-static inline int ring_space(struct intel_engine *ring)
+static inline int ring_space(struct intel_ringbuffer *ringbuf)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
-
 	int space = (ringbuf->head & HEAD_ADDR) - (ringbuf->tail + I915_RING_FREE_SPACE);
 	if (space < 0)
 		space += ringbuf->size;
@@ -47,7 +45,7 @@ void intel_ringbuffer_advance_and_submit(struct intel_engine *ring,
 					 struct i915_hw_context *ctx)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 
 	ringbuf->tail &= ringbuf->size - 1;
 	if (dev_priv->gpu_error.stop_rings & intel_ring_flag(ring))
@@ -401,13 +399,13 @@ gen8_render_ring_flush(struct intel_engine *ring,
 	if (IS_ERR_OR_NULL(ringbuf))
 		return (PTR_ERR(ringbuf));
 
-	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(6));
-	intel_ring_emit(ring, flags);
-	intel_ring_emit(ring, scratch_addr);
-	intel_ring_emit(ring, 0);
-	intel_ring_emit(ring, 0);
-	intel_ring_emit(ring, 0);
-	intel_ring_advance(ring);
+	intel_ringbuffer_emit(ringbuf, GFX_OP_PIPE_CONTROL(6));
+	intel_ringbuffer_emit(ringbuf, flags);
+	intel_ringbuffer_emit(ringbuf, scratch_addr);
+	intel_ringbuffer_emit(ringbuf, 0);
+	intel_ringbuffer_emit(ringbuf, 0);
+	intel_ringbuffer_emit(ringbuf, 0);
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 
@@ -451,7 +449,7 @@ static int init_ring_common(struct intel_engine *ring)
 {
 	struct drm_device *dev = ring->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	struct drm_i915_gem_object *obj = ringbuf->obj;
 	int ret = 0;
 	u32 head;
@@ -524,7 +522,7 @@ static int init_ring_common(struct intel_engine *ring)
 	else {
 		ringbuf->head = I915_READ_HEAD(ring);
 		ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
-		ringbuf->space = ring_space(ring);
+		ringbuf->space = ring_space(ringbuf);
 		ringbuf->last_retired_head = -1;
 	}
 
@@ -538,7 +536,7 @@ out:
 
 static int init_ring_common_lrc(struct intel_engine *ring)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 
 	ringbuf->head = 0;
 	ringbuf->tail = 0;
@@ -741,10 +739,10 @@ gen8_add_request(struct intel_engine *ring,
 	if (IS_ERR_OR_NULL(ringbuf))
 		return (PTR_ERR(ringbuf));
 
-	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
-	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
-	intel_ring_emit(ring, MI_USER_INTERRUPT);
+	intel_ringbuffer_emit(ringbuf, MI_STORE_DWORD_INDEX);
+	intel_ringbuffer_emit(ringbuf, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
+	intel_ringbuffer_emit(ringbuf, ring->outstanding_lazy_seqno);
+	intel_ringbuffer_emit(ringbuf, MI_USER_INTERRUPT);
 	intel_ringbuffer_advance_and_submit(ring, ctx);
 
 	return 0;
@@ -1402,7 +1400,7 @@ static int init_phys_status_page(struct intel_engine *ring)
 static void destroy_ring_buffer(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 
 	if (dev_priv->lrc_enabled)
 		return;
@@ -1417,7 +1415,7 @@ static int alloc_ring_buffer(struct intel_engine *ring)
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj = NULL;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	int ret;
 
 	if (dev_priv->lrc_enabled)
@@ -1454,7 +1452,7 @@ static int intel_init_ring(struct drm_device *dev,
 {
 	struct drm_i915_gem_object *obj;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	int ret;
 
 	ring->dev = dev;
@@ -1526,7 +1524,7 @@ err_hws:
 void intel_cleanup_ring(struct intel_engine *ring)
 {
 	struct drm_i915_private *dev_priv;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	int ret;
 
 	if (ringbuf->obj == NULL)
@@ -1553,10 +1551,11 @@ void intel_cleanup_ring(struct intel_engine *ring)
 	cleanup_status_page(ring);
 }
 
-static int intel_ring_wait_request(struct intel_engine *ring, int n)
+static int intel_ring_wait_request(struct intel_engine *ring,
+				   struct i915_hw_context *ctx, int n)
 {
 	struct drm_i915_gem_request *request;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	u32 seqno = 0, tail;
 	int ret;
 
@@ -1564,7 +1563,7 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 		ringbuf->head = ringbuf->last_retired_head;
 		ringbuf->last_retired_head = -1;
 
-		ringbuf->space = ring_space(ring);
+		ringbuf->space = ring_space(ringbuf);
 		if (ringbuf->space >= n)
 			return 0;
 	}
@@ -1600,7 +1599,7 @@ static int intel_ring_wait_request(struct intel_engine *ring, int n)
 		return ret;
 
 	ringbuf->head = tail;
-	ringbuf->space = ring_space(ring);
+	ringbuf->space = ring_space(ringbuf);
 	if (WARN_ON(ringbuf->space < n))
 		return -ENOSPC;
 
@@ -1612,11 +1611,11 @@ static int ring_wait_for_space(struct intel_engine *ring,
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	unsigned long end;
 	int ret;
 
-	ret = intel_ring_wait_request(ring, n);
+	ret = intel_ring_wait_request(ring, ctx, n);
 	if (ret != -ENOSPC)
 		return ret;
 
@@ -1633,7 +1632,7 @@ static int ring_wait_for_space(struct intel_engine *ring,
 
 	do {
 		ringbuf->head = I915_READ_HEAD(ring);
-		ringbuf->space = ring_space(ring);
+		ringbuf->space = ring_space(ringbuf);
 		if (ringbuf->space >= n) {
 			trace_i915_ring_wait_end(ring);
 			return 0;
@@ -1661,7 +1660,7 @@ static int intel_wrap_ring_buffer(struct intel_engine *ring,
 				  struct i915_hw_context *ctx)
 {
 	uint32_t __iomem *virt;
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	int rem = ringbuf->size - ringbuf->tail;
 
 	if (ringbuf->space < rem) {
@@ -1676,7 +1675,7 @@ static int intel_wrap_ring_buffer(struct intel_engine *ring,
 		iowrite32(MI_NOOP, virt++);
 
 	ringbuf->tail = 0;
-	ringbuf->space = ring_space(ring);
+	ringbuf->space = ring_space(ringbuf);
 
 	return 0;
 }
@@ -1726,7 +1725,7 @@ intel_ring_alloc_seqno(struct intel_engine *ring)
 static int __intel_ring_prepare(struct intel_engine *ring,
 				struct i915_hw_context *ctx, int bytes)
 {
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
 	int ret;
 
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
@@ -1745,6 +1744,17 @@ static int __intel_ring_prepare(struct intel_engine *ring,
 }
 
 struct intel_ringbuffer *
+intel_ringbuffer_get(struct intel_engine *ring, struct i915_hw_context *ctx)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+
+	if (dev_priv->lrc_enabled && ctx)
+		return ctx->ringbuf;
+	else
+		return &ring->default_ringbuf;
+}
+
+struct intel_ringbuffer *
 intel_ringbuffer_begin(struct intel_engine *ring,
 		       struct i915_hw_context *ctx,
 		       int num_dwords)
@@ -1776,20 +1786,21 @@ intel_ringbuffer_begin(struct intel_engine *ring,
 int intel_ringbuffer_cacheline_align(struct intel_engine *ring,
 				     struct i915_hw_context *ctx)
 {
-	int num_dwords = (64 - (__get_ringbuf(ring)->tail & 63)) / sizeof(uint32_t);
-	int ret;
+	struct intel_ringbuffer *ringbuf = intel_ringbuffer_get(ring, ctx);
+	int num_dwords;
 
+	num_dwords = (64 - (ringbuf->tail & 63)) / sizeof(uint32_t);
 	if (num_dwords == 0)
 		return 0;
 
-	ret = intel_ring_begin(ring, num_dwords);
-	if (ret)
-		return ret;
+	ringbuf = intel_ringbuffer_begin(ring, ctx, num_dwords);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return PTR_ERR(ringbuf);
 
 	while (num_dwords--)
 		intel_ring_emit(ring, MI_NOOP);
 
-	intel_ring_advance(ring);
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 }
@@ -1860,11 +1871,11 @@ static int gen8_ring_flush(struct intel_engine *ring,
 	if (invalidate & I915_GEM_GPU_DOMAINS)
 		cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
 			MI_FLUSH_DW_OP_STOREDW;
-	intel_ring_emit(ring, cmd);
-	intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
-	intel_ring_emit(ring, 0); /* upper addr */
-	intel_ring_emit(ring, 0); /* value */
-	intel_ring_advance(ring);
+	intel_ringbuffer_emit(ringbuf, cmd);
+	intel_ringbuffer_emit(ringbuf, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
+	intel_ringbuffer_emit(ringbuf, 0); /* upper addr */
+	intel_ringbuffer_emit(ringbuf, 0); /* value */
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 }
@@ -1916,11 +1927,11 @@ gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 		return (PTR_ERR(ringbuf));
 
 	/* FIXME(BDW): Address space and security selectors. */
-	intel_ring_emit(ring, MI_BATCH_BUFFER_START_GEN8 | (ppgtt<<8));
-	intel_ring_emit(ring, offset);
-	intel_ring_emit(ring, 0);
-	intel_ring_emit(ring, MI_NOOP);
-	intel_ring_advance(ring);
+	intel_ringbuffer_emit(ringbuf, MI_BATCH_BUFFER_START_GEN8 | (ppgtt<<8));
+	intel_ringbuffer_emit(ringbuf, offset);
+	intel_ringbuffer_emit(ringbuf, 0);
+	intel_ringbuffer_emit(ringbuf, MI_NOOP);
+	intel_ringbuffer_advance(ringbuf);
 
 	return 0;
 }
@@ -2112,7 +2123,7 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_engine *ring = &dev_priv->ring[RCS];
-	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
+	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 	int ret;
 
 	if (INTEL_INFO(dev)->gen >= 6) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 101d4d4..3b0f28b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -210,16 +210,10 @@ struct intel_engine {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
-/* This is a temporary define to help us transition to per-context ringbuffers */
-static inline struct intel_ringbuffer *__get_ringbuf(struct intel_engine *ring)
-{
-	return &ring->default_ringbuf;
-}
-
 static inline bool
 intel_ring_initialized(struct intel_engine *ring)
 {
-	return __get_ringbuf(ring)->obj != NULL;
+	return ring->default_ringbuf.obj != NULL;
 }
 
 static inline unsigned
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 26/49] drm/i915/bdw: Set the request context information correctly in the LRC case
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (24 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 25/49] drm/i915: Final touches to LR contexts plumbing and refactoring oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 27/49] drm/i915/bdw: Prepare for user-created LR contexts oscar.mateo
                   ` (23 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

We need it (at least) to properly update the last retired head.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e3c3c58..e844c50 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2213,7 +2213,10 @@ int __i915_add_request(struct intel_engine *ring,
 	/* Hold a reference to the current context so that we can inspect
 	 * it later in case a hangcheck error event fires.
 	 */
-	request->ctx = ring->last_context;
+	if (dev_priv->lrc_enabled)
+		request->ctx = ctx;
+	else
+		request->ctx = ring->last_context;
 	if (request->ctx)
 		i915_gem_context_reference(request->ctx);
 
@@ -2477,7 +2480,6 @@ i915_gem_retire_requests_ring(struct intel_engine *ring)
 
 		trace_i915_gem_request_retire(ring, request->seqno);
 
-		/* TODO: request->ctx is not correctly updated for LR contexts */
 		ringbuf = intel_ringbuffer_get(ring, request->ctx);
 
 		/* We know the GPU must have read the request to have
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 27/49] drm/i915/bdw: Prepare for user-created LR contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (25 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 26/49] drm/i915/bdw: Set the request context information correctly in the LRC case oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 28/49] drm/i915/bdw: Start creating & destroying user " oscar.mateo
                   ` (22 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Now that our global default contexts are correctly created and we have
finished the refactoring, it's time to allow other kind of contexts.

As we said earlier, logical ring contexts created by the user have their
own ringbuffer: not only the backing pages, but the whole management struct.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_lrc.c | 69 ++++++++++++++++++++++++++++++++++++-----
 2 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c0f0c3d..91b0886 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2322,6 +2322,7 @@ void gen8_gem_context_fini(struct drm_device *dev);
 struct i915_hw_context *gen8_gem_create_context(struct drm_device *dev,
 			struct intel_engine *ring,
 			struct drm_i915_file_private *file_priv, bool create_vm);
+void gen8_gem_context_free(struct i915_hw_context *ctx);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index a726b26..d5afb0d 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -75,15 +75,31 @@
 #define CTX_R_PWR_CLK_STATE		0x42
 #define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
 
+void gen8_gem_context_free(struct i915_hw_context *ctx)
+{
+	/* Global default contexts ringbuffers are take care of
+	 * in the fini cleanup code */
+	if (ctx->file_priv) {
+		iounmap(ctx->ringbuf->virtual_start);
+		i915_gem_object_ggtt_unpin(ctx->ringbuf->obj);
+		drm_gem_object_unreference(&ctx->ringbuf->obj->base);
+		ctx->ringbuf->obj = NULL;
+		kfree(ctx->ringbuf);
+		ctx->ringbuf = NULL;
+	}
+}
+
 struct i915_hw_context *
 gen8_gem_create_context(struct drm_device *dev,
 			struct intel_engine *ring,
 			struct drm_i915_file_private *file_priv,
 			bool create_vm)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx = NULL;
 	struct drm_i915_gem_object *ring_obj = NULL;
 	struct i915_hw_ppgtt *ppgtt = NULL;
+	struct intel_ringbuffer *ringbuf = NULL;
 	struct page *page;
 	uint32_t *reg_state;
 	int ret;
@@ -94,7 +110,8 @@ gen8_gem_create_context(struct drm_device *dev,
 
 	ring_obj = i915_gem_alloc_object(dev, 32 * PAGE_SIZE);
 	if (!ring_obj) {
-		i915_gem_object_ggtt_unpin(ctx->obj);
+		if (!file_priv)
+			i915_gem_object_ggtt_unpin(ctx->obj);
 		i915_gem_context_unreference(ctx);
 		return ERR_PTR(-ENOMEM);
 	}
@@ -107,7 +124,8 @@ gen8_gem_create_context(struct drm_device *dev,
 	ret = i915_gem_obj_ggtt_pin(ring_obj, PAGE_SIZE, PIN_MAPPABLE);
 	if (ret) {
 		drm_gem_object_unreference(&ring_obj->base);
-		i915_gem_object_ggtt_unpin(ctx->obj);
+		if (!file_priv)
+			i915_gem_object_ggtt_unpin(ctx->obj);
 		i915_gem_context_unreference(ctx);
 		return ERR_PTR(ret);
 	}
@@ -117,18 +135,44 @@ gen8_gem_create_context(struct drm_device *dev,
 	if (ret)
 		goto destroy_ring_obj;
 
-	ctx->ringbuf = &ring->default_ringbuf;
+	if (file_priv) {
+		ringbuf = kzalloc(sizeof(struct intel_ringbuffer), GFP_KERNEL);
+		if (!ringbuf) {
+			DRM_ERROR("Failed to allocate ringbuffer\n");
+			ret = -ENOMEM;
+			goto destroy_ring_obj;
+		}
+
+		ringbuf->size = 32 * PAGE_SIZE;
+		ringbuf->effective_size = ringbuf->size;
+		ringbuf->head = 0;
+		ringbuf->tail = 0;
+		ringbuf->space = ringbuf->size;
+		ringbuf->last_retired_head = -1;
+		ringbuf->virtual_start = ioremap_wc(dev_priv->gtt.mappable_base +
+						i915_gem_obj_ggtt_offset(ring_obj),
+						ringbuf->size);
+		if (ringbuf->virtual_start == NULL) {
+			DRM_ERROR("Failed to map ringbuffer\n");
+			ret = -EINVAL;
+			goto destroy_ringbuf;
+		}
+
+		ctx->ringbuf = ringbuf;
+	} else {
+		ctx->ringbuf = &ring->default_ringbuf;
+	}
 	ctx->ringbuf->obj = ring_obj;
 
 	ppgtt = ctx_to_ppgtt(ctx);
 
 	ret = i915_gem_object_set_to_cpu_domain(ctx->obj, true);
 	if (ret)
-		goto destroy_ring_obj;
+		goto unmap_ringbuf;
 
 	ret = i915_gem_object_get_pages(ctx->obj);
 	if (ret)
-		goto destroy_ring_obj;
+		goto unmap_ringbuf;
 
 	i915_gem_object_pin_pages(ctx->obj);
 
@@ -213,12 +257,23 @@ gen8_gem_create_context(struct drm_device *dev,
 
 	return ctx;
 
+unmap_ringbuf:
+	if (ringbuf)
+		iounmap(ringbuf->virtual_start);
+
+destroy_ringbuf:
+	if (ringbuf) {
+		ringbuf->obj = NULL;
+		kfree(ringbuf);
+	}
+	ctx->ringbuf = NULL;
+
 destroy_ring_obj:
 	i915_gem_object_ggtt_unpin(ring_obj);
 	drm_gem_object_unreference(&ring_obj->base);
 	ctx->ringbuf->obj = NULL;
-	ctx->ringbuf = NULL;
-	i915_gem_object_ggtt_unpin(ctx->obj);
+	if (!file_priv)
+		i915_gem_object_ggtt_unpin(ctx->obj);
 	i915_gem_context_unreference(ctx);
 
 	return ERR_PTR(ret);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 28/49] drm/i915/bdw: Start creating & destroying user LR contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (26 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 27/49] drm/i915/bdw: Prepare for user-created LR contexts oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 29/49] drm/i915/bdw: Pin context pages at context create time oscar.mateo
                   ` (21 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Users can create contexts either implicitly by opening an fd, or
explicitly by the context create ioctl. In either case, this context
needs to be corectly populated with "advanced" stuff. For the moment
we consider all the user contexts to be of the render type (probably
true for the create context ioctl at this moment, but a gross
oversimplification for the fd case that we will fix later on).

Also, we need to make sure the corresponding ringbuffer and backup
pages are correctly destroyed (note that, in the case of the global
default contexts, freeing was already taken care of by the ring
cleanup code).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cb43272..6baa5ab 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -184,6 +184,11 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	struct i915_hw_context *ctx = container_of(ctx_ref,
 						   typeof(*ctx), ref);
 	struct i915_hw_ppgtt *ppgtt = NULL;
+	struct drm_device *dev = ctx->obj->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (dev_priv->lrc_enabled)
+		gen8_gem_context_free(ctx);
 
 	/* We refcount even the aliasing PPGTT to keep the code symmetric */
 	if (USES_PPGTT(ctx->obj->base.dev))
@@ -535,8 +540,13 @@ int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
 	idr_init(&file_priv->context_idr);
 
 	mutex_lock(&dev->struct_mutex);
-	file_priv->private_default_ctx =
-		i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev));
+	if (dev_priv->lrc_enabled)
+		file_priv->private_default_ctx = gen8_gem_create_context(dev,
+						&dev_priv->ring[RCS], file_priv,
+						USES_FULL_PPGTT(dev));
+	else
+		file_priv->private_default_ctx = i915_gem_create_context(dev,
+						file_priv, USES_FULL_PPGTT(dev));
 	mutex_unlock(&dev->struct_mutex);
 
 	if (IS_ERR(file_priv->private_default_ctx)) {
@@ -782,6 +792,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_context_create *args = data;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx;
 	int ret;
 
@@ -792,7 +803,12 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
-	ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev));
+	if (dev_priv->lrc_enabled)
+		ctx = gen8_gem_create_context(dev, &dev_priv->ring[RCS],
+					file_priv, USES_FULL_PPGTT(dev));
+	else
+		ctx = i915_gem_create_context(dev, file_priv,
+					USES_FULL_PPGTT(dev));
 	mutex_unlock(&dev->struct_mutex);
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 29/49] drm/i915/bdw: Pin context pages at context create time
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (27 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 28/49] drm/i915/bdw: Start creating & destroying user " oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 17:59 ` [PATCH 30/49] drm/i915/bdw: Extract LR context object populating oscar.mateo
                   ` (20 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

For the moment, this is simple and works allright. When we start
having a lot of contexts, this is going to become problematic.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_lrc.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index d5afb0d..74ab558 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -41,6 +41,8 @@
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
+#define GEN8_CONTEXT_ALIGN 4096
+
 #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
 
 #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
@@ -86,6 +88,7 @@ void gen8_gem_context_free(struct i915_hw_context *ctx)
 		ctx->ringbuf->obj = NULL;
 		kfree(ctx->ringbuf);
 		ctx->ringbuf = NULL;
+		i915_gem_object_ggtt_unpin(ctx->obj);
 	}
 }
 
@@ -108,10 +111,18 @@ gen8_gem_create_context(struct drm_device *dev,
 	if (IS_ERR_OR_NULL(ctx))
 		return ctx;
 
+	if (file_priv) {
+		ret = i915_gem_obj_ggtt_pin(ctx->obj, GEN8_CONTEXT_ALIGN, 0);
+		if (ret) {
+			DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
+			i915_gem_context_unreference(ctx);
+			return ERR_PTR(ret);
+		}
+	}
+
 	ring_obj = i915_gem_alloc_object(dev, 32 * PAGE_SIZE);
 	if (!ring_obj) {
-		if (!file_priv)
-			i915_gem_object_ggtt_unpin(ctx->obj);
+		i915_gem_object_ggtt_unpin(ctx->obj);
 		i915_gem_context_unreference(ctx);
 		return ERR_PTR(-ENOMEM);
 	}
@@ -124,8 +135,7 @@ gen8_gem_create_context(struct drm_device *dev,
 	ret = i915_gem_obj_ggtt_pin(ring_obj, PAGE_SIZE, PIN_MAPPABLE);
 	if (ret) {
 		drm_gem_object_unreference(&ring_obj->base);
-		if (!file_priv)
-			i915_gem_object_ggtt_unpin(ctx->obj);
+		i915_gem_object_ggtt_unpin(ctx->obj);
 		i915_gem_context_unreference(ctx);
 		return ERR_PTR(ret);
 	}
@@ -272,8 +282,7 @@ destroy_ring_obj:
 	i915_gem_object_ggtt_unpin(ring_obj);
 	drm_gem_object_unreference(&ring_obj->base);
 	ctx->ringbuf->obj = NULL;
-	if (!file_priv)
-		i915_gem_object_ggtt_unpin(ctx->obj);
+	i915_gem_object_ggtt_unpin(ctx->obj);
 	i915_gem_context_unreference(ctx);
 
 	return ERR_PTR(ret);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 30/49] drm/i915/bdw: Extract LR context object populating
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (28 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 29/49] drm/i915/bdw: Pin context pages at context create time oscar.mateo
@ 2014-03-27 17:59 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 31/49] drm/i915/bdw: Introduce dependent contexts oscar.mateo
                   ` (19 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 17:59 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

As a separate function, we can decide wether we want a context with real
information about which engine it uses, or a "blank" context for which to
make a deferred decision.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_lrc.c | 175 ++++++++++++++++++++++------------------
 1 file changed, 95 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 74ab558..124e5f2 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -92,97 +92,24 @@ void gen8_gem_context_free(struct i915_hw_context *ctx)
 	}
 }
 
-struct i915_hw_context *
-gen8_gem_create_context(struct drm_device *dev,
-			struct intel_engine *ring,
-			struct drm_i915_file_private *file_priv,
-			bool create_vm)
+static int
+intel_populate_lrc(struct i915_hw_context *ctx,
+		   struct intel_engine *ring)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_hw_context *ctx = NULL;
-	struct drm_i915_gem_object *ring_obj = NULL;
-	struct i915_hw_ppgtt *ppgtt = NULL;
-	struct intel_ringbuffer *ringbuf = NULL;
 	struct page *page;
 	uint32_t *reg_state;
+	struct i915_hw_ppgtt *ppgtt = NULL;
 	int ret;
 
-	ctx = i915_gem_create_context(dev, file_priv, create_vm);
-	if (IS_ERR_OR_NULL(ctx))
-		return ctx;
-
-	if (file_priv) {
-		ret = i915_gem_obj_ggtt_pin(ctx->obj, GEN8_CONTEXT_ALIGN, 0);
-		if (ret) {
-			DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
-			i915_gem_context_unreference(ctx);
-			return ERR_PTR(ret);
-		}
-	}
-
-	ring_obj = i915_gem_alloc_object(dev, 32 * PAGE_SIZE);
-	if (!ring_obj) {
-		i915_gem_object_ggtt_unpin(ctx->obj);
-		i915_gem_context_unreference(ctx);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	/* TODO: For now we put this in the mappable region so that we can reuse
-	 * the existing ringbuffer code which ioremaps it. When we start
-	 * creating many contexts, this will no longer work and we must switch
-	 * to a kmapish interface.
-	 */
-	ret = i915_gem_obj_ggtt_pin(ring_obj, PAGE_SIZE, PIN_MAPPABLE);
-	if (ret) {
-		drm_gem_object_unreference(&ring_obj->base);
-		i915_gem_object_ggtt_unpin(ctx->obj);
-		i915_gem_context_unreference(ctx);
-		return ERR_PTR(ret);
-	}
-
-	/* Failure at this point is almost impossible */
-	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
-	if (ret)
-		goto destroy_ring_obj;
-
-	if (file_priv) {
-		ringbuf = kzalloc(sizeof(struct intel_ringbuffer), GFP_KERNEL);
-		if (!ringbuf) {
-			DRM_ERROR("Failed to allocate ringbuffer\n");
-			ret = -ENOMEM;
-			goto destroy_ring_obj;
-		}
-
-		ringbuf->size = 32 * PAGE_SIZE;
-		ringbuf->effective_size = ringbuf->size;
-		ringbuf->head = 0;
-		ringbuf->tail = 0;
-		ringbuf->space = ringbuf->size;
-		ringbuf->last_retired_head = -1;
-		ringbuf->virtual_start = ioremap_wc(dev_priv->gtt.mappable_base +
-						i915_gem_obj_ggtt_offset(ring_obj),
-						ringbuf->size);
-		if (ringbuf->virtual_start == NULL) {
-			DRM_ERROR("Failed to map ringbuffer\n");
-			ret = -EINVAL;
-			goto destroy_ringbuf;
-		}
-
-		ctx->ringbuf = ringbuf;
-	} else {
-		ctx->ringbuf = &ring->default_ringbuf;
-	}
-	ctx->ringbuf->obj = ring_obj;
-
 	ppgtt = ctx_to_ppgtt(ctx);
 
 	ret = i915_gem_object_set_to_cpu_domain(ctx->obj, true);
 	if (ret)
-		goto unmap_ringbuf;
+		return ret;
 
 	ret = i915_gem_object_get_pages(ctx->obj);
 	if (ret)
-		goto unmap_ringbuf;
+		return ret;
 
 	i915_gem_object_pin_pages(ctx->obj);
 
@@ -204,7 +131,7 @@ gen8_gem_create_context(struct drm_device *dev,
 	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
 	reg_state[CTX_RING_TAIL+1] = 0;
 	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
-	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
+	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ctx->ringbuf->obj);
 	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
 	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
 	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
@@ -265,6 +192,94 @@ gen8_gem_create_context(struct drm_device *dev,
 	set_page_dirty(page);
 	i915_gem_object_unpin_pages(ctx->obj);
 
+	return 0;
+}
+
+struct i915_hw_context *
+gen8_gem_create_context(struct drm_device *dev,
+			struct intel_engine *ring,
+			struct drm_i915_file_private *file_priv,
+			bool create_vm)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_hw_context *ctx = NULL;
+	struct drm_i915_gem_object *ring_obj = NULL;
+	struct intel_ringbuffer *ringbuf = NULL;
+	int ret;
+
+	ctx = i915_gem_create_context(dev, file_priv, create_vm);
+	if (IS_ERR_OR_NULL(ctx))
+		return ctx;
+
+	if (file_priv) {
+		ret = i915_gem_obj_ggtt_pin(ctx->obj, GEN8_CONTEXT_ALIGN, 0);
+		if (ret) {
+			DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
+			i915_gem_context_unreference(ctx);
+			return ERR_PTR(ret);
+		}
+	}
+
+	ring_obj = i915_gem_alloc_object(dev, 32 * PAGE_SIZE);
+	if (!ring_obj) {
+		i915_gem_object_ggtt_unpin(ctx->obj);
+		i915_gem_context_unreference(ctx);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	/* TODO: For now we put this in the mappable region so that we can reuse
+	 * the existing ringbuffer code which ioremaps it. When we start
+	 * creating many contexts, this will no longer work and we must switch
+	 * to a kmapish interface.
+	 */
+	ret = i915_gem_obj_ggtt_pin(ring_obj, PAGE_SIZE, PIN_MAPPABLE);
+	if (ret) {
+		drm_gem_object_unreference(&ring_obj->base);
+		i915_gem_object_ggtt_unpin(ctx->obj);
+		i915_gem_context_unreference(ctx);
+		return ERR_PTR(ret);
+	}
+
+	/* Failure at this point is almost impossible */
+	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
+	if (ret)
+		goto destroy_ring_obj;
+
+	if (file_priv) {
+		ringbuf = kzalloc(sizeof(struct intel_ringbuffer), GFP_KERNEL);
+		if (!ringbuf) {
+			DRM_ERROR("Failed to allocate ringbuffer\n");
+			ret = -ENOMEM;
+			goto destroy_ring_obj;
+		}
+
+		ringbuf->size = 32 * PAGE_SIZE;
+		ringbuf->effective_size = ringbuf->size;
+		ringbuf->head = 0;
+		ringbuf->tail = 0;
+		ringbuf->space = ringbuf->size;
+		ringbuf->last_retired_head = -1;
+		ringbuf->virtual_start = ioremap_wc(dev_priv->gtt.mappable_base +
+						i915_gem_obj_ggtt_offset(ring_obj),
+						ringbuf->size);
+		if (ringbuf->virtual_start == NULL) {
+			DRM_ERROR("Failed to map ringbuffer\n");
+			ret = -EINVAL;
+			goto destroy_ringbuf;
+		}
+
+		ctx->ringbuf = ringbuf;
+	} else {
+		ctx->ringbuf = &ring->default_ringbuf;
+	}
+	ctx->ringbuf->obj = ring_obj;
+
+	if (ring) {
+		ret = intel_populate_lrc(ctx, ring);
+		if (ret)
+			goto unmap_ringbuf;
+	}
+
 	return ctx;
 
 unmap_ringbuf:
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 31/49] drm/i915/bdw: Introduce dependent contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (29 preceding siblings ...)
  2014-03-27 17:59 ` [PATCH 30/49] drm/i915/bdw: Extract LR context object populating oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 17:21   ` Mateo Lozano, Oscar
  2014-03-27 18:00 ` [PATCH 32/49] drm/i915/bdw: Create stand-alone and " oscar.mateo
                   ` (18 subsequent siblings)
  49 siblings, 1 reply; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

>From here on, we define a stand-alone context as the first context with
a given ID to be created for a new fd or a new context create ioctl. This
is the one we can easily find using integer ID management. On the other
hand, dependent contexts are subsequently created with the same ID and
simply hang from the stand-alone one.

This patch, together with the two previous and the next, are meant to
solve a big problem we have: with execlists, we need contexts to work with
all engines, and we cannot reuse one context for more than one engine.

Because, on a new fd or a context create ioctl, we really don't know which
engine is going to be used later on, we are going to create at that point
a "blank" context and assign it to an engine on a deferred way (during the
execbuffer, to be precise). If later on, we execbuffer on a different engine,
we create a new dependent context on the previous.

Note: I have tried to colour this patch in a different way, using a different
struct (a "context group") to hold the context ID from where the per-engine
contexts hang, but it makes legacy contexts unnecessary complex.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  6 +++++-
 drivers/gpu/drm/i915/i915_gem_context.c | 17 +++++++++++++--
 drivers/gpu/drm/i915/i915_lrc.c         | 37 ++++++++++++++++++++++++++++++---
 3 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 91b0886..d9470a4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -602,6 +602,9 @@ struct i915_hw_context {
 	struct i915_address_space *vm;
 
 	struct list_head link;
+
+	/* Advanced contexts only */
+	struct list_head dependent_contexts;
 };
 
 struct i915_fbc {
@@ -2321,7 +2324,8 @@ int gen8_gem_context_init(struct drm_device *dev);
 void gen8_gem_context_fini(struct drm_device *dev);
 struct i915_hw_context *gen8_gem_create_context(struct drm_device *dev,
 			struct intel_engine *ring,
-			struct drm_i915_file_private *file_priv, bool create_vm);
+			struct drm_i915_file_private *file_priv,
+			struct i915_hw_context *standalone_ctx, bool create_vm);
 void gen8_gem_context_free(struct i915_hw_context *ctx);
 
 /* i915_gem_evict.c */
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 6baa5ab..17015b2 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -271,6 +271,8 @@ __create_hw_context(struct drm_device *dev,
 	 * is no remap info, it will be a NOP. */
 	ctx->remap_slice = (1 << NUM_L3_SLICES(dev)) - 1;
 
+	INIT_LIST_HEAD(&ctx->dependent_contexts);
+
 	return ctx;
 
 err_out:
@@ -511,6 +513,12 @@ int i915_gem_context_enable(struct drm_i915_private *dev_priv)
 static int context_idr_cleanup(int id, void *p, void *data)
 {
 	struct i915_hw_context *ctx = p;
+	struct i915_hw_context *cursor, *tmp;
+
+	list_for_each_entry_safe(cursor, tmp, &ctx->dependent_contexts, dependent_contexts) {
+		list_del(&cursor->dependent_contexts);
+		i915_gem_context_unreference(cursor);
+	}
 
 	/* Ignore the default context because close will handle it */
 	if (i915_gem_context_is_default(ctx))
@@ -543,7 +551,7 @@ int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
 	if (dev_priv->lrc_enabled)
 		file_priv->private_default_ctx = gen8_gem_create_context(dev,
 						&dev_priv->ring[RCS], file_priv,
-						USES_FULL_PPGTT(dev));
+						NULL, USES_FULL_PPGTT(dev));
 	else
 		file_priv->private_default_ctx = i915_gem_create_context(dev,
 						file_priv, USES_FULL_PPGTT(dev));
@@ -805,7 +813,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 
 	if (dev_priv->lrc_enabled)
 		ctx = gen8_gem_create_context(dev, &dev_priv->ring[RCS],
-					file_priv, USES_FULL_PPGTT(dev));
+					file_priv, NULL, USES_FULL_PPGTT(dev));
 	else
 		ctx = i915_gem_create_context(dev, file_priv,
 					USES_FULL_PPGTT(dev));
@@ -825,6 +833,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	struct drm_i915_gem_context_destroy *args = data;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct i915_hw_context *ctx;
+	struct i915_hw_context *cursor, *tmp;
 	int ret;
 
 	if (args->ctx_id == DEFAULT_CONTEXT_ID)
@@ -841,6 +850,10 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	}
 
 	idr_remove(&ctx->file_priv->context_idr, ctx->id);
+	list_for_each_entry_safe(cursor, tmp, &ctx->dependent_contexts, dependent_contexts) {
+		list_del(&cursor->dependent_contexts);
+		i915_gem_context_unreference(cursor);
+	}
 	i915_gem_context_unreference(ctx);
 	mutex_unlock(&dev->struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 124e5f2..99011cc 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -195,23 +195,54 @@ intel_populate_lrc(struct i915_hw_context *ctx,
 	return 0;
 }
 
+static void assert_on_ppgtt_release(struct kref *kref)
+{
+	WARN(1, "Are we trying to free the aliasing PPGTT?\n");
+}
+
 struct i915_hw_context *
 gen8_gem_create_context(struct drm_device *dev,
 			struct intel_engine *ring,
 			struct drm_i915_file_private *file_priv,
+			struct i915_hw_context *standalone_ctx,
 			bool create_vm)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *ctx = NULL;
 	struct drm_i915_gem_object *ring_obj = NULL;
 	struct intel_ringbuffer *ringbuf = NULL;
+	bool is_dependent;
 	int ret;
 
-	ctx = i915_gem_create_context(dev, file_priv, create_vm);
+	/* NB: a standalone context is the first context with a given id to be
+	 * created for a new fd. Dependent contexts simply hang from the stand-alone,
+	 * sharing their ID and their PPGTT */
+	is_dependent = (file_priv != NULL) && (standalone_ctx != NULL);
+
+	ctx = i915_gem_create_context(dev, is_dependent? NULL : file_priv,
+					is_dependent? false : create_vm);
 	if (IS_ERR_OR_NULL(ctx))
 		return ctx;
 
-	if (file_priv) {
+	if (is_dependent) {
+		struct i915_hw_ppgtt *ppgtt;
+
+		/* We take the same PPGTT as the standalone */
+		ppgtt = ctx_to_ppgtt(ctx);
+		kref_put(&ppgtt->ref, assert_on_ppgtt_release);
+		ppgtt = ctx_to_ppgtt(standalone_ctx);
+		ctx->vm = &ppgtt->base;
+		kref_get(&ppgtt->ref);
+
+		ctx->file_priv = file_priv;
+		ctx->id = standalone_ctx->id;
+		ctx->remap_slice = standalone_ctx->remap_slice;
+
+		list_add_tail(&ctx->dependent_contexts,
+				&standalone_ctx->dependent_contexts);
+	}
+
+	if (file_priv && !is_dependent) {
 		ret = i915_gem_obj_ggtt_pin(ctx->obj, GEN8_CONTEXT_ALIGN, 0);
 		if (ret) {
 			DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
@@ -337,7 +368,7 @@ int gen8_gem_context_init(struct drm_device *dev)
 
 	for_each_ring(ring, dev_priv, ring_id) {
 		ring->default_context = gen8_gem_create_context(dev, ring,
-						NULL, (ring_id == RCS));
+					NULL, NULL, (ring_id == RCS));
 		if (IS_ERR_OR_NULL(ring->default_context)) {
 			ret = PTR_ERR(ring->default_context);
 			DRM_DEBUG_DRIVER("Create ctx failed: %d\n", ret);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 32/49] drm/i915/bdw: Create stand-alone and dependent contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (30 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 31/49] drm/i915/bdw: Introduce dependent contexts oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 33/49] drm/i915/bdw: Allow non-default, non-render user LR contexts oscar.mateo
                   ` (17 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

On execbuffer, either...:

A) there is not standalone context (and we error: user provided ctx id is invalid).
B) the standalone context is the one we are looking for (and we return it).
C) the standalone context is blank (and we populate and return it).
D) one of the dependent contexts is the one we want (and we return it).
E) none of the above (and we create and populate a new dependent context).

Note that, historically, HW contexts other than the default one have only worked
for the render ring (in Full PPGTT, contexts were used for other engines in a
tricky way: the render ring would really receive MI_SET_CONTEXTs while the other
engines would simply use their contexts to know which PPGTT to switch to). Now we
are going to really save and restore contexts for other engines, but we still don't
allow the context create ioctl to work them (since it changes the ABI).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  5 +++
 drivers/gpu/drm/i915/i915_gem_context.c    |  8 ++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  5 ++-
 drivers/gpu/drm/i915/i915_lrc.c            | 59 ++++++++++++++++++++++++++++++
 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d9470a4..a9f807b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -605,6 +605,8 @@ struct i915_hw_context {
 
 	/* Advanced contexts only */
 	struct list_head dependent_contexts;
+	bool is_blank;
+	enum intel_ring_id ring_id;
 };
 
 struct i915_fbc {
@@ -2322,6 +2324,9 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 /* i915_lrc.c */
 int gen8_gem_context_init(struct drm_device *dev);
 void gen8_gem_context_fini(struct drm_device *dev);
+struct i915_hw_context *gen8_gem_validate_context(struct drm_device *dev,
+			struct drm_file *file,
+			struct intel_engine *ring, const u32 ctx_id);
 struct i915_hw_context *gen8_gem_create_context(struct drm_device *dev,
 			struct intel_engine *ring,
 			struct drm_i915_file_private *file_priv,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 17015b2..a4e878e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -550,8 +550,8 @@ int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
 	mutex_lock(&dev->struct_mutex);
 	if (dev_priv->lrc_enabled)
 		file_priv->private_default_ctx = gen8_gem_create_context(dev,
-						&dev_priv->ring[RCS], file_priv,
-						NULL, USES_FULL_PPGTT(dev));
+						NULL, file_priv, NULL,
+						USES_FULL_PPGTT(dev));
 	else
 		file_priv->private_default_ctx = i915_gem_create_context(dev,
 						file_priv, USES_FULL_PPGTT(dev));
@@ -812,8 +812,8 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return ret;
 
 	if (dev_priv->lrc_enabled)
-		ctx = gen8_gem_create_context(dev, &dev_priv->ring[RCS],
-					file_priv, NULL, USES_FULL_PPGTT(dev));
+		ctx = gen8_gem_create_context(dev, NULL, file_priv, NULL,
+						USES_FULL_PPGTT(dev));
 	else
 		ctx = i915_gem_create_context(dev, file_priv,
 					USES_FULL_PPGTT(dev));
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index fa5a439..72bda74 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1131,7 +1131,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		goto pre_mutex_err;
 	}
 
-	ctx = i915_gem_validate_context(dev, file, ring, ctx_id);
+	if (dev_priv->lrc_enabled)
+		ctx = gen8_gem_validate_context(dev, file, ring, ctx_id);
+	else
+		ctx = i915_gem_validate_context(dev, file, ring, ctx_id);
 	if (IS_ERR(ctx)) {
 		mutex_unlock(&dev->struct_mutex);
 		ret = PTR_ERR(ctx);
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 99011cc..3065bca 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -192,6 +192,9 @@ intel_populate_lrc(struct i915_hw_context *ctx,
 	set_page_dirty(page);
 	i915_gem_object_unpin_pages(ctx->obj);
 
+	ctx->ring_id = ring->id;
+	ctx->is_blank = false;
+
 	return 0;
 }
 
@@ -309,6 +312,9 @@ gen8_gem_create_context(struct drm_device *dev,
 		ret = intel_populate_lrc(ctx, ring);
 		if (ret)
 			goto unmap_ringbuf;
+	} else {
+		WARN(is_dependent, "Dependent but blank context!\n");
+		ctx->is_blank = true;
 	}
 
 	return ctx;
@@ -334,6 +340,59 @@ destroy_ring_obj:
 	return ERR_PTR(ret);
 }
 
+struct i915_hw_context *
+gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
+			  struct intel_engine *ring, const u32 ctx_id)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_hw_context *ctx = NULL;
+	struct i915_hw_context *cursor = NULL;
+	struct i915_ctx_hang_stats *hs;
+	bool found = false;
+	int ret;
+
+	/* There is no reason why we cannot accept non-default, non-render contexts,
+	 * other than it changes the ABI (these kind of custom contexts have not been
+	 * allowed before) */
+	if (ring->id != RCS && ctx_id != DEFAULT_CONTEXT_ID)
+		return ERR_PTR(-EINVAL);
+
+	ctx = (struct i915_hw_context *)idr_find(&file_priv->context_idr, ctx_id);
+	if (!ctx)
+		return ERR_PTR(-ENOENT);
+
+	if (ctx->is_blank) {
+		ret = intel_populate_lrc(ctx, ring);
+		if (ret)
+			return ERR_PTR(ret);
+	}
+	else if (ctx->ring_id != ring->id) {
+		list_for_each_entry(cursor, &ctx->dependent_contexts, dependent_contexts) {
+			if (cursor->ring_id == ring->id) {
+				found = true;
+				break;
+			}
+		}
+
+		if (found)
+			ctx = cursor;
+		else {
+			ctx = gen8_gem_create_context(dev, ring, file_priv,
+					ctx, USES_FULL_PPGTT(dev));
+			if (!ctx)
+				return ERR_PTR(-ENOENT);
+		}
+	}
+
+	hs = &ctx->hang_stats;
+	if (hs->banned) {
+		DRM_DEBUG("Context %u tried to submit while banned\n", ctx_id);
+		return ERR_PTR(-EIO);
+	}
+
+	return ctx;
+}
+
 void gen8_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 33/49] drm/i915/bdw: Allow non-default, non-render user LR contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (31 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 32/49] drm/i915/bdw: Create stand-alone and " oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 34/49] drm/i915/bdw: Fix reset stats ioctl with " oscar.mateo
                   ` (16 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

This commit changes the ABI, so it is provided separately so that it can be
dropped by the maintainer is so he wishes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_lrc.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 3065bca..9d1e7f3 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -351,12 +351,6 @@ gen8_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 	bool found = false;
 	int ret;
 
-	/* There is no reason why we cannot accept non-default, non-render contexts,
-	 * other than it changes the ABI (these kind of custom contexts have not been
-	 * allowed before) */
-	if (ring->id != RCS && ctx_id != DEFAULT_CONTEXT_ID)
-		return ERR_PTR(-EINVAL);
-
 	ctx = (struct i915_hw_context *)idr_find(&file_priv->context_idr, ctx_id);
 	if (!ctx)
 		return ERR_PTR(-ENOENT);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 34/49] drm/i915/bdw: Fix reset stats ioctl with LR contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (32 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 33/49] drm/i915/bdw: Allow non-default, non-render user LR contexts oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 35/49] drm/i915: Allocate an integer ID for each new file descriptor oscar.mateo
                   ` (15 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Since we cannot tell apart which specific context the user refers too,
get stats from all the per-engine cotexts with the same ID.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index c3832d9..4fe3e4a 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -931,6 +931,21 @@ int i915_get_reset_stats_ioctl(struct drm_device *dev,
 	args->batch_active = hs->batch_active;
 	args->batch_pending = hs->batch_pending;
 
+	if (dev_priv->lrc_enabled) {
+		struct i915_hw_context *cursor = NULL;
+		list_for_each_entry(cursor, &ctx->dependent_contexts, dependent_contexts) {
+			hs = &cursor->hang_stats;
+
+			if (capable(CAP_SYS_ADMIN))
+				args->reset_count += i915_reset_count(&dev_priv->gpu_error);
+			else
+				args->reset_count = 0;
+
+			args->batch_active += hs->batch_active;
+			args->batch_pending += hs->batch_pending;
+		}
+	}
+
 	mutex_unlock(&dev->struct_mutex);
 
 	return 0;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 35/49] drm/i915: Allocate an integer ID for each new file descriptor
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (33 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 34/49] drm/i915/bdw: Fix reset stats ioctl with " oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 36/49] drm/i915/bdw: Prepare for a 20-bits globally unique submission ID oscar.mateo
                   ` (14 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Since context IDs are not globally unique anymore (they are only unique
for a given file descriptor), we can use the new file_priv ID in
combination with the context ID to unequivocally refer to a context.

The ID 0 remains to be used internally by i915 (meaning no file_priv).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c | 14 +++++++++++---
 drivers/gpu/drm/i915/i915_drv.h |  3 +++
 drivers/gpu/drm/i915/i915_gem.c |  9 +++++++++
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index ea5d965..7188403 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1567,6 +1567,8 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	device_info = (struct intel_device_info *)&dev_priv->info;
 	*device_info = *info;
 
+	idr_init(&dev_priv->filepriv_idr);
+
 	spin_lock_init(&dev_priv->irq_lock);
 	spin_lock_init(&dev_priv->gpu_error.lock);
 	spin_lock_init(&dev_priv->backlight_lock);
@@ -1867,6 +1869,8 @@ int i915_driver_unload(struct drm_device *dev)
 	if (dev_priv->slab)
 		kmem_cache_destroy(dev_priv->slab);
 
+	idr_destroy(&dev_priv->filepriv_idr);
+
 	pci_dev_put(dev_priv->bridge_dev);
 	kfree(dev->dev_private);
 
@@ -1917,11 +1921,15 @@ void i915_driver_lastclose(struct drm_device * dev)
 	i915_dma_cleanup(dev);
 }
 
-void i915_driver_preclose(struct drm_device * dev, struct drm_file *file_priv)
+void i915_driver_preclose(struct drm_device * dev, struct drm_file *file)
 {
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+
 	mutex_lock(&dev->struct_mutex);
-	i915_gem_context_close(dev, file_priv);
-	i915_gem_release(dev, file_priv);
+	idr_remove(&dev_priv->filepriv_idr, file_priv->id);
+	i915_gem_context_close(dev, file);
+	i915_gem_release(dev, file);
 	mutex_unlock(&dev->struct_mutex);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a9f807b..2f6e55d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1268,6 +1268,8 @@ typedef struct drm_i915_private {
 
 	const struct intel_device_info info;
 
+	struct idr filepriv_idr;
+
 	int relative_constants_mode;
 
 	void __iomem *regs;
@@ -1676,6 +1678,7 @@ struct drm_i915_gem_request {
 struct drm_i915_file_private {
 	struct drm_i915_private *dev_priv;
 	struct drm_file *file;
+	int id;
 
 	struct {
 		spinlock_t lock;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e844c50..07d88847 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4879,6 +4879,7 @@ i915_gem_file_idle_work_handler(struct work_struct *work)
 
 int i915_gem_open(struct drm_device *dev, struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_file_private *file_priv;
 	int ret;
 
@@ -4892,6 +4893,14 @@ int i915_gem_open(struct drm_device *dev, struct drm_file *file)
 	file_priv->dev_priv = dev->dev_private;
 	file_priv->file = file;
 
+	ret = idr_alloc(&dev_priv->filepriv_idr, file_priv, 0, 0, GFP_KERNEL);
+	if (ret < 0) {
+		kfree(file_priv);
+		return ret;
+	}
+
+	file_priv->id = ret;
+
 	spin_lock_init(&file_priv->mm.lock);
 	INIT_LIST_HEAD(&file_priv->mm.request_list);
 	INIT_DELAYED_WORK(&file_priv->mm.idle_work,
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 36/49] drm/i915/bdw: Prepare for a 20-bits globally unique submission ID
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (34 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 35/49] drm/i915: Allocate an integer ID for each new file descriptor oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 37/49] drm/i915/bdw: Implement context switching (somewhat) oscar.mateo
                   ` (13 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Consisting on 12 bits with the filepriv ID, 5 bits with the context
ID and 3 bits with the ring ID.

Note: this changes the ABI (only 4096 file descriptors are now allowed,
with 8 contexts per-fd) and will break some IGT tests (those that open
a big number of fds). If required, I can try to rewrite this so that
only legacy ring contexts are affected (as it stands now, legacy hw
contexts limits are also modified).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  5 ++++-
 drivers/gpu/drm/i915/i915_gem.c         |  3 ++-
 drivers/gpu/drm/i915/i915_gem_context.c |  4 ++--
 drivers/gpu/drm/i915/i915_lrc.c         | 18 ++++++++++++++++++
 4 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2f6e55d..ba3d262 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -588,7 +588,8 @@ struct i915_ctx_hang_stats {
 };
 
 /* This must match up with the value previously used for execbuf2.rsvd1. */
-#define DEFAULT_CONTEXT_ID 0
+#define DEFAULT_CONTEXT_ID	0
+#define CONTEXT_ID_BITS		5
 struct i915_hw_context {
 	struct kref ref;
 	int id;
@@ -1262,6 +1263,8 @@ struct intel_pipe_crc {
 	wait_queue_head_t wq;
 };
 
+#define MIN_FILEPRIV_ID		1
+#define FILEPRIV_ID_BITS	12
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 07d88847..10bb50f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4893,7 +4893,8 @@ int i915_gem_open(struct drm_device *dev, struct drm_file *file)
 	file_priv->dev_priv = dev->dev_private;
 	file_priv->file = file;
 
-	ret = idr_alloc(&dev_priv->filepriv_idr, file_priv, 0, 0, GFP_KERNEL);
+	ret = idr_alloc(&dev_priv->filepriv_idr, file_priv, MIN_FILEPRIV_ID,
+			(1 << FILEPRIV_ID_BITS) - 1, GFP_KERNEL);
 	if (ret < 0) {
 		kfree(file_priv);
 		return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index a4e878e..1322e00 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -259,8 +259,8 @@ __create_hw_context(struct drm_device *dev,
 	if (file_priv == NULL)
 		return ctx;
 
-	ret = idr_alloc(&file_priv->context_idr, ctx, DEFAULT_CONTEXT_ID, 0,
-			GFP_KERNEL);
+	ret = idr_alloc(&file_priv->context_idr, ctx, DEFAULT_CONTEXT_ID,
+			(1 << CONTEXT_ID_BITS) - 1, GFP_KERNEL);
 	if (ret < 0)
 		goto err_out;
 
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 9d1e7f3..91e7ea6 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -77,6 +77,24 @@
 #define CTX_R_PWR_CLK_STATE		0x42
 #define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
 
+static inline u32 get_submission_id(struct i915_hw_context *ctx)
+{
+	struct drm_i915_file_private *file_priv = ctx->file_priv;
+	u32 submission_id;
+
+	if (file_priv) {
+		WARN(ctx->ring_id & ~0x7, "Ring ID > 3 bits!\n");
+		submission_id = ctx->ring_id;
+		submission_id |= (ctx->id << 3);
+		submission_id |= (file_priv->id << (CONTEXT_ID_BITS + 3));
+	} else {
+		submission_id = ctx->ring_id;
+		submission_id |= (ctx->id << 3);
+	}
+
+	return submission_id;
+}
+
 void gen8_gem_context_free(struct i915_hw_context *ctx)
 {
 	/* Global default contexts ringbuffers are take care of
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 37/49] drm/i915/bdw: Implement context switching (somewhat)
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (35 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 36/49] drm/i915/bdw: Prepare for a 20-bits globally unique submission ID oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 38/49] drm/i915/bdw: Add forcewake lock around ELSP writes oscar.mateo
                   ` (12 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

A context switch occurs by submitting a context descriptor to the
ExecList Submission Port. Given that we can now initialize a context,
it's possible to begin implementing the context switch by creating the
descriptor and submitting it to ELSP (actually two, since the ELSP
has two ports).

The context object must be mapped in the GGTT, which means it must exist
in the 0-4GB graphics VA range.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: This code has changed quite a lot in various rebases. Of particular
importance is that now we use the globally unique Submission ID to send
to the hardware. Also, context pages are now pinned unconditionally to
GGTT, so there is no need to bind them.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_lrc.c | 84 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 91e7ea6..aa190a2 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -77,6 +77,28 @@
 #define CTX_R_PWR_CLK_STATE		0x42
 #define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
 
+#define GEN8_CTX_VALID (1<<0)
+#define GEN8_CTX_FORCE_PD_RESTORE (1<<1)
+#define GEN8_CTX_FORCE_RESTORE (1<<2)
+#define GEN8_CTX_L3LLC_COHERENT (1<<5)
+#define GEN8_CTX_PRIVILEGE (1<<8)
+enum {
+	ADVANCED_CONTEXT=0,
+	LEGACY_CONTEXT,
+	ADVANCED_AD_CONTEXT,
+	LEGACY_64B_CONTEXT
+};
+#define GEN8_CTX_MODE_SHIFT 3
+enum {
+	FAULT_AND_HANG=0,
+	FAULT_AND_HALT, /* Debug only */
+	FAULT_AND_STREAM,
+	FAULT_AND_CONTINUE /* Unsupported */
+};
+#define GEN8_CTX_FAULT_SHIFT 6
+#define GEN8_CTX_LRCA_SHIFT 12
+#define GEN8_CTX_UNUSED_SHIFT 32
+
 static inline u32 get_submission_id(struct i915_hw_context *ctx)
 {
 	struct drm_i915_file_private *file_priv = ctx->file_priv;
@@ -95,6 +117,68 @@ static inline u32 get_submission_id(struct i915_hw_context *ctx)
 	return submission_id;
 }
 
+static inline uint64_t get_descriptor(struct i915_hw_context *ctx)
+{
+	uint64_t desc;
+	u32 submission_id = get_submission_id(ctx);
+
+	BUG_ON(i915_gem_obj_ggtt_offset(ctx->obj) & 0xFFFFFFFF00000000ULL);
+
+	desc = GEN8_CTX_VALID;
+	desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
+	desc |= i915_gem_obj_ggtt_offset(ctx->obj);
+	desc |= GEN8_CTX_L3LLC_COHERENT;
+	desc |= (u64)submission_id << GEN8_CTX_UNUSED_SHIFT;
+	desc |= GEN8_CTX_PRIVILEGE;
+
+	/* TODO: WaDisableLiteRestore when we start using semaphore
+	 * signalling between Command Streamers */
+	/* desc |= GEN8_CTX_FORCE_RESTORE; */
+
+	return desc;
+}
+
+static void submit_execlist(struct intel_engine *ring,
+			    struct i915_hw_context *ctx0,
+			    struct i915_hw_context *ctx1)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	uint64_t temp = 0;
+	uint32_t desc[4];
+
+	/* XXX: You must always write both descriptors in the order below. */
+	if (ctx1)
+		temp = get_descriptor(ctx1);
+	else
+		temp = 0;
+	desc[1] = (u32)(temp >> 32);
+	desc[0] = (u32)temp;
+
+	temp = get_descriptor(ctx0);
+	desc[3] = (u32)(temp >> 32);
+	desc[2] = (u32)temp;
+
+	I915_WRITE(RING_ELSP(ring), desc[1]);
+	I915_WRITE(RING_ELSP(ring), desc[0]);
+	I915_WRITE(RING_ELSP(ring), desc[3]);
+	/* The context is automatically loaded after the following */
+	I915_WRITE(RING_ELSP(ring), desc[2]);
+}
+
+static int gen8_switch_context(struct intel_engine *ring,
+		struct i915_hw_context *to0, u32 tail0,
+		struct i915_hw_context *to1, u32 tail1)
+{
+	BUG_ON(!i915_gem_obj_is_pinned(to0->obj));
+
+	if (to1)
+		BUG_ON(!i915_gem_obj_is_pinned(to1->obj));
+
+	submit_execlist(ring, to0, to1);
+
+	return 0;
+}
+
 void gen8_gem_context_free(struct i915_hw_context *ctx)
 {
 	/* Global default contexts ringbuffers are take care of
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 38/49] drm/i915/bdw: Add forcewake lock around ELSP writes
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (36 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 37/49] drm/i915/bdw: Implement context switching (somewhat) oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style oscar.mateo
                   ` (11 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Thomas Daniel <thomas.daniel@intel.com>

BSPEC says: SW must set Force Wakeup bit to prevent GT from
entering C6 while ELSP writes are in progress.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Acked-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_lrc.c | 15 +++++++++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ba3d262..b28b785 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2602,6 +2602,7 @@ void vlv_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine);
 
 #define I915_READ(reg)		dev_priv->uncore.funcs.mmio_readl(dev_priv, (reg), true)
 #define I915_WRITE(reg, val)	dev_priv->uncore.funcs.mmio_writel(dev_priv, (reg), (val), true)
+#define I915_RAW_WRITE(reg, val)	writel(val, dev_priv->regs + reg)
 #define I915_READ_NOTRACE(reg)		dev_priv->uncore.funcs.mmio_readl(dev_priv, (reg), false)
 #define I915_WRITE_NOTRACE(reg, val)	dev_priv->uncore.funcs.mmio_writel(dev_priv, (reg), (val), false)
 
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index aa190a2..6948df1 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -158,11 +158,18 @@ static void submit_execlist(struct intel_engine *ring,
 	desc[3] = (u32)(temp >> 32);
 	desc[2] = (u32)temp;
 
-	I915_WRITE(RING_ELSP(ring), desc[1]);
-	I915_WRITE(RING_ELSP(ring), desc[0]);
-	I915_WRITE(RING_ELSP(ring), desc[3]);
+	/* Set Force Wakeup bit to prevent GT from entering C6 while
+	 * ELSP writes are in progress */
+	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
+
+	I915_RAW_WRITE(RING_ELSP(ring), desc[1]);
+	I915_RAW_WRITE(RING_ELSP(ring), desc[0]);
+	I915_RAW_WRITE(RING_ELSP(ring), desc[3]);
 	/* The context is automatically loaded after the following */
-	I915_WRITE(RING_ELSP(ring), desc[2]);
+	I915_RAW_WRITE(RING_ELSP(ring), desc[2]);
+
+	/* Release Force Wakeup */
+	gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
 }
 
 static int gen8_switch_context(struct intel_engine *ring,
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (37 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 38/49] drm/i915/bdw: Add forcewake lock around ELSP writes oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-31 16:42   ` Damien Lespiau
  2014-04-02 13:47   ` Damien Lespiau
  2014-03-27 18:00 ` [PATCH 40/49] drm/i915/bdw: Write the tail pointer, " oscar.mateo
                   ` (10 subsequent siblings)
  49 siblings, 2 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Each logical ring context has the PDPs in the context object, so update
them before submission. This should work both for Aliasing PPGTT
(nothing will be changed) and Full PPGTT.

Also, don't write PDP in the legacy way when using logical ring contexts
(this is mostly for correctness so that we know we are running the
LR context correctly).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c |  4 ++++
 drivers/gpu/drm/i915/i915_lrc.c     | 34 +++++++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index e5911ec..9f39b7f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -220,11 +220,15 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 			  struct intel_engine *ring,
 			  bool synchronous)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	int i, ret;
 
 	/* bit of a hack to find the actual last used pd */
 	int used_pd = ppgtt->num_pd_entries / GEN8_PDES_PER_PAGE;
 
+	if (dev_priv->lrc_enabled)
+		return 0;
+
 	for (i = used_pd - 1; i >= 0; i--) {
 		dma_addr_t addr = ppgtt->pd_dma_addr[i];
 		ret = gen8_write_pdp(ring, i, addr, synchronous);
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 6948df1..9984a54 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -172,15 +172,47 @@ static void submit_execlist(struct intel_engine *ring,
 	gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
 }
 
+static int gen8_write_pdp_ctx(struct i915_hw_context *ctx,
+				   struct i915_hw_ppgtt *ppgtt)
+{
+	struct page *page;
+	uint32_t *reg_state;
+
+	page = i915_gem_object_get_page(ctx->obj, 1);
+	reg_state = kmap_atomic(page);
+
+	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
+	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
+	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
+	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
+	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
+	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
+	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
+	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
+
+	kunmap_atomic(reg_state);
+
+	return 0;
+}
+
 static int gen8_switch_context(struct intel_engine *ring,
 		struct i915_hw_context *to0, u32 tail0,
 		struct i915_hw_context *to1, u32 tail1)
 {
+	struct i915_hw_ppgtt *ppgtt;
+
 	BUG_ON(!i915_gem_obj_is_pinned(to0->obj));
 
-	if (to1)
+	ppgtt = ctx_to_ppgtt(to0);
+	gen8_write_pdp_ctx(to0, ppgtt);
+
+	if (to1) {
 		BUG_ON(!i915_gem_obj_is_pinned(to1->obj));
 
+		ppgtt = ctx_to_ppgtt(to1);
+		gen8_write_pdp_ctx(to1, ppgtt);
+	}
+
 	submit_execlist(ring, to0, to1);
 
 	return 0;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 40/49] drm/i915/bdw: Write the tail pointer, LRC style
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (38 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 41/49] drm/i915/bdw: LR context switch interrupts oscar.mateo
                   ` (9 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Writing the tail pointer for the context ringbuffer is quite similar to
the legacy ringbuffers. The primary difference is that each context has
the ringbuffer pointers in the context object.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_lrc.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 9984a54..e564bac 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -172,6 +172,19 @@ static void submit_execlist(struct intel_engine *ring,
 	gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
 }
 
+static void gen8_write_tail_ctx(struct i915_hw_context *ctx, u32 value)
+{
+	struct page *page;
+	uint32_t *reg_state;
+
+	page = i915_gem_object_get_page(ctx->obj, 1);
+	reg_state = kmap_atomic(page);
+
+	reg_state[CTX_RING_TAIL+1] = value;
+
+	kunmap_atomic(reg_state);
+}
+
 static int gen8_write_pdp_ctx(struct i915_hw_context *ctx,
 				   struct i915_hw_ppgtt *ppgtt)
 {
@@ -205,12 +218,14 @@ static int gen8_switch_context(struct intel_engine *ring,
 
 	ppgtt = ctx_to_ppgtt(to0);
 	gen8_write_pdp_ctx(to0, ppgtt);
+	gen8_write_tail_ctx(to0, tail0);
 
 	if (to1) {
 		BUG_ON(!i915_gem_obj_is_pinned(to1->obj));
 
 		ppgtt = ctx_to_ppgtt(to1);
 		gen8_write_pdp_ctx(to1, ppgtt);
+		gen8_write_tail_ctx(to1, tail1);
 	}
 
 	submit_execlist(ring, to0, to1);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 41/49] drm/i915/bdw: LR context switch interrupts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (39 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 40/49] drm/i915/bdw: Write the tail pointer, " oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-04-02 11:42   ` Damien Lespiau
  2014-03-27 18:00 ` [PATCH 42/49] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
                   ` (8 subsequent siblings)
  49 siblings, 1 reply; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Thomas Daniel <thomas.daniel@intel.com>

We need to attend context switch interrupts from all rings. Also, fixed writing
IMR/IER and added HWSTAM at ring init time.

Notice that, if added to irq_enable_mask, the context switch interrupts would
be incorrectly masked out when the user interrupts are due to no users waiting
on a sequence number. Therefore, this commit adds a bitmask of interrupts to
be kept unmasked at all times.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Acked-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c         | 28 ++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_reg.h         |  2 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 33 +++++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 4 files changed, 42 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1ba8bb3..56657b5 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1334,7 +1334,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 				       struct drm_i915_private *dev_priv,
 				       u32 master_ctl)
 {
-	u32 rcs, bcs, vcs;
+	u32 rcs, bcs, vcs, vecs;
 	uint32_t tmp = 0;
 	irqreturn_t ret = IRQ_NONE;
 
@@ -1348,6 +1348,8 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 				notify_ring(dev, &dev_priv->ring[RCS]);
 			if (bcs & GT_RENDER_USER_INTERRUPT)
 				notify_ring(dev, &dev_priv->ring[BCS]);
+			if ((rcs | bcs) & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
 			I915_WRITE(GEN8_GT_IIR(0), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT0)!\n");
@@ -1360,6 +1362,8 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 			vcs = tmp >> GEN8_VCS1_IRQ_SHIFT;
 			if (vcs & GT_RENDER_USER_INTERRUPT)
 				notify_ring(dev, &dev_priv->ring[VCS]);
+			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
 			I915_WRITE(GEN8_GT_IIR(1), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT1)!\n");
@@ -1369,9 +1373,11 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 		tmp = I915_READ(GEN8_GT_IIR(3));
 		if (tmp) {
 			ret = IRQ_HANDLED;
-			vcs = tmp >> GEN8_VECS_IRQ_SHIFT;
-			if (vcs & GT_RENDER_USER_INTERRUPT)
+			vecs = tmp >> GEN8_VECS_IRQ_SHIFT;
+			if (vecs & GT_RENDER_USER_INTERRUPT)
 				notify_ring(dev, &dev_priv->ring[VECS]);
+			if (vecs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
 			I915_WRITE(GEN8_GT_IIR(3), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT3)!\n");
@@ -3244,12 +3250,17 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 	/* These are interrupts we'll toggle with the ring mask register */
 	uint32_t gt_interrupts[] = {
 		GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
-			GT_RENDER_L3_PARITY_ERROR_INTERRUPT |
-			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT,
+		GT_RENDER_L3_PARITY_ERROR_INTERRUPT |
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
+		GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT |
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT,
 		GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
-			GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT,
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
+		GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT |
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT,
 		0,
-		GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT
+		GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT |
+		GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT,
 		};
 
 	for (i = 0; i < ARRAY_SIZE(gt_interrupts); i++) {
@@ -3258,9 +3269,10 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 			DRM_ERROR("Interrupt (%d) should have been masked in pre-install 0x%08x\n",
 				  i, tmp);
 		I915_WRITE(GEN8_GT_IMR(i), ~gt_interrupts[i]);
+		POSTING_READ(GEN8_GT_IMR(i));
 		I915_WRITE(GEN8_GT_IER(i), gt_interrupts[i]);
+		POSTING_READ(GEN8_GT_IER(i));
 	}
-	POSTING_READ(GEN8_GT_IER(0));
 }
 
 static void gen8_de_irq_postinstall(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index adcb9c7..117825e 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -709,6 +709,7 @@ enum punit_power_well {
 #define RING_ACTHD(base)	((base)+0x74)
 #define RING_NOPID(base)	((base)+0x94)
 #define RING_IMR(base)		((base)+0xa8)
+#define RING_HWSTAM(base)	((base)+0x98)
 #define RING_TIMESTAMP(base)	((base)+0x358)
 #define   TAIL_ADDR		0x001FFFF8
 #define   HEAD_WRAP_COUNT	0xFFE00000
@@ -4100,6 +4101,7 @@ enum punit_power_well {
 #define GEN8_GT_IMR(which) (0x44304 + (0x10 * (which)))
 #define GEN8_GT_IIR(which) (0x44308 + (0x10 * (which)))
 #define GEN8_GT_IER(which) (0x4430c + (0x10 * (which)))
+#define   GEN8_GT_CONTEXT_SWITCH_INTERRUPT	(1<<8)
 
 #define GEN8_BCS_IRQ_SHIFT 16
 #define GEN8_RCS_IRQ_SHIFT 0
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index fba9b05..230740e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -536,6 +536,8 @@ out:
 
 static int init_ring_common_lrc(struct intel_engine *ring)
 {
+	struct drm_device *dev = ring->dev;
+	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_ringbuffer *ringbuf = &ring->default_ringbuf;
 
 	ringbuf->head = 0;
@@ -543,6 +545,9 @@ static int init_ring_common_lrc(struct intel_engine *ring)
 	ringbuf->space = ringbuf->size;
 	ringbuf->last_retired_head = -1;
 
+	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
+	I915_WRITE(RING_HWSTAM(ring->mmio_base), ~(ring->irq_enable_mask | ring->irq_keep_mask));
+
 	return 0;
 }
 
@@ -1189,13 +1194,7 @@ gen8_ring_get_irq(struct intel_engine *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (ring->irq_refcount++ == 0) {
-		if (HAS_L3_DPF(dev) && ring->id == RCS) {
-			I915_WRITE_IMR(ring,
-				       ~(ring->irq_enable_mask |
-					 GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
-		} else {
-			I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
-		}
+		I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
 		POSTING_READ(RING_IMR(ring->mmio_base));
 	}
 	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
@@ -1212,12 +1211,7 @@ gen8_ring_put_irq(struct intel_engine *ring)
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
 	if (--ring->irq_refcount == 0) {
-		if (HAS_L3_DPF(dev) && ring->id == RCS) {
-			I915_WRITE_IMR(ring,
-				       ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
-		} else {
-			I915_WRITE_IMR(ring, ~0);
-		}
+		I915_WRITE_IMR(ring, ~ring->irq_keep_mask);
 		POSTING_READ(RING_IMR(ring->mmio_base));
 	}
 	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
@@ -2032,16 +2026,21 @@ int intel_init_render_ring(struct drm_device *dev)
 			if (dev_priv->lrc_enabled) {
 				ring->write_tail = gen8_write_tail_lrc;
 				ring->init = init_render_ring_lrc;
+				ring->irq_keep_mask =
+				GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT;
 			}
 			ring->add_request = gen8_add_request;
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
 			ring->irq_put = gen8_ring_put_irq;
+			if (HAS_L3_DPF(dev))
+				ring->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
 		} else {
 			ring->irq_get = gen6_ring_get_irq;
 			ring->irq_put = gen6_ring_put_irq;
 		}
-		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
+		ring->irq_enable_mask =
+			GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT;
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		ring->sync_to = gen6_ring_sync;
@@ -2202,6 +2201,8 @@ int intel_init_bsd_ring(struct drm_device *dev)
 			if (dev_priv->lrc_enabled) {
 				ring->write_tail = gen8_write_tail_lrc;
 				ring->init = init_ring_common_lrc;
+				ring->irq_keep_mask =
+				GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			}
 			ring->flush = gen8_ring_flush;
 			ring->add_request = gen8_add_request;
@@ -2262,6 +2263,8 @@ int intel_init_blt_ring(struct drm_device *dev)
 		if (dev_priv->lrc_enabled) {
 			ring->write_tail = gen8_write_tail_lrc;
 			ring->init = init_ring_common_lrc;
+			ring->irq_keep_mask =
+			GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		}
 		ring->flush = gen8_ring_flush;
 		ring->add_request = gen8_add_request;
@@ -2304,6 +2307,8 @@ int intel_init_vebox_ring(struct drm_device *dev)
 		if (dev_priv->lrc_enabled) {
 			ring->write_tail = gen8_write_tail_lrc;
 			ring->init = init_ring_common_lrc;
+			ring->irq_keep_mask =
+			GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		}
 		ring->flush = gen8_ring_flush;
 		ring->add_request = gen8_add_request;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 3b0f28b..9fbb2d5 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -93,6 +93,7 @@ struct intel_engine {
 
 	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
+	u32		irq_keep_mask;		/* bitmask for interrupts that should not be masked */
 	u32		trace_irq_seqno;
 	u32		sync_seqno[I915_NUM_RINGS-1];
 	bool __must_check (*irq_get)(struct intel_engine *ring);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 42/49] drm/i915/bdw: Get prepared for a two-stage execlist submit process
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (40 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 41/49] drm/i915/bdw: LR context switch interrupts oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-04-04 11:12   ` Damien Lespiau
  2014-03-27 18:00 ` [PATCH 43/49] drm/i915/bdw: Handle context switch events oscar.mateo
                   ` (7 subsequent siblings)
  49 siblings, 1 reply; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Michel Thierry <michel.thierry@intel.com>

Context switch (and execlist submission) should happen only when
other contexts are not active, otherwise pre-emption occurs.

To assure this, we place context switch requests in a queue and those
request are later consumed when the right context switch interrupt is
received.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>

v2: Use a spinlock, do not remove the requests on unqueue (wait for
context switch completion).

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v3: Several rebases and code changes. Use unique ID.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  6 ++++
 drivers/gpu/drm/i915/i915_lrc.c         | 56 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  8 +++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +++
 4 files changed, 74 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b28b785..2607664 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1676,6 +1676,9 @@ struct drm_i915_gem_request {
 	struct drm_i915_file_private *file_priv;
 	/** file_priv list entry for this request */
 	struct list_head client_list;
+
+	/** execlist queue entry for this request */
+	struct list_head execlist_link;
 };
 
 struct drm_i915_file_private {
@@ -2338,6 +2341,9 @@ struct i915_hw_context *gen8_gem_create_context(struct drm_device *dev,
 			struct drm_i915_file_private *file_priv,
 			struct i915_hw_context *standalone_ctx, bool create_vm);
 void gen8_gem_context_free(struct i915_hw_context *ctx);
+int gen8_switch_context_queue(struct intel_engine *ring,
+			      struct i915_hw_context *to,
+			      u32 tail);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index e564bac..4cacabb 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -233,6 +233,62 @@ static int gen8_switch_context(struct intel_engine *ring,
 	return 0;
 }
 
+static void gen8_switch_context_unqueue(struct intel_engine *ring)
+{
+	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
+	struct drm_i915_gem_request *cursor = NULL, *tmp = NULL;
+
+	if (list_empty(&ring->execlist_queue))
+		return;
+
+	/* Try to read in pairs */
+	list_for_each_entry_safe(cursor, tmp, &ring->execlist_queue, execlist_link) {
+		if (!req0)
+			req0 = cursor;
+		else if (get_submission_id(req0->ctx) == get_submission_id(cursor->ctx)) {
+			/* Same ID: ignore first request, as second request
+			 * will update tail past first request's workload */
+			list_del(&req0->execlist_link);
+			i915_gem_context_unreference(req0->ctx);
+			kfree(req0);
+			req0 = cursor;
+		} else {
+			req1 = cursor;
+			break;
+		}
+	}
+
+	BUG_ON(gen8_switch_context(ring, req0->ctx, req0->tail,
+			req1? req1->ctx : NULL, req1? req1->tail : 0));
+}
+
+int gen8_switch_context_queue(struct intel_engine *ring,
+			      struct i915_hw_context *to,
+			      u32 tail)
+{
+	struct drm_i915_gem_request *req = NULL;
+	unsigned long flags;
+	bool was_empty;
+
+	req = (struct drm_i915_gem_request *)
+		kmalloc(sizeof(struct drm_i915_gem_request), GFP_KERNEL);
+	req->ring = ring;
+	req->ctx = to;
+	i915_gem_context_reference(req->ctx);
+	req->tail = tail;
+
+	spin_lock_irqsave(&ring->execlist_lock, flags);
+
+	was_empty = list_empty(&ring->execlist_queue);
+	list_add_tail(&req->execlist_link, &ring->execlist_queue);
+	if (was_empty)
+		gen8_switch_context_unqueue(ring);
+
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
+
+	return 0;
+}
+
 void gen8_gem_context_free(struct i915_hw_context *ctx)
 {
 	/* Global default contexts ringbuffers are take care of
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 230740e..a92bede 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2387,6 +2387,8 @@ void intel_init_rings_early(struct drm_device *dev)
 	dev_priv->ring[RCS].dev = dev;
 	dev_priv->ring[RCS].default_ringbuf.head = 0;
 	dev_priv->ring[RCS].default_ringbuf.tail = 0;
+	INIT_LIST_HEAD(&dev_priv->ring[RCS].execlist_queue);
+	spin_lock_init(&dev_priv->ring[RCS].execlist_lock);
 
 	dev_priv->ring[BCS].name = "blitter ring";
 	dev_priv->ring[BCS].id = BCS;
@@ -2394,6 +2396,8 @@ void intel_init_rings_early(struct drm_device *dev)
 	dev_priv->ring[BCS].dev = dev;
 	dev_priv->ring[BCS].default_ringbuf.head = 0;
 	dev_priv->ring[BCS].default_ringbuf.tail = 0;
+	INIT_LIST_HEAD(&dev_priv->ring[BCS].execlist_queue);
+	spin_lock_init(&dev_priv->ring[BCS].execlist_lock);
 
 	dev_priv->ring[VCS].name = "bsd ring";
 	dev_priv->ring[VCS].id = VCS;
@@ -2404,6 +2408,8 @@ void intel_init_rings_early(struct drm_device *dev)
 	dev_priv->ring[VCS].dev = dev;
 	dev_priv->ring[VCS].default_ringbuf.head = 0;
 	dev_priv->ring[VCS].default_ringbuf.tail = 0;
+	INIT_LIST_HEAD(&dev_priv->ring[VCS].execlist_queue);
+	spin_lock_init(&dev_priv->ring[VCS].execlist_lock);
 
 	dev_priv->ring[VECS].name = "video enhancement ring";
 	dev_priv->ring[VECS].id = VECS;
@@ -2411,4 +2417,6 @@ void intel_init_rings_early(struct drm_device *dev)
 	dev_priv->ring[VECS].dev = dev;
 	dev_priv->ring[VECS].default_ringbuf.head = 0;
 	dev_priv->ring[VECS].default_ringbuf.tail = 0;
+	INIT_LIST_HEAD(&dev_priv->ring[VECS].execlist_queue);
+	spin_lock_init(&dev_priv->ring[VECS].execlist_lock);
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9fbb2d5..5f4fd3c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -167,6 +167,10 @@ struct intel_engine {
 	 * Do an explicit TLB flush before MI_SET_CONTEXT
 	 */
 	bool itlb_before_ctx_switch;
+
+	spinlock_t execlist_lock;
+	struct list_head execlist_queue;
+
 	struct i915_hw_context *default_context;
 	struct i915_hw_context *last_context;
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 43/49] drm/i915/bdw: Handle context switch events
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (41 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 42/49] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-04-03 14:24   ` Damien Lespiau
  2014-04-26  0:53   ` Robert Beckett
  2014-03-27 18:00 ` [PATCH 44/49] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
                   ` (6 subsequent siblings)
  49 siblings, 2 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Thomas Daniel <thomas.daniel@intel.com>

Handle all context status events in the context status buffer on every
context switch interrupt. We only remove work from the execlist queue
after a context status buffer reports that it has completed and we only
attempt to schedule new contexts on interrupt when a previously submitted
context completes (unless no contexts are queued, which means the GPU is
free).

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v2: Unreferencing the context when we are freeing the request might free
the backing bo, which requires the struct_mutex to be grabbed, so defer
unreferencing and freeing to a bottom half.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |   3 +
 drivers/gpu/drm/i915/i915_irq.c         |  28 ++++++---
 drivers/gpu/drm/i915/i915_lrc.c         | 101 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.c |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
 5 files changed, 123 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2607664..4c8cf52 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1679,6 +1679,8 @@ struct drm_i915_gem_request {
 
 	/** execlist queue entry for this request */
 	struct list_head execlist_link;
+	/** Struct to handle this request in the bottom half of an interrupt */
+	struct work_struct work;
 };
 
 struct drm_i915_file_private {
@@ -2344,6 +2346,7 @@ void gen8_gem_context_free(struct i915_hw_context *ctx);
 int gen8_switch_context_queue(struct intel_engine *ring,
 			      struct i915_hw_context *to,
 			      u32 tail);
+void gen8_handle_context_events(struct intel_engine *ring);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 56657b5..6e0f456 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1334,6 +1334,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 				       struct drm_i915_private *dev_priv,
 				       u32 master_ctl)
 {
+	struct intel_engine *ring;
 	u32 rcs, bcs, vcs, vecs;
 	uint32_t tmp = 0;
 	irqreturn_t ret = IRQ_NONE;
@@ -1342,14 +1343,21 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 		tmp = I915_READ(GEN8_GT_IIR(0));
 		if (tmp) {
 			ret = IRQ_HANDLED;
+
 			rcs = tmp >> GEN8_RCS_IRQ_SHIFT;
-			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
+			ring = &dev_priv->ring[RCS];
 			if (rcs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[RCS]);
+				notify_ring(dev, ring);
+			if (rcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+				gen8_handle_context_events(ring);
+
+			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
+			ring = &dev_priv->ring[BCS];
 			if (bcs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[BCS]);
-			if ((rcs | bcs) & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
-			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
+				notify_ring(dev, ring);
+			if (bcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
+				gen8_handle_context_events(ring);
+
 			I915_WRITE(GEN8_GT_IIR(0), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT0)!\n");
@@ -1360,10 +1368,11 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 		if (tmp) {
 			ret = IRQ_HANDLED;
 			vcs = tmp >> GEN8_VCS1_IRQ_SHIFT;
+			ring = &dev_priv->ring[VCS];
 			if (vcs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[VCS]);
+				notify_ring(dev, ring);
 			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
-			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
+				gen8_handle_context_events(ring);
 			I915_WRITE(GEN8_GT_IIR(1), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT1)!\n");
@@ -1374,10 +1383,11 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
 		if (tmp) {
 			ret = IRQ_HANDLED;
 			vecs = tmp >> GEN8_VECS_IRQ_SHIFT;
+			ring = &dev_priv->ring[VECS];
 			if (vecs & GT_RENDER_USER_INTERRUPT)
-				notify_ring(dev, &dev_priv->ring[VECS]);
+				notify_ring(dev, ring);
 			if (vecs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
-			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
+				gen8_handle_context_events(ring);
 			I915_WRITE(GEN8_GT_IIR(3), tmp);
 		} else
 			DRM_ERROR("The master control interrupt lied (GT3)!\n");
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 4cacabb..440da11 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -46,7 +46,24 @@
 #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
 
 #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
+#define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
 #define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
+#define RING_CONTEXT_STATUS_BUF(ring)	((ring)->mmio_base+0x370)
+#define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
+
+#define RING_EXECLIST_QFULL		(1 << 0x2)
+#define RING_EXECLIST1_VALID		(1 << 0x3)
+#define RING_EXECLIST0_VALID		(1 << 0x4)
+#define RING_EXECLIST_ACTIVE_STATUS	(3 << 0xE)
+#define RING_EXECLIST1_ACTIVE		(1 << 0x11)
+#define RING_EXECLIST0_ACTIVE		(1 << 0x12)
+
+#define GEN8_CTX_STATUS_IDLE_ACTIVE	(1 << 0)
+#define GEN8_CTX_STATUS_PREEMPTED	(1 << 1)
+#define GEN8_CTX_STATUS_ELEMENT_SWITCH	(1 << 2)
+#define GEN8_CTX_STATUS_ACTIVE_IDLE	(1 << 3)
+#define GEN8_CTX_STATUS_COMPLETE	(1 << 4)
+#define GEN8_CTX_STATUS_LITE_RESTORE	(1 << 15)
 
 #define CTX_LRI_HEADER_0		0x01
 #define CTX_CONTEXT_CONTROL		0x02
@@ -237,6 +254,9 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
 {
 	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
 	struct drm_i915_gem_request *cursor = NULL, *tmp = NULL;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+
+	assert_spin_locked(&ring->execlist_lock);
 
 	if (list_empty(&ring->execlist_queue))
 		return;
@@ -249,8 +269,7 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
 			/* Same ID: ignore first request, as second request
 			 * will update tail past first request's workload */
 			list_del(&req0->execlist_link);
-			i915_gem_context_unreference(req0->ctx);
-			kfree(req0);
+			queue_work(dev_priv->wq, &req0->work);
 			req0 = cursor;
 		} else {
 			req1 = cursor;
@@ -262,6 +281,83 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
 			req1? req1->ctx : NULL, req1? req1->tail : 0));
 }
 
+static bool check_remove_request(struct intel_engine *ring, u32 request_id)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_request *head_req;
+
+	assert_spin_locked(&ring->execlist_lock);
+
+	head_req = list_first_entry_or_null(&ring->execlist_queue,
+			struct drm_i915_gem_request, execlist_link);
+	if (head_req != NULL) {
+		if (get_submission_id(head_req->ctx) == request_id) {
+			list_del(&head_req->execlist_link);
+			queue_work(dev_priv->wq, &head_req->work);
+			return true;
+		}
+	}
+
+	return false;
+}
+
+void gen8_handle_context_events(struct intel_engine *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	u32 status_pointer;
+	u8 read_pointer;
+	u8 write_pointer;
+	u32 status;
+	u32 status_id;
+	u32 submit_contexts = 0;
+
+	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
+
+	read_pointer = ring->next_context_status_buffer;
+	write_pointer = status_pointer & 0x07;
+	if (read_pointer > write_pointer)
+		write_pointer += 6;
+
+	spin_lock(&ring->execlist_lock);
+
+	while (read_pointer < write_pointer) {
+		read_pointer++;
+		status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
+				(read_pointer % 6) * 8);
+		status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
+				(read_pointer % 6) * 8 + 4);
+
+		if (status & GEN8_CTX_STATUS_ELEMENT_SWITCH) {
+			if (check_remove_request(ring, status_id))
+				submit_contexts++;
+		} else if (status & GEN8_CTX_STATUS_COMPLETE) {
+			if (check_remove_request(ring, status_id))
+				submit_contexts++;
+		}
+	}
+
+	if (submit_contexts != 0)
+		gen8_switch_context_unqueue(ring);
+
+	spin_unlock(&ring->execlist_lock);
+
+	WARN(submit_contexts > 2, "More than two context complete events?\n");
+	ring->next_context_status_buffer = write_pointer % 6;
+}
+
+static void free_request_task(struct work_struct *work)
+{
+	struct drm_i915_gem_request *req =
+			container_of(work, struct drm_i915_gem_request, work);
+	struct drm_device *dev = req->ring->dev;
+
+	mutex_lock(&dev->struct_mutex);
+	i915_gem_context_unreference(req->ctx);
+	mutex_unlock(&dev->struct_mutex);
+
+	kfree(req);
+}
+
 int gen8_switch_context_queue(struct intel_engine *ring,
 			      struct i915_hw_context *to,
 			      u32 tail)
@@ -276,6 +372,7 @@ int gen8_switch_context_queue(struct intel_engine *ring,
 	req->ctx = to;
 	i915_gem_context_reference(req->ctx);
 	req->tail = tail;
+	INIT_WORK(&req->work, free_request_task);
 
 	spin_lock_irqsave(&ring->execlist_lock, flags);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index a92bede..ee5a220 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1464,6 +1464,7 @@ static int intel_init_ring(struct drm_device *dev,
 		if (ring->status_page.page_addr == NULL)
 			return -ENOMEM;
 		ring->status_page.obj = obj;
+		ring->next_context_status_buffer = 0;
 	} else if (I915_NEED_GFX_HWS(dev)) {
 		ret = init_status_page(ring);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 5f4fd3c..daca04e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -173,6 +173,7 @@ struct intel_engine {
 
 	struct i915_hw_context *default_context;
 	struct i915_hw_context *last_context;
+	u8 next_context_status_buffer;
 
 	struct intel_ring_hangcheck hangcheck;
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 44/49] drm/i915/bdw: Display execlists info in debugfs
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (42 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 43/49] drm/i915/bdw: Handle context switch events oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-04-07 19:19   ` Damien Lespiau
  2014-03-27 18:00 ` [PATCH 45/49] drm/i915/bdw: Display context ringbuffer " oscar.mateo
                   ` (5 subsequent siblings)
  49 siblings, 1 reply; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 65 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h     |  1 +
 drivers/gpu/drm/i915/i915_lrc.c     |  8 +----
 drivers/gpu/drm/i915/i915_reg.h     |  7 ++++
 4 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 8b06acb..226b630 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1690,6 +1690,70 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	return 0;
 }
 
+static int i915_execlists(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct intel_engine *ring;
+	u32 status_pointer;
+	u8 read_pointer;
+	u8 write_pointer;
+	u32 status;
+	u32 ctx_id;
+	struct list_head *cursor;
+	struct drm_i915_gem_request *head_req;
+	int unused, i;
+
+	for_each_active_ring(ring, dev_priv, unused) {
+		int count = 0;
+
+		seq_printf(m, "%s\n", ring->name);
+
+		status = I915_READ(RING_EXECLIST_STATUS(ring));
+		ctx_id = I915_READ(RING_EXECLIST_STATUS(ring) + 4);
+		seq_printf(m, "\tExeclist status: 0x%08X, context: %u\n",
+				status, ctx_id);
+
+		status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
+		seq_printf(m, "\tStatus pointer: 0x%08X\n", status_pointer);
+
+		read_pointer = ring->next_context_status_buffer;
+		write_pointer = status_pointer & 0x07;
+		if (read_pointer > write_pointer)
+			write_pointer += 6;
+		seq_printf(m, "\tRead pointer: 0x%08X, write pointer 0x%08X\n",
+				read_pointer, write_pointer);
+
+		for (i = 0; i < 6; i++) {
+			status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) + 8*i);
+			ctx_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) + 8*i + 4);
+
+			seq_printf(m, "\tStatus buffer %d: 0x%08X, context: %u\n",
+					i, status, ctx_id);
+		}
+
+		list_for_each(cursor, &ring->execlist_queue) {
+			count++;
+		}
+		seq_printf(m, "\t%d requests in queue\n", count);
+
+		if (count > 0) {
+			head_req = list_first_entry(&ring->execlist_queue,
+					struct drm_i915_gem_request, execlist_link);
+			seq_printf(m, "\tHead request id: %u\n",
+					get_submission_id(head_req->ctx));
+			seq_printf(m, "\tHead request seqno: %u\n", head_req->seqno);
+			seq_printf(m, "\tHead request tail: %u\n", head_req->tail);
+
+		}
+
+		seq_putc(m, '\n');
+	}
+
+	return 0;
+}
+
 static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -3785,6 +3849,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_opregion", i915_opregion, 0},
 	{"i915_gem_framebuffer", i915_gem_framebuffer_info, 0},
 	{"i915_context_status", i915_context_status, 0},
+	{"i915_execlists", i915_execlists, 0},
 	{"i915_gen6_forcewake_count", i915_gen6_forcewake_count_info, 0},
 	{"i915_swizzle_info", i915_swizzle_info, 0},
 	{"i915_ppgtt_info", i915_ppgtt_info, 0},
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4c8cf52..5164f84 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2347,6 +2347,7 @@ int gen8_switch_context_queue(struct intel_engine *ring,
 			      struct i915_hw_context *to,
 			      u32 tail);
 void gen8_handle_context_events(struct intel_engine *ring);
+inline u32 get_submission_id(struct i915_hw_context *ctx);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 440da11..025dae7 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -45,12 +45,6 @@
 
 #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
 
-#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
-#define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
-#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
-#define RING_CONTEXT_STATUS_BUF(ring)	((ring)->mmio_base+0x370)
-#define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
-
 #define RING_EXECLIST_QFULL		(1 << 0x2)
 #define RING_EXECLIST1_VALID		(1 << 0x3)
 #define RING_EXECLIST0_VALID		(1 << 0x4)
@@ -116,7 +110,7 @@ enum {
 #define GEN8_CTX_LRCA_SHIFT 12
 #define GEN8_CTX_UNUSED_SHIFT 32
 
-static inline u32 get_submission_id(struct i915_hw_context *ctx)
+inline u32 get_submission_id(struct i915_hw_context *ctx)
 {
 	struct drm_i915_file_private *file_priv = ctx->file_priv;
 	u32 submission_id;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 117825e..b36da4f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -113,6 +113,13 @@
 #define GEN8_RING_PDP_UDW(ring, n)	((ring)->mmio_base+0x270 + ((n) * 8 + 4))
 #define GEN8_RING_PDP_LDW(ring, n)	((ring)->mmio_base+0x270 + (n) * 8)
 
+/* Execlists regs */
+#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
+#define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
+#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
+#define RING_CONTEXT_STATUS_BUF(ring)	((ring)->mmio_base+0x370)
+#define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
+
 #define GAM_ECOCHK			0x4090
 #define   ECOCHK_SNB_BIT		(1<<10)
 #define   HSW_ECOCHK_ARB_PRIO_SOL	(1<<6)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 45/49] drm/i915/bdw: Display context ringbuffer info in debugfs
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (43 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 44/49] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 46/49] drm/i915/bdw: Start queueing contexts to be submitted oscar.mateo
                   ` (4 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 226b630..c52108d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1648,6 +1648,14 @@ static int i915_gem_framebuffer_info(struct seq_file *m, void *data)
 
 	return 0;
 }
+static void describe_ctx_ringbuf(struct seq_file *m, struct i915_hw_context *ctx)
+{
+	struct intel_ringbuffer *ringbuf = ctx->ringbuf;
+
+	seq_printf(m, " (ringbuffer type: %d, space: %d, head: %u, tail: %u, last head: %d)",
+			ctx->ring_id, ringbuf->space, ringbuf->head, ringbuf->tail,
+			ringbuf->last_retired_head);
+}
 
 static int i915_context_status(struct seq_file *m, void *unused)
 {
@@ -1682,6 +1690,8 @@ static int i915_context_status(struct seq_file *m, void *unused)
 				seq_printf(m, "(default context %s) ", ring->name);
 
 		describe_obj(m, ctx->obj);
+		if (dev_priv->lrc_enabled)
+			describe_ctx_ringbuf(m, ctx);
 		seq_putc(m, '\n');
 	}
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 46/49] drm/i915/bdw: Start queueing contexts to be submitted
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (44 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 45/49] drm/i915/bdw: Display context ringbuffer " oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 47/49] drm/i915/bdw: Always write seqno to default context oscar.mateo
                   ` (3 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

Finally, start queueing request on write_tail. Also, remove remaining
legacy context switches.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c            |  9 ++++++---
 drivers/gpu/drm/i915/i915_gem_context.c    | 10 ++++++----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  8 +++++---
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  5 ++++-
 4 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 10bb50f..6b8be10 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2798,9 +2798,12 @@ int i915_gpu_idle(struct drm_device *dev)
 
 	/* Flush everything onto the inactive list. */
 	for_each_active_ring(ring, dev_priv, i) {
-		ret = i915_switch_context(ring, NULL, ring->default_context);
-		if (ret)
-			return ret;
+		if (!dev_priv->lrc_enabled) {
+			ret = i915_switch_context(ring, NULL,
+					ring->default_context);
+			if (ret)
+				return ret;
+		}
 
 		ret = intel_ring_idle(ring);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 1322e00..828c2a4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -501,10 +501,12 @@ int i915_gem_context_enable(struct drm_i915_private *dev_priv)
 
 	BUG_ON(!dev_priv->ring[RCS].default_context);
 
-	for_each_active_ring(ring, dev_priv, i) {
-		ret = do_switch(ring, ring->default_context);
-		if (ret)
-			return ret;
+	if (!dev_priv->lrc_enabled) {
+		for_each_active_ring(ring, dev_priv, i) {
+			ret = do_switch(ring, ring->default_context);
+			if (ret)
+				return ret;
+		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 72bda74..fae55c1 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1228,9 +1228,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	if (ret)
 		goto err;
 
-	ret = i915_switch_context(ring, file, ctx);
-	if (ret)
-		goto err;
+	if (!dev_priv->lrc_enabled) {
+		ret = i915_switch_context(ring, file, ctx);
+		if (ret)
+			goto err;
+	}
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    mode != dev_priv->relative_constants_mode) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ee5a220..9a6775d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -422,7 +422,10 @@ static void gen8_write_tail_lrc(struct intel_engine *ring,
 				struct i915_hw_context *ctx,
 				u32 value)
 {
-	DRM_ERROR("Execlists still not ready!\n");
+	if (WARN_ON(ctx == NULL))
+		ctx = ring->default_context;
+
+	gen8_switch_context_queue(ring, ctx, value);
 }
 
 u32 intel_ring_get_active_head(struct intel_engine *ring)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 47/49] drm/i915/bdw: Always write seqno to default context
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (45 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 46/49] drm/i915/bdw: Start queueing contexts to be submitted oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 48/49] drm/i915/bdw: Enable logical ring contexts oscar.mateo
                   ` (2 subsequent siblings)
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Oscar Mateo <oscar.mateo@intel.com>

Even though we have one Hardware Status Page per context, we are still
managing the seqnos per engine. Therefore, the sequence number must be
written to a consistent place for all contexts: one of the global
default contexts.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v2: Since get_seqno and set_seqno now look for the seqno in the engine's
status page, they don't need to be changed.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c | 68 +++++++++++++++++++++++++++++++--
 2 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b36da4f..002b513 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -261,6 +261,7 @@
 #define   MI_FORCE_RESTORE		(1<<1)
 #define   MI_RESTORE_INHIBIT		(1<<0)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
+#define MI_STORE_DWORD_IMM_GEN8	MI_INSTR(0x20, 2)
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 965+ only */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
 #define   MI_STORE_DWORD_INDEX_SHIFT 2
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 9a6775d..824c0859 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -738,6 +738,62 @@ gen6_add_request(struct intel_engine *ring,
 }
 
 static int
+gen8_nonrender_add_request_lrc(struct intel_engine *ring,
+			       struct i915_hw_context *ctx)
+{
+	struct intel_ringbuffer *ringbuf;
+	u32 cmd;
+
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 6);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
+
+	cmd = MI_FLUSH_DW + 1;
+	cmd |= MI_INVALIDATE_TLB;
+	cmd |= MI_FLUSH_DW_OP_STOREDW;
+
+	intel_ringbuffer_emit(ringbuf, cmd);
+	intel_ringbuffer_emit(ringbuf,
+			((i915_gem_obj_ggtt_offset(ring->default_context->obj)) +
+			(I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)) |
+			MI_FLUSH_DW_USE_GTT);
+	intel_ringbuffer_emit(ringbuf, 0); /* upper addr */
+	intel_ringbuffer_emit(ringbuf, ring->outstanding_lazy_seqno);
+	intel_ringbuffer_emit(ringbuf, MI_USER_INTERRUPT);
+	intel_ringbuffer_emit(ringbuf, MI_NOOP);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
+
+	return 0;
+}
+
+static int
+gen8_add_request_lrc(struct intel_engine *ring,
+		     struct i915_hw_context *ctx)
+{
+	struct intel_ringbuffer *ringbuf;
+	u32 cmd;
+
+	ringbuf = intel_ringbuffer_begin(ring, ctx, 6);
+	if (IS_ERR_OR_NULL(ringbuf))
+		return (PTR_ERR(ringbuf));
+
+	cmd = MI_STORE_DWORD_IMM_GEN8;
+	cmd |= (1 << 22); /* use global GTT */
+
+	intel_ringbuffer_emit(ringbuf, cmd);
+	intel_ringbuffer_emit(ringbuf,
+			((i915_gem_obj_ggtt_offset(ring->default_context->obj)) +
+			(I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)));
+	intel_ringbuffer_emit(ringbuf, 0); /* upper addr */
+	intel_ringbuffer_emit(ringbuf, ring->outstanding_lazy_seqno);
+	intel_ringbuffer_emit(ringbuf, MI_USER_INTERRUPT);
+	intel_ringbuffer_emit(ringbuf, MI_NOOP);
+	intel_ringbuffer_advance_and_submit(ring, ctx);
+
+	return 0;
+}
+
+static int
 gen8_add_request(struct intel_engine *ring,
 		 struct i915_hw_context *ctx)
 {
@@ -2027,13 +2083,14 @@ int intel_init_render_ring(struct drm_device *dev)
 		if (INTEL_INFO(dev)->gen == 6)
 			ring->flush = gen6_render_ring_flush;
 		if (INTEL_INFO(dev)->gen >= 8) {
+			ring->add_request = gen8_add_request;
 			if (dev_priv->lrc_enabled) {
 				ring->write_tail = gen8_write_tail_lrc;
 				ring->init = init_render_ring_lrc;
 				ring->irq_keep_mask =
 				GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT;
+				ring->add_request = gen8_add_request_lrc;
 			}
-			ring->add_request = gen8_add_request;
 			ring->flush = gen8_render_ring_flush;
 			ring->irq_get = gen8_ring_get_irq;
 			ring->irq_put = gen8_ring_put_irq;
@@ -2202,14 +2259,15 @@ int intel_init_bsd_ring(struct drm_device *dev)
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (INTEL_INFO(dev)->gen >= 8) {
+			ring->add_request = gen8_add_request;
 			if (dev_priv->lrc_enabled) {
 				ring->write_tail = gen8_write_tail_lrc;
 				ring->init = init_ring_common_lrc;
 				ring->irq_keep_mask =
 				GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
+				ring->add_request = gen8_nonrender_add_request_lrc;
 			}
 			ring->flush = gen8_ring_flush;
-			ring->add_request = gen8_add_request;
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 			ring->irq_get = gen8_ring_get_irq;
@@ -2264,14 +2322,15 @@ int intel_init_blt_ring(struct drm_device *dev)
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
+		ring->add_request = gen8_add_request;
 		if (dev_priv->lrc_enabled) {
 			ring->write_tail = gen8_write_tail_lrc;
 			ring->init = init_ring_common_lrc;
 			ring->irq_keep_mask =
 			GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
+			ring->add_request = gen8_nonrender_add_request_lrc;
 		}
 		ring->flush = gen8_ring_flush;
-		ring->add_request = gen8_add_request;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
@@ -2308,14 +2367,15 @@ int intel_init_vebox_ring(struct drm_device *dev)
 	ring->get_seqno = gen6_ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
+		ring->add_request = gen8_add_request;
 		if (dev_priv->lrc_enabled) {
 			ring->write_tail = gen8_write_tail_lrc;
 			ring->init = init_ring_common_lrc;
 			ring->irq_keep_mask =
 			GEN8_GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
+			ring->add_request = gen8_nonrender_add_request_lrc;
 		}
 		ring->flush = gen8_ring_flush;
-		ring->add_request = gen8_add_request;
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 		ring->irq_get = gen8_ring_get_irq;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 48/49] drm/i915/bdw: Enable logical ring contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (46 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 47/49] drm/i915/bdw: Always write seqno to default context oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-03-27 18:00 ` [PATCH 49/49] drm/i915/bdw: Document execlists and " oscar.mateo
  2014-04-07 18:12 ` [PATCH 00/49] Execlists Damien Lespiau
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

The time has come, the Walrus said, to talk of many things.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5164f84..bf03ea5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1852,7 +1852,7 @@ struct drm_i915_cmd_table {
 #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
 
 #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
-#define HAS_LOGICAL_RING_CONTEXTS(dev)	0
+#define HAS_LOGICAL_RING_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 8)
 #define HAS_ALIASING_PPGTT(dev)	(INTEL_INFO(dev)->gen >= 6 && !IS_VALLEYVIEW(dev))
 #define HAS_PPGTT(dev)		(INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) \
 				 && !IS_BROADWELL(dev))
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 49/49] drm/i915/bdw: Document execlists and logical ring contexts
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (47 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 48/49] drm/i915/bdw: Enable logical ring contexts oscar.mateo
@ 2014-03-27 18:00 ` oscar.mateo
  2014-04-07 18:12 ` [PATCH 00/49] Execlists Damien Lespiau
  49 siblings, 0 replies; 85+ messages in thread
From: oscar.mateo @ 2014-03-27 18:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Daniel

From: Oscar Mateo <oscar.mateo@intel.com>

Explain i915_lrc.c with some execlists notes

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>

v2: Add notes on logical ring context creation.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_lrc.c | 78 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
index 025dae7..521abe9 100644
--- a/drivers/gpu/drm/i915/i915_lrc.c
+++ b/drivers/gpu/drm/i915/i915_lrc.c
@@ -33,8 +33,86 @@
  * These expanded contexts enable a number of new abilities, especially
  * "Execlists" (also implemented in this file).
  *
+ * One of the main differences with the legacy HW contexts is that logical
+ * ring contexts incorporate many more things to the context's state, like
+ * PDPs or ringbuffer control registers.
+ *
+ * Regarding the creation of contexts, we had before:
+ *
+ * - One global default context.
+ * - One local default context for each opened fd.
+ * - One extra context for each context create ioctl call.
+ *
+ * Now that ringbuffers belong per-context (and not per-engine, like before) and
+ * that contexts are uniquely tied to a given engine (and not reusable, like
+ * before) we need:
+ *
+ * - One global default context for each engine.
+ * - Up to "no. of engines" local default contexts for each opened fd.
+ * - Up to "no. of engines" extra local contexts for each context create ioctl.
+ *
+ * Given that at creation time of a non-global context we don't know which
+ * engine is going to use it, we have implemented a deferred creation of
+ * LR contexts: the local default context starts its life as a hollow or
+ * blank holder, that gets populated once we receive an execbuffer ioctl on
+ * that fd. If later on we receive another execbuffer ioctl for a different
+ * engine, we create a second local default context and so on. The same rules
+ * apply to create context.
+ *
  * Execlists are the new method by which, on gen8+ hardware, workloads are
  * submitted for execution (as opposed to the legacy, ringbuffer-based, method).
+ * This method works as follows:
+ *
+ * When a request is committed, its commands (the BB start and any leading or
+ * trailing commands, like the seqno breadcrumbs) are placed in the ringbuffer
+ * for the appropriate context. The tail pointer in the hardware context is not
+ * updated at this time, but instead, kept by the driver in the ringbuffer
+ * structure. A structure representing this request is added to a request queue
+ * for the appropriate engine: this structure contains a copy of the context's
+ * tail after the request was written to the ring buffer and a pointer to the
+ * context itself.
+ *
+ * If the engine's request queue was empty before the request was added, the
+ * queue is processed immediately. Otherwise the queue will be processed during
+ * a context switch interrupt. In any case, elements on the queue will get sent
+ * (in pairs) to the GPU's ExecLists Submit Port (ELSP, for short) with a
+ * globally unique 20-bits context ID (constructed with the fd's ID, plus our
+ * own context ID, plus the engine's ID).
+ *
+ * When execution of a request completes, the GPU updates the context status
+ * buffer with a context complete event and generates a context switch interrupt.
+ * During context switch interrupt handling, the driver examines the context
+ * status events in the context status buffer: for each context complete event,
+ * if the announced ID matches that on the head of the request queue then that
+ * request is retired and removed from the queue.
+ *
+ * After processing, if any requests were retired and the queue is not empty
+ * then a new execution list can be submitted. The two requests at the front of
+ * the queue are next to be submitted but since a context may not occur twice in
+ * an execution list, if subsequent requests have the same ID as the first then
+ * the two requests must be combined. This is done simply by discarding requests
+ * at the head of the queue until either only one requests is left (in which case
+ * we use a NULL second context) or the first two requests have unique IDs.
+ *
+ * By always executing the first two requests in the queue the driver ensures
+ * that the GPU is kept as busy as possible. In the case where a single context
+ * completes but a second context is still executing, the request for the second
+ * context will be at the head of the queue when we remove the first one. This
+ * request will then be resubmitted along with a new request for a different context,
+ * which will cause the hardware to continue executing the second request and queue
+ * the new request (the GPU detects the condition of a context getting preempted
+ * with the same context and optimizes the context switch flow by not doing
+ * preemption, but just sampling the new tail pointer).
+ *
+ * Because the GPU continues to execute while the context switch interrupt is being
+ * handled, there is a race condition where a second context completes while
+ * handling the completion of the previous. This results in the second context being
+ * resubmitted (potentially along with a third), and an extra context complete event
+ * for that context will occur. The request will be removed from the queue at the
+ * first context complete event, and the second context complete event will not
+ * result in removal of a request from the queue because the IDs of the request
+ * and the event will not match.
+ *
  */
 
 #include <drm/drmP.h>
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style
  2014-03-27 18:00 ` [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style oscar.mateo
@ 2014-03-31 16:42   ` Damien Lespiau
  2014-04-01 13:42     ` Mateo Lozano, Oscar
  2014-04-02 13:47   ` Damien Lespiau
  1 sibling, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-03-31 16:42 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Thu, Mar 27, 2014 at 06:00:08PM +0000, oscar.mateo@intel.com wrote:
> +	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> +	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
> +	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
> +	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
> +	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
> +	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
> +	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
> +	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];

Compiling a 32bits kernel whithout HIGHMEM64G gives:

drivers/gpu/drm/i915/i915_lrc.c: In function ‘gen8_write_pdp_ctx’:
drivers/gpu/drm/i915/i915_lrc.c:286:2: warning: right shift count >=
width of type [enabled by default]
  reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;

Turns out dma_addr_t can be 32bits if configured without 64bits support
on 32bits kernels:

#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
typedef u64 dma_addr_t;
#else           
typedef u32 dma_addr_t;
#endif /* dma_addr_t *

and

config ARCH_DMA_ADDR_T_64BIT
        def_bool y
        depends on X86_64 || HIGHMEM64G

-- 
Damien
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-03-27 17:59 ` [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
@ 2014-04-01  0:00   ` Damien Lespiau
  2014-04-01 13:33     ` Mateo Lozano, Oscar
  2014-04-15 16:00   ` Jeff McGee
  1 sibling, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-01  0:00 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Thu, Mar 27, 2014 at 05:59:48PM +0000, oscar.mateo@intel.com wrote:
> +	if (ring->id == RCS)
> +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
> +	else
> +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);

In the "Register State Context", this header is actually given in hex,
but has bit 12 set, shouldn't we do the same here?

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini
  2014-03-27 17:59 ` [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
@ 2014-04-01  0:38   ` Damien Lespiau
  2014-04-01 13:47     ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-01  0:38 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Thu, Mar 27, 2014 at 05:59:46PM +0000, oscar.mateo@intel.com wrote:
> --- a/drivers/gpu/drm/i915/i915_lrc.c
> +++ b/drivers/gpu/drm/i915/i915_lrc.c
> @@ -41,7 +41,45 @@
>  #include <drm/i915_drm.h>
>  #include "i915_drv.h"
>  
> +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)

I'm a bit puzzled by that number:
  - I found a sentence saying: "the Context Image for the rendering
    engine consists of 20 4K pages", which seems that it includes the
    HWS page (on the same page it says context layout = HWS Page +
    register state context).
  - When looking at the register state context for the render engine:
    18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add
    the HWS Page)
  - Clearly I must be missing something :)
  - That's only for the render engine, other engines have a much smaller
    context, smaller enough that it's worth looking at their exact size.
  - It'd be nice to work out the real size from the *CXT_*SIZE
    registers.

All of this can be refinement patches on top I guess, might have a look
at it.

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-04-01  0:00   ` Damien Lespiau
@ 2014-04-01 13:33     ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-04-01 13:33 UTC (permalink / raw)
  To: Lespiau, Damien; +Cc: intel-gfx, Ben Widawsky, Widawsky, Benjamin

You are right on the money: it looks like I am missing the "Force Posted" bit. I´ll add in the next patch series version.

Thanks,
Oscar

> -----Original Message-----
> From: Lespiau, Damien
> Sent: Tuesday, April 01, 2014 1:01 AM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org; Ben Widawsky; Widawsky, Benjamin
> Subject: Re: [Intel-gfx] [PATCH 19/49] drm/i915/bdw: Populate LR contexts
> (somewhat)
> 
> On Thu, Mar 27, 2014 at 05:59:48PM +0000, oscar.mateo@intel.com wrote:
> > +	if (ring->id == RCS)
> > +		reg_state[CTX_LRI_HEADER_0] =
> MI_LOAD_REGISTER_IMM(14);
> > +	else
> > +		reg_state[CTX_LRI_HEADER_0] =
> MI_LOAD_REGISTER_IMM(11);
> 
> In the "Register State Context", this header is actually given in hex, but has bit
> 12 set, shouldn't we do the same here?
> 
> --
> Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style
  2014-03-31 16:42   ` Damien Lespiau
@ 2014-04-01 13:42     ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-04-01 13:42 UTC (permalink / raw)
  To: Lespiau, Damien; +Cc: intel-gfx

Bummer. I´ll fix it on the next version.

Thanks!
Oscar


> -----Original Message-----
> From: Lespiau, Damien
> Sent: Monday, March 31, 2014 5:43 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs,
> LRC style
> 
> On Thu, Mar 27, 2014 at 06:00:08PM +0000, oscar.mateo@intel.com wrote:
> > +	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> > +	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
> > +	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
> > +	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
> > +	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
> > +	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
> > +	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
> > +	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
> 
> Compiling a 32bits kernel whithout HIGHMEM64G gives:
> 
> drivers/gpu/drm/i915/i915_lrc.c: In function ‘gen8_write_pdp_ctx’:
> drivers/gpu/drm/i915/i915_lrc.c:286:2: warning: right shift count >= width of
> type [enabled by default]
>   reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> 
> Turns out dma_addr_t can be 32bits if configured without 64bits support on
> 32bits kernels:
> 
> #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> typedef u64 dma_addr_t;
> #else
> typedef u32 dma_addr_t;
> #endif /* dma_addr_t *
> 
> and
> 
> config ARCH_DMA_ADDR_T_64BIT
>         def_bool y
>         depends on X86_64 || HIGHMEM64G
> 
> --
> Damien
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini
  2014-04-01  0:38   ` Damien Lespiau
@ 2014-04-01 13:47     ` Mateo Lozano, Oscar
  2014-04-01 13:51       ` Damien Lespiau
  0 siblings, 1 reply; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-04-01 13:47 UTC (permalink / raw)
  To: Lespiau, Damien; +Cc: intel-gfx, Ben Widawsky, Widawsky, Benjamin

> > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > @@ -41,7 +41,45 @@
> >  #include <drm/i915_drm.h>
> >  #include "i915_drv.h"
> >
> > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> 
> I'm a bit puzzled by that number:
>   - I found a sentence saying: "the Context Image for the rendering
>     engine consists of 20 4K pages", which seems that it includes the
>     HWS page (on the same page it says context layout = HWS Page +
>     register state context).
>   - When looking at the register state context for the render engine:
>     18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add
>     the HWS Page)
>   - Clearly I must be missing something :)
>   - That's only for the render engine, other engines have a much smaller
>     context, smaller enough that it's worth looking at their exact size.
>   - It'd be nice to work out the real size from the *CXT_*SIZE
>     registers.

Hmmmm... I´ll try to get the real context sizes from the registers and compare. At least for RCS, VCS and BCS since there doesn´t seem to be a register for VECS?

-- Oscar

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini
  2014-04-01 13:47     ` Mateo Lozano, Oscar
@ 2014-04-01 13:51       ` Damien Lespiau
  2014-04-01 19:18         ` Ben Widawsky
  0 siblings, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-01 13:51 UTC (permalink / raw)
  To: Mateo Lozano, Oscar; +Cc: intel-gfx, Ben Widawsky, Widawsky, Benjamin

On Tue, Apr 01, 2014 at 02:47:19PM +0100, Mateo Lozano, Oscar wrote:
> > > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > > @@ -41,7 +41,45 @@
> > >  #include <drm/i915_drm.h>
> > >  #include "i915_drv.h"
> > >
> > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> > 
> > I'm a bit puzzled by that number:
> >   - I found a sentence saying: "the Context Image for the rendering
> >     engine consists of 20 4K pages", which seems that it includes the
> >     HWS page (on the same page it says context layout = HWS Page +
> >     register state context).
> >   - When looking at the register state context for the render engine:
> >     18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add
> >     the HWS Page)
> >   - Clearly I must be missing something :)
> >   - That's only for the render engine, other engines have a much smaller
> >     context, smaller enough that it's worth looking at their exact size.
> >   - It'd be nice to work out the real size from the *CXT_*SIZE
> >     registers.
> 
> Hmmmm... I´ll try to get the real context sizes from the registers and
> compare. At least for RCS, VCS and BCS since there doesn´t seem to be
> a register for VECS?

Couldn't find it either. I guess we'll need to ask the help of a friend.
Or the 50/50 joker maybe.

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini
  2014-04-01 13:51       ` Damien Lespiau
@ 2014-04-01 19:18         ` Ben Widawsky
  2014-04-01 21:05           ` Damien Lespiau
  0 siblings, 1 reply; 85+ messages in thread
From: Ben Widawsky @ 2014-04-01 19:18 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: intel-gfx, Widawsky, Benjamin

On Tue, Apr 01, 2014 at 02:51:27PM +0100, Damien Lespiau wrote:
> On Tue, Apr 01, 2014 at 02:47:19PM +0100, Mateo Lozano, Oscar wrote:
> > > > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > > > @@ -41,7 +41,45 @@
> > > >  #include <drm/i915_drm.h>
> > > >  #include "i915_drv.h"
> > > >
> > > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> > > 
> > > I'm a bit puzzled by that number:
> > >   - I found a sentence saying: "the Context Image for the rendering
> > >     engine consists of 20 4K pages", which seems that it includes the
> > >     HWS page (on the same page it says context layout = HWS Page +
> > >     register state context).
> > >   - When looking at the register state context for the render engine:
> > >     18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add
> > >     the HWS Page)
> > >   - Clearly I must be missing something :)
> > >   - That's only for the render engine, other engines have a much smaller
> > >     context, smaller enough that it's worth looking at their exact size.
> > >   - It'd be nice to work out the real size from the *CXT_*SIZE
> > >     registers.
> > 
> > Hmmmm... I´ll try to get the real context sizes from the registers and
> > compare. At least for RCS, VCS and BCS since there doesn´t seem to be
> > a register for VECS?
> 
> Couldn't find it either. I guess we'll need to ask the help of a friend.
> Or the 50/50 joker maybe.
> 
> -- 
> Damien

CXT_SIZE is total garbage on anything past Ivybridge. That's why we
don't use it for HSW either... I know, right? We should request the spec
get updated. I have no excuse for not requesting that sooner.

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini
  2014-04-01 19:18         ` Ben Widawsky
@ 2014-04-01 21:05           ` Damien Lespiau
  2014-04-02  4:07             ` Ben Widawsky
  0 siblings, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-01 21:05 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Widawsky, Benjamin

On Tue, Apr 01, 2014 at 12:18:24PM -0700, Ben Widawsky wrote:
> On Tue, Apr 01, 2014 at 02:51:27PM +0100, Damien Lespiau wrote:
> > On Tue, Apr 01, 2014 at 02:47:19PM +0100, Mateo Lozano, Oscar wrote:
> > > > > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > > > > @@ -41,7 +41,45 @@
> > > > >  #include <drm/i915_drm.h>
> > > > >  #include "i915_drv.h"
> > > > >
> > > > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> > > > 
> > > > I'm a bit puzzled by that number:
> > > >   - I found a sentence saying: "the Context Image for the rendering
> > > >     engine consists of 20 4K pages", which seems that it includes the
> > > >     HWS page (on the same page it says context layout = HWS Page +
> > > >     register state context).
> > > >   - When looking at the register state context for the render engine:
> > > >     18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add
> > > >     the HWS Page)
> > > >   - Clearly I must be missing something :)
> > > >   - That's only for the render engine, other engines have a much smaller
> > > >     context, smaller enough that it's worth looking at their exact size.
> > > >   - It'd be nice to work out the real size from the *CXT_*SIZE
> > > >     registers.
> > > 
> > > Hmmmm... I´ll try to get the real context sizes from the registers and
> > > compare. At least for RCS, VCS and BCS since there doesn´t seem to be
> > > a register for VECS?
> > 
> > Couldn't find it either. I guess we'll need to ask the help of a friend.
> > Or the 50/50 joker maybe.
> > 
> > -- 
> > Damien
> 
> CXT_SIZE is total garbage on anything past Ivybridge. That's why we
> don't use it for HSW either... I know, right? We should request the spec
> get updated. I have no excuse for not requesting that sooner.

(talking about BDW only)

For the render ring:

HWSP: 4KB
Ring context: CTX_SIZE[26:24] 5 cache lines -> offsets (in DW) 0x0 to 0x4f (= 5 * 64 / 4)
Render context: CTX_SIZE[23:16] -> 0x65 caches lines -> offets (in DW) 0x50 to 0x69f (= 0x50 + 0x65 * 64 / 4 - 1)
VF/VFE context CTX_SIZE[7:0] -> 0x82 cache lines -> offsets (in DW) 0x6A0 to 0xebf (= 0x6a0 + 0x82*64/4 - 1)
Atomic storage is the max that you can allocate, 32KB ie 8192 DWords

So we're almost there. What's missing here is the RS context size, couldn't find
it in the spec :/ Maybe because that is a "well known" value.

Note that I don't actually know what we read back from hw.

Considering that the BCS context size seems to be 2 pages, I think it's worth
digging a bit more to save ~66KB per BCS context (for instance). Even if we
have to hardcode the different context sizes.

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini
  2014-04-01 21:05           ` Damien Lespiau
@ 2014-04-02  4:07             ` Ben Widawsky
  0 siblings, 0 replies; 85+ messages in thread
From: Ben Widawsky @ 2014-04-02  4:07 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: intel-gfx, Widawsky, Benjamin

On Tue, Apr 01, 2014 at 10:05:12PM +0100, Damien Lespiau wrote:
> On Tue, Apr 01, 2014 at 12:18:24PM -0700, Ben Widawsky wrote:
> > On Tue, Apr 01, 2014 at 02:51:27PM +0100, Damien Lespiau wrote:
> > > On Tue, Apr 01, 2014 at 02:47:19PM +0100, Mateo Lozano, Oscar wrote:
> > > > > > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > > > > > @@ -41,7 +41,45 @@
> > > > > >  #include <drm/i915_drm.h>
> > > > > >  #include "i915_drv.h"
> > > > > >
> > > > > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> > > > > 
> > > > > I'm a bit puzzled by that number:
> > > > >   - I found a sentence saying: "the Context Image for the rendering
> > > > >     engine consists of 20 4K pages", which seems that it includes the
> > > > >     HWS page (on the same page it says context layout = HWS Page +
> > > > >     register state context).
> > > > >   - When looking at the register state context for the render engine:
> > > > >     18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add
> > > > >     the HWS Page)
> > > > >   - Clearly I must be missing something :)
> > > > >   - That's only for the render engine, other engines have a much smaller
> > > > >     context, smaller enough that it's worth looking at their exact size.
> > > > >   - It'd be nice to work out the real size from the *CXT_*SIZE
> > > > >     registers.
> > > > 
> > > > Hmmmm... I´ll try to get the real context sizes from the registers and
> > > > compare. At least for RCS, VCS and BCS since there doesn´t seem to be
> > > > a register for VECS?
> > > 
> > > Couldn't find it either. I guess we'll need to ask the help of a friend.
> > > Or the 50/50 joker maybe.
> > > 
> > > -- 
> > > Damien
> > 
> > CXT_SIZE is total garbage on anything past Ivybridge. That's why we
> > don't use it for HSW either... I know, right? We should request the spec
> > get updated. I have no excuse for not requesting that sooner.
> 
> (talking about BDW only)
> 
> For the render ring:
> 
> HWSP: 4KB
> Ring context: CTX_SIZE[26:24] 5 cache lines -> offsets (in DW) 0x0 to 0x4f (= 5 * 64 / 4)
> Render context: CTX_SIZE[23:16] -> 0x65 caches lines -> offets (in DW) 0x50 to 0x69f (= 0x50 + 0x65 * 64 / 4 - 1)
> VF/VFE context CTX_SIZE[7:0] -> 0x82 cache lines -> offsets (in DW) 0x6A0 to 0xebf (= 0x6a0 + 0x82*64/4 - 1)
> Atomic storage is the max that you can allocate, 32KB ie 8192 DWords
> 
> So we're almost there. What's missing here is the RS context size, couldn't find
> it in the spec :/ Maybe because that is a "well known" value.
> 
> Note that I don't actually know what we read back from hw.
> 
> Considering that the BCS context size seems to be 2 pages, I think it's worth
> digging a bit more to save ~66KB per BCS context (for instance). Even if we
> have to hardcode the different context sizes.
> 
> -- 
> Damien

I guess I should have checked first. Looks like there are actually quite
a few changes since I wrote the code originally.

Carry on.

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 41/49] drm/i915/bdw: LR context switch interrupts
  2014-03-27 18:00 ` [PATCH 41/49] drm/i915/bdw: LR context switch interrupts oscar.mateo
@ 2014-04-02 11:42   ` Damien Lespiau
  2014-04-02 11:49     ` Daniel Vetter
  0 siblings, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-02 11:42 UTC (permalink / raw)
  To: oscar.mateo; +Cc: Thomas Daniel, intel-gfx

On Thu, Mar 27, 2014 at 06:00:10PM +0000, oscar.mateo@intel.com wrote:
> @@ -543,6 +545,9 @@ static int init_ring_common_lrc(struct intel_engine *ring)
>  	ringbuf->space = ringbuf->size;
>  	ringbuf->last_retired_head = -1;
>  
> +	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
> +	I915_WRITE(RING_HWSTAM(ring->mmio_base), ~(ring->irq_enable_mask | ring->irq_keep_mask));
> +
>  	return 0;
>  }
>  

Two little things:

  - I don't see any place where we look at the interrupt reporting in
    the HWS page, so we could just initialize HWSTAM to 0xffffffff

  - There's a programming note we don't seem to respect "At most 1 bit
    can be unmasked at any given time". That would be solved by the
    first point if I did not miss a call site using it.

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 41/49] drm/i915/bdw: LR context switch interrupts
  2014-04-02 11:42   ` Damien Lespiau
@ 2014-04-02 11:49     ` Daniel Vetter
  2014-04-02 12:56       ` Damien Lespiau
  0 siblings, 1 reply; 85+ messages in thread
From: Daniel Vetter @ 2014-04-02 11:49 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: Thomas Daniel, intel-gfx

On Wed, Apr 02, 2014 at 12:42:11PM +0100, Damien Lespiau wrote:
> On Thu, Mar 27, 2014 at 06:00:10PM +0000, oscar.mateo@intel.com wrote:
> > @@ -543,6 +545,9 @@ static int init_ring_common_lrc(struct intel_engine *ring)
> >  	ringbuf->space = ringbuf->size;
> >  	ringbuf->last_retired_head = -1;
> >  
> > +	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
> > +	I915_WRITE(RING_HWSTAM(ring->mmio_base), ~(ring->irq_enable_mask | ring->irq_keep_mask));
> > +
> >  	return 0;
> >  }
> >  
> 
> Two little things:
> 
>   - I don't see any place where we look at the interrupt reporting in
>     the HWS page, so we could just initialize HWSTAM to 0xffffffff

It's an old w/a to make interrupt signalling a little bit more coherent.
No idea whether we still need it since we don't really have a good
testcase for interrupts ... I guess we could give it a shot with a patch
and a big commit message citing all the history.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 41/49] drm/i915/bdw: LR context switch interrupts
  2014-04-02 11:49     ` Daniel Vetter
@ 2014-04-02 12:56       ` Damien Lespiau
  0 siblings, 0 replies; 85+ messages in thread
From: Damien Lespiau @ 2014-04-02 12:56 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Thomas Daniel, intel-gfx

On Wed, Apr 02, 2014 at 01:49:38PM +0200, Daniel Vetter wrote:
> On Wed, Apr 02, 2014 at 12:42:11PM +0100, Damien Lespiau wrote:
> > On Thu, Mar 27, 2014 at 06:00:10PM +0000, oscar.mateo@intel.com wrote:
> > > @@ -543,6 +545,9 @@ static int init_ring_common_lrc(struct intel_engine *ring)
> > >  	ringbuf->space = ringbuf->size;
> > >  	ringbuf->last_retired_head = -1;
> > >  
> > > +	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
> > > +	I915_WRITE(RING_HWSTAM(ring->mmio_base), ~(ring->irq_enable_mask | ring->irq_keep_mask));
> > > +
> > >  	return 0;
> > >  }
> > >  
> > 
> > Two little things:
> > 
> >   - I don't see any place where we look at the interrupt reporting in
> >     the HWS page, so we could just initialize HWSTAM to 0xffffffff
> 
> It's an old w/a to make interrupt signalling a little bit more coherent.
> No idea whether we still need it since we don't really have a good
> testcase for interrupts ... I guess we could give it a shot with a patch
> and a big commit message citing all the history.

Another detail, is that we're unmasking interrupts in HWSTAM that are
not supposed to be unmasked because the HW doesn't support reporting
them to the HWSP (eg. Context Switch Interrupt).

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style
  2014-03-27 18:00 ` [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style oscar.mateo
  2014-03-31 16:42   ` Damien Lespiau
@ 2014-04-02 13:47   ` Damien Lespiau
  2014-04-09  7:56     ` Mateo Lozano, Oscar
  1 sibling, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-02 13:47 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Thu, Mar 27, 2014 at 06:00:08PM +0000, oscar.mateo@intel.com wrote:
> +static int gen8_write_pdp_ctx(struct i915_hw_context *ctx,
> +				   struct i915_hw_ppgtt *ppgtt)
> +{
> +	struct page *page;
> +	uint32_t *reg_state;
> +
> +	page = i915_gem_object_get_page(ctx->obj, 1);
> +	reg_state = kmap_atomic(page);
> +
> +	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> +	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
> +	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
> +	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
> +	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
> +	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
> +	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
> +	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
> +
> +	kunmap_atomic(reg_state);
> +
> +	return 0;
> +}
> +
>  static int gen8_switch_context(struct intel_engine *ring,
>  		struct i915_hw_context *to0, u32 tail0,
>  		struct i915_hw_context *to1, u32 tail1)
>  {
> +	struct i915_hw_ppgtt *ppgtt;
> +
>  	BUG_ON(!i915_gem_obj_is_pinned(to0->obj));
>  
> -	if (to1)
> +	ppgtt = ctx_to_ppgtt(to0);
> +	gen8_write_pdp_ctx(to0, ppgtt);
> +
> +	if (to1) {
>  		BUG_ON(!i915_gem_obj_is_pinned(to1->obj));
>  
> +		ppgtt = ctx_to_ppgtt(to1);
> +		gen8_write_pdp_ctx(to1, ppgtt);
> +	}
> +

You're always calling gen8_write_pdp_ctx() and gen8_write_tail_ctx()
together, kmapping the page twice is a bit wastful.

--
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 43/49] drm/i915/bdw: Handle context switch events
  2014-03-27 18:00 ` [PATCH 43/49] drm/i915/bdw: Handle context switch events oscar.mateo
@ 2014-04-03 14:24   ` Damien Lespiau
  2014-04-09  8:15     ` Mateo Lozano, Oscar
  2014-04-26  0:53   ` Robert Beckett
  1 sibling, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-03 14:24 UTC (permalink / raw)
  To: oscar.mateo; +Cc: Thomas Daniel, intel-gfx

On Thu, Mar 27, 2014 at 06:00:12PM +0000, oscar.mateo@intel.com wrote:
> +void gen8_handle_context_events(struct intel_engine *ring)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	u32 status_pointer;
> +	u8 read_pointer;
> +	u8 write_pointer;
> +	u32 status;
> +	u32 status_id;
> +	u32 submit_contexts = 0;
> +
> +	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
> +
> +	read_pointer = ring->next_context_status_buffer;
> +	write_pointer = status_pointer & 0x07;
> +	if (read_pointer > write_pointer)
> +		write_pointer += 6;
> +
> +	spin_lock(&ring->execlist_lock);
> +
> +	while (read_pointer < write_pointer) {
> +		read_pointer++;
> +		status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> +				(read_pointer % 6) * 8);
> +		status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> +				(read_pointer % 6) * 8 + 4);
> +
> +		if (status & GEN8_CTX_STATUS_ELEMENT_SWITCH) {
> +			if (check_remove_request(ring, status_id))
> +				submit_contexts++;
> +		} else if (status & GEN8_CTX_STATUS_COMPLETE) {
> +			if (check_remove_request(ring, status_id))
> +				submit_contexts++;
> +		}
> +	}
> +
> +	if (submit_contexts != 0)
> +		gen8_switch_context_unqueue(ring);
> +
> +	spin_unlock(&ring->execlist_lock);
> +
> +	WARN(submit_contexts > 2, "More than two context complete events?\n");
> +	ring->next_context_status_buffer = write_pointer % 6;
> +}

I'm a bit suprised that we never update the read pointer in the
CONTEXT_STATUS_PTR when we consume entries from CONTEXT_STATUS_BUF.

Are we sure this field isn't used by hw at all to figure out if the
circular buffer has some free space?

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 42/49] drm/i915/bdw: Get prepared for a two-stage execlist submit process
  2014-03-27 18:00 ` [PATCH 42/49] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
@ 2014-04-04 11:12   ` Damien Lespiau
  2014-04-04 13:24     ` Damien Lespiau
  0 siblings, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-04 11:12 UTC (permalink / raw)
  To: oscar.mateo; +Cc: Thomas Daniel, intel-gfx

On Thu, Mar 27, 2014 at 06:00:11PM +0000, oscar.mateo@intel.com wrote:
> +int gen8_switch_context_queue(struct intel_engine *ring,
> +			      struct i915_hw_context *to,
> +			      u32 tail)
> +{
> +	struct drm_i915_gem_request *req = NULL;
> +	unsigned long flags;
> +	bool was_empty;
> +
> +	req = (struct drm_i915_gem_request *)
> +		kmalloc(sizeof(struct drm_i915_gem_request), GFP_KERNEL);
> +	req->ring = ring;
> +	req->ctx = to;
> +	i915_gem_context_reference(req->ctx);
> +	req->tail = tail;

Need to test if the allocation has succeeded and return an error if not.

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 42/49] drm/i915/bdw: Get prepared for a two-stage execlist submit process
  2014-04-04 11:12   ` Damien Lespiau
@ 2014-04-04 13:24     ` Damien Lespiau
  2014-04-09  7:57       ` Mateo Lozano, Oscar
  0 siblings, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-04 13:24 UTC (permalink / raw)
  To: oscar.mateo; +Cc: Thomas Daniel, intel-gfx

On Fri, Apr 04, 2014 at 12:12:35PM +0100, Damien Lespiau wrote:
> On Thu, Mar 27, 2014 at 06:00:11PM +0000, oscar.mateo@intel.com wrote:
> > +int gen8_switch_context_queue(struct intel_engine *ring,
> > +			      struct i915_hw_context *to,
> > +			      u32 tail)
> > +{
> > +	struct drm_i915_gem_request *req = NULL;
> > +	unsigned long flags;
> > +	bool was_empty;
> > +
> > +	req = (struct drm_i915_gem_request *)
> > +		kmalloc(sizeof(struct drm_i915_gem_request), GFP_KERNEL);
> > +	req->ring = ring;
> > +	req->ctx = to;
> > +	i915_gem_context_reference(req->ctx);
> > +	req->tail = tail;
> 
> Need to test if the allocation has succeeded and return an error if not.

Also, no need for the cast, kmalloc returns a void *.

We usually use sizeof(*req) to be safe against variables changing type
without updating the allocation size (someont the compiler wouldn't warn
about).

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 00/49] Execlists
  2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
                   ` (48 preceding siblings ...)
  2014-03-27 18:00 ` [PATCH 49/49] drm/i915/bdw: Document execlists and " oscar.mateo
@ 2014-04-07 18:12 ` Damien Lespiau
  2014-04-07 21:32   ` Daniel Vetter
  49 siblings, 1 reply; 85+ messages in thread
From: Damien Lespiau @ 2014-04-07 18:12 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Thu, Mar 27, 2014 at 05:59:29PM +0000, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> Hi all,
> 
> This patch series implement execlists for GEN8+. Before continuing, it
> is important to mention that I might have taken upon myself to
> assemble the series and rewrite it for upstreaming, but many people
> have worked on this series before me.

Two more things before I forget:

  - The error reporting seems broken and doesn't report the ring buffers
    content nor the correct per-ring register, from a cursory glance
    (fairly important, we can't go live without good error state
    support)

  - We could allocate the ring buffer in the same allocation as the
    context, that'd save one allocation (small detail)

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 44/49] drm/i915/bdw: Display execlists info in debugfs
  2014-03-27 18:00 ` [PATCH 44/49] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
@ 2014-04-07 19:19   ` Damien Lespiau
  0 siblings, 0 replies; 85+ messages in thread
From: Damien Lespiau @ 2014-04-07 19:19 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx

On Thu, Mar 27, 2014 at 06:00:13PM +0000, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2347,6 +2347,7 @@ int gen8_switch_context_queue(struct intel_engine *ring,
>  			      struct i915_hw_context *to,
>  			      u32 tail);
>  void gen8_handle_context_events(struct intel_engine *ring);
> +inline u32 get_submission_id(struct i915_hw_context *ctx);

More verbose warnings tell us we can't inline this function outside of
its compilation unit as its body is only accessible in i915_lrc.c. I'd
just scrap the inline here.

Also, we need to namespace that symbol.

-- 
Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 00/49] Execlists
  2014-04-07 18:12 ` [PATCH 00/49] Execlists Damien Lespiau
@ 2014-04-07 21:32   ` Daniel Vetter
  0 siblings, 0 replies; 85+ messages in thread
From: Daniel Vetter @ 2014-04-07 21:32 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: intel-gfx

On Mon, Apr 7, 2014 at 8:12 PM, Damien Lespiau <damien.lespiau@intel.com> wrote:
>> This patch series implement execlists for GEN8+. Before continuing, it
>> is important to mention that I might have taken upon myself to
>> assemble the series and rewrite it for upstreaming, but many people
>> have worked on this series before me.
>
> Two more things before I forget:
>
>   - The error reporting seems broken and doesn't report the ring buffers
>     content nor the correct per-ring register, from a cursory glance
>     (fairly important, we can't go live without good error state
>     support)

I think we should finally go ahead an have a somewhat functional
testcase for the error state dumper. Here's what I have in mind which
at least guarantees that shit works at a _very_ basic level:

1. Allocate gem object, use it as a noop batch.
2. Take note of the presumed offset returned by the kernel
3. Write some cookie value at the end of the noop batch.
4. Hang rings with the igt/debugfs stop rings architecture.
5. Launch batch, block on completion.
6. Check that the hang was properly detected (i.e. stop_rings cleared).
7. Read error state and check that the batch objects for your ring
contains the cookie and check that the ring objects contains a reloc
with MI_BB_START for your presumed batch object's address. The kernel
shouldn't have moved the object around.

Repeat as subtest for all rings like we do with all the other per-ring tests.

I'll do a JIRA out of this tomorrow blocking execlists, so consider
yourself signed up ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style
  2014-04-02 13:47   ` Damien Lespiau
@ 2014-04-09  7:56     ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-04-09  7:56 UTC (permalink / raw)
  To: Lespiau, Damien; +Cc: intel-gfx

> You're always calling gen8_write_pdp_ctx() and gen8_write_tail_ctx()
> together, kmapping the page twice is a bit wastful.

You are totally right... I´ll join them in the next version.

-- Oscar

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 42/49] drm/i915/bdw: Get prepared for a two-stage execlist submit process
  2014-04-04 13:24     ` Damien Lespiau
@ 2014-04-09  7:57       ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-04-09  7:57 UTC (permalink / raw)
  To: Lespiau, Damien; +Cc: Daniel, Thomas, intel-gfx

> > > +	req = (struct drm_i915_gem_request *)
> > > +		kmalloc(sizeof(struct drm_i915_gem_request), GFP_KERNEL);
> > > +	req->ring = ring;
> > > +	req->ctx = to;
> > > +	i915_gem_context_reference(req->ctx);
> > > +	req->tail = tail;
> >
> > Need to test if the allocation has succeeded and return an error if not.
> 
> Also, no need for the cast, kmalloc returns a void *.
> 
> We usually use sizeof(*req) to be safe against variables changing type without
> updating the allocation size (someont the compiler wouldn't warn about).

Ok, will do.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 43/49] drm/i915/bdw: Handle context switch events
  2014-04-03 14:24   ` Damien Lespiau
@ 2014-04-09  8:15     ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-04-09  8:15 UTC (permalink / raw)
  To: Lespiau, Damien; +Cc: Daniel, Thomas, intel-gfx

It seems to be completely managed by SW, for SW (or, at least, it does not seem to have any visible effect in the HW). But you are right, it is probably worth updating.

-- Oscar

> -----Original Message-----
> From: Lespiau, Damien
> Sent: Thursday, April 03, 2014 3:25 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org; Daniel, Thomas
> Subject: Re: [Intel-gfx] [PATCH 43/49] drm/i915/bdw: Handle context switch
> events
> 
> On Thu, Mar 27, 2014 at 06:00:12PM +0000, oscar.mateo@intel.com wrote:
> > +void gen8_handle_context_events(struct intel_engine *ring) {
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > +	u32 status_pointer;
> > +	u8 read_pointer;
> > +	u8 write_pointer;
> > +	u32 status;
> > +	u32 status_id;
> > +	u32 submit_contexts = 0;
> > +
> > +	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
> > +
> > +	read_pointer = ring->next_context_status_buffer;
> > +	write_pointer = status_pointer & 0x07;
> > +	if (read_pointer > write_pointer)
> > +		write_pointer += 6;
> > +
> > +	spin_lock(&ring->execlist_lock);
> > +
> > +	while (read_pointer < write_pointer) {
> > +		read_pointer++;
> > +		status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> > +				(read_pointer % 6) * 8);
> > +		status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> > +				(read_pointer % 6) * 8 + 4);
> > +
> > +		if (status & GEN8_CTX_STATUS_ELEMENT_SWITCH) {
> > +			if (check_remove_request(ring, status_id))
> > +				submit_contexts++;
> > +		} else if (status & GEN8_CTX_STATUS_COMPLETE) {
> > +			if (check_remove_request(ring, status_id))
> > +				submit_contexts++;
> > +		}
> > +	}
> > +
> > +	if (submit_contexts != 0)
> > +		gen8_switch_context_unqueue(ring);
> > +
> > +	spin_unlock(&ring->execlist_lock);
> > +
> > +	WARN(submit_contexts > 2, "More than two context complete
> events?\n");
> > +	ring->next_context_status_buffer = write_pointer % 6; }
> 
> I'm a bit suprised that we never update the read pointer in the
> CONTEXT_STATUS_PTR when we consume entries from
> CONTEXT_STATUS_BUF.
> 
> Are we sure this field isn't used by hw at all to figure out if the circular buffer
> has some free space?
> 
> --
> Damien

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 31/49] drm/i915/bdw: Introduce dependent contexts
  2014-03-27 17:21   ` Mateo Lozano, Oscar
@ 2014-04-09 16:54     ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-04-09 16:54 UTC (permalink / raw)
  To: Lespiau, Damien; +Cc: intel-gfx

Hey Damien,

> I already got a fair review comment from Brad Volkin on this: he proposes to
> do this instead
> 
> 	struct i915_hw_context {
> 		struct i915_address_space *vm;
> 		struct {
> 			struct drm_i915_gem_object *ctx_obj;
> 			struct intel_ringbuffer *ringbuf;
> 		} engine[I915_MAX_RINGS];
> 		...
> 	};
> 
> This is: instead of creating extra contexts with the same Context ID, modify the
> current i915_hw_context to work with all engines. I agree this alternative looks
> less *hackish*, but I want to get eyes on it (several things need careful
> consideration if we do this, e.g.: should the hang_stats also be per-engine?)

After looking at the execlists code, does this also make sense to you?

Cheers,
Oscar

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-03-27 17:59 ` [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
  2014-04-01  0:00   ` Damien Lespiau
@ 2014-04-15 16:00   ` Jeff McGee
  2014-04-15 16:10     ` Jeff McGee
  1 sibling, 1 reply; 85+ messages in thread
From: Jeff McGee @ 2014-04-15 16:00 UTC (permalink / raw)
  To: oscar.mateo; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Thu, Mar 27, 2014 at 05:59:48PM +0000, oscar.mateo@intel.com wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> For the most part, logical rinf context objects are similar to hardware
> contexts in that the backing object is meant to be opaque. There are
> some exceptions where we need to poke certain offsets of the object for
> initialization, updating the tail pointer or updating the PDPs.
> 
> For our basic execlist implementation we'll only need our PPGTT PDs,
> and ringbuffer addresses in order to set up the context. With previous
> patches, we have both, so start prepping the context to be load.
> 
> Before running a context for the first time you must populate some
> fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
> first page (in 0 based counting) of the context  image. These same
> fields will be read and written to as contexts are saved and restored
> once the system is up and running.
> 
> Many of these fields are completely reused from previous global
> registers: ringbuffer head/tail/control, context control matches some
> previous MI_SET_CONTEXT flags, and page directories. There are other
> fields which we don't touch which we may want in the future.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
> for other engines.
> 
> Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>
> 
> v3: Several rebases and general changes to the code.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_lrc.c | 145 ++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 138 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
> index 40dfa95..f0176ff 100644
> --- a/drivers/gpu/drm/i915/i915_lrc.c
> +++ b/drivers/gpu/drm/i915/i915_lrc.c
> @@ -43,6 +43,38 @@
>  
>  #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
>  
> +#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
> +#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
> +
> +#define CTX_LRI_HEADER_0		0x01
> +#define CTX_CONTEXT_CONTROL		0x02
> +#define CTX_RING_HEAD			0x04
> +#define CTX_RING_TAIL			0x06
> +#define CTX_RING_BUFFER_START		0x08
> +#define CTX_RING_BUFFER_CONTROL	0x0a
> +#define CTX_BB_HEAD_U			0x0c
> +#define CTX_BB_HEAD_L			0x0e
> +#define CTX_BB_STATE			0x10
> +#define CTX_SECOND_BB_HEAD_U		0x12
> +#define CTX_SECOND_BB_HEAD_L		0x14
> +#define CTX_SECOND_BB_STATE		0x16
> +#define CTX_BB_PER_CTX_PTR		0x18
> +#define CTX_RCS_INDIRECT_CTX		0x1a
> +#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
> +#define CTX_LRI_HEADER_1		0x21
> +#define CTX_CTX_TIMESTAMP		0x22
> +#define CTX_PDP3_UDW			0x24
> +#define CTX_PDP3_LDW			0x26
> +#define CTX_PDP2_UDW			0x28
> +#define CTX_PDP2_LDW			0x2a
> +#define CTX_PDP1_UDW			0x2c
> +#define CTX_PDP1_LDW			0x2e
> +#define CTX_PDP0_UDW			0x30
> +#define CTX_PDP0_LDW			0x32
> +#define CTX_LRI_HEADER_2		0x41
> +#define CTX_R_PWR_CLK_STATE		0x42
> +#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
> +
>  struct i915_hw_context *
>  gen8_gem_create_context(struct drm_device *dev,
>  			struct intel_engine *ring,
> @@ -51,6 +83,9 @@ gen8_gem_create_context(struct drm_device *dev,
>  {
>  	struct i915_hw_context *ctx = NULL;
>  	struct drm_i915_gem_object *ring_obj = NULL;
> +	struct i915_hw_ppgtt *ppgtt = NULL;
> +	struct page *page;
> +	uint32_t *reg_state;
>  	int ret;
>  
>  	ctx = i915_gem_create_context(dev, file_priv, create_vm);
> @@ -79,18 +114,114 @@ gen8_gem_create_context(struct drm_device *dev,
>  
>  	/* Failure at this point is almost impossible */
>  	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
> -	if (ret) {
> -		i915_gem_object_ggtt_unpin(ring_obj);
> -		drm_gem_object_unreference(&ring_obj->base);
> -		i915_gem_object_ggtt_unpin(ctx->obj);
> -		i915_gem_context_unreference(ctx);
> -		return ERR_PTR(ret);
> -	}
> +	if (ret)
> +		goto destroy_ring_obj;
>  
>  	ctx->ringbuf = &ring->default_ringbuf;
>  	ctx->ringbuf->obj = ring_obj;
>  
> +	ppgtt = ctx_to_ppgtt(ctx);
> +
> +	ret = i915_gem_object_set_to_cpu_domain(ctx->obj, true);
> +	if (ret)
> +		goto destroy_ring_obj;
> +
> +	ret = i915_gem_object_get_pages(ctx->obj);
> +	if (ret)
> +		goto destroy_ring_obj;
> +
> +	i915_gem_object_pin_pages(ctx->obj);
> +
> +	/* The second page of the context object contains some fields which must
> +	 * be set up prior to the first execution.
> +	 */
> +	page = i915_gem_object_get_page(ctx->obj, 1);
> +	reg_state = kmap_atomic(page);
> +
> +	if (ring->id == RCS)
> +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
> +	else
> +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
> +	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
> +	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
> +	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
> +	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
> +	reg_state[CTX_RING_HEAD+1] = 0;
> +	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
> +	reg_state[CTX_RING_TAIL+1] = 0;
> +	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
> +	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
> +	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
> +	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
> +	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
> +	reg_state[CTX_BB_HEAD_U+1] = 0;
> +	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
> +	reg_state[CTX_BB_HEAD_L+1] = 0;
> +	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
> +	reg_state[CTX_BB_STATE+1] = (1<<5);
> +	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
> +	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
> +	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
> +	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
> +	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
> +	reg_state[CTX_SECOND_BB_STATE+1] = 0;
> +	if (ring->id == RCS) {
> +		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
> +		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
> +		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
> +		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
> +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
> +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
> +	}
> +
> +	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
> +	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
> +	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
> +	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
> +	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
> +	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
> +	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
> +	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
> +	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
> +	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
> +	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
> +	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> +	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
> +	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
> +	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
> +	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
> +	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
> +	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
> +	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
> +	if (ring->id == RCS) {
> +		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> +		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;

You're writing the MMIO address for the R_PWR_CLK_STATE register to this
field. Shouldn't this receive the value we want programmed to the register?

> +		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
> +	}
> +
> +#if 0
> +	/* Offsets not yet defined for these */
> +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS[] = ;
> +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
> +#endif
> +
> +	kunmap_atomic(reg_state);
> +
> +	ctx->obj->dirty = 1;
> +	set_page_dirty(page);
> +	i915_gem_object_unpin_pages(ctx->obj);
> +
>  	return ctx;
> +
> +destroy_ring_obj:
> +	i915_gem_object_ggtt_unpin(ring_obj);
> +	drm_gem_object_unreference(&ring_obj->base);
> +	ctx->ringbuf->obj = NULL;
> +	ctx->ringbuf = NULL;
> +	i915_gem_object_ggtt_unpin(ctx->obj);
> +	i915_gem_context_unreference(ctx);
> +
> +	return ERR_PTR(ret);
>  }
>  
>  void gen8_gem_context_fini(struct drm_device *dev)
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-04-15 16:00   ` Jeff McGee
@ 2014-04-15 16:10     ` Jeff McGee
  2014-04-15 19:51       ` Daniel Vetter
  2014-04-15 20:43       ` Jeff McGee
  0 siblings, 2 replies; 85+ messages in thread
From: Jeff McGee @ 2014-04-15 16:10 UTC (permalink / raw)
  To: oscar.mateo, intel-gfx, Ben Widawsky, Ben Widawsky

On Tue, Apr 15, 2014 at 11:00:33AM -0500, Jeff McGee wrote:
> On Thu, Mar 27, 2014 at 05:59:48PM +0000, oscar.mateo@intel.com wrote:
> > From: Ben Widawsky <benjamin.widawsky@intel.com>
> > 
> > For the most part, logical rinf context objects are similar to hardware
> > contexts in that the backing object is meant to be opaque. There are
> > some exceptions where we need to poke certain offsets of the object for
> > initialization, updating the tail pointer or updating the PDPs.
> > 
> > For our basic execlist implementation we'll only need our PPGTT PDs,
> > and ringbuffer addresses in order to set up the context. With previous
> > patches, we have both, so start prepping the context to be load.
> > 
> > Before running a context for the first time you must populate some
> > fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
> > first page (in 0 based counting) of the context  image. These same
> > fields will be read and written to as contexts are saved and restored
> > once the system is up and running.
> > 
> > Many of these fields are completely reused from previous global
> > registers: ringbuffer head/tail/control, context control matches some
> > previous MI_SET_CONTEXT flags, and page directories. There are other
> > fields which we don't touch which we may want in the future.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
> > for other engines.
> > 
> > Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>
> > 
> > v3: Several rebases and general changes to the code.
> > 
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_lrc.c | 145 ++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 138 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
> > index 40dfa95..f0176ff 100644
> > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > @@ -43,6 +43,38 @@
> >  
> >  #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> >  
> > +#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
> > +#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
> > +
> > +#define CTX_LRI_HEADER_0		0x01
> > +#define CTX_CONTEXT_CONTROL		0x02
> > +#define CTX_RING_HEAD			0x04
> > +#define CTX_RING_TAIL			0x06
> > +#define CTX_RING_BUFFER_START		0x08
> > +#define CTX_RING_BUFFER_CONTROL	0x0a
> > +#define CTX_BB_HEAD_U			0x0c
> > +#define CTX_BB_HEAD_L			0x0e
> > +#define CTX_BB_STATE			0x10
> > +#define CTX_SECOND_BB_HEAD_U		0x12
> > +#define CTX_SECOND_BB_HEAD_L		0x14
> > +#define CTX_SECOND_BB_STATE		0x16
> > +#define CTX_BB_PER_CTX_PTR		0x18
> > +#define CTX_RCS_INDIRECT_CTX		0x1a
> > +#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
> > +#define CTX_LRI_HEADER_1		0x21
> > +#define CTX_CTX_TIMESTAMP		0x22
> > +#define CTX_PDP3_UDW			0x24
> > +#define CTX_PDP3_LDW			0x26
> > +#define CTX_PDP2_UDW			0x28
> > +#define CTX_PDP2_LDW			0x2a
> > +#define CTX_PDP1_UDW			0x2c
> > +#define CTX_PDP1_LDW			0x2e
> > +#define CTX_PDP0_UDW			0x30
> > +#define CTX_PDP0_LDW			0x32
> > +#define CTX_LRI_HEADER_2		0x41
> > +#define CTX_R_PWR_CLK_STATE		0x42
> > +#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
> > +
> >  struct i915_hw_context *
> >  gen8_gem_create_context(struct drm_device *dev,
> >  			struct intel_engine *ring,
> > @@ -51,6 +83,9 @@ gen8_gem_create_context(struct drm_device *dev,
> >  {
> >  	struct i915_hw_context *ctx = NULL;
> >  	struct drm_i915_gem_object *ring_obj = NULL;
> > +	struct i915_hw_ppgtt *ppgtt = NULL;
> > +	struct page *page;
> > +	uint32_t *reg_state;
> >  	int ret;
> >  
> >  	ctx = i915_gem_create_context(dev, file_priv, create_vm);
> > @@ -79,18 +114,114 @@ gen8_gem_create_context(struct drm_device *dev,
> >  
> >  	/* Failure at this point is almost impossible */
> >  	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
> > -	if (ret) {
> > -		i915_gem_object_ggtt_unpin(ring_obj);
> > -		drm_gem_object_unreference(&ring_obj->base);
> > -		i915_gem_object_ggtt_unpin(ctx->obj);
> > -		i915_gem_context_unreference(ctx);
> > -		return ERR_PTR(ret);
> > -	}
> > +	if (ret)
> > +		goto destroy_ring_obj;
> >  
> >  	ctx->ringbuf = &ring->default_ringbuf;
> >  	ctx->ringbuf->obj = ring_obj;
> >  
> > +	ppgtt = ctx_to_ppgtt(ctx);
> > +
> > +	ret = i915_gem_object_set_to_cpu_domain(ctx->obj, true);
> > +	if (ret)
> > +		goto destroy_ring_obj;
> > +
> > +	ret = i915_gem_object_get_pages(ctx->obj);
> > +	if (ret)
> > +		goto destroy_ring_obj;
> > +
> > +	i915_gem_object_pin_pages(ctx->obj);
> > +
> > +	/* The second page of the context object contains some fields which must
> > +	 * be set up prior to the first execution.
> > +	 */
> > +	page = i915_gem_object_get_page(ctx->obj, 1);
> > +	reg_state = kmap_atomic(page);
> > +
> > +	if (ring->id == RCS)
> > +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
> > +	else
> > +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
> > +	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
> > +	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
> > +	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
> > +	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
> > +	reg_state[CTX_RING_HEAD+1] = 0;
> > +	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
> > +	reg_state[CTX_RING_TAIL+1] = 0;
> > +	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
> > +	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
> > +	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
> > +	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
> > +	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
> > +	reg_state[CTX_BB_HEAD_U+1] = 0;
> > +	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
> > +	reg_state[CTX_BB_HEAD_L+1] = 0;
> > +	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
> > +	reg_state[CTX_BB_STATE+1] = (1<<5);
> > +	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
> > +	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
> > +	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
> > +	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
> > +	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
> > +	reg_state[CTX_SECOND_BB_STATE+1] = 0;
> > +	if (ring->id == RCS) {
> > +		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
> > +		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
> > +		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
> > +		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
> > +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
> > +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
> > +	}
> > +
> > +	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
> > +	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
> > +	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
> > +	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
> > +	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
> > +	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
> > +	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
> > +	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
> > +	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
> > +	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
> > +	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
> > +	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> > +	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
> > +	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
> > +	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
> > +	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
> > +	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
> > +	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
> > +	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
> > +	if (ring->id == RCS) {
> > +		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> > +		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
> 
> You're writing the MMIO address for the R_PWR_CLK_STATE register to this
> field. Shouldn't this receive the value we want programmed to the register?
> 

Oh, nevermind. I understand now.
-Jeff

> > +		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
> > +	}
> > +
> > +#if 0
> > +	/* Offsets not yet defined for these */
> > +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS[] = ;
> > +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
> > +#endif
> > +
> > +	kunmap_atomic(reg_state);
> > +
> > +	ctx->obj->dirty = 1;
> > +	set_page_dirty(page);
> > +	i915_gem_object_unpin_pages(ctx->obj);
> > +
> >  	return ctx;
> > +
> > +destroy_ring_obj:
> > +	i915_gem_object_ggtt_unpin(ring_obj);
> > +	drm_gem_object_unreference(&ring_obj->base);
> > +	ctx->ringbuf->obj = NULL;
> > +	ctx->ringbuf = NULL;
> > +	i915_gem_object_ggtt_unpin(ctx->obj);
> > +	i915_gem_context_unreference(ctx);
> > +
> > +	return ERR_PTR(ret);
> >  }
> >  
> >  void gen8_gem_context_fini(struct drm_device *dev)
> > -- 
> > 1.9.0
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-04-15 16:10     ` Jeff McGee
@ 2014-04-15 19:51       ` Daniel Vetter
  2014-04-15 20:43       ` Jeff McGee
  1 sibling, 0 replies; 85+ messages in thread
From: Daniel Vetter @ 2014-04-15 19:51 UTC (permalink / raw)
  To: oscar.mateo, intel-gfx, Ben Widawsky, Ben Widawsky

On Tue, Apr 15, 2014 at 11:10:34AM -0500, Jeff McGee wrote:
> Oh, nevermind. I understand now.

Quickly explaining your understanding for everyone else's benefit would be
nice ... In general be explicit and verbose on mailing lists to make sure
you don't miss some important bit of context people from a different geo
lack.

And if you discussed something off-list (either off or chat) please
summarize the conclusions and participants in an on-list message.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-04-15 16:10     ` Jeff McGee
  2014-04-15 19:51       ` Daniel Vetter
@ 2014-04-15 20:43       ` Jeff McGee
  2014-04-15 21:08         ` Daniel Vetter
  1 sibling, 1 reply; 85+ messages in thread
From: Jeff McGee @ 2014-04-15 20:43 UTC (permalink / raw)
  To: oscar.mateo, intel-gfx, Ben Widawsky, Ben Widawsky

On Tue, Apr 15, 2014 at 11:10:34AM -0500, Jeff McGee wrote:
> On Tue, Apr 15, 2014 at 11:00:33AM -0500, Jeff McGee wrote:
> > On Thu, Mar 27, 2014 at 05:59:48PM +0000, oscar.mateo@intel.com wrote:
> > > From: Ben Widawsky <benjamin.widawsky@intel.com>
> > > 
> > > For the most part, logical rinf context objects are similar to hardware
> > > contexts in that the backing object is meant to be opaque. There are
> > > some exceptions where we need to poke certain offsets of the object for
> > > initialization, updating the tail pointer or updating the PDPs.
> > > 
> > > For our basic execlist implementation we'll only need our PPGTT PDs,
> > > and ringbuffer addresses in order to set up the context. With previous
> > > patches, we have both, so start prepping the context to be load.
> > > 
> > > Before running a context for the first time you must populate some
> > > fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
> > > first page (in 0 based counting) of the context  image. These same
> > > fields will be read and written to as contexts are saved and restored
> > > once the system is up and running.
> > > 
> > > Many of these fields are completely reused from previous global
> > > registers: ringbuffer head/tail/control, context control matches some
> > > previous MI_SET_CONTEXT flags, and page directories. There are other
> > > fields which we don't touch which we may want in the future.
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
> > > for other engines.
> > > 
> > > Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>
> > > 
> > > v3: Several rebases and general changes to the code.
> > > 
> > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_lrc.c | 145 ++++++++++++++++++++++++++++++++++++++--
> > >  1 file changed, 138 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
> > > index 40dfa95..f0176ff 100644
> > > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > > @@ -43,6 +43,38 @@
> > >  
> > >  #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> > >  
> > > +#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
> > > +#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
> > > +
> > > +#define CTX_LRI_HEADER_0		0x01
> > > +#define CTX_CONTEXT_CONTROL		0x02
> > > +#define CTX_RING_HEAD			0x04
> > > +#define CTX_RING_TAIL			0x06
> > > +#define CTX_RING_BUFFER_START		0x08
> > > +#define CTX_RING_BUFFER_CONTROL	0x0a
> > > +#define CTX_BB_HEAD_U			0x0c
> > > +#define CTX_BB_HEAD_L			0x0e
> > > +#define CTX_BB_STATE			0x10
> > > +#define CTX_SECOND_BB_HEAD_U		0x12
> > > +#define CTX_SECOND_BB_HEAD_L		0x14
> > > +#define CTX_SECOND_BB_STATE		0x16
> > > +#define CTX_BB_PER_CTX_PTR		0x18
> > > +#define CTX_RCS_INDIRECT_CTX		0x1a
> > > +#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
> > > +#define CTX_LRI_HEADER_1		0x21
> > > +#define CTX_CTX_TIMESTAMP		0x22
> > > +#define CTX_PDP3_UDW			0x24
> > > +#define CTX_PDP3_LDW			0x26
> > > +#define CTX_PDP2_UDW			0x28
> > > +#define CTX_PDP2_LDW			0x2a
> > > +#define CTX_PDP1_UDW			0x2c
> > > +#define CTX_PDP1_LDW			0x2e
> > > +#define CTX_PDP0_UDW			0x30
> > > +#define CTX_PDP0_LDW			0x32
> > > +#define CTX_LRI_HEADER_2		0x41
> > > +#define CTX_R_PWR_CLK_STATE		0x42
> > > +#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
> > > +
> > >  struct i915_hw_context *
> > >  gen8_gem_create_context(struct drm_device *dev,
> > >  			struct intel_engine *ring,
> > > @@ -51,6 +83,9 @@ gen8_gem_create_context(struct drm_device *dev,
> > >  {
> > >  	struct i915_hw_context *ctx = NULL;
> > >  	struct drm_i915_gem_object *ring_obj = NULL;
> > > +	struct i915_hw_ppgtt *ppgtt = NULL;
> > > +	struct page *page;
> > > +	uint32_t *reg_state;
> > >  	int ret;
> > >  
> > >  	ctx = i915_gem_create_context(dev, file_priv, create_vm);
> > > @@ -79,18 +114,114 @@ gen8_gem_create_context(struct drm_device *dev,
> > >  
> > >  	/* Failure at this point is almost impossible */
> > >  	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
> > > -	if (ret) {
> > > -		i915_gem_object_ggtt_unpin(ring_obj);
> > > -		drm_gem_object_unreference(&ring_obj->base);
> > > -		i915_gem_object_ggtt_unpin(ctx->obj);
> > > -		i915_gem_context_unreference(ctx);
> > > -		return ERR_PTR(ret);
> > > -	}
> > > +	if (ret)
> > > +		goto destroy_ring_obj;
> > >  
> > >  	ctx->ringbuf = &ring->default_ringbuf;
> > >  	ctx->ringbuf->obj = ring_obj;
> > >  
> > > +	ppgtt = ctx_to_ppgtt(ctx);
> > > +
> > > +	ret = i915_gem_object_set_to_cpu_domain(ctx->obj, true);
> > > +	if (ret)
> > > +		goto destroy_ring_obj;
> > > +
> > > +	ret = i915_gem_object_get_pages(ctx->obj);
> > > +	if (ret)
> > > +		goto destroy_ring_obj;
> > > +
> > > +	i915_gem_object_pin_pages(ctx->obj);
> > > +
> > > +	/* The second page of the context object contains some fields which must
> > > +	 * be set up prior to the first execution.
> > > +	 */
> > > +	page = i915_gem_object_get_page(ctx->obj, 1);
> > > +	reg_state = kmap_atomic(page);
> > > +
> > > +	if (ring->id == RCS)
> > > +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
> > > +	else
> > > +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
> > > +	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
> > > +	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
> > > +	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
> > > +	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
> > > +	reg_state[CTX_RING_HEAD+1] = 0;
> > > +	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
> > > +	reg_state[CTX_RING_TAIL+1] = 0;
> > > +	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
> > > +	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
> > > +	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
> > > +	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
> > > +	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
> > > +	reg_state[CTX_BB_HEAD_U+1] = 0;
> > > +	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
> > > +	reg_state[CTX_BB_HEAD_L+1] = 0;
> > > +	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
> > > +	reg_state[CTX_BB_STATE+1] = (1<<5);
> > > +	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
> > > +	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
> > > +	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
> > > +	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
> > > +	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
> > > +	reg_state[CTX_SECOND_BB_STATE+1] = 0;
> > > +	if (ring->id == RCS) {
> > > +		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
> > > +		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
> > > +		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
> > > +		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
> > > +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
> > > +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
> > > +	}
> > > +
> > > +	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
> > > +	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
> > > +	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
> > > +	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
> > > +	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
> > > +	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
> > > +	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
> > > +	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
> > > +	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
> > > +	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
> > > +	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
> > > +	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> > > +	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
> > > +	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
> > > +	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
> > > +	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
> > > +	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
> > > +	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
> > > +	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
> > > +	if (ring->id == RCS) {
> > > +		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> > > +		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
> > 
> > You're writing the MMIO address for the R_PWR_CLK_STATE register to this
> > field. Shouldn't this receive the value we want programmed to the register?
> > 
> 
> Oh, nevermind. I understand now.
> -Jeff
> 
To clarify my comments...I was at first confused by the need to specify the
R_PWR_CLK_STATE register address in the logical context, thinking that only
the desired value needed to be specified. But I see now that the programming
model is to specify the MI_LOAD_REGISTER_IMM command, followed by the address
at which to load, followed by the value to load.

Reflecting on my initial confusion, would it be clearer to provide names for
each dword position in the context image, rather than using an unnamed offset
like CTX_R_PWR_CLK_STATE+1? Example:

reg_state[CTX_R_PWR_CLK_STATE_ADDR] = 0x20c8
reg_state[CTX_R_PWR_CLK_STATE_DATA] = 0;

Jeff
> > > +		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
> > > +	}
> > > +
> > > +#if 0
> > > +	/* Offsets not yet defined for these */
> > > +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS[] = ;
> > > +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
> > > +#endif
> > > +
> > > +	kunmap_atomic(reg_state);
> > > +
> > > +	ctx->obj->dirty = 1;
> > > +	set_page_dirty(page);
> > > +	i915_gem_object_unpin_pages(ctx->obj);
> > > +
> > >  	return ctx;
> > > +
> > > +destroy_ring_obj:
> > > +	i915_gem_object_ggtt_unpin(ring_obj);
> > > +	drm_gem_object_unreference(&ring_obj->base);
> > > +	ctx->ringbuf->obj = NULL;
> > > +	ctx->ringbuf = NULL;
> > > +	i915_gem_object_ggtt_unpin(ctx->obj);
> > > +	i915_gem_context_unreference(ctx);
> > > +
> > > +	return ERR_PTR(ret);
> > >  }
> > >  
> > >  void gen8_gem_context_fini(struct drm_device *dev)
> > > -- 
> > > 1.9.0
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-04-15 20:43       ` Jeff McGee
@ 2014-04-15 21:08         ` Daniel Vetter
  2014-04-15 22:32           ` Jeff McGee
  0 siblings, 1 reply; 85+ messages in thread
From: Daniel Vetter @ 2014-04-15 21:08 UTC (permalink / raw)
  To: oscar.mateo, intel-gfx, Ben Widawsky, Ben Widawsky

On Tue, Apr 15, 2014 at 03:43:23PM -0500, Jeff McGee wrote:
> On Tue, Apr 15, 2014 at 11:10:34AM -0500, Jeff McGee wrote:
> > On Tue, Apr 15, 2014 at 11:00:33AM -0500, Jeff McGee wrote:
> > > On Thu, Mar 27, 2014 at 05:59:48PM +0000, oscar.mateo@intel.com wrote:
> > > > From: Ben Widawsky <benjamin.widawsky@intel.com>
> > > > 
> > > > For the most part, logical rinf context objects are similar to hardware
> > > > contexts in that the backing object is meant to be opaque. There are
> > > > some exceptions where we need to poke certain offsets of the object for
> > > > initialization, updating the tail pointer or updating the PDPs.
> > > > 
> > > > For our basic execlist implementation we'll only need our PPGTT PDs,
> > > > and ringbuffer addresses in order to set up the context. With previous
> > > > patches, we have both, so start prepping the context to be load.
> > > > 
> > > > Before running a context for the first time you must populate some
> > > > fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
> > > > first page (in 0 based counting) of the context  image. These same
> > > > fields will be read and written to as contexts are saved and restored
> > > > once the system is up and running.
> > > > 
> > > > Many of these fields are completely reused from previous global
> > > > registers: ringbuffer head/tail/control, context control matches some
> > > > previous MI_SET_CONTEXT flags, and page directories. There are other
> > > > fields which we don't touch which we may want in the future.
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > 
> > > > v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
> > > > for other engines.
> > > > 
> > > > Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>
> > > > 
> > > > v3: Several rebases and general changes to the code.
> > > > 
> > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_lrc.c | 145 ++++++++++++++++++++++++++++++++++++++--
> > > >  1 file changed, 138 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
> > > > index 40dfa95..f0176ff 100644
> > > > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > > > @@ -43,6 +43,38 @@
> > > >  
> > > >  #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> > > >  
> > > > +#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
> > > > +#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
> > > > +
> > > > +#define CTX_LRI_HEADER_0		0x01
> > > > +#define CTX_CONTEXT_CONTROL		0x02
> > > > +#define CTX_RING_HEAD			0x04
> > > > +#define CTX_RING_TAIL			0x06
> > > > +#define CTX_RING_BUFFER_START		0x08
> > > > +#define CTX_RING_BUFFER_CONTROL	0x0a
> > > > +#define CTX_BB_HEAD_U			0x0c
> > > > +#define CTX_BB_HEAD_L			0x0e
> > > > +#define CTX_BB_STATE			0x10
> > > > +#define CTX_SECOND_BB_HEAD_U		0x12
> > > > +#define CTX_SECOND_BB_HEAD_L		0x14
> > > > +#define CTX_SECOND_BB_STATE		0x16
> > > > +#define CTX_BB_PER_CTX_PTR		0x18
> > > > +#define CTX_RCS_INDIRECT_CTX		0x1a
> > > > +#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
> > > > +#define CTX_LRI_HEADER_1		0x21
> > > > +#define CTX_CTX_TIMESTAMP		0x22
> > > > +#define CTX_PDP3_UDW			0x24
> > > > +#define CTX_PDP3_LDW			0x26
> > > > +#define CTX_PDP2_UDW			0x28
> > > > +#define CTX_PDP2_LDW			0x2a
> > > > +#define CTX_PDP1_UDW			0x2c
> > > > +#define CTX_PDP1_LDW			0x2e
> > > > +#define CTX_PDP0_UDW			0x30
> > > > +#define CTX_PDP0_LDW			0x32
> > > > +#define CTX_LRI_HEADER_2		0x41
> > > > +#define CTX_R_PWR_CLK_STATE		0x42
> > > > +#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
> > > > +
> > > >  struct i915_hw_context *
> > > >  gen8_gem_create_context(struct drm_device *dev,
> > > >  			struct intel_engine *ring,
> > > > @@ -51,6 +83,9 @@ gen8_gem_create_context(struct drm_device *dev,
> > > >  {
> > > >  	struct i915_hw_context *ctx = NULL;
> > > >  	struct drm_i915_gem_object *ring_obj = NULL;
> > > > +	struct i915_hw_ppgtt *ppgtt = NULL;
> > > > +	struct page *page;
> > > > +	uint32_t *reg_state;
> > > >  	int ret;
> > > >  
> > > >  	ctx = i915_gem_create_context(dev, file_priv, create_vm);
> > > > @@ -79,18 +114,114 @@ gen8_gem_create_context(struct drm_device *dev,
> > > >  
> > > >  	/* Failure at this point is almost impossible */
> > > >  	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
> > > > -	if (ret) {
> > > > -		i915_gem_object_ggtt_unpin(ring_obj);
> > > > -		drm_gem_object_unreference(&ring_obj->base);
> > > > -		i915_gem_object_ggtt_unpin(ctx->obj);
> > > > -		i915_gem_context_unreference(ctx);
> > > > -		return ERR_PTR(ret);
> > > > -	}
> > > > +	if (ret)
> > > > +		goto destroy_ring_obj;
> > > >  
> > > >  	ctx->ringbuf = &ring->default_ringbuf;
> > > >  	ctx->ringbuf->obj = ring_obj;
> > > >  
> > > > +	ppgtt = ctx_to_ppgtt(ctx);
> > > > +
> > > > +	ret = i915_gem_object_set_to_cpu_domain(ctx->obj, true);
> > > > +	if (ret)
> > > > +		goto destroy_ring_obj;
> > > > +
> > > > +	ret = i915_gem_object_get_pages(ctx->obj);
> > > > +	if (ret)
> > > > +		goto destroy_ring_obj;
> > > > +
> > > > +	i915_gem_object_pin_pages(ctx->obj);
> > > > +
> > > > +	/* The second page of the context object contains some fields which must
> > > > +	 * be set up prior to the first execution.
> > > > +	 */
> > > > +	page = i915_gem_object_get_page(ctx->obj, 1);
> > > > +	reg_state = kmap_atomic(page);
> > > > +
> > > > +	if (ring->id == RCS)
> > > > +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
> > > > +	else
> > > > +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
> > > > +	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
> > > > +	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
> > > > +	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
> > > > +	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
> > > > +	reg_state[CTX_RING_HEAD+1] = 0;
> > > > +	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
> > > > +	reg_state[CTX_RING_TAIL+1] = 0;
> > > > +	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
> > > > +	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
> > > > +	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
> > > > +	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
> > > > +	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
> > > > +	reg_state[CTX_BB_HEAD_U+1] = 0;
> > > > +	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
> > > > +	reg_state[CTX_BB_HEAD_L+1] = 0;
> > > > +	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
> > > > +	reg_state[CTX_BB_STATE+1] = (1<<5);
> > > > +	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
> > > > +	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
> > > > +	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
> > > > +	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
> > > > +	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
> > > > +	reg_state[CTX_SECOND_BB_STATE+1] = 0;
> > > > +	if (ring->id == RCS) {
> > > > +		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
> > > > +		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
> > > > +		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
> > > > +		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
> > > > +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
> > > > +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
> > > > +	}
> > > > +
> > > > +	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
> > > > +	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
> > > > +	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
> > > > +	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
> > > > +	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
> > > > +	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
> > > > +	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
> > > > +	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
> > > > +	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
> > > > +	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
> > > > +	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
> > > > +	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> > > > +	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
> > > > +	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
> > > > +	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
> > > > +	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
> > > > +	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
> > > > +	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
> > > > +	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
> > > > +	if (ring->id == RCS) {
> > > > +		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> > > > +		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
> > > 
> > > You're writing the MMIO address for the R_PWR_CLK_STATE register to this
> > > field. Shouldn't this receive the value we want programmed to the register?
> > > 
> > 
> > Oh, nevermind. I understand now.
> > -Jeff
> > 
> To clarify my comments...I was at first confused by the need to specify the
> R_PWR_CLK_STATE register address in the logical context, thinking that only
> the desired value needed to be specified. But I see now that the programming
> model is to specify the MI_LOAD_REGISTER_IMM command, followed by the address
> at which to load, followed by the value to load.
> 
> Reflecting on my initial confusion, would it be clearer to provide names for
> each dword position in the context image, rather than using an unnamed offset
> like CTX_R_PWR_CLK_STATE+1? Example:
> 
> reg_state[CTX_R_PWR_CLK_STATE_ADDR] = 0x20c8
> reg_state[CTX_R_PWR_CLK_STATE_DATA] = 0;

Usually when we emit batches in userspace (and the context is nothing else
really) we have some OUT_BATCH macro which writes the dword and increments
the pointer. Since MI_LOAD_REGISTER_IMM is multi-length we could add a
OUT_BATCH_REG_WRITE(reg, value) which does both dword emissions.

That should clarify a lot what's going on here. We might even completely
drop all the offset #defines and replace them with a few comments or so.
-Daniel

> 
> Jeff
> > > > +		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
> > > > +	}
> > > > +
> > > > +#if 0
> > > > +	/* Offsets not yet defined for these */
> > > > +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS[] = ;
> > > > +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
> > > > +#endif
> > > > +
> > > > +	kunmap_atomic(reg_state);
> > > > +
> > > > +	ctx->obj->dirty = 1;
> > > > +	set_page_dirty(page);
> > > > +	i915_gem_object_unpin_pages(ctx->obj);
> > > > +
> > > >  	return ctx;
> > > > +
> > > > +destroy_ring_obj:
> > > > +	i915_gem_object_ggtt_unpin(ring_obj);
> > > > +	drm_gem_object_unreference(&ring_obj->base);
> > > > +	ctx->ringbuf->obj = NULL;
> > > > +	ctx->ringbuf = NULL;
> > > > +	i915_gem_object_ggtt_unpin(ctx->obj);
> > > > +	i915_gem_context_unreference(ctx);
> > > > +
> > > > +	return ERR_PTR(ret);
> > > >  }
> > > >  
> > > >  void gen8_gem_context_fini(struct drm_device *dev)
> > > > -- 
> > > > 1.9.0
> > > > 
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-04-15 21:08         ` Daniel Vetter
@ 2014-04-15 22:32           ` Jeff McGee
  2014-04-16  6:04             ` Daniel Vetter
  0 siblings, 1 reply; 85+ messages in thread
From: Jeff McGee @ 2014-04-15 22:32 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Tue, Apr 15, 2014 at 11:08:02PM +0200, Daniel Vetter wrote:
> On Tue, Apr 15, 2014 at 03:43:23PM -0500, Jeff McGee wrote:
> > On Tue, Apr 15, 2014 at 11:10:34AM -0500, Jeff McGee wrote:
> > > On Tue, Apr 15, 2014 at 11:00:33AM -0500, Jeff McGee wrote:
> > > > On Thu, Mar 27, 2014 at 05:59:48PM +0000, oscar.mateo@intel.com wrote:
> > > > > From: Ben Widawsky <benjamin.widawsky@intel.com>
> > > > > 
> > > > > For the most part, logical rinf context objects are similar to hardware
> > > > > contexts in that the backing object is meant to be opaque. There are
> > > > > some exceptions where we need to poke certain offsets of the object for
> > > > > initialization, updating the tail pointer or updating the PDPs.
> > > > > 
> > > > > For our basic execlist implementation we'll only need our PPGTT PDs,
> > > > > and ringbuffer addresses in order to set up the context. With previous
> > > > > patches, we have both, so start prepping the context to be load.
> > > > > 
> > > > > Before running a context for the first time you must populate some
> > > > > fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
> > > > > first page (in 0 based counting) of the context  image. These same
> > > > > fields will be read and written to as contexts are saved and restored
> > > > > once the system is up and running.
> > > > > 
> > > > > Many of these fields are completely reused from previous global
> > > > > registers: ringbuffer head/tail/control, context control matches some
> > > > > previous MI_SET_CONTEXT flags, and page directories. There are other
> > > > > fields which we don't touch which we may want in the future.
> > > > > 
> > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > > 
> > > > > v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
> > > > > for other engines.
> > > > > 
> > > > > Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>
> > > > > 
> > > > > v3: Several rebases and general changes to the code.
> > > > > 
> > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_lrc.c | 145 ++++++++++++++++++++++++++++++++++++++--
> > > > >  1 file changed, 138 insertions(+), 7 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
> > > > > index 40dfa95..f0176ff 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_lrc.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c
> > > > > @@ -43,6 +43,38 @@
> > > > >  
> > > > >  #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
> > > > >  
> > > > > +#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
> > > > > +#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
> > > > > +
> > > > > +#define CTX_LRI_HEADER_0		0x01
> > > > > +#define CTX_CONTEXT_CONTROL		0x02
> > > > > +#define CTX_RING_HEAD			0x04
> > > > > +#define CTX_RING_TAIL			0x06
> > > > > +#define CTX_RING_BUFFER_START		0x08
> > > > > +#define CTX_RING_BUFFER_CONTROL	0x0a
> > > > > +#define CTX_BB_HEAD_U			0x0c
> > > > > +#define CTX_BB_HEAD_L			0x0e
> > > > > +#define CTX_BB_STATE			0x10
> > > > > +#define CTX_SECOND_BB_HEAD_U		0x12
> > > > > +#define CTX_SECOND_BB_HEAD_L		0x14
> > > > > +#define CTX_SECOND_BB_STATE		0x16
> > > > > +#define CTX_BB_PER_CTX_PTR		0x18
> > > > > +#define CTX_RCS_INDIRECT_CTX		0x1a
> > > > > +#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
> > > > > +#define CTX_LRI_HEADER_1		0x21
> > > > > +#define CTX_CTX_TIMESTAMP		0x22
> > > > > +#define CTX_PDP3_UDW			0x24
> > > > > +#define CTX_PDP3_LDW			0x26
> > > > > +#define CTX_PDP2_UDW			0x28
> > > > > +#define CTX_PDP2_LDW			0x2a
> > > > > +#define CTX_PDP1_UDW			0x2c
> > > > > +#define CTX_PDP1_LDW			0x2e
> > > > > +#define CTX_PDP0_UDW			0x30
> > > > > +#define CTX_PDP0_LDW			0x32
> > > > > +#define CTX_LRI_HEADER_2		0x41
> > > > > +#define CTX_R_PWR_CLK_STATE		0x42
> > > > > +#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
> > > > > +
> > > > >  struct i915_hw_context *
> > > > >  gen8_gem_create_context(struct drm_device *dev,
> > > > >  			struct intel_engine *ring,
> > > > > @@ -51,6 +83,9 @@ gen8_gem_create_context(struct drm_device *dev,
> > > > >  {
> > > > >  	struct i915_hw_context *ctx = NULL;
> > > > >  	struct drm_i915_gem_object *ring_obj = NULL;
> > > > > +	struct i915_hw_ppgtt *ppgtt = NULL;
> > > > > +	struct page *page;
> > > > > +	uint32_t *reg_state;
> > > > >  	int ret;
> > > > >  
> > > > >  	ctx = i915_gem_create_context(dev, file_priv, create_vm);
> > > > > @@ -79,18 +114,114 @@ gen8_gem_create_context(struct drm_device *dev,
> > > > >  
> > > > >  	/* Failure at this point is almost impossible */
> > > > >  	ret = i915_gem_object_set_to_gtt_domain(ring_obj, true);
> > > > > -	if (ret) {
> > > > > -		i915_gem_object_ggtt_unpin(ring_obj);
> > > > > -		drm_gem_object_unreference(&ring_obj->base);
> > > > > -		i915_gem_object_ggtt_unpin(ctx->obj);
> > > > > -		i915_gem_context_unreference(ctx);
> > > > > -		return ERR_PTR(ret);
> > > > > -	}
> > > > > +	if (ret)
> > > > > +		goto destroy_ring_obj;
> > > > >  
> > > > >  	ctx->ringbuf = &ring->default_ringbuf;
> > > > >  	ctx->ringbuf->obj = ring_obj;
> > > > >  
> > > > > +	ppgtt = ctx_to_ppgtt(ctx);
> > > > > +
> > > > > +	ret = i915_gem_object_set_to_cpu_domain(ctx->obj, true);
> > > > > +	if (ret)
> > > > > +		goto destroy_ring_obj;
> > > > > +
> > > > > +	ret = i915_gem_object_get_pages(ctx->obj);
> > > > > +	if (ret)
> > > > > +		goto destroy_ring_obj;
> > > > > +
> > > > > +	i915_gem_object_pin_pages(ctx->obj);
> > > > > +
> > > > > +	/* The second page of the context object contains some fields which must
> > > > > +	 * be set up prior to the first execution.
> > > > > +	 */
> > > > > +	page = i915_gem_object_get_page(ctx->obj, 1);
> > > > > +	reg_state = kmap_atomic(page);
> > > > > +
> > > > > +	if (ring->id == RCS)
> > > > > +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
> > > > > +	else
> > > > > +		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
> > > > > +	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
> > > > > +	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
> > > > > +	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
> > > > > +	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
> > > > > +	reg_state[CTX_RING_HEAD+1] = 0;
> > > > > +	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
> > > > > +	reg_state[CTX_RING_TAIL+1] = 0;
> > > > > +	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
> > > > > +	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
> > > > > +	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
> > > > > +	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
> > > > > +	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
> > > > > +	reg_state[CTX_BB_HEAD_U+1] = 0;
> > > > > +	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
> > > > > +	reg_state[CTX_BB_HEAD_L+1] = 0;
> > > > > +	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
> > > > > +	reg_state[CTX_BB_STATE+1] = (1<<5);
> > > > > +	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
> > > > > +	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
> > > > > +	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
> > > > > +	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
> > > > > +	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
> > > > > +	reg_state[CTX_SECOND_BB_STATE+1] = 0;
> > > > > +	if (ring->id == RCS) {
> > > > > +		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
> > > > > +		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
> > > > > +		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
> > > > > +		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
> > > > > +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
> > > > > +		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
> > > > > +	}
> > > > > +
> > > > > +	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
> > > > > +	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
> > > > > +	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
> > > > > +	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
> > > > > +	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
> > > > > +	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
> > > > > +	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
> > > > > +	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
> > > > > +	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
> > > > > +	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
> > > > > +	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
> > > > > +	reg_state[CTX_PDP3_UDW+1] = ppgtt->pd_dma_addr[3] >> 32;
> > > > > +	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
> > > > > +	reg_state[CTX_PDP2_UDW+1] = ppgtt->pd_dma_addr[2] >> 32;
> > > > > +	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
> > > > > +	reg_state[CTX_PDP1_UDW+1] = ppgtt->pd_dma_addr[1] >> 32;
> > > > > +	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
> > > > > +	reg_state[CTX_PDP0_UDW+1] = ppgtt->pd_dma_addr[0] >> 32;
> > > > > +	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
> > > > > +	if (ring->id == RCS) {
> > > > > +		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> > > > > +		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
> > > > 
> > > > You're writing the MMIO address for the R_PWR_CLK_STATE register to this
> > > > field. Shouldn't this receive the value we want programmed to the register?
> > > > 
> > > 
> > > Oh, nevermind. I understand now.
> > > -Jeff
> > > 
> > To clarify my comments...I was at first confused by the need to specify the
> > R_PWR_CLK_STATE register address in the logical context, thinking that only
> > the desired value needed to be specified. But I see now that the programming
> > model is to specify the MI_LOAD_REGISTER_IMM command, followed by the address
> > at which to load, followed by the value to load.
> > 
> > Reflecting on my initial confusion, would it be clearer to provide names for
> > each dword position in the context image, rather than using an unnamed offset
> > like CTX_R_PWR_CLK_STATE+1? Example:
> > 
> > reg_state[CTX_R_PWR_CLK_STATE_ADDR] = 0x20c8
> > reg_state[CTX_R_PWR_CLK_STATE_DATA] = 0;
> 
> Usually when we emit batches in userspace (and the context is nothing else
> really) we have some OUT_BATCH macro which writes the dword and increments
> the pointer. Since MI_LOAD_REGISTER_IMM is multi-length we could add a
> OUT_BATCH_REG_WRITE(reg, value) which does both dword emissions.
> 
> That should clarify a lot what's going on here. We might even completely
> drop all the offset #defines and replace them with a few comments or so.
> -Daniel
> 
OK, now I get it. My mistake was in thinking the context image is just pure
state that hardware already knows how to restore. But as you say it is more
like a batch which includes the state *and* the MI_LOAD_REGISTER_IMM commands
required to restore. So in that sense I understand that the approach here to
initilize the context is much like constructing a batch. But later when we
want to update the value of a context field we have (in a later patch of this
series): 

reg_state[CTX_RING_TAIL+1] = value;

This is a bit obscure when occurring by itself and not in the flow of
initializing the context (batch). The same will be true when we add management
of the CTX_R_PWR_CLK_STATE value dword.
-Jeff

> > 
> > Jeff
> > > > > +		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
> > > > > +	}
> > > > > +
> > > > > +#if 0
> > > > > +	/* Offsets not yet defined for these */
> > > > > +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS[] = ;
> > > > > +	reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
> > > > > +#endif
> > > > > +
> > > > > +	kunmap_atomic(reg_state);
> > > > > +
> > > > > +	ctx->obj->dirty = 1;
> > > > > +	set_page_dirty(page);
> > > > > +	i915_gem_object_unpin_pages(ctx->obj);
> > > > > +
> > > > >  	return ctx;
> > > > > +
> > > > > +destroy_ring_obj:
> > > > > +	i915_gem_object_ggtt_unpin(ring_obj);
> > > > > +	drm_gem_object_unreference(&ring_obj->base);
> > > > > +	ctx->ringbuf->obj = NULL;
> > > > > +	ctx->ringbuf = NULL;
> > > > > +	i915_gem_object_ggtt_unpin(ctx->obj);
> > > > > +	i915_gem_context_unreference(ctx);
> > > > > +
> > > > > +	return ERR_PTR(ret);
> > > > >  }
> > > > >  
> > > > >  void gen8_gem_context_fini(struct drm_device *dev)
> > > > > -- 
> > > > > 1.9.0
> > > > > 
> > > > > _______________________________________________
> > > > > Intel-gfx mailing list
> > > > > Intel-gfx@lists.freedesktop.org
> > > > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat)
  2014-04-15 22:32           ` Jeff McGee
@ 2014-04-16  6:04             ` Daniel Vetter
  0 siblings, 0 replies; 85+ messages in thread
From: Daniel Vetter @ 2014-04-16  6:04 UTC (permalink / raw)
  To: Daniel Vetter, Mateo Lozano, Oscar, intel-gfx, Ben Widawsky,
	Ben Widawsky

On Wed, Apr 16, 2014 at 12:32 AM, Jeff McGee <jeff.mcgee@intel.com> wrote:
>> > Reflecting on my initial confusion, would it be clearer to provide names for
>> > each dword position in the context image, rather than using an unnamed offset
>> > like CTX_R_PWR_CLK_STATE+1? Example:
>> >
>> > reg_state[CTX_R_PWR_CLK_STATE_ADDR] = 0x20c8
>> > reg_state[CTX_R_PWR_CLK_STATE_DATA] = 0;
>>
>> Usually when we emit batches in userspace (and the context is nothing else
>> really) we have some OUT_BATCH macro which writes the dword and increments
>> the pointer. Since MI_LOAD_REGISTER_IMM is multi-length we could add a
>> OUT_BATCH_REG_WRITE(reg, value) which does both dword emissions.
>>
>> That should clarify a lot what's going on here. We might even completely
>> drop all the offset #defines and replace them with a few comments or so.
>> -Daniel
>>
> OK, now I get it. My mistake was in thinking the context image is just pure
> state that hardware already knows how to restore. But as you say it is more
> like a batch which includes the state *and* the MI_LOAD_REGISTER_IMM commands
> required to restore. So in that sense I understand that the approach here to
> initilize the context is much like constructing a batch. But later when we
> want to update the value of a context field we have (in a later patch of this
> series):
>
> reg_state[CTX_RING_TAIL+1] = value;
>
> This is a bit obscure when occurring by itself and not in the flow of
> initializing the context (batch). The same will be true when we add management
> of the CTX_R_PWR_CLK_STATE value dword.

The approach thus far for this stuff has been to save the hidden
offset value when you get around to write this dword somewhere, e.g.
for relocation processing or similar. Or maybe add an assert that the
contstant we #defined matches the running offset when we expect it to.
Dunno really whether going with an OUT_BATCH approach really makes
sense, was just an idea.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 43/49] drm/i915/bdw: Handle context switch events
  2014-03-27 18:00 ` [PATCH 43/49] drm/i915/bdw: Handle context switch events oscar.mateo
  2014-04-03 14:24   ` Damien Lespiau
@ 2014-04-26  0:53   ` Robert Beckett
  2014-04-28 14:43     ` Mateo Lozano, Oscar
  1 sibling, 1 reply; 85+ messages in thread
From: Robert Beckett @ 2014-04-26  0:53 UTC (permalink / raw)
  To: oscar.mateo, intel-gfx; +Cc: Thomas Daniel

On 27/03/2014 18:00, oscar.mateo@intel.com wrote:
> From: Thomas Daniel <thomas.daniel@intel.com>
>
> Handle all context status events in the context status buffer on every
> context switch interrupt. We only remove work from the execlist queue
> after a context status buffer reports that it has completed and we only
> attempt to schedule new contexts on interrupt when a previously submitted
> context completes (unless no contexts are queued, which means the GPU is
> free).
>
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>
> v2: Unreferencing the context when we are freeing the request might free
> the backing bo, which requires the struct_mutex to be grabbed, so defer
> unreferencing and freeing to a bottom half.
>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         |   3 +
>   drivers/gpu/drm/i915/i915_irq.c         |  28 ++++++---
>   drivers/gpu/drm/i915/i915_lrc.c         | 101 +++++++++++++++++++++++++++++++-
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   1 +
>   drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
>   5 files changed, 123 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2607664..4c8cf52 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1679,6 +1679,8 @@ struct drm_i915_gem_request {
>
>   	/** execlist queue entry for this request */
>   	struct list_head execlist_link;
> +	/** Struct to handle this request in the bottom half of an interrupt */
> +	struct work_struct work;
>   };
>
>   struct drm_i915_file_private {
> @@ -2344,6 +2346,7 @@ void gen8_gem_context_free(struct i915_hw_context *ctx);
>   int gen8_switch_context_queue(struct intel_engine *ring,
>   			      struct i915_hw_context *to,
>   			      u32 tail);
> +void gen8_handle_context_events(struct intel_engine *ring);
>
>   /* i915_gem_evict.c */
>   int __must_check i915_gem_evict_something(struct drm_device *dev,
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 56657b5..6e0f456 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1334,6 +1334,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
>   				       struct drm_i915_private *dev_priv,
>   				       u32 master_ctl)
>   {
> +	struct intel_engine *ring;
>   	u32 rcs, bcs, vcs, vecs;
>   	uint32_t tmp = 0;
>   	irqreturn_t ret = IRQ_NONE;
> @@ -1342,14 +1343,21 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
>   		tmp = I915_READ(GEN8_GT_IIR(0));
>   		if (tmp) {
>   			ret = IRQ_HANDLED;
> +
>   			rcs = tmp >> GEN8_RCS_IRQ_SHIFT;
> -			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
> +			ring = &dev_priv->ring[RCS];
>   			if (rcs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[RCS]);
> +				notify_ring(dev, ring);
> +			if (rcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> +				gen8_handle_context_events(ring);

Handling the context events here can generate a new execlist submission, 
which if a small enough workload, can finish and generate a new context 
event interrupt before we ack this interrupt.

When we ack this interrupt, we clear the new one too, loosing an interrupt.

Moving the

I915_WRITE(GEN8_GT_IIR(0), tmp);

to just inside the if (tmp) { conditional (or anywhere before this call) 
fixes this issue. There is no harm in acking the interrupt immediately 
as we have the read stored in tmp.


> +
> +			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
> +			ring = &dev_priv->ring[BCS];
>   			if (bcs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[BCS]);
> -			if ((rcs | bcs) & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> -			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
> +				notify_ring(dev, ring);
> +			if (bcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> +				gen8_handle_context_events(ring);
> +
>   			I915_WRITE(GEN8_GT_IIR(0), tmp);
>   		} else
>   			DRM_ERROR("The master control interrupt lied (GT0)!\n");
> @@ -1360,10 +1368,11 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
>   		if (tmp) {
>   			ret = IRQ_HANDLED;
>   			vcs = tmp >> GEN8_VCS1_IRQ_SHIFT;
> +			ring = &dev_priv->ring[VCS];
>   			if (vcs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[VCS]);
> +				notify_ring(dev, ring);
>   			if (vcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> -			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
> +				gen8_handle_context_events(ring);
>   			I915_WRITE(GEN8_GT_IIR(1), tmp);
>   		} else
>   			DRM_ERROR("The master control interrupt lied (GT1)!\n");
> @@ -1374,10 +1383,11 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
>   		if (tmp) {
>   			ret = IRQ_HANDLED;
>   			vecs = tmp >> GEN8_VECS_IRQ_SHIFT;
> +			ring = &dev_priv->ring[VECS];
>   			if (vecs & GT_RENDER_USER_INTERRUPT)
> -				notify_ring(dev, &dev_priv->ring[VECS]);
> +				notify_ring(dev, ring);
>   			if (vecs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> -			     DRM_DEBUG_DRIVER("TODO: Context switch\n");
> +				gen8_handle_context_events(ring);
>   			I915_WRITE(GEN8_GT_IIR(3), tmp);
>   		} else
>   			DRM_ERROR("The master control interrupt lied (GT3)!\n");
> diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c
> index 4cacabb..440da11 100644
> --- a/drivers/gpu/drm/i915/i915_lrc.c
> +++ b/drivers/gpu/drm/i915/i915_lrc.c
> @@ -46,7 +46,24 @@
>   #define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE)
>
>   #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
> +#define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
>   #define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
> +#define RING_CONTEXT_STATUS_BUF(ring)	((ring)->mmio_base+0x370)
> +#define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
> +
> +#define RING_EXECLIST_QFULL		(1 << 0x2)
> +#define RING_EXECLIST1_VALID		(1 << 0x3)
> +#define RING_EXECLIST0_VALID		(1 << 0x4)
> +#define RING_EXECLIST_ACTIVE_STATUS	(3 << 0xE)
> +#define RING_EXECLIST1_ACTIVE		(1 << 0x11)
> +#define RING_EXECLIST0_ACTIVE		(1 << 0x12)
> +
> +#define GEN8_CTX_STATUS_IDLE_ACTIVE	(1 << 0)
> +#define GEN8_CTX_STATUS_PREEMPTED	(1 << 1)
> +#define GEN8_CTX_STATUS_ELEMENT_SWITCH	(1 << 2)
> +#define GEN8_CTX_STATUS_ACTIVE_IDLE	(1 << 3)
> +#define GEN8_CTX_STATUS_COMPLETE	(1 << 4)
> +#define GEN8_CTX_STATUS_LITE_RESTORE	(1 << 15)
>
>   #define CTX_LRI_HEADER_0		0x01
>   #define CTX_CONTEXT_CONTROL		0x02
> @@ -237,6 +254,9 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
>   {
>   	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
>   	struct drm_i915_gem_request *cursor = NULL, *tmp = NULL;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +
> +	assert_spin_locked(&ring->execlist_lock);
>
>   	if (list_empty(&ring->execlist_queue))
>   		return;
> @@ -249,8 +269,7 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
>   			/* Same ID: ignore first request, as second request
>   			 * will update tail past first request's workload */
>   			list_del(&req0->execlist_link);
> -			i915_gem_context_unreference(req0->ctx);
> -			kfree(req0);
> +			queue_work(dev_priv->wq, &req0->work);
>   			req0 = cursor;
>   		} else {
>   			req1 = cursor;
> @@ -262,6 +281,83 @@ static void gen8_switch_context_unqueue(struct intel_engine *ring)
>   			req1? req1->ctx : NULL, req1? req1->tail : 0));
>   }
>
> +static bool check_remove_request(struct intel_engine *ring, u32 request_id)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct drm_i915_gem_request *head_req;
> +
> +	assert_spin_locked(&ring->execlist_lock);
> +
> +	head_req = list_first_entry_or_null(&ring->execlist_queue,
> +			struct drm_i915_gem_request, execlist_link);
> +	if (head_req != NULL) {
> +		if (get_submission_id(head_req->ctx) == request_id) {
> +			list_del(&head_req->execlist_link);
> +			queue_work(dev_priv->wq, &head_req->work);
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}
> +
> +void gen8_handle_context_events(struct intel_engine *ring)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	u32 status_pointer;
> +	u8 read_pointer;
> +	u8 write_pointer;
> +	u32 status;
> +	u32 status_id;
> +	u32 submit_contexts = 0;
> +
> +	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
> +
> +	read_pointer = ring->next_context_status_buffer;
> +	write_pointer = status_pointer & 0x07;
> +	if (read_pointer > write_pointer)
> +		write_pointer += 6;
> +
> +	spin_lock(&ring->execlist_lock);
> +
> +	while (read_pointer < write_pointer) {
> +		read_pointer++;
> +		status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> +				(read_pointer % 6) * 8);
> +		status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> +				(read_pointer % 6) * 8 + 4);
> +
> +		if (status & GEN8_CTX_STATUS_ELEMENT_SWITCH) {
> +			if (check_remove_request(ring, status_id))
> +				submit_contexts++;
> +		} else if (status & GEN8_CTX_STATUS_COMPLETE) {
> +			if (check_remove_request(ring, status_id))
> +				submit_contexts++;
> +		}
> +	}
> +
> +	if (submit_contexts != 0)
> +		gen8_switch_context_unqueue(ring);
> +
> +	spin_unlock(&ring->execlist_lock);
> +
> +	WARN(submit_contexts > 2, "More than two context complete events?\n");
> +	ring->next_context_status_buffer = write_pointer % 6;
> +}
> +
> +static void free_request_task(struct work_struct *work)
> +{
> +	struct drm_i915_gem_request *req =
> +			container_of(work, struct drm_i915_gem_request, work);
> +	struct drm_device *dev = req->ring->dev;
> +
> +	mutex_lock(&dev->struct_mutex);
> +	i915_gem_context_unreference(req->ctx);
> +	mutex_unlock(&dev->struct_mutex);
> +
> +	kfree(req);
> +}
> +
>   int gen8_switch_context_queue(struct intel_engine *ring,
>   			      struct i915_hw_context *to,
>   			      u32 tail)
> @@ -276,6 +372,7 @@ int gen8_switch_context_queue(struct intel_engine *ring,
>   	req->ctx = to;
>   	i915_gem_context_reference(req->ctx);
>   	req->tail = tail;
> +	INIT_WORK(&req->work, free_request_task);
>
>   	spin_lock_irqsave(&ring->execlist_lock, flags);
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index a92bede..ee5a220 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1464,6 +1464,7 @@ static int intel_init_ring(struct drm_device *dev,
>   		if (ring->status_page.page_addr == NULL)
>   			return -ENOMEM;
>   		ring->status_page.obj = obj;
> +		ring->next_context_status_buffer = 0;
>   	} else if (I915_NEED_GFX_HWS(dev)) {
>   		ret = init_status_page(ring);
>   		if (ret)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 5f4fd3c..daca04e 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -173,6 +173,7 @@ struct intel_engine {
>
>   	struct i915_hw_context *default_context;
>   	struct i915_hw_context *last_context;
> +	u8 next_context_status_buffer;
>
>   	struct intel_ring_hangcheck hangcheck;
>
>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 43/49] drm/i915/bdw: Handle context switch events
  2014-04-26  0:53   ` Robert Beckett
@ 2014-04-28 14:43     ` Mateo Lozano, Oscar
  0 siblings, 0 replies; 85+ messages in thread
From: Mateo Lozano, Oscar @ 2014-04-28 14:43 UTC (permalink / raw)
  To: Beckett, Robert, intel-gfx; +Cc: Daniel, Thomas

> > >   		tmp = I915_READ(GEN8_GT_IIR(0));
> > >   		if (tmp) {
> > >   			ret = IRQ_HANDLED;
> > > +
> > >   			rcs = tmp >> GEN8_RCS_IRQ_SHIFT;
> > > -			bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
> > > +			ring = &dev_priv->ring[RCS];
> > >   			if (rcs & GT_RENDER_USER_INTERRUPT)
> > > -				notify_ring(dev, &dev_priv->ring[RCS]);
> > > +				notify_ring(dev, ring);
> > > +			if (rcs & GEN8_GT_CONTEXT_SWITCH_INTERRUPT)
> > > +				gen8_handle_context_events(ring);
> > 
> > Handling the context events here can generate a new execlist submission,
> > which if a small enough workload, can finish and generate a new context event
> > interrupt before we ack this interrupt.
> > 
> > When we ack this interrupt, we clear the new one too, loosing an interrupt.
> > 
> > Moving the
> > 
> > I915_WRITE(GEN8_GT_IIR(0), tmp);
> > 
> > to just inside the if (tmp) { conditional (or anywhere before this call) fixes this
> > issue. There is no harm in acking the interrupt immediately as we have the
> > read stored in tmp.
> > 
> -----Original Message-----
> From: Daniel, Thomas
> Sent: Monday, April 28, 2014 10:58 AM
> To: Beckett, Robert; Mateo Lozano, Oscar; Barbalho, Rafael; Ewins, Jon
> Subject: RE: Re: [Intel-gfx] [PATCH 43/49] drm/i915/bdw: Handle context switch
> events
> 
> Hi Bob,
> 
> Looks like a good catch, and a sensible fix.
> 
> Thomas.

I agree with Thomas. Will add to the next revision of the series.

Thanks!
Oscar

^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2014-04-28 14:44 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-27 17:59 [PATCH 00/49] Execlists oscar.mateo
2014-03-27 17:59 ` [PATCH 01/49] drm/i915/bdw: Macro to distinguish LRCs (Logical Ring Contexts) oscar.mateo
2014-03-27 17:59 ` [PATCH 02/49] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
2014-03-27 17:59 ` [PATCH 03/49] drm/i915: for_each_ring oscar.mateo
2014-03-27 17:59 ` [PATCH 04/49] drm/i915: Simplify a couple of functions thanks to for_each_ring oscar.mateo
2014-03-27 17:59 ` [PATCH 05/49] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
2014-03-27 17:59 ` [PATCH 06/49] drm/i915/bdw: New file for logical ring contexts and execlists oscar.mateo
2014-03-27 17:59 ` [PATCH 07/49] drm/i915/bdw: Rework init code for gen8 contexts oscar.mateo
2014-03-27 17:59 ` [PATCH 08/49] drm/i915: Make i915_gem_create_context outside accessible oscar.mateo
2014-03-27 17:59 ` [PATCH 09/49] drm/i915: Extract ringbuffer obj alloc & destroy oscar.mateo
2014-03-27 17:59 ` [PATCH 10/49] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
2014-03-27 17:59 ` [PATCH 11/49] drm/i915: Split the ringbuffers and the rings oscar.mateo
2014-03-27 17:59 ` [PATCH 12/49] drm/i915: Rename functions that mention ringbuffers (meaning rings) oscar.mateo
2014-03-27 17:59 ` [PATCH 13/49] drm/i915/bdw: Execlists ring tail writing oscar.mateo
2014-03-27 17:13   ` Mateo Lozano, Oscar
2014-03-27 17:59 ` [PATCH 14/49] drm/i915/bdw: LR context ring init oscar.mateo
2014-03-27 17:59 ` [PATCH 15/49] drm/i915/bdw: GEN8 semaphoreless ring add request oscar.mateo
2014-03-27 17:59 ` [PATCH 16/49] drm/i915/bdw: GEN8 new ring flush oscar.mateo
2014-03-27 17:59 ` [PATCH 17/49] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
2014-04-01  0:38   ` Damien Lespiau
2014-04-01 13:47     ` Mateo Lozano, Oscar
2014-04-01 13:51       ` Damien Lespiau
2014-04-01 19:18         ` Ben Widawsky
2014-04-01 21:05           ` Damien Lespiau
2014-04-02  4:07             ` Ben Widawsky
2014-03-27 17:59 ` [PATCH 18/49] drm/i915/bdw: Allocate ringbuffer for LR contexts oscar.mateo
2014-03-27 17:59 ` [PATCH 19/49] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
2014-04-01  0:00   ` Damien Lespiau
2014-04-01 13:33     ` Mateo Lozano, Oscar
2014-04-15 16:00   ` Jeff McGee
2014-04-15 16:10     ` Jeff McGee
2014-04-15 19:51       ` Daniel Vetter
2014-04-15 20:43       ` Jeff McGee
2014-04-15 21:08         ` Daniel Vetter
2014-04-15 22:32           ` Jeff McGee
2014-04-16  6:04             ` Daniel Vetter
2014-03-27 17:59 ` [PATCH 20/49] drm/i915/bdw: Status page for LR contexts oscar.mateo
2014-03-27 17:59 ` [PATCH 21/49] drm/i915/bdw: Enable execlists in the hardware oscar.mateo
2014-03-27 17:59 ` [PATCH 22/49] drm/i915/bdw: Plumbing for user LR context switching oscar.mateo
2014-03-27 17:59 ` [PATCH 23/49] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit oscar.mateo
2014-03-27 17:59 ` [PATCH 24/49] drm/i915/bdw: Write a new set of context-aware ringbuffer management functions oscar.mateo
2014-03-27 17:59 ` [PATCH 25/49] drm/i915: Final touches to LR contexts plumbing and refactoring oscar.mateo
2014-03-27 17:59 ` [PATCH 26/49] drm/i915/bdw: Set the request context information correctly in the LRC case oscar.mateo
2014-03-27 17:59 ` [PATCH 27/49] drm/i915/bdw: Prepare for user-created LR contexts oscar.mateo
2014-03-27 17:59 ` [PATCH 28/49] drm/i915/bdw: Start creating & destroying user " oscar.mateo
2014-03-27 17:59 ` [PATCH 29/49] drm/i915/bdw: Pin context pages at context create time oscar.mateo
2014-03-27 17:59 ` [PATCH 30/49] drm/i915/bdw: Extract LR context object populating oscar.mateo
2014-03-27 18:00 ` [PATCH 31/49] drm/i915/bdw: Introduce dependent contexts oscar.mateo
2014-03-27 17:21   ` Mateo Lozano, Oscar
2014-04-09 16:54     ` Mateo Lozano, Oscar
2014-03-27 18:00 ` [PATCH 32/49] drm/i915/bdw: Create stand-alone and " oscar.mateo
2014-03-27 18:00 ` [PATCH 33/49] drm/i915/bdw: Allow non-default, non-render user LR contexts oscar.mateo
2014-03-27 18:00 ` [PATCH 34/49] drm/i915/bdw: Fix reset stats ioctl with " oscar.mateo
2014-03-27 18:00 ` [PATCH 35/49] drm/i915: Allocate an integer ID for each new file descriptor oscar.mateo
2014-03-27 18:00 ` [PATCH 36/49] drm/i915/bdw: Prepare for a 20-bits globally unique submission ID oscar.mateo
2014-03-27 18:00 ` [PATCH 37/49] drm/i915/bdw: Implement context switching (somewhat) oscar.mateo
2014-03-27 18:00 ` [PATCH 38/49] drm/i915/bdw: Add forcewake lock around ELSP writes oscar.mateo
2014-03-27 18:00 ` [PATCH 39/49] drm/i915/bdw: Swap the PPGTT PDPs, LRC style oscar.mateo
2014-03-31 16:42   ` Damien Lespiau
2014-04-01 13:42     ` Mateo Lozano, Oscar
2014-04-02 13:47   ` Damien Lespiau
2014-04-09  7:56     ` Mateo Lozano, Oscar
2014-03-27 18:00 ` [PATCH 40/49] drm/i915/bdw: Write the tail pointer, " oscar.mateo
2014-03-27 18:00 ` [PATCH 41/49] drm/i915/bdw: LR context switch interrupts oscar.mateo
2014-04-02 11:42   ` Damien Lespiau
2014-04-02 11:49     ` Daniel Vetter
2014-04-02 12:56       ` Damien Lespiau
2014-03-27 18:00 ` [PATCH 42/49] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
2014-04-04 11:12   ` Damien Lespiau
2014-04-04 13:24     ` Damien Lespiau
2014-04-09  7:57       ` Mateo Lozano, Oscar
2014-03-27 18:00 ` [PATCH 43/49] drm/i915/bdw: Handle context switch events oscar.mateo
2014-04-03 14:24   ` Damien Lespiau
2014-04-09  8:15     ` Mateo Lozano, Oscar
2014-04-26  0:53   ` Robert Beckett
2014-04-28 14:43     ` Mateo Lozano, Oscar
2014-03-27 18:00 ` [PATCH 44/49] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
2014-04-07 19:19   ` Damien Lespiau
2014-03-27 18:00 ` [PATCH 45/49] drm/i915/bdw: Display context ringbuffer " oscar.mateo
2014-03-27 18:00 ` [PATCH 46/49] drm/i915/bdw: Start queueing contexts to be submitted oscar.mateo
2014-03-27 18:00 ` [PATCH 47/49] drm/i915/bdw: Always write seqno to default context oscar.mateo
2014-03-27 18:00 ` [PATCH 48/49] drm/i915/bdw: Enable logical ring contexts oscar.mateo
2014-03-27 18:00 ` [PATCH 49/49] drm/i915/bdw: Document execlists and " oscar.mateo
2014-04-07 18:12 ` [PATCH 00/49] Execlists Damien Lespiau
2014-04-07 21:32   ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.