All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/55] Remove the outstanding_lazy_request
@ 2015-05-29 16:43 John.C.Harrison
  2015-05-29 16:43 ` [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful John.C.Harrison
                   ` (56 more replies)
  0 siblings, 57 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The driver tracks GPU work using request structures. Unfortunately, this
tracking is not currently explicit but is done by means of a catch-all request
that floats around in the background hoovering up work until it gets submitted.
This background request (ring->outstanding_lazy_request or OLR) is created at
the point of actually writing to the ring rather than when a particular piece of
GPU work is started. This scheme sort of hangs together but causes a number of
issues. It can mean that multiple pieces of independent work are lumped together
in the same request or that work is not officially submitted until much later
than it was created.

This patch series completely removes the OLR and explicitly tracks each piece of
work with it's own personal request structure from start to submission.

The patch set seems to fix the "'gem_ringfill --r render' + ctrl-c straight
after boot" issue logged as BZ:88865. I haven't done any analysis of that
particular issue but the descriptions I've seen appear to blame an inconsistent
or mangled OLR.

Note also that by the end of this series, a number of differences between the
legacy and execlist code paths have been removed. For example add_request() and
emit_request() now have the same signature thus could be merged back to a single
function pointer. Merging some of these together would also allow the removal of
a bunch of 'if(execlists)' tests where the difference is simply to call the
legacy function or the execlist one.

v2: Rebased to newer nightly tree, fixed up a few minor issues, added two extra
patches - one to move the LRC ring begin around in the vein of other recent
reshuffles, the other to clean up some issues with i915_add_request().

v3: Large re-work due to feedback from code review. Some patches have been
removed, extra ones have been added and others have been changed significantly.
It is recommended that all patches are reviewed from scratch rather than
assuming only certain ones have changed and need re-inspecting. The exceptions
are where the 'reviewed-by' tag has been kept because that patch was not
significantly affected.

v4: Further updates due to review feedback and rebasing on top of significant
changes to the underlying tree.

[Patches against drm-intel-nightly tree fetched 22/05/2015]

John Harrison (55):
  drm/i915: Re-instate request->uniq becuase it is extremely useful
  drm/i915: Reserve ring buffer space for i915_add_request() commands
  drm/i915: i915_add_request must not fail
  drm/i915: Early alloc request in execbuff
  drm/i915: Set context in request from creation even in legacy mode
  drm/i915: Merged the many do_execbuf() parameters into a structure
  drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters
  drm/i915: Update alloc_request to return the allocated request
  drm/i915: Add request to execbuf params and add explicit cleanup
  drm/i915: Update the dispatch tracepoint to use params->request
  drm/i915: Update move_to_gpu() to take a request structure
  drm/i915: Update execbuffer_move_to_active() to take a request structure
  drm/i915: Add flag to i915_add_request() to skip the cache flush
  drm/i915: Update i915_gpu_idle() to manage its own request
  drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring
  drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable()
  drm/i915: Don't tag kernel batches as user batches
  drm/i915: Add explicit request management to i915_gem_init_hw()
  drm/i915: Update ppgtt_init_ring() & context_enable() to take requests
  drm/i915: Update i915_switch_context() to take a request structure
  drm/i915: Update do_switch() to take a request structure
  drm/i915: Update deferred context creation to do explicit request management
  drm/i915: Update init_context() to take a request structure
  drm/i915: Update render_state_init() to take a request structure
  drm/i915: Update i915_gem_object_sync() to take a request structure
  drm/i915: Update overlay code to do explicit request management
  drm/i915: Update queue_flip() to take a request structure
  drm/i915: Update add_request() to take a request structure
  drm/i915: Update [vma|object]_move_to_active() to take request structures
  drm/i915: Update l3_remap to take a request structure
  drm/i915: Update mi_set_context() to take a request structure
  drm/i915: Update a bunch of execbuffer helpers to take request structures
  drm/i915: Update workarounds_emit() to take request structures
  drm/i915: Update flush_all_caches() to take request structures
  drm/i915: Update switch_mm() to take a request structure
  drm/i915: Update ring->flush() to take a requests structure
  drm/i915: Update some flush helpers to take request structures
  drm/i915: Update ring->emit_flush() to take a request structure
  drm/i915: Update ring->add_request() to take a request structure
  drm/i915: Update ring->emit_request() to take a request structure
  drm/i915: Update ring->dispatch_execbuffer() to take a request structure
  drm/i915: Update ring->emit_bb_start() to take a request structure
  drm/i915: Update ring->sync_to() to take a request structure
  drm/i915: Update ring->signal() to take a request structure
  drm/i915: Update cacheline_align() to take a request structure
  drm/i915: Update intel_ring_begin() to take a request structure
  drm/i915: Update intel_logical_ring_begin() to take a request structure
  drm/i915: Add *_ring_begin() to request allocation
  drm/i915: Remove the now obsolete intel_ring_get_request()
  drm/i915: Remove the now obsolete 'outstanding_lazy_request'
  drm/i915: Move the request/file and request/pid association to creation time
  drm/i915: Remove 'faked' request from LRC submission
  drm/i915: Update a bunch of LRC functions to take requests
  drm/i915: Remove the now obsolete 'i915_gem_check_olr()'
  drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()

 drivers/gpu/drm/i915/i915_drv.h              |   77 +++---
 drivers/gpu/drm/i915/i915_gem.c              |  368 ++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_context.c      |   78 +++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  128 +++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c          |   59 +++--
 drivers/gpu/drm/i915/i915_gem_gtt.h          |    3 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c |   15 +-
 drivers/gpu/drm/i915/i915_gem_render_state.h |    2 +-
 drivers/gpu/drm/i915/i915_trace.h            |   41 +--
 drivers/gpu/drm/i915/intel_display.c         |   60 +++--
 drivers/gpu/drm/i915/intel_drv.h             |    3 +-
 drivers/gpu/drm/i915/intel_fbdev.c           |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c             |  265 +++++++++----------
 drivers/gpu/drm/i915/intel_lrc.h             |   16 +-
 drivers/gpu/drm/i915/intel_overlay.c         |   63 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.c      |  303 +++++++++++++--------
 drivers/gpu/drm/i915/intel_ringbuffer.h      |   53 ++--
 17 files changed, 876 insertions(+), 660 deletions(-)

-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-06-03 11:14   ` Tomas Elf
  2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
                   ` (55 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The seqno value cannot always be used when debugging issues via trace
points. This is because it can be reset back to start, especially
during TDR type tests. Also, when the scheduler arrives the seqno is
only valid while a given request is executing on the hardware. While
the request is simply queued waiting for submission, it's seqno value
will be zero (meaning invalid).

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |    4 ++++
 drivers/gpu/drm/i915/i915_gem.c   |    1 +
 drivers/gpu/drm/i915/i915_trace.h |   13 +++++++++----
 drivers/gpu/drm/i915/intel_lrc.c  |    2 ++
 4 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1038f5c..0347eb9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1882,6 +1882,8 @@ struct drm_i915_private {
 
 	bool edp_low_vswing;
 
+	uint32_t request_uniq;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
@@ -2160,6 +2162,8 @@ struct drm_i915_gem_request {
 	/** process identifier submitting this request */
 	struct pid *pid;
 
+	uint32_t uniq;
+
 	/**
 	 * The ELSP only accepts two elements at a time, so we queue
 	 * context/tail pairs on a given queue (ring->execlist_queue) until the
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index cc206f1..68f1d1e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2657,6 +2657,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 		goto err;
 
 	req->ring = ring;
+	req->uniq = dev_priv->request_uniq++;
 
 	if (i915.enable_execlists)
 		ret = intel_logical_ring_alloc_request_extras(req, ctx);
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 497cba5..6cbc280 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -504,6 +504,7 @@ DECLARE_EVENT_CLASS(i915_gem_request,
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
 			     __field(u32, ring)
+			     __field(u32, uniq)
 			     __field(u32, seqno)
 			     ),
 
@@ -512,11 +513,13 @@ DECLARE_EVENT_CLASS(i915_gem_request,
 						i915_gem_request_get_ring(req);
 			   __entry->dev = ring->dev->primary->index;
 			   __entry->ring = ring->id;
+			   __entry->uniq = req ? req->uniq : 0;
 			   __entry->seqno = i915_gem_request_get_seqno(req);
 			   ),
 
-	    TP_printk("dev=%u, ring=%u, seqno=%u",
-		      __entry->dev, __entry->ring, __entry->seqno)
+	    TP_printk("dev=%u, ring=%u, uniq=%u, seqno=%u",
+		      __entry->dev, __entry->ring, __entry->uniq,
+		      __entry->seqno)
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
@@ -561,6 +564,7 @@ TRACE_EVENT(i915_gem_request_wait_begin,
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
 			     __field(u32, ring)
+			     __field(u32, uniq)
 			     __field(u32, seqno)
 			     __field(bool, blocking)
 			     ),
@@ -576,13 +580,14 @@ TRACE_EVENT(i915_gem_request_wait_begin,
 						i915_gem_request_get_ring(req);
 			   __entry->dev = ring->dev->primary->index;
 			   __entry->ring = ring->id;
+			   __entry->uniq = req ? req->uniq : 0;
 			   __entry->seqno = i915_gem_request_get_seqno(req);
 			   __entry->blocking =
 				     mutex_is_locked(&ring->dev->struct_mutex);
 			   ),
 
-	    TP_printk("dev=%u, ring=%u, seqno=%u, blocking=%s",
-		      __entry->dev, __entry->ring,
+	    TP_printk("dev=%u, ring=%u, uniq=%u, seqno=%u, blocking=%s",
+		      __entry->dev, __entry->ring, __entry->uniq,
 		      __entry->seqno, __entry->blocking ?  "yes (NB)" : "no")
 );
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 96ae90a..6a5ed07 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -549,6 +549,7 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
 				   struct drm_i915_gem_request *request)
 {
 	struct drm_i915_gem_request *cursor;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	int num_elements = 0;
 
 	if (to != ring->default_context)
@@ -565,6 +566,7 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
 		request->ring = ring;
 		request->ctx = to;
 		kref_init(&request->ref);
+		request->uniq = dev_priv->request_uniq++;
 		i915_gem_context_reference(request->ctx);
 	} else {
 		i915_gem_request_reference(request);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
  2015-05-29 16:43 ` [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-06-02 18:14   ` Tomas Elf
                     ` (2 more replies)
  2015-05-29 16:43 ` [PATCH 03/55] drm/i915: i915_add_request must not fail John.C.Harrison
                   ` (54 subsequent siblings)
  56 siblings, 3 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

It is a bad idea for i915_add_request() to fail. The work will already have been
send to the ring and will be processed, but there will not be any tracking or
management of that work.

The only way the add request call can fail is if it can't write its epilogue
commands to the ring (cache flushing, seqno updates, interrupt signalling). The
reasons for that are mostly down to running out of ring buffer space and the
problems associated with trying to get some more. This patch prevents that
situation from happening in the first place.

When a request is created, it marks sufficient space as reserved for the
epilogue commands. Thus guaranteeing that by the time the epilogue is written,
there will be plenty of space for it. Note that a ring_begin() call is required
to actually reserve the space (and do any potential waiting). However, that is
not currently done at request creation time. This is because the ring_begin()
code can allocate a request. Hence calling begin() from the request allocation
code would lead to infinite recursion! Later patches in this series remove the
need for begin() to do the allocate. At that point, it becomes safe for the
allocate to call begin() and really reserve the space.

Until then, there is a potential for insufficient space to be available at the
point of calling i915_add_request(). However, that would only be in the case
where the request was created and immediately submitted without ever calling
ring_begin() and adding any work to that request. Which should never happen. And
even if it does, and if that request happens to fall down the tiny window of
opportunity for failing due to being out of ring space then does it really
matter because the request wasn't doing anything in the first place?

v2: Updated the 'reserved space too small' warning to include the offending
sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
re-initialisation of tracking state after a buffer wrap to keep the sanity
checks accurate.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    1 +
 drivers/gpu/drm/i915/i915_gem.c         |   37 +++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        |    9 ++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |   68 ++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |   10 +++++
 5 files changed, 123 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0347eb9..eba1857 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
 
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct intel_context *ctx);
+void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 
 static inline uint32_t
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 68f1d1e..6f51416 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
 	} else
 		ringbuf = ring->buffer;
 
+	/*
+	 * To ensure that this call will not fail, space for its emissions
+	 * should already have been reserved in the ring buffer. Let the ring
+	 * know that it is time to use that space up.
+	 */
+	intel_ring_reserved_space_use(ringbuf);
+
 	request_start = intel_ring_get_tail(ringbuf);
 	/*
 	 * Emit any outstanding flushes - execbuf can fail to emit the flush
@@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
 			   round_jiffies_up_relative(HZ));
 	intel_mark_busy(dev_priv->dev);
 
+	/* Sanity check that the reserved size was large enough. */
+	intel_ring_reserved_space_end(ringbuf);
+
 	return 0;
 }
 
@@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	if (ret)
 		goto err;
 
+	/*
+	 * Reserve space in the ring buffer for all the commands required to
+	 * eventually emit this request. This is to guarantee that the
+	 * i915_add_request() call can't fail. Note that the reserve may need
+	 * to be redone if the request is not actually submitted straight
+	 * away, e.g. because a GPU scheduler has deferred it.
+	 *
+	 * Note further that this call merely notes the reserve request. A
+	 * subsequent call to *_ring_begin() is required to actually ensure
+	 * that the reservation is available. Without the begin, if the
+	 * request creator immediately submitted the request without adding
+	 * any commands to it then there might not actually be sufficient
+	 * room for the submission commands. Unfortunately, the current
+	 * *_ring_begin() implementations potentially call back here to
+	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
+	 * infinite recursion! Until that back call path is removed, it is
+	 * necessary to do a manual _begin() outside.
+	 */
+	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
+
 	ring->outstanding_lazy_request = req;
 	return 0;
 
@@ -2674,6 +2704,13 @@ err:
 	return ret;
 }
 
+void i915_gem_request_cancel(struct drm_i915_gem_request *req)
+{
+	intel_ring_reserved_space_cancel(req->ringbuf);
+
+	i915_gem_request_unreference(req);
+}
+
 struct drm_i915_gem_request *
 i915_gem_find_active_request(struct intel_engine_cs *ring)
 {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6a5ed07..e62d396 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -687,6 +687,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
 	unsigned space;
 	int ret;
 
+	/* The whole point of reserving space is to not wait! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (intel_ring_space(ringbuf) >= bytes)
 		return 0;
 
@@ -747,6 +750,9 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
 	uint32_t __iomem *virt;
 	int rem = ringbuf->size - ringbuf->tail;
 
+	/* Can't wrap if space has already been reserved! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (ringbuf->space < rem) {
 		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
 
@@ -770,6 +776,9 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
 {
 	int ret;
 
+	if (!ringbuf->reserved_in_use)
+		bytes += ringbuf->reserved_size;
+
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
 		ret = logical_ring_wrap_buffer(ringbuf, ctx);
 		if (unlikely(ret))
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d934f85..74c2222 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2103,6 +2103,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
 	unsigned space;
 	int ret;
 
+	/* The whole point of reserving space is to not wait! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (intel_ring_space(ringbuf) >= n)
 		return 0;
 
@@ -2130,6 +2133,9 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
 	struct intel_ringbuffer *ringbuf = ring->buffer;
 	int rem = ringbuf->size - ringbuf->tail;
 
+	/* Can't wrap if space has already been reserved! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (ringbuf->space < rem) {
 		int ret = ring_wait_for_space(ring, rem);
 		if (ret)
@@ -2180,16 +2186,74 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
 	return 0;
 }
 
-static int __intel_ring_prepare(struct intel_engine_cs *ring,
-				int bytes)
+void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
+{
+	/* NB: Until request management is fully tidied up and the OLR is
+	 * removed, there are too many ways for get false hits on this
+	 * anti-recursion check! */
+	/*WARN_ON(ringbuf->reserved_size);*/
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_size = size;
+
+	/*
+	 * Really need to call _begin() here but that currently leads to
+	 * recursion problems! This will be fixed later but for now just
+	 * return and hope for the best. Note that there is only a real
+	 * problem if the create of the request never actually calls _begin()
+	 * but if they are not submitting any work then why did they create
+	 * the request in the first place?
+	 */
+}
+
+void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_size   = 0;
+	ringbuf->reserved_in_use = false;
+}
+
+void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_in_use = true;
+	ringbuf->reserved_tail   = ringbuf->tail;
+}
+
+void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(!ringbuf->reserved_in_use);
+	WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
+	     "request reserved size too small: %d vs %d!\n",
+	     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
+
+	ringbuf->reserved_size   = 0;
+	ringbuf->reserved_in_use = false;
+}
+
+static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
 {
 	struct intel_ringbuffer *ringbuf = ring->buffer;
 	int ret;
 
+	if (!ringbuf->reserved_in_use)
+		bytes += ringbuf->reserved_size;
+
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
+		WARN_ON(ringbuf->reserved_in_use);
+
 		ret = intel_wrap_ring_buffer(ring);
 		if (unlikely(ret))
 			return ret;
+
+		if(ringbuf->reserved_size) {
+			uint32_t size = ringbuf->reserved_size;
+
+			intel_ring_reserved_space_cancel(ringbuf);
+			intel_ring_reserved_space_reserve(ringbuf, size);
+		}
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 39f6dfc..39f795c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -105,6 +105,9 @@ struct intel_ringbuffer {
 	int space;
 	int size;
 	int effective_size;
+	int reserved_size;
+	int reserved_tail;
+	bool reserved_in_use;
 
 	/** We track the position of the requests in the ring buffer, and
 	 * when each is retired we increment last_retired_head as the GPU
@@ -450,4 +453,11 @@ intel_ring_get_request(struct intel_engine_cs *ring)
 	return ring->outstanding_lazy_request;
 }
 
+#define MIN_SPACE_FOR_ADD_REQUEST	128
+
+void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
+void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
+void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
+void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 03/55] drm/i915: i915_add_request must not fail
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
  2015-05-29 16:43 ` [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful John.C.Harrison
  2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-06-02 18:16   ` Tomas Elf
  2015-06-23 10:16   ` Chris Wilson
  2015-05-29 16:43 ` [PATCH 04/55] drm/i915: Early alloc request in execbuff John.C.Harrison
                   ` (53 subsequent siblings)
  56 siblings, 2 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The i915_add_request() function is called to keep track of work that has been
written to the ring buffer. It adds epilogue commands to track progress (seqno
updates and such), moves the request structure onto the right list and other
such house keeping tasks. However, the work itself has already been written to
the ring and will get executed whether or not the add request call succeeds. So
no matter what goes wrong, there isn't a whole lot of point in failing the call.

At the moment, this is fine(ish). If the add request does bail early on and not
do the housekeeping, the request will still float around in the
ring->outstanding_lazy_request field and be picked up next time. It means
multiple pieces of work will be tagged as the same request and driver can't
actually wait for the first piece of work until something else has been
submitted. But it all sort of hangs together.

This patch series is all about removing the OLR and guaranteeing that each piece
of work gets its own personal request. That means that there is no more
'hoovering up of forgotten requests'. If the request does not get tracked then
it will be leaked. Thus the add request call _must_ not fail. The previous patch
should have already ensured that it _will_ not fail by removing the potential
for running out of ring space. This patch enforces the rule by actually removing
the early exit paths and the return code.

Note that if something does manage to fail and the epilogue commands don't get
written to the ring, the driver will still hang together. The request will be
added to the tracking lists. And as in the old case, any subsequent work will
generate a new seqno which will suffice for marking the old one as complete.

v2: Improved WARNings (Tomas Elf review request).

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h              |    6 ++--
 drivers/gpu/drm/i915/i915_gem.c              |   43 ++++++++++++--------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
 drivers/gpu/drm/i915/intel_overlay.c         |    8 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.c      |    8 ++---
 7 files changed, 31 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index eba1857..1be4a52 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2860,9 +2860,9 @@ void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
-int __i915_add_request(struct intel_engine_cs *ring,
-		       struct drm_file *file,
-		       struct drm_i915_gem_object *batch_obj);
+void __i915_add_request(struct intel_engine_cs *ring,
+			struct drm_file *file,
+			struct drm_i915_gem_object *batch_obj);
 #define i915_add_request(ring) \
 	__i915_add_request(ring, NULL, NULL)
 int __i915_wait_request(struct drm_i915_gem_request *req,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6f51416..dd39aa5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1155,15 +1155,12 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
 int
 i915_gem_check_olr(struct drm_i915_gem_request *req)
 {
-	int ret;
-
 	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
 
-	ret = 0;
 	if (req == req->ring->outstanding_lazy_request)
-		ret = i915_add_request(req->ring);
+		i915_add_request(req->ring);
 
-	return ret;
+	return 0;
 }
 
 static void fake_irq(unsigned long data)
@@ -2466,9 +2463,14 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
 	return 0;
 }
 
-int __i915_add_request(struct intel_engine_cs *ring,
-		       struct drm_file *file,
-		       struct drm_i915_gem_object *obj)
+/*
+ * NB: This function is not allowed to fail. Doing so would mean the the
+ * request is not being tracked for completion but the work itself is
+ * going to happen on the hardware. This would be a Bad Thing(tm).
+ */
+void __i915_add_request(struct intel_engine_cs *ring,
+			struct drm_file *file,
+			struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct drm_i915_gem_request *request;
@@ -2478,7 +2480,7 @@ int __i915_add_request(struct intel_engine_cs *ring,
 
 	request = ring->outstanding_lazy_request;
 	if (WARN_ON(request == NULL))
-		return -ENOMEM;
+		return;
 
 	if (i915.enable_execlists) {
 		ringbuf = request->ctx->engine[ring->id].ringbuf;
@@ -2500,15 +2502,12 @@ int __i915_add_request(struct intel_engine_cs *ring,
 	 * is that the flush _must_ happen before the next request, no matter
 	 * what.
 	 */
-	if (i915.enable_execlists) {
+	if (i915.enable_execlists)
 		ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
-		if (ret)
-			return ret;
-	} else {
+	else
 		ret = intel_ring_flush_all_caches(ring);
-		if (ret)
-			return ret;
-	}
+	/* Not allowed to fail! */
+	WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
 
 	/* Record the position of the start of the request so that
 	 * should we detect the updated seqno part-way through the
@@ -2517,17 +2516,15 @@ int __i915_add_request(struct intel_engine_cs *ring,
 	 */
 	request->postfix = intel_ring_get_tail(ringbuf);
 
-	if (i915.enable_execlists) {
+	if (i915.enable_execlists)
 		ret = ring->emit_request(ringbuf, request);
-		if (ret)
-			return ret;
-	} else {
+	else {
 		ret = ring->add_request(ring);
-		if (ret)
-			return ret;
 
 		request->tail = intel_ring_get_tail(ringbuf);
 	}
+	/* Not allowed to fail! */
+	WARN(ret, "emit|add_request failed: %d!\n", ret);
 
 	request->head = request_start;
 
@@ -2576,8 +2573,6 @@ int __i915_add_request(struct intel_engine_cs *ring,
 
 	/* Sanity check that the reserved size was large enough. */
 	intel_ring_reserved_space_end(ringbuf);
-
-	return 0;
 }
 
 static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index bd0e4bd..2b48a31 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1061,7 +1061,7 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 	ring->gpu_caches_dirty = true;
 
 	/* Add a breadcrumb for the completion of the batch buffer */
-	(void)__i915_add_request(ring, file, obj);
+	__i915_add_request(ring, file, obj);
 }
 
 static int
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 521548a..ce4788f 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
-	ret = __i915_add_request(ring, NULL, so.obj);
+	__i915_add_request(ring, NULL, so.obj);
 	/* __i915_add_request moves object to inactive if it fails */
 out:
 	i915_gem_render_state_fini(&so);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e62d396..7a75fc8 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1373,7 +1373,7 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
-	ret = __i915_add_request(ring, file, so.obj);
+	__i915_add_request(ring, file, so.obj);
 	/* intel_logical_ring_add_request moves object to inactive if it
 	 * fails */
 out:
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 25c8ec6..e7534b9 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -220,9 +220,7 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
 	WARN_ON(overlay->last_flip_req);
 	i915_gem_request_assign(&overlay->last_flip_req,
 					     ring->outstanding_lazy_request);
-	ret = i915_add_request(ring);
-	if (ret)
-		return ret;
+	i915_add_request(ring);
 
 	overlay->flip_tail = tail;
 	ret = i915_wait_request(overlay->last_flip_req);
@@ -291,7 +289,9 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 	WARN_ON(overlay->last_flip_req);
 	i915_gem_request_assign(&overlay->last_flip_req,
 					     ring->outstanding_lazy_request);
-	return i915_add_request(ring);
+	i915_add_request(ring);
+
+	return 0;
 }
 
 static void intel_overlay_release_old_vid_tail(struct intel_overlay *overlay)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 74c2222..7061b07 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2156,14 +2156,10 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
 int intel_ring_idle(struct intel_engine_cs *ring)
 {
 	struct drm_i915_gem_request *req;
-	int ret;
 
 	/* We need to add any requests required to flush the objects and ring */
-	if (ring->outstanding_lazy_request) {
-		ret = i915_add_request(ring);
-		if (ret)
-			return ret;
-	}
+	if (ring->outstanding_lazy_request)
+		i915_add_request(ring);
 
 	/* Wait upon the last request to be completed */
 	if (list_empty(&ring->request_list))
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 04/55] drm/i915: Early alloc request in execbuff
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (2 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 03/55] drm/i915: i915_add_request must not fail John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 05/55] drm/i915: Set context in request from creation even in legacy mode John.C.Harrison
                   ` (52 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Start of explicit request management in the execbuffer code path. This patch
adds a call to allocate a request structure before all the actual hardware work
is done. Thus guaranteeing that all that work is tagged by a known request. At
present, nothing further is done with the request, the rest comes later in the
series.

The only noticable change is that failure to get a request (e.g. due to lack of
memory) will be caught earlier in the sequence. It now occurs right at the start
before any un-undoable work has been done.

v2: Simplified the error handling path.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 2b48a31..e2bccc7 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1603,10 +1603,16 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	} else
 		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
+	/* Allocate a request for this batch buffer nice and early. */
+	ret = i915_gem_request_alloc(ring, ctx);
+	if (ret)
+		goto err_batch_unpin;
+
 	ret = dev_priv->gt.execbuf_submit(dev, file, ring, ctx, args,
 					  &eb->vmas, batch_obj, exec_start,
 					  dispatch_flags);
 
+err_batch_unpin:
 	/*
 	 * FIXME: We crucially rely upon the active tracking for the (ppgtt)
 	 * batch vma for correctness. For less ugly and less fragility this
@@ -1615,6 +1621,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 */
 	if (dispatch_flags & I915_DISPATCH_SECURE)
 		i915_gem_object_ggtt_unpin(batch_obj);
+
 err:
 	/* the request owns the ref now */
 	i915_gem_context_unreference(ctx);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 05/55] drm/i915: Set context in request from creation even in legacy mode
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (3 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 04/55] drm/i915: Early alloc request in execbuff John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 06/55] drm/i915: Merged the many do_execbuf() parameters into a structure John.C.Harrison
                   ` (51 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

In execlist mode, the context object pointer is written in to the request
structure (and reference counted) at the point of request creation. In legacy
mode, this only happens inside i915_add_request().

This patch updates the legacy code path to match the execlist version. This
allows all the intermediate code between request creation and request submission
to get at the context object given only a request structure. Thus negating the
need to pass context pointers here, there and everywhere.

v2: Moved the context reference so it does not need to be undone if the
get_seqno() fails.

v3: Fixed execlist mode always hitting a warning about invalid last_contexts
(which don't exist in execlist mode).

v4: Updated for new i915_gem_request_alloc() scheme.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c  |   22 +++++++++-------------
 drivers/gpu/drm/i915/intel_lrc.c |   11 ++++-------
 drivers/gpu/drm/i915/intel_lrc.h |    3 +--
 3 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dd39aa5..458a449 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2536,14 +2536,7 @@ void __i915_add_request(struct intel_engine_cs *ring,
 	 */
 	request->batch_obj = obj;
 
-	if (!i915.enable_execlists) {
-		/* Hold a reference to the current context so that we can inspect
-		 * it later in case a hangcheck error event fires.
-		 */
-		request->ctx = ring->last_context;
-		if (request->ctx)
-			i915_gem_context_reference(request->ctx);
-	}
+	WARN_ON(!i915.enable_execlists && (request->ctx != ring->last_context));
 
 	request->emitted_jiffies = jiffies;
 	list_add_tail(&request->list, &ring->request_list);
@@ -2654,22 +2647,25 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	if (req == NULL)
 		return -ENOMEM;
 
-	kref_init(&req->ref);
-	req->i915 = dev_priv;
-
 	ret = i915_gem_get_seqno(ring->dev, &req->seqno);
 	if (ret)
 		goto err;
 
+	kref_init(&req->ref);
+	req->i915 = dev_priv;
 	req->ring = ring;
 	req->uniq = dev_priv->request_uniq++;
+	req->ctx  = ctx;
+	i915_gem_context_reference(req->ctx);
 
 	if (i915.enable_execlists)
-		ret = intel_logical_ring_alloc_request_extras(req, ctx);
+		ret = intel_logical_ring_alloc_request_extras(req);
 	else
 		ret = intel_ring_alloc_request_extras(req);
-	if (ret)
+	if (ret) {
+		i915_gem_context_unreference(req->ctx);
 		goto err;
+	}
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7a75fc8..881cb87 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -660,20 +660,17 @@ static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf,
 	return logical_ring_invalidate_all_caches(ringbuf, ctx);
 }
 
-int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request,
-					    struct intel_context *ctx)
+int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
 {
 	int ret;
 
-	if (ctx != request->ring->default_context) {
-		ret = intel_lr_context_pin(request->ring, ctx);
+	if (request->ctx != request->ring->default_context) {
+		ret = intel_lr_context_pin(request->ring, request->ctx);
 		if (ret)
 			return ret;
 	}
 
-	request->ringbuf = ctx->engine[request->ring->id].ringbuf;
-	request->ctx     = ctx;
-	i915_gem_context_reference(request->ctx);
+	request->ringbuf = request->ctx->engine[request->ring->id].ringbuf;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 04d3a6d..4148de0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -36,8 +36,7 @@
 #define RING_CONTEXT_STATUS_PTR(ring)	((ring)->mmio_base+0x3a0)
 
 /* Logical Rings */
-int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request,
-					    struct intel_context *ctx);
+int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request);
 void intel_logical_ring_stop(struct intel_engine_cs *ring);
 void intel_logical_ring_cleanup(struct intel_engine_cs *ring);
 int intel_logical_rings_init(struct drm_device *dev);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 06/55] drm/i915: Merged the many do_execbuf() parameters into a structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (4 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 05/55] drm/i915: Set context in request from creation even in legacy mode John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 07/55] drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters John.C.Harrison
                   ` (50 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The do_execbuf() function takes quite a few parameters. The actual set of
parameters is going to change with the conversion to passing requests around.
Further, it is due to grow massively with the arrival of the GPU scheduler.

This patch simplifies the prototype by passing a parameter structure instead.
Changing the parameter set in the future is then simply a matter of
adding/removing items to the structure.

Note that the structure does not contain absolutely everything that is passed
in. This is because the intention is to use this structure more extensively
later in this patch series and more especially in the GPU scheduler that is
coming soon. The latter requires hanging on to the structure as the final
hardware submission can be delayed until long after the execbuf IOCTL has
returned to user land. Thus it is unsafe to put anything in the structure that
is local to the IOCTL call itself - such as the 'args' parameter. All entries
must be copies of data or pointers to structures that are reference counted in
some way and guaranteed to exist for the duration of the batch buffer's life.

v2: Rebased to newer tree and updated for changes to the command parser.
Specifically, a code shuffle has required saving the batch start address in the
params structure.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |   28 +++++++------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   59 ++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_lrc.c           |   26 ++++++------
 drivers/gpu/drm/i915/intel_lrc.h           |    9 ++---
 4 files changed, 70 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1be4a52..de17814 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1605,6 +1605,17 @@ struct i915_virtual_gpu {
 	bool active;
 };
 
+struct i915_execbuffer_params {
+	struct drm_device               *dev;
+	struct drm_file                 *file;
+	uint32_t                        dispatch_flags;
+	uint32_t                        args_batch_start_offset;
+	uint32_t                        batch_obj_vm_offset;
+	struct intel_engine_cs          *ring;
+	struct drm_i915_gem_object      *batch_obj;
+	struct intel_context            *ctx;
+};
+
 struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *objects;
@@ -1868,13 +1879,9 @@ struct drm_i915_private {
 
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
 	struct {
-		int (*execbuf_submit)(struct drm_device *dev, struct drm_file *file,
-				      struct intel_engine_cs *ring,
-				      struct intel_context *ctx,
+		int (*execbuf_submit)(struct i915_execbuffer_params *params,
 				      struct drm_i915_gem_execbuffer2 *args,
-				      struct list_head *vmas,
-				      struct drm_i915_gem_object *batch_obj,
-				      u64 exec_start, u32 flags);
+				      struct list_head *vmas);
 		int (*init_rings)(struct drm_device *dev);
 		void (*cleanup_ring)(struct intel_engine_cs *ring);
 		void (*stop_ring)(struct intel_engine_cs *ring);
@@ -2662,14 +2669,9 @@ void i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 					 struct drm_file *file,
 					 struct intel_engine_cs *ring,
 					 struct drm_i915_gem_object *obj);
-int i915_gem_ringbuffer_submission(struct drm_device *dev,
-				   struct drm_file *file,
-				   struct intel_engine_cs *ring,
-				   struct intel_context *ctx,
+int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 				   struct drm_i915_gem_execbuffer2 *args,
-				   struct list_head *vmas,
-				   struct drm_i915_gem_object *batch_obj,
-				   u64 exec_start, u32 flags);
+				   struct list_head *vmas);
 int i915_gem_execbuffer(struct drm_device *dev, void *data,
 			struct drm_file *file_priv);
 int i915_gem_execbuffer2(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e2bccc7..4e483c9 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1185,17 +1185,15 @@ err:
 }
 
 int
-i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file,
-			       struct intel_engine_cs *ring,
-			       struct intel_context *ctx,
+i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 			       struct drm_i915_gem_execbuffer2 *args,
-			       struct list_head *vmas,
-			       struct drm_i915_gem_object *batch_obj,
-			       u64 exec_start, u32 dispatch_flags)
+			       struct list_head *vmas)
 {
 	struct drm_clip_rect *cliprects = NULL;
+	struct drm_device *dev = params->dev;
+	struct intel_engine_cs *ring = params->ring;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	u64 exec_len;
+	u64 exec_start, exec_len;
 	int instp_mode;
 	u32 instp_mask;
 	int i, ret = 0;
@@ -1247,11 +1245,11 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file,
 	if (ret)
 		goto error;
 
-	ret = i915_switch_context(ring, ctx);
+	ret = i915_switch_context(ring, params->ctx);
 	if (ret)
 		goto error;
 
-	WARN(ctx->ppgtt && ctx->ppgtt->pd_dirty_rings & (1<<ring->id),
+	WARN(params->ctx->ppgtt && params->ctx->ppgtt->pd_dirty_rings & (1<<ring->id),
 	     "%s didn't clear reload\n", ring->name);
 
 	instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK;
@@ -1312,7 +1310,10 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file,
 			goto error;
 	}
 
-	exec_len = args->batch_len;
+	exec_len   = args->batch_len;
+	exec_start = params->batch_obj_vm_offset +
+		     params->args_batch_start_offset;
+
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
 			ret = i915_emit_box(ring, &cliprects[i],
@@ -1322,22 +1323,23 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file,
 
 			ret = ring->dispatch_execbuffer(ring,
 							exec_start, exec_len,
-							dispatch_flags);
+							params->dispatch_flags);
 			if (ret)
 				goto error;
 		}
 	} else {
 		ret = ring->dispatch_execbuffer(ring,
 						exec_start, exec_len,
-						dispatch_flags);
+						params->dispatch_flags);
 		if (ret)
 			return ret;
 	}
 
-	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), dispatch_flags);
+	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, ring);
-	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
+	i915_gem_execbuffer_retire_commands(params->dev, params->file, ring,
+					    params->batch_obj);
 
 error:
 	kfree(cliprects);
@@ -1407,8 +1409,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct intel_engine_cs *ring;
 	struct intel_context *ctx;
 	struct i915_address_space *vm;
+	struct i915_execbuffer_params params_master; /* XXX: will be removed later */
+	struct i915_execbuffer_params *params = &params_master;
 	const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
-	u64 exec_start = args->batch_start_offset;
 	u32 dispatch_flags;
 	int ret;
 	bool need_relocs;
@@ -1501,6 +1504,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	else
 		vm = &dev_priv->gtt.base;
 
+	memset(&params_master, 0x00, sizeof(params_master));
+
 	eb = eb_create(args);
 	if (eb == NULL) {
 		i915_gem_context_unreference(ctx);
@@ -1543,6 +1548,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		goto err;
 	}
 
+	params->args_batch_start_offset = args->batch_start_offset;
 	if (i915_needs_cmd_parser(ring) && args->batch_len) {
 		struct drm_i915_gem_object *parsed_batch_obj;
 
@@ -1574,7 +1580,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			 * command parser has accepted.
 			 */
 			dispatch_flags |= I915_DISPATCH_SECURE;
-			exec_start = 0;
+			params->args_batch_start_offset = 0;
 			batch_obj = parsed_batch_obj;
 		}
 	}
@@ -1599,18 +1605,29 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		if (ret)
 			goto err;
 
-		exec_start += i915_gem_obj_ggtt_offset(batch_obj);
+		params->batch_obj_vm_offset = i915_gem_obj_ggtt_offset(batch_obj);
 	} else
-		exec_start += i915_gem_obj_offset(batch_obj, vm);
+		params->batch_obj_vm_offset = i915_gem_obj_offset(batch_obj, vm);
 
 	/* Allocate a request for this batch buffer nice and early. */
 	ret = i915_gem_request_alloc(ring, ctx);
 	if (ret)
 		goto err_batch_unpin;
 
-	ret = dev_priv->gt.execbuf_submit(dev, file, ring, ctx, args,
-					  &eb->vmas, batch_obj, exec_start,
-					  dispatch_flags);
+	/*
+	 * Save assorted stuff away to pass through to *_submission().
+	 * NB: This data should be 'persistent' and not local as it will
+	 * kept around beyond the duration of the IOCTL once the GPU
+	 * scheduler arrives.
+	 */
+	params->dev                     = dev;
+	params->file                    = file;
+	params->ring                    = ring;
+	params->dispatch_flags          = dispatch_flags;
+	params->batch_obj               = batch_obj;
+	params->ctx                     = ctx;
+
+	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
 
 err_batch_unpin:
 	/*
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 881cb87..86747f1 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -847,16 +847,15 @@ static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
  *
  * Return: non-zero if the submission fails.
  */
-int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
-			       struct intel_engine_cs *ring,
-			       struct intel_context *ctx,
+int intel_execlists_submission(struct i915_execbuffer_params *params,
 			       struct drm_i915_gem_execbuffer2 *args,
-			       struct list_head *vmas,
-			       struct drm_i915_gem_object *batch_obj,
-			       u64 exec_start, u32 dispatch_flags)
+			       struct list_head *vmas)
 {
+	struct drm_device       *dev = params->dev;
+	struct intel_engine_cs  *ring = params->ring;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
+	struct intel_ringbuffer *ringbuf = params->ctx->engine[ring->id].ringbuf;
+	u64 exec_start;
 	int instp_mode;
 	u32 instp_mask;
 	int ret;
@@ -907,13 +906,13 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
 		return -EINVAL;
 	}
 
-	ret = execlists_move_to_gpu(ringbuf, ctx, vmas);
+	ret = execlists_move_to_gpu(ringbuf, params->ctx, vmas);
 	if (ret)
 		return ret;
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    instp_mode != dev_priv->relative_constants_mode) {
-		ret = intel_logical_ring_begin(ringbuf, ctx, 4);
+		ret = intel_logical_ring_begin(ringbuf, params->ctx, 4);
 		if (ret)
 			return ret;
 
@@ -926,14 +925,17 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
 		dev_priv->relative_constants_mode = instp_mode;
 	}
 
-	ret = ring->emit_bb_start(ringbuf, ctx, exec_start, dispatch_flags);
+	exec_start = params->batch_obj_vm_offset +
+		     args->batch_start_offset;
+
+	ret = ring->emit_bb_start(ringbuf, params->ctx, exec_start, params->dispatch_flags);
 	if (ret)
 		return ret;
 
-	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), dispatch_flags);
+	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, ring);
-	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
+	i915_gem_execbuffer_retire_commands(params->dev, params->file, ring, params->batch_obj);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 4148de0..bf137c4 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -76,13 +76,10 @@ void intel_lr_context_reset(struct drm_device *dev,
 
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
-int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
-			       struct intel_engine_cs *ring,
-			       struct intel_context *ctx,
+struct i915_execbuffer_params;
+int intel_execlists_submission(struct i915_execbuffer_params *params,
 			       struct drm_i915_gem_execbuffer2 *args,
-			       struct list_head *vmas,
-			       struct drm_i915_gem_object *batch_obj,
-			       u64 exec_start, u32 dispatch_flags);
+			       struct list_head *vmas);
 u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
 
 void intel_lrc_irq_handler(struct intel_engine_cs *ring);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 07/55] drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (5 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 06/55] drm/i915: Merged the many do_execbuf() parameters into a structure John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 08/55] drm/i915: Update alloc_request to return the allocated request John.C.Harrison
                   ` (49 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Shrunk the parameter list of i915_gem_execbuffer_retire_commands() to a single
structure as everything it requires is available in the execbuff_params object.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |    5 +----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   12 ++++--------
 drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
 3 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index de17814..bb8544c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2665,10 +2665,7 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file_priv);
 void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 					struct intel_engine_cs *ring);
-void i915_gem_execbuffer_retire_commands(struct drm_device *dev,
-					 struct drm_file *file,
-					 struct intel_engine_cs *ring,
-					 struct drm_i915_gem_object *obj);
+void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
 int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 				   struct drm_i915_gem_execbuffer2 *args,
 				   struct list_head *vmas);
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 4e483c9..0a43665 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1052,16 +1052,13 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 }
 
 void
-i915_gem_execbuffer_retire_commands(struct drm_device *dev,
-				    struct drm_file *file,
-				    struct intel_engine_cs *ring,
-				    struct drm_i915_gem_object *obj)
+i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
 {
 	/* Unconditionally force add_request to emit a full flush. */
-	ring->gpu_caches_dirty = true;
+	params->ring->gpu_caches_dirty = true;
 
 	/* Add a breadcrumb for the completion of the batch buffer */
-	__i915_add_request(ring, file, obj);
+	__i915_add_request(params->ring, params->file, params->batch_obj);
 }
 
 static int
@@ -1338,8 +1335,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, ring);
-	i915_gem_execbuffer_retire_commands(params->dev, params->file, ring,
-					    params->batch_obj);
+	i915_gem_execbuffer_retire_commands(params);
 
 error:
 	kfree(cliprects);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 86747f1..348e6b8 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -935,7 +935,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, ring);
-	i915_gem_execbuffer_retire_commands(params->dev, params->file, ring, params->batch_obj);
+	i915_gem_execbuffer_retire_commands(params);
 
 	return 0;
 }
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 08/55] drm/i915: Update alloc_request to return the allocated request
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (6 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 07/55] drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 09/55] drm/i915: Add request to execbuf params and add explicit cleanup John.C.Harrison
                   ` (48 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The alloc_request() function does not actually return the newly allocated
request. Instead, it must be pulled from ring->outstanding_lazy_request. This
patch fixes this so that code can create a request and start using it knowing
exactly which request it actually owns.

v2: Updated for new i915_gem_request_alloc() scheme.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |    3 ++-
 drivers/gpu/drm/i915/i915_gem.c            |   10 +++++++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    3 ++-
 drivers/gpu/drm/i915/intel_lrc.c           |    3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |    3 ++-
 5 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bb8544c..ab1b3ec 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2193,7 +2193,8 @@ struct drm_i915_gem_request {
 };
 
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
-			   struct intel_context *ctx);
+			   struct intel_context *ctx,
+			   struct drm_i915_gem_request **req_out);
 void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 458a449..ba2e7f7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2634,13 +2634,17 @@ void i915_gem_request_free(struct kref *req_ref)
 }
 
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
-			   struct intel_context *ctx)
+			   struct intel_context *ctx,
+			   struct drm_i915_gem_request **req_out)
 {
 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
 	struct drm_i915_gem_request *req;
 	int ret;
 
-	if (ring->outstanding_lazy_request)
+	if (!req_out)
+		return -EINVAL;
+
+	if ((*req_out = ring->outstanding_lazy_request) != NULL)
 		return 0;
 
 	req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
@@ -2687,7 +2691,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	 */
 	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
 
-	ring->outstanding_lazy_request = req;
+	*req_out = ring->outstanding_lazy_request = req;
 	return 0;
 
 err:
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0a43665..f2d19afe 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1407,6 +1407,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct i915_address_space *vm;
 	struct i915_execbuffer_params params_master; /* XXX: will be removed later */
 	struct i915_execbuffer_params *params = &params_master;
+	struct drm_i915_gem_request *request;
 	const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
 	u32 dispatch_flags;
 	int ret;
@@ -1606,7 +1607,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		params->batch_obj_vm_offset = i915_gem_obj_offset(batch_obj, vm);
 
 	/* Allocate a request for this batch buffer nice and early. */
-	ret = i915_gem_request_alloc(ring, ctx);
+	ret = i915_gem_request_alloc(ring, ctx, &request);
 	if (ret)
 		goto err_batch_unpin;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 348e6b8..1347601 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -807,6 +807,7 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
 static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
 				    struct intel_context *ctx, int num_dwords)
 {
+	struct drm_i915_gem_request *req;
 	struct intel_engine_cs *ring = ringbuf->ring;
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -822,7 +823,7 @@ static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
 		return ret;
 
 	/* Preallocate the olr before touching the ring */
-	ret = i915_gem_request_alloc(ring, ctx);
+	ret = i915_gem_request_alloc(ring, ctx, &req);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 7061b07..d766c1d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2264,6 +2264,7 @@ static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
 int intel_ring_begin(struct intel_engine_cs *ring,
 		     int num_dwords)
 {
+	struct drm_i915_gem_request *req;
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	int ret;
 
@@ -2277,7 +2278,7 @@ int intel_ring_begin(struct intel_engine_cs *ring,
 		return ret;
 
 	/* Preallocate the olr before touching the ring */
-	ret = i915_gem_request_alloc(ring, ring->default_context);
+	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 	if (ret)
 		return ret;
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 09/55] drm/i915: Add request to execbuf params and add explicit cleanup
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (7 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 08/55] drm/i915: Update alloc_request to return the allocated request John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 10/55] drm/i915: Update the dispatch tracepoint to use params->request John.C.Harrison
                   ` (47 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Rather than just having a local request variable in the execbuff code, the
request pointer is now stored in the execbuff params structure. Also added
explicit cleanup of the request (plus wiping the OLR to match) in the error
case. This means that the execbuff code is no longer dependent upon the OLR
keeping track of the request so as to not leak it when things do go wrong. Note
that in the success case, the i915_add_request() at the end of the submission
function will tidy up the request and clear the OLR.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |    1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   13 +++++++++++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ab1b3ec..5f4916a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1614,6 +1614,7 @@ struct i915_execbuffer_params {
 	struct intel_engine_cs          *ring;
 	struct drm_i915_gem_object      *batch_obj;
 	struct intel_context            *ctx;
+	struct drm_i915_gem_request     *request;
 };
 
 struct drm_i915_private {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f2d19afe..9dc1c58 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1407,7 +1407,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct i915_address_space *vm;
 	struct i915_execbuffer_params params_master; /* XXX: will be removed later */
 	struct i915_execbuffer_params *params = &params_master;
-	struct drm_i915_gem_request *request;
 	const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
 	u32 dispatch_flags;
 	int ret;
@@ -1607,7 +1606,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		params->batch_obj_vm_offset = i915_gem_obj_offset(batch_obj, vm);
 
 	/* Allocate a request for this batch buffer nice and early. */
-	ret = i915_gem_request_alloc(ring, ctx, &request);
+	ret = i915_gem_request_alloc(ring, ctx, &params->request);
 	if (ret)
 		goto err_batch_unpin;
 
@@ -1641,6 +1640,16 @@ err:
 	i915_gem_context_unreference(ctx);
 	eb_destroy(eb);
 
+	/*
+	 * If the request was created but not successfully submitted then it
+	 * must be freed again. If it was submitted then it is being tracked
+	 * on the active request list and no clean up is required here.
+	 */
+	if (ret && params->request) {
+		i915_gem_request_cancel(params->request);
+		ring->outstanding_lazy_request = NULL;
+	}
+
 	mutex_unlock(&dev->struct_mutex);
 
 pre_mutex_err:
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 10/55] drm/i915: Update the dispatch tracepoint to use params->request
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (8 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 09/55] drm/i915: Add request to execbuf params and add explicit cleanup John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 11/55] drm/i915: Update move_to_gpu() to take a request structure John.C.Harrison
                   ` (46 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated a couple of trace points to use the now cached request pointer rather
than extracting it from the ring.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 9dc1c58..63b5dd3 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1332,7 +1332,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 			return ret;
 	}
 
-	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags);
+	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, ring);
 	i915_gem_execbuffer_retire_commands(params);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1347601..21640b8 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -933,7 +933,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 	if (ret)
 		return ret;
 
-	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags);
+	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, ring);
 	i915_gem_execbuffer_retire_commands(params);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 11/55] drm/i915: Update move_to_gpu() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (9 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 10/55] drm/i915: Update the dispatch tracepoint to use params->request John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 12/55] drm/i915: Update execbuffer_move_to_active() " John.C.Harrison
                   ` (45 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the move_to_gpu() code paths.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   12 ++++++------
 drivers/gpu/drm/i915/intel_lrc.c           |   12 +++++-------
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 63b5dd3..cc4d25f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -886,10 +886,10 @@ err:
 }
 
 static int
-i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring,
+i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req,
 				struct list_head *vmas)
 {
-	const unsigned other_rings = ~intel_ring_flag(ring);
+	const unsigned other_rings = ~intel_ring_flag(req->ring);
 	struct i915_vma *vma;
 	uint32_t flush_domains = 0;
 	bool flush_chipset = false;
@@ -899,7 +899,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring,
 		struct drm_i915_gem_object *obj = vma->obj;
 
 		if (obj->active & other_rings) {
-			ret = i915_gem_object_sync(obj, ring);
+			ret = i915_gem_object_sync(obj, req->ring);
 			if (ret)
 				return ret;
 		}
@@ -911,7 +911,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring,
 	}
 
 	if (flush_chipset)
-		i915_gem_chipset_flush(ring->dev);
+		i915_gem_chipset_flush(req->ring->dev);
 
 	if (flush_domains & I915_GEM_DOMAIN_GTT)
 		wmb();
@@ -919,7 +919,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring,
 	/* Unconditionally invalidate gpu caches and ensure that we do flush
 	 * any residual writes from the previous batch.
 	 */
-	return intel_ring_invalidate_all_caches(ring);
+	return intel_ring_invalidate_all_caches(req->ring);
 }
 
 static bool
@@ -1238,7 +1238,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 		}
 	}
 
-	ret = i915_gem_execbuffer_move_to_gpu(ring, vmas);
+	ret = i915_gem_execbuffer_move_to_gpu(params->request, vmas);
 	if (ret)
 		goto error;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 21640b8..a98c9ce 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -625,12 +625,10 @@ static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf,
 	return 0;
 }
 
-static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf,
-				 struct intel_context *ctx,
+static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
 				 struct list_head *vmas)
 {
-	struct intel_engine_cs *ring = ringbuf->ring;
-	const unsigned other_rings = ~intel_ring_flag(ring);
+	const unsigned other_rings = ~intel_ring_flag(req->ring);
 	struct i915_vma *vma;
 	uint32_t flush_domains = 0;
 	bool flush_chipset = false;
@@ -640,7 +638,7 @@ static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf,
 		struct drm_i915_gem_object *obj = vma->obj;
 
 		if (obj->active & other_rings) {
-			ret = i915_gem_object_sync(obj, ring);
+			ret = i915_gem_object_sync(obj, req->ring);
 			if (ret)
 				return ret;
 		}
@@ -657,7 +655,7 @@ static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf,
 	/* Unconditionally invalidate gpu caches and ensure that we do flush
 	 * any residual writes from the previous batch.
 	 */
-	return logical_ring_invalidate_all_caches(ringbuf, ctx);
+	return logical_ring_invalidate_all_caches(req->ringbuf, req->ctx);
 }
 
 int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
@@ -907,7 +905,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 		return -EINVAL;
 	}
 
-	ret = execlists_move_to_gpu(ringbuf, params->ctx, vmas);
+	ret = execlists_move_to_gpu(params->request, vmas);
 	if (ret)
 		return ret;
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 12/55] drm/i915: Update execbuffer_move_to_active() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (10 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 11/55] drm/i915: Update move_to_gpu() to take a request structure John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush John.C.Harrison
                   ` (44 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the
execbuffer_move_to_active() code path.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |    2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    6 +++---
 drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5f4916a..cc2c45c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2666,7 +2666,7 @@ int i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file_priv);
 void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
-					struct intel_engine_cs *ring);
+					struct drm_i915_gem_request *req);
 void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
 int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 				   struct drm_i915_gem_execbuffer2 *args,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index cc4d25f..a6532db 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1012,9 +1012,9 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 
 void
 i915_gem_execbuffer_move_to_active(struct list_head *vmas,
-				   struct intel_engine_cs *ring)
+				   struct drm_i915_gem_request *req)
 {
-	struct drm_i915_gem_request *req = intel_ring_get_request(ring);
+	struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
 	struct i915_vma *vma;
 
 	list_for_each_entry(vma, vmas, exec_list) {
@@ -1334,7 +1334,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 
 	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
-	i915_gem_execbuffer_move_to_active(vmas, ring);
+	i915_gem_execbuffer_move_to_active(vmas, params->request);
 	i915_gem_execbuffer_retire_commands(params);
 
 error:
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a98c9ce..6c0b16f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -933,7 +933,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 
 	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
-	i915_gem_execbuffer_move_to_active(vmas, ring);
+	i915_gem_execbuffer_move_to_active(vmas, params->request);
 	i915_gem_execbuffer_retire_commands(params);
 
 	return 0;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (11 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 12/55] drm/i915: Update execbuffer_move_to_active() " John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-06-02 18:19   ` Tomas Elf
  2015-05-29 16:43 ` [PATCH 14/55] drm/i915: Update i915_gpu_idle() to manage its own request John.C.Harrison
                   ` (43 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

In order to explcitly track all GPU work (and completely remove the outstanding
lazy request), it is necessary to add extra i915_add_request() calls to various
places. Some of these do not need the implicit cache flush done as part of the
standard batch buffer submission process.

This patch adds a flag to _add_request() to specify whether the flush is
required or not.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h              |    7 +++++--
 drivers/gpu/drm/i915/i915_gem.c              |   17 ++++++++++-------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
 5 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cc2c45c..f5a733b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2863,9 +2863,12 @@ int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 void __i915_add_request(struct intel_engine_cs *ring,
 			struct drm_file *file,
-			struct drm_i915_gem_object *batch_obj);
+			struct drm_i915_gem_object *batch_obj,
+			bool flush_caches);
 #define i915_add_request(ring) \
-	__i915_add_request(ring, NULL, NULL)
+	__i915_add_request(ring, NULL, NULL, true)
+#define i915_add_request_no_flush(ring) \
+	__i915_add_request(ring, NULL, NULL, false)
 int __i915_wait_request(struct drm_i915_gem_request *req,
 			unsigned reset_counter,
 			bool interruptible,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ba2e7f7..458b54e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2470,7 +2470,8 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
  */
 void __i915_add_request(struct intel_engine_cs *ring,
 			struct drm_file *file,
-			struct drm_i915_gem_object *obj)
+			struct drm_i915_gem_object *obj,
+			bool flush_caches)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct drm_i915_gem_request *request;
@@ -2502,12 +2503,14 @@ void __i915_add_request(struct intel_engine_cs *ring,
 	 * is that the flush _must_ happen before the next request, no matter
 	 * what.
 	 */
-	if (i915.enable_execlists)
-		ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
-	else
-		ret = intel_ring_flush_all_caches(ring);
-	/* Not allowed to fail! */
-	WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
+	if (flush_caches) {
+		if (i915.enable_execlists)
+			ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
+		else
+			ret = intel_ring_flush_all_caches(ring);
+		/* Not allowed to fail! */
+		WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
+	}
 
 	/* Record the position of the start of the request so that
 	 * should we detect the updated seqno part-way through the
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a6532db..e27f47f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1058,7 +1058,7 @@ i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
 	params->ring->gpu_caches_dirty = true;
 
 	/* Add a breadcrumb for the completion of the batch buffer */
-	__i915_add_request(params->ring, params->file, params->batch_obj);
+	__i915_add_request(params->ring, params->file, params->batch_obj, true);
 }
 
 static int
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index ce4788f..4418616 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
-	__i915_add_request(ring, NULL, so.obj);
+	__i915_add_request(ring, NULL, so.obj, true);
 	/* __i915_add_request moves object to inactive if it fails */
 out:
 	i915_gem_render_state_fini(&so);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6c0b16f..00bb335 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1371,7 +1371,7 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
-	__i915_add_request(ring, file, so.obj);
+	__i915_add_request(ring, file, so.obj, true);
 	/* intel_logical_ring_add_request moves object to inactive if it
 	 * fails */
 out:
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 14/55] drm/i915: Update i915_gpu_idle() to manage its own request
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (12 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring John.C.Harrison
                   ` (42 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Added explicit request creation and submission to the GPU idle code path.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 458b54e..bbd00e1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3309,11 +3309,23 @@ int i915_gpu_idle(struct drm_device *dev)
 	/* Flush everything onto the inactive list. */
 	for_each_ring(ring, dev_priv, i) {
 		if (!i915.enable_execlists) {
-			ret = i915_switch_context(ring, ring->default_context);
+			struct drm_i915_gem_request *req;
+
+			ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 			if (ret)
 				return ret;
+
+			ret = i915_switch_context(req->ring, ring->default_context);
+			if (ret) {
+				i915_gem_request_cancel(req);
+				return ret;
+			}
+
+			i915_add_request_no_flush(req->ring);
 		}
 
+		WARN_ON(ring->outstanding_lazy_request);
+
 		ret = intel_ring_idle(ring);
 		if (ret)
 			return ret;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (13 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 14/55] drm/i915: Update i915_gpu_idle() to manage its own request John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-06-18 12:11   ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 16/55] drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() John.C.Harrison
                   ` (41 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The i915_gem_init_hw() function calls a bunch of smaller initialisation
functions. Multiple of which have generic sections and per ring sections. This
means multiple passes are done over the rings. Each pass writes data to the ring
which floats around in that ring's OLR until some random point in the future
when an add_request() is done by some random other piece of code.

This patch breaks i915_ppgtt_init_hw() in two with the per ring initialisation
now being done in i915_ppgtt_init_ring(). The ring looping is now done at the
top level in i915_gem_init_hw().

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c     |   25 +++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem_gtt.c |   28 +++++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem_gtt.h |    1 +
 3 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bbd00e1..e6eb31d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5056,19 +5056,32 @@ i915_gem_init_hw(struct drm_device *dev)
 	 */
 	init_unused_rings(dev);
 
+	ret = i915_ppgtt_init_hw(dev);
+	if (ret) {
+		DRM_ERROR("PPGTT enable HW failed %d\n", ret);
+		goto out;
+	}
+
+	/* Need to do basic initialisation of all rings first: */
 	for_each_ring(ring, dev_priv, i) {
 		ret = ring->init_hw(ring);
 		if (ret)
 			goto out;
 	}
 
-	for (i = 0; i < NUM_L3_SLICES(dev); i++)
-		i915_gem_l3_remap(&dev_priv->ring[RCS], i);
+	/* Now it is safe to go back round and do everything else: */
+	for_each_ring(ring, dev_priv, i) {
+		if (ring->id == RCS) {
+			for (i = 0; i < NUM_L3_SLICES(dev); i++)
+				i915_gem_l3_remap(ring, i);
+		}
 
-	ret = i915_ppgtt_init_hw(dev);
-	if (ret && ret != -EIO) {
-		DRM_ERROR("PPGTT enable failed %d\n", ret);
-		i915_gem_cleanup_ringbuffer(dev);
+		ret = i915_ppgtt_init_ring(ring);
+		if (ret && ret != -EIO) {
+			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
+			i915_gem_cleanup_ringbuffer(dev);
+			goto out;
+		}
 	}
 
 	ret = i915_gem_context_enable(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 17b7df0..b14ae63 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1543,11 +1543,6 @@ int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 
 int i915_ppgtt_init_hw(struct drm_device *dev)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_engine_cs *ring;
-	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
-	int i, ret = 0;
-
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -1566,16 +1561,23 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 	else
 		MISSING_CASE(INTEL_INFO(dev)->gen);
 
-	if (ppgtt) {
-		for_each_ring(ring, dev_priv, i) {
-			ret = ppgtt->switch_mm(ppgtt, ring);
-			if (ret != 0)
-				return ret;
-		}
-	}
+	return 0;
+}
 
-	return ret;
+int i915_ppgtt_init_ring(struct intel_engine_cs *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
+
+	if (i915.enable_execlists)
+		return 0;
+
+	if (!ppgtt)
+		return 0;
+
+	return ppgtt->switch_mm(ppgtt, ring);
 }
+
 struct i915_hw_ppgtt *
 i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 0d46dd2..0caa9eb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -475,6 +475,7 @@ void i915_global_gtt_cleanup(struct drm_device *dev);
 
 int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
 int i915_ppgtt_init_hw(struct drm_device *dev);
+int i915_ppgtt_init_ring(struct intel_engine_cs *ring);
 void i915_ppgtt_release(struct kref *kref);
 struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_device *dev,
 					struct drm_i915_file_private *fpriv);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 16/55] drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable()
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (14 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 17/55] drm/i915: Don't tag kernel batches as user batches John.C.Harrison
                   ` (40 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The start of day context initialisation code in i915_gem_context_enable() loops
over each ring and calls the legacy switch context or the execlist init context
code as appropriate.

This patch moves the ring looping out of that function in to the top level
caller i915_gem_init_hw(). This means the a single pass can be made over all
rings doing the PPGTT, L3 remap and context initialisation of each ring
altogether.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    2 +-
 drivers/gpu/drm/i915/i915_gem.c         |   17 +++++++++-------
 drivers/gpu/drm/i915/i915_gem_context.c |   32 +++++++++++--------------------
 3 files changed, 22 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f5a733b..69c8f56 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3012,7 +3012,7 @@ int __must_check i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
 void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
-int i915_gem_context_enable(struct drm_i915_private *dev_priv);
+int i915_gem_context_enable(struct intel_engine_cs *ring);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
 int i915_switch_context(struct intel_engine_cs *ring,
 			struct intel_context *to);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e6eb31d..a2712a6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5056,6 +5056,8 @@ i915_gem_init_hw(struct drm_device *dev)
 	 */
 	init_unused_rings(dev);
 
+	BUG_ON(!dev_priv->ring[RCS].default_context);
+
 	ret = i915_ppgtt_init_hw(dev);
 	if (ret) {
 		DRM_ERROR("PPGTT enable HW failed %d\n", ret);
@@ -5071,6 +5073,8 @@ i915_gem_init_hw(struct drm_device *dev)
 
 	/* Now it is safe to go back round and do everything else: */
 	for_each_ring(ring, dev_priv, i) {
+		WARN_ON(!ring->default_context);
+
 		if (ring->id == RCS) {
 			for (i = 0; i < NUM_L3_SLICES(dev); i++)
 				i915_gem_l3_remap(ring, i);
@@ -5082,14 +5086,13 @@ i915_gem_init_hw(struct drm_device *dev)
 			i915_gem_cleanup_ringbuffer(dev);
 			goto out;
 		}
-	}
 
-	ret = i915_gem_context_enable(dev_priv);
-	if (ret && ret != -EIO) {
-		DRM_ERROR("Context enable failed %d\n", ret);
-		i915_gem_cleanup_ringbuffer(dev);
-
-		goto out;
+		ret = i915_gem_context_enable(ring);
+		if (ret && ret != -EIO) {
+			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
+			i915_gem_cleanup_ringbuffer(dev);
+			goto out;
+		}
 	}
 
 out:
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 8867818..2542bbc 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -409,32 +409,22 @@ void i915_gem_context_fini(struct drm_device *dev)
 	i915_gem_context_unreference(dctx);
 }
 
-int i915_gem_context_enable(struct drm_i915_private *dev_priv)
+int i915_gem_context_enable(struct intel_engine_cs *ring)
 {
-	struct intel_engine_cs *ring;
-	int ret, i;
-
-	BUG_ON(!dev_priv->ring[RCS].default_context);
+	int ret;
 
 	if (i915.enable_execlists) {
-		for_each_ring(ring, dev_priv, i) {
-			if (ring->init_context) {
-				ret = ring->init_context(ring,
-						ring->default_context);
-				if (ret) {
-					DRM_ERROR("ring init context: %d\n",
-							ret);
-					return ret;
-				}
-			}
-		}
+		if (ring->init_context == NULL)
+			return 0;
 
+		ret = ring->init_context(ring, ring->default_context);
 	} else
-		for_each_ring(ring, dev_priv, i) {
-			ret = i915_switch_context(ring, ring->default_context);
-			if (ret)
-				return ret;
-		}
+		ret = i915_switch_context(ring, ring->default_context);
+
+	if (ret) {
+		DRM_ERROR("ring init context: %d\n", ret);
+		return ret;
+	}
 
 	return 0;
 }
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 17/55] drm/i915: Don't tag kernel batches as user batches
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (15 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 16/55] drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw() John.C.Harrison
                   ` (39 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The render state initialisation code does an explicit i915_add_request() call to
commit the init commands. It was passing in the initialisation batch buffer to
add_request() as the batch object parameter. However, the batch object entry in
the request structure (which is all that parameter is used for) is meant for
keeping track of user generated batch buffers for blame tagging during GPU
hangs.

This patch clears the batch object parameter so that kernel generated batch
buffers are not tagged as being user generated.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 4418616..a32a4b9 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
-	__i915_add_request(ring, NULL, so.obj, true);
+	__i915_add_request(ring, NULL, NULL, true);
 	/* __i915_add_request moves object to inactive if it fails */
 out:
 	i915_gem_render_state_fini(&so);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 00bb335..c744362 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1371,7 +1371,7 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
-	__i915_add_request(ring, file, so.obj, true);
+	__i915_add_request(ring, file, NULL, true);
 	/* intel_logical_ring_add_request moves object to inactive if it
 	 * fails */
 out:
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw()
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (16 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 17/55] drm/i915: Don't tag kernel batches as user batches John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-06-02 18:20   ` Tomas Elf
  2015-05-29 16:43 ` [PATCH 19/55] drm/i915: Update ppgtt_init_ring() & context_enable() to take requests John.C.Harrison
                   ` (38 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Now that a single per ring loop is being done for all the different
intialisation steps in i915_gem_init_hw(), it is possible to add proper request
management as well. The last remaining issue is that the context enable call
eventually ends up within *_render_state_init() and this does its own private
_i915_add_request() call.

This patch adds explicit request creation and submission to the top level loop
and removes the add_request() from deep within the sub-functions.

v2: Updated for removal of batch_obj from add_request call in previous patch.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h              |    3 ++-
 drivers/gpu/drm/i915/i915_gem.c              |   12 ++++++++++++
 drivers/gpu/drm/i915/i915_gem_render_state.c |    2 --
 drivers/gpu/drm/i915/intel_lrc.c             |    5 -----
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 69c8f56..21045e7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2154,7 +2154,8 @@ struct drm_i915_gem_request {
 	struct intel_context *ctx;
 	struct intel_ringbuffer *ringbuf;
 
-	/** Batch buffer related to this request if any */
+	/** Batch buffer related to this request if any (used for
+	    error state dump only) */
 	struct drm_i915_gem_object *batch_obj;
 
 	/** Time at which this request was emitted, in jiffies. */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a2712a6..1960e30 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5073,8 +5073,16 @@ i915_gem_init_hw(struct drm_device *dev)
 
 	/* Now it is safe to go back round and do everything else: */
 	for_each_ring(ring, dev_priv, i) {
+		struct drm_i915_gem_request *req;
+
 		WARN_ON(!ring->default_context);
 
+		ret = i915_gem_request_alloc(ring, ring->default_context, &req);
+		if (ret) {
+			i915_gem_cleanup_ringbuffer(dev);
+			goto out;
+		}
+
 		if (ring->id == RCS) {
 			for (i = 0; i < NUM_L3_SLICES(dev); i++)
 				i915_gem_l3_remap(ring, i);
@@ -5083,6 +5091,7 @@ i915_gem_init_hw(struct drm_device *dev)
 		ret = i915_ppgtt_init_ring(ring);
 		if (ret && ret != -EIO) {
 			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
+			i915_gem_request_cancel(req);
 			i915_gem_cleanup_ringbuffer(dev);
 			goto out;
 		}
@@ -5090,9 +5099,12 @@ i915_gem_init_hw(struct drm_device *dev)
 		ret = i915_gem_context_enable(ring);
 		if (ret && ret != -EIO) {
 			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
+			i915_gem_request_cancel(req);
 			i915_gem_cleanup_ringbuffer(dev);
 			goto out;
 		}
+
+		i915_add_request_no_flush(ring);
 	}
 
 out:
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index a32a4b9..a07b4ee 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -173,8 +173,6 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
-	__i915_add_request(ring, NULL, NULL, true);
-	/* __i915_add_request moves object to inactive if it fails */
 out:
 	i915_gem_render_state_fini(&so);
 	return ret;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c744362..37efa93 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1351,8 +1351,6 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
 {
 	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
 	struct render_state so;
-	struct drm_i915_file_private *file_priv = ctx->file_priv;
-	struct drm_file *file = file_priv ? file_priv->file : NULL;
 	int ret;
 
 	ret = i915_gem_render_state_prepare(ring, &so);
@@ -1371,9 +1369,6 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
 
 	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
 
-	__i915_add_request(ring, file, NULL, true);
-	/* intel_logical_ring_add_request moves object to inactive if it
-	 * fails */
 out:
 	i915_gem_render_state_fini(&so);
 	return ret;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 19/55] drm/i915: Update ppgtt_init_ring() & context_enable() to take requests
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (17 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw() John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 20/55] drm/i915: Update i915_switch_context() to take a request structure John.C.Harrison
                   ` (37 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The final step in removing the OLR from i915_gem_init_hw() is to pass the newly
allocated request structure in to each step rather than passing a ring
structure. This patch updates both i915_ppgtt_init_ring() and
i915_gem_context_enable() to take request pointers.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    2 +-
 drivers/gpu/drm/i915/i915_gem.c         |    4 ++--
 drivers/gpu/drm/i915/i915_gem_context.c |    3 ++-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |    6 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.h     |    2 +-
 5 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 21045e7..63f580c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3013,7 +3013,7 @@ int __must_check i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
 void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
-int i915_gem_context_enable(struct intel_engine_cs *ring);
+int i915_gem_context_enable(struct drm_i915_gem_request *req);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
 int i915_switch_context(struct intel_engine_cs *ring,
 			struct intel_context *to);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1960e30..77b5eef 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5088,7 +5088,7 @@ i915_gem_init_hw(struct drm_device *dev)
 				i915_gem_l3_remap(ring, i);
 		}
 
-		ret = i915_ppgtt_init_ring(ring);
+		ret = i915_ppgtt_init_ring(req);
 		if (ret && ret != -EIO) {
 			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
 			i915_gem_request_cancel(req);
@@ -5096,7 +5096,7 @@ i915_gem_init_hw(struct drm_device *dev)
 			goto out;
 		}
 
-		ret = i915_gem_context_enable(ring);
+		ret = i915_gem_context_enable(req);
 		if (ret && ret != -EIO) {
 			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
 			i915_gem_request_cancel(req);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2542bbc..e934dc6 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -409,8 +409,9 @@ void i915_gem_context_fini(struct drm_device *dev)
 	i915_gem_context_unreference(dctx);
 }
 
-int i915_gem_context_enable(struct intel_engine_cs *ring)
+int i915_gem_context_enable(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	if (i915.enable_execlists) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b14ae63..73d6f9e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1564,9 +1564,9 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 	return 0;
 }
 
-int i915_ppgtt_init_ring(struct intel_engine_cs *ring)
+int i915_ppgtt_init_ring(struct drm_i915_gem_request *req)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_private *dev_priv = req->ring->dev->dev_private;
 	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
 
 	if (i915.enable_execlists)
@@ -1575,7 +1575,7 @@ int i915_ppgtt_init_ring(struct intel_engine_cs *ring)
 	if (!ppgtt)
 		return 0;
 
-	return ppgtt->switch_mm(ppgtt, ring);
+	return ppgtt->switch_mm(ppgtt, req->ring);
 }
 
 struct i915_hw_ppgtt *
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 0caa9eb..75dfa05 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -475,7 +475,7 @@ void i915_global_gtt_cleanup(struct drm_device *dev);
 
 int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
 int i915_ppgtt_init_hw(struct drm_device *dev);
-int i915_ppgtt_init_ring(struct intel_engine_cs *ring);
+int i915_ppgtt_init_ring(struct drm_i915_gem_request *req);
 void i915_ppgtt_release(struct kref *kref);
 struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_device *dev,
 					struct drm_i915_file_private *fpriv);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 20/55] drm/i915: Update i915_switch_context() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (18 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 19/55] drm/i915: Update ppgtt_init_ring() & context_enable() to take requests John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 21/55] drm/i915: Update do_switch() " John.C.Harrison
                   ` (36 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Now that the request is guaranteed to specify the context, it is possible to
update the context switch code to use requests rather than ring and context
pairs. This patch updates i915_switch_context() accordingly.

Also removed the warning that the request's context must match the last context
switch's context. As the context switch now gets the context object from the
request structure, there is no longer any scope for the two to become out of
step.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |    3 +--
 drivers/gpu/drm/i915/i915_gem.c            |    4 +---
 drivers/gpu/drm/i915/i915_gem_context.c    |   19 +++++++++----------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
 4 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 63f580c..64a10fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3015,8 +3015,7 @@ void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 int i915_gem_context_enable(struct drm_i915_gem_request *req);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
-int i915_switch_context(struct intel_engine_cs *ring,
-			struct intel_context *to);
+int i915_switch_context(struct drm_i915_gem_request *req);
 struct intel_context *
 i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
 void i915_gem_context_free(struct kref *ctx_ref);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 77b5eef..b7d66aa 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2539,8 +2539,6 @@ void __i915_add_request(struct intel_engine_cs *ring,
 	 */
 	request->batch_obj = obj;
 
-	WARN_ON(!i915.enable_execlists && (request->ctx != ring->last_context));
-
 	request->emitted_jiffies = jiffies;
 	list_add_tail(&request->list, &ring->request_list);
 	request->file_priv = NULL;
@@ -3315,7 +3313,7 @@ int i915_gpu_idle(struct drm_device *dev)
 			if (ret)
 				return ret;
 
-			ret = i915_switch_context(req->ring, ring->default_context);
+			ret = i915_switch_context(req);
 			if (ret) {
 				i915_gem_request_cancel(req);
 				return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e934dc6..bae8552 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -420,7 +420,7 @@ int i915_gem_context_enable(struct drm_i915_gem_request *req)
 
 		ret = ring->init_context(ring, ring->default_context);
 	} else
-		ret = i915_switch_context(ring, ring->default_context);
+		ret = i915_switch_context(req);
 
 	if (ret) {
 		DRM_ERROR("ring init context: %d\n", ret);
@@ -775,8 +775,7 @@ unpin_out:
 
 /**
  * i915_switch_context() - perform a GPU context switch.
- * @ring: ring for which we'll execute the context switch
- * @to: the context to switch to
+ * @req: request for which we'll execute the context switch
  *
  * The context life cycle is simple. The context refcount is incremented and
  * decremented by 1 and create and destroy. If the context is in use by the GPU,
@@ -787,25 +786,25 @@ unpin_out:
  * switched by writing to the ELSP and requests keep a reference to their
  * context.
  */
-int i915_switch_context(struct intel_engine_cs *ring,
-			struct intel_context *to)
+int i915_switch_context(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
 	WARN_ON(i915.enable_execlists);
 	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
-	if (to->legacy_hw_ctx.rcs_state == NULL) { /* We have the fake context */
-		if (to != ring->last_context) {
-			i915_gem_context_reference(to);
+	if (req->ctx->legacy_hw_ctx.rcs_state == NULL) { /* We have the fake context */
+		if (req->ctx != ring->last_context) {
+			i915_gem_context_reference(req->ctx);
 			if (ring->last_context)
 				i915_gem_context_unreference(ring->last_context);
-			ring->last_context = to;
+			ring->last_context = req->ctx;
 		}
 		return 0;
 	}
 
-	return do_switch(ring, to);
+	return do_switch(req->ring, req->ctx);
 }
 
 static bool contexts_enabled(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e27f47f..50b1ced 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1242,7 +1242,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 	if (ret)
 		goto error;
 
-	ret = i915_switch_context(ring, params->ctx);
+	ret = i915_switch_context(params->request);
 	if (ret)
 		goto error;
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 21/55] drm/i915: Update do_switch() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (19 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 20/55] drm/i915: Update i915_switch_context() to take a request structure John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management John.C.Harrison
                   ` (35 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated do_switch() to take a request pointer instead of a ring/context pair.

v2: Removed some overzealous req-> dereferencing.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index bae8552..99606f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -614,9 +614,10 @@ needs_pd_load_post(struct intel_engine_cs *ring, struct intel_context *to,
 	return false;
 }
 
-static int do_switch(struct intel_engine_cs *ring,
-		     struct intel_context *to)
+static int do_switch(struct drm_i915_gem_request *req)
 {
+	struct intel_context *to = req->ctx;
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_context *from = ring->last_context;
 	u32 hw_flags = 0;
@@ -804,7 +805,7 @@ int i915_switch_context(struct drm_i915_gem_request *req)
 		return 0;
 	}
 
-	return do_switch(req->ring, req->ctx);
+	return do_switch(req);
 }
 
 static bool contexts_enabled(struct drm_device *dev)
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (20 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 21/55] drm/i915: Update do_switch() " John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-06-02 18:22   ` Tomas Elf
  2015-05-29 16:43 ` [PATCH 23/55] drm/i915: Update init_context() to take a request structure John.C.Harrison
                   ` (34 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

In execlist mode, context initialisation is deferred until first use of the
given context. This is because execlist mode has per ring context state and thus
many more context storage objects than legacy mode and many are never actually
used. Previously, the initialisation commands were written to the ring and
tagged with some random request structure via the OLR. This seemed to be causing
a null pointer deference bug under certain circumstances (BZ:88865).

This patch adds explicit request creation and submission to the deferred
initialisation code path. Thus removing any reliance on or randomness caused by
the OLR.

Note that it should be possible to move the deferred context creation until even
later - when the context is actually switched to rather than when it is merely
validated. This would allow the initialisation to be done within the request of
the work that is wanting to use the context. Hence, the extra request that is
created, used and retired just for the context init could be removed completely.
However, this is left for a follow up patch.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 37efa93..2730efd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1971,13 +1971,22 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
 		lrc_setup_hardware_status_page(ring, ctx_obj);
 	else if (ring->id == RCS && !ctx->rcs_initialized) {
 		if (ring->init_context) {
-			ret = ring->init_context(ring, ctx);
+			struct drm_i915_gem_request *req;
+
+			ret = i915_gem_request_alloc(ring, ctx, &req);
+			if (ret)
+				return ret;
+
+			ret = ring->init_context(req->ring, ctx);
 			if (ret) {
 				DRM_ERROR("ring init context: %d\n", ret);
+				i915_gem_request_cancel(req);
 				ctx->engine[ring->id].ringbuf = NULL;
 				ctx->engine[ring->id].state = NULL;
 				goto error;
 			}
+
+			i915_add_request_no_flush(req->ring);
 		}
 
 		ctx->rcs_initialized = true;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 23/55] drm/i915: Update init_context() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (21 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 24/55] drm/i915: Update render_state_init() " John.C.Harrison
                   ` (33 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Now that everything above has been converted to use requests, it is possible to
update init_context() to take a request pointer instead of a ring/context pair.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c |    4 ++--
 drivers/gpu/drm/i915/intel_lrc.c        |    9 ++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.c |    7 +++----
 drivers/gpu/drm/i915/intel_ringbuffer.h |    3 +--
 4 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 99606f8..c6640d4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -418,7 +418,7 @@ int i915_gem_context_enable(struct drm_i915_gem_request *req)
 		if (ring->init_context == NULL)
 			return 0;
 
-		ret = ring->init_context(ring, ring->default_context);
+		ret = ring->init_context(req);
 	} else
 		ret = i915_switch_context(req);
 
@@ -760,7 +760,7 @@ done:
 
 	if (uninitialized) {
 		if (ring->init_context) {
-			ret = ring->init_context(ring, to);
+			ret = ring->init_context(req);
 			if (ret)
 				DRM_ERROR("ring init context: %d\n", ret);
 		}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2730efd..cc93e12 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1374,16 +1374,15 @@ out:
 	return ret;
 }
 
-static int gen8_init_rcs_context(struct intel_engine_cs *ring,
-		       struct intel_context *ctx)
+static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 {
 	int ret;
 
-	ret = intel_logical_ring_workarounds_emit(ring, ctx);
+	ret = intel_logical_ring_workarounds_emit(req->ring, req->ctx);
 	if (ret)
 		return ret;
 
-	return intel_lr_context_render_state_init(ring, ctx);
+	return intel_lr_context_render_state_init(req->ring, req->ctx);
 }
 
 /**
@@ -1977,7 +1976,7 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
 			if (ret)
 				return ret;
 
-			ret = ring->init_context(req->ring, ctx);
+			ret = ring->init_context(req);
 			if (ret) {
 				DRM_ERROR("ring init context: %d\n", ret);
 				i915_gem_request_cancel(req);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d766c1d..0d8f89c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -742,16 +742,15 @@ static int intel_ring_workarounds_emit(struct intel_engine_cs *ring,
 	return 0;
 }
 
-static int intel_rcs_ctx_init(struct intel_engine_cs *ring,
-			      struct intel_context *ctx)
+static int intel_rcs_ctx_init(struct drm_i915_gem_request *req)
 {
 	int ret;
 
-	ret = intel_ring_workarounds_emit(ring, ctx);
+	ret = intel_ring_workarounds_emit(req->ring, req->ctx);
 	if (ret != 0)
 		return ret;
 
-	ret = i915_gem_render_state_init(ring);
+	ret = i915_gem_render_state_init(req->ring);
 	if (ret)
 		DRM_ERROR("init render state: %d\n", ret);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 39f795c..8450add 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -154,8 +154,7 @@ struct  intel_engine_cs {
 
 	int		(*init_hw)(struct intel_engine_cs *ring);
 
-	int		(*init_context)(struct intel_engine_cs *ring,
-					struct intel_context *ctx);
+	int		(*init_context)(struct drm_i915_gem_request *req);
 
 	void		(*write_tail)(struct intel_engine_cs *ring,
 				      u32 value);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 24/55] drm/i915: Update render_state_init() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (22 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 23/55] drm/i915: Update init_context() to take a request structure John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 25/55] drm/i915: Update i915_gem_object_sync() " John.C.Harrison
                   ` (32 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the two render_state_init() functions to take a request pointer instead
of a ring. This removes their reliance on the OLR.

v2: Rebased to newer tree.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_render_state.c |   14 +++++++-------
 drivers/gpu/drm/i915/i915_gem_render_state.h |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c             |   18 ++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.c      |    2 +-
 4 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index a07b4ee..6598f9b 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -152,26 +152,26 @@ int i915_gem_render_state_prepare(struct intel_engine_cs *ring,
 	return 0;
 }
 
-int i915_gem_render_state_init(struct intel_engine_cs *ring)
+int i915_gem_render_state_init(struct drm_i915_gem_request *req)
 {
 	struct render_state so;
 	int ret;
 
-	ret = i915_gem_render_state_prepare(ring, &so);
+	ret = i915_gem_render_state_prepare(req->ring, &so);
 	if (ret)
 		return ret;
 
 	if (so.rodata == NULL)
 		return 0;
 
-	ret = ring->dispatch_execbuffer(ring,
-					so.ggtt_offset,
-					so.rodata->batch_items * 4,
-					I915_DISPATCH_SECURE);
+	ret = req->ring->dispatch_execbuffer(req->ring,
+					     so.ggtt_offset,
+					     so.rodata->batch_items * 4,
+					     I915_DISPATCH_SECURE);
 	if (ret)
 		goto out;
 
-	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
+	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req->ring);
 
 out:
 	i915_gem_render_state_fini(&so);
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.h b/drivers/gpu/drm/i915/i915_gem_render_state.h
index c44961e..7aa7372 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.h
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.h
@@ -39,7 +39,7 @@ struct render_state {
 	int gen;
 };
 
-int i915_gem_render_state_init(struct intel_engine_cs *ring);
+int i915_gem_render_state_init(struct drm_i915_gem_request *req);
 void i915_gem_render_state_fini(struct render_state *so);
 int i915_gem_render_state_prepare(struct intel_engine_cs *ring,
 				  struct render_state *so);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index cc93e12..6d005b1 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1346,28 +1346,26 @@ static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
 	return 0;
 }
 
-static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
-					      struct intel_context *ctx)
+static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req)
 {
-	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
 	struct render_state so;
 	int ret;
 
-	ret = i915_gem_render_state_prepare(ring, &so);
+	ret = i915_gem_render_state_prepare(req->ring, &so);
 	if (ret)
 		return ret;
 
 	if (so.rodata == NULL)
 		return 0;
 
-	ret = ring->emit_bb_start(ringbuf,
-			ctx,
-			so.ggtt_offset,
-			I915_DISPATCH_SECURE);
+	ret = req->ring->emit_bb_start(req->ringbuf,
+				       req->ctx,
+				       so.ggtt_offset,
+				       I915_DISPATCH_SECURE);
 	if (ret)
 		goto out;
 
-	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
+	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req->ring);
 
 out:
 	i915_gem_render_state_fini(&so);
@@ -1382,7 +1380,7 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	if (ret)
 		return ret;
 
-	return intel_lr_context_render_state_init(req->ring, req->ctx);
+	return intel_lr_context_render_state_init(req);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0d8f89c..72735a0a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -750,7 +750,7 @@ static int intel_rcs_ctx_init(struct drm_i915_gem_request *req)
 	if (ret != 0)
 		return ret;
 
-	ret = i915_gem_render_state_init(req->ring);
+	ret = i915_gem_render_state_init(req);
 	if (ret)
 		DRM_ERROR("init render state: %d\n", ret);
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (23 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 24/55] drm/i915: Update render_state_init() " John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-06-02 18:26   ` Tomas Elf
  2015-05-29 16:43 ` [PATCH 26/55] drm/i915: Update overlay code to do explicit request management John.C.Harrison
                   ` (31 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the i915_gem_object_sync()
code path.

v2: Much more complex patch to share a single request between the sync and the
page flip. The _sync() function now supports lazy allocation of the request
structure. That is, if one is passed in then that will be used. If one is not,
then a request will be allocated and passed back out. Note that the _sync() code
does not necessarily require a request. Thus one will only be created until
certain situations. The reason the lazy allocation must be done within the
_sync() code itself is because the decision to need one or not is not really
something that code above can second guess (except in the case where one is
definitely not required because no ring is passed in).

The call chains above _sync() now support passing a request through which most
callers passing in NULL and assuming that no request will be required (because
they also pass in NULL for the ring and therefore can't be generating any ring
code).

The exeception is intel_crtc_page_flip() which now supports having a request
returned from _sync(). If one is, then that request is shared by the page flip
(if the page flip is of a type to need a request). If _sync() does not generate
a request but the page flip does need one, then the page flip path will create
its own request.

v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
Elf review request). Rebased onto newer tree that significantly changed the
synchronisation code.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
 drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
 drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
 drivers/gpu/drm/i915/intel_drv.h           |    3 +-
 drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
 drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
 8 files changed, 58 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 64a10fa..f69e9cb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
-			 struct intel_engine_cs *to);
+			 struct intel_engine_cs *to,
+			 struct drm_i915_gem_request **to_req);
 void i915_vma_move_to_active(struct i915_vma *vma,
 			     struct intel_engine_cs *ring);
 int i915_gem_dumb_create(struct drm_file *file_priv,
@@ -2889,6 +2890,7 @@ int __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     struct intel_engine_cs *pipelined,
+				     struct drm_i915_gem_request **pipelined_request,
 				     const struct i915_ggtt_view *view);
 void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj,
 					      const struct i915_ggtt_view *view);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b7d66aa..db90043 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3098,25 +3098,26 @@ out:
 static int
 __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 		       struct intel_engine_cs *to,
-		       struct drm_i915_gem_request *req)
+		       struct drm_i915_gem_request *from_req,
+		       struct drm_i915_gem_request **to_req)
 {
 	struct intel_engine_cs *from;
 	int ret;
 
-	from = i915_gem_request_get_ring(req);
+	from = i915_gem_request_get_ring(from_req);
 	if (to == from)
 		return 0;
 
-	if (i915_gem_request_completed(req, true))
+	if (i915_gem_request_completed(from_req, true))
 		return 0;
 
-	ret = i915_gem_check_olr(req);
+	ret = i915_gem_check_olr(from_req);
 	if (ret)
 		return ret;
 
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
 		struct drm_i915_private *i915 = to_i915(obj->base.dev);
-		ret = __i915_wait_request(req,
+		ret = __i915_wait_request(from_req,
 					  atomic_read(&i915->gpu_error.reset_counter),
 					  i915->mm.interruptible,
 					  NULL,
@@ -3124,15 +3125,25 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 		if (ret)
 			return ret;
 
-		i915_gem_object_retire_request(obj, req);
+		i915_gem_object_retire_request(obj, from_req);
 	} else {
 		int idx = intel_ring_sync_index(from, to);
-		u32 seqno = i915_gem_request_get_seqno(req);
+		u32 seqno = i915_gem_request_get_seqno(from_req);
 
+		WARN_ON(!to_req);
+
+		/* Optimization: Avoid semaphore sync when we are sure we already
+		 * waited for an object with higher seqno */
 		if (seqno <= from->semaphore.sync_seqno[idx])
 			return 0;
 
-		trace_i915_gem_ring_sync_to(from, to, req);
+		if (*to_req == NULL) {
+			ret = i915_gem_request_alloc(to, to->default_context, to_req);
+			if (ret)
+				return ret;
+		}
+
+		trace_i915_gem_ring_sync_to(from, to, from_req);
 		ret = to->semaphore.sync_to(to, from, seqno);
 		if (ret)
 			return ret;
@@ -3153,6 +3164,9 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
  *
  * @obj: object which may be in use on another ring.
  * @to: ring we wish to use the object on. May be NULL.
+ * @to_req: request we wish to use the object for. See below.
+ *          This will be allocated and returned if a request is
+ *          required but not passed in.
  *
  * This code is meant to abstract object synchronization with the GPU.
  * Calling with NULL implies synchronizing the object with the CPU
@@ -3168,11 +3182,22 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
  * - If we are a write request (pending_write_domain is set), the new
  *   request must wait for outstanding read requests to complete.
  *
+ * For CPU synchronisation (NULL to) no request is required. For syncing with
+ * rings to_req must be non-NULL. However, a request does not have to be
+ * pre-allocated. If *to_req is null and sync commands will be emitted then a
+ * request will be allocated automatically and returned through *to_req. Note
+ * that it is not guaranteed that commands will be emitted (because the
+ * might already be idle). Hence there is no need to create a request that
+ * might never have any work submitted. Note further that if a request is
+ * returned in *to_req, it is the responsibility of the caller to submit
+ * that request (after potentially adding more work to it).
+ *
  * Returns 0 if successful, else propagates up the lower layer error.
  */
 int
 i915_gem_object_sync(struct drm_i915_gem_object *obj,
-		     struct intel_engine_cs *to)
+		     struct intel_engine_cs *to,
+		     struct drm_i915_gem_request **to_req)
 {
 	const bool readonly = obj->base.pending_write_domain == 0;
 	struct drm_i915_gem_request *req[I915_NUM_RINGS];
@@ -3194,7 +3219,7 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
 				req[n++] = obj->last_read_req[i];
 	}
 	for (i = 0; i < n; i++) {
-		ret = __i915_gem_object_sync(obj, to, req[i]);
+		ret = __i915_gem_object_sync(obj, to, req[i], to_req);
 		if (ret)
 			return ret;
 	}
@@ -4144,12 +4169,13 @@ int
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     struct intel_engine_cs *pipelined,
+				     struct drm_i915_gem_request **pipelined_request,
 				     const struct i915_ggtt_view *view)
 {
 	u32 old_read_domains, old_write_domain;
 	int ret;
 
-	ret = i915_gem_object_sync(obj, pipelined);
+	ret = i915_gem_object_sync(obj, pipelined, pipelined_request);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 50b1ced..bea92ad 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -899,7 +899,7 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req,
 		struct drm_i915_gem_object *obj = vma->obj;
 
 		if (obj->active & other_rings) {
-			ret = i915_gem_object_sync(obj, req->ring);
+			ret = i915_gem_object_sync(obj, req->ring, &req);
 			if (ret)
 				return ret;
 		}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 657a333..6528ada 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2338,7 +2338,8 @@ int
 intel_pin_and_fence_fb_obj(struct drm_plane *plane,
 			   struct drm_framebuffer *fb,
 			   const struct drm_plane_state *plane_state,
-			   struct intel_engine_cs *pipelined)
+			   struct intel_engine_cs *pipelined,
+			   struct drm_i915_gem_request **pipelined_request)
 {
 	struct drm_device *dev = fb->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -2403,7 +2404,7 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
 
 	dev_priv->mm.interruptible = false;
 	ret = i915_gem_object_pin_to_display_plane(obj, alignment, pipelined,
-						   &view);
+						   pipelined_request, &view);
 	if (ret)
 		goto err_interruptible;
 
@@ -11119,6 +11120,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	struct intel_unpin_work *work;
 	struct intel_engine_cs *ring;
 	bool mmio_flip;
+	struct drm_i915_gem_request *request = NULL;
 	int ret;
 
 	/*
@@ -11225,7 +11227,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	 */
 	ret = intel_pin_and_fence_fb_obj(crtc->primary, fb,
 					 crtc->primary->state,
-					 mmio_flip ? i915_gem_request_get_ring(obj->last_write_req) : ring);
+					 mmio_flip ? i915_gem_request_get_ring(obj->last_write_req) : ring, &request);
 	if (ret)
 		goto cleanup_pending;
 
@@ -11256,6 +11258,9 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 					intel_ring_get_request(ring));
 	}
 
+	if (request)
+		i915_add_request_no_flush(request->ring);
+
 	work->flip_queued_vblank = drm_crtc_vblank_count(crtc);
 	work->enable_stall_check = true;
 
@@ -11273,6 +11278,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 cleanup_unpin:
 	intel_unpin_fb_obj(fb, crtc->primary->state);
 cleanup_pending:
+	if (request)
+		i915_gem_request_cancel(request);
 	atomic_dec(&intel_crtc->unpin_work_count);
 	mutex_unlock(&dev->struct_mutex);
 cleanup:
@@ -13171,7 +13178,7 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 		if (ret)
 			DRM_DEBUG_KMS("failed to attach phys object\n");
 	} else {
-		ret = intel_pin_and_fence_fb_obj(plane, fb, new_state, NULL);
+		ret = intel_pin_and_fence_fb_obj(plane, fb, new_state, NULL, NULL);
 	}
 
 	if (ret == 0)
@@ -15218,7 +15225,7 @@ void intel_modeset_gem_init(struct drm_device *dev)
 		ret = intel_pin_and_fence_fb_obj(c->primary,
 						 c->primary->fb,
 						 c->primary->state,
-						 NULL);
+						 NULL, NULL);
 		mutex_unlock(&dev->struct_mutex);
 		if (ret) {
 			DRM_ERROR("failed to pin boot fb on pipe %d\n",
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 02d8317..73650ae 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1034,7 +1034,8 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
 int intel_pin_and_fence_fb_obj(struct drm_plane *plane,
 			       struct drm_framebuffer *fb,
 			       const struct drm_plane_state *plane_state,
-			       struct intel_engine_cs *pipelined);
+			       struct intel_engine_cs *pipelined,
+			       struct drm_i915_gem_request **pipelined_request);
 struct drm_framebuffer *
 __intel_framebuffer_create(struct drm_device *dev,
 			   struct drm_mode_fb_cmd2 *mode_cmd,
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 4e7e7da..dd9f3b2 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -151,7 +151,7 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 	}
 
 	/* Flush everything out, we'll be doing GTT only from now on */
-	ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL);
+	ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL, NULL);
 	if (ret) {
 		DRM_ERROR("failed to pin obj: %d\n", ret);
 		goto out_fb;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6d005b1..f8e8fdb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -638,7 +638,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
 		struct drm_i915_gem_object *obj = vma->obj;
 
 		if (obj->active & other_rings) {
-			ret = i915_gem_object_sync(obj, req->ring);
+			ret = i915_gem_object_sync(obj, req->ring, &req);
 			if (ret)
 				return ret;
 		}
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index e7534b9..0f8187a 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -724,7 +724,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 	if (ret != 0)
 		return ret;
 
-	ret = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL,
+	ret = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL, NULL,
 						   &i915_ggtt_view_normal);
 	if (ret != 0)
 		return ret;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 26/55] drm/i915: Update overlay code to do explicit request management
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (24 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 25/55] drm/i915: Update i915_gem_object_sync() " John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 27/55] drm/i915: Update queue_flip() to take a request structure John.C.Harrison
                   ` (30 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The overlay update code path to do explicit request creation and submission
rather than relying on the OLR to do the right thing.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_overlay.c |   57 ++++++++++++++++++++++++----------
 1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 0f8187a..3adb63e 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -210,17 +210,14 @@ static void intel_overlay_unmap_regs(struct intel_overlay *overlay,
 }
 
 static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
+					 struct drm_i915_gem_request *req,
 					 void (*tail)(struct intel_overlay *))
 {
-	struct drm_device *dev = overlay->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_engine_cs *ring = &dev_priv->ring[RCS];
 	int ret;
 
 	WARN_ON(overlay->last_flip_req);
-	i915_gem_request_assign(&overlay->last_flip_req,
-					     ring->outstanding_lazy_request);
-	i915_add_request(ring);
+	i915_gem_request_assign(&overlay->last_flip_req, req);
+	i915_add_request(req->ring);
 
 	overlay->flip_tail = tail;
 	ret = i915_wait_request(overlay->last_flip_req);
@@ -237,15 +234,22 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring = &dev_priv->ring[RCS];
+	struct drm_i915_gem_request *req;
 	int ret;
 
 	WARN_ON(overlay->active);
 	WARN_ON(IS_I830(dev) && !(dev_priv->quirks & QUIRK_PIPEA_FORCE));
 
-	ret = intel_ring_begin(ring, 4);
+	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 	if (ret)
 		return ret;
 
+	ret = intel_ring_begin(ring, 4);
+	if (ret) {
+		i915_gem_request_cancel(req);
+		return ret;
+	}
+
 	overlay->active = true;
 
 	intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_ON);
@@ -254,7 +258,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
 
-	return intel_overlay_do_wait_request(overlay, NULL);
+	return intel_overlay_do_wait_request(overlay, req, NULL);
 }
 
 /* overlay needs to be enabled in OCMD reg */
@@ -264,6 +268,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring = &dev_priv->ring[RCS];
+	struct drm_i915_gem_request *req;
 	u32 flip_addr = overlay->flip_addr;
 	u32 tmp;
 	int ret;
@@ -278,18 +283,23 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 	if (tmp & (1 << 17))
 		DRM_DEBUG("overlay underrun, DOVSTA: %x\n", tmp);
 
-	ret = intel_ring_begin(ring, 2);
+	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 	if (ret)
 		return ret;
 
+	ret = intel_ring_begin(ring, 2);
+	if (ret) {
+		i915_gem_request_cancel(req);
+		return ret;
+	}
+
 	intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_CONTINUE);
 	intel_ring_emit(ring, flip_addr);
 	intel_ring_advance(ring);
 
 	WARN_ON(overlay->last_flip_req);
-	i915_gem_request_assign(&overlay->last_flip_req,
-					     ring->outstanding_lazy_request);
-	i915_add_request(ring);
+	i915_gem_request_assign(&overlay->last_flip_req, req);
+	i915_add_request(req->ring);
 
 	return 0;
 }
@@ -327,6 +337,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
 	struct drm_device *dev = overlay->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring = &dev_priv->ring[RCS];
+	struct drm_i915_gem_request *req;
 	u32 flip_addr = overlay->flip_addr;
 	int ret;
 
@@ -338,10 +349,16 @@ static int intel_overlay_off(struct intel_overlay *overlay)
 	 * of the hw. Do it in both cases */
 	flip_addr |= OFC_UPDATE;
 
-	ret = intel_ring_begin(ring, 6);
+	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 	if (ret)
 		return ret;
 
+	ret = intel_ring_begin(ring, 6);
+	if (ret) {
+		i915_gem_request_cancel(req);
+		return ret;
+	}
+
 	/* wait for overlay to go idle */
 	intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_CONTINUE);
 	intel_ring_emit(ring, flip_addr);
@@ -360,7 +377,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
 	}
 	intel_ring_advance(ring);
 
-	return intel_overlay_do_wait_request(overlay, intel_overlay_off_tail);
+	return intel_overlay_do_wait_request(overlay, req, intel_overlay_off_tail);
 }
 
 /* recover from an interruption due to a signal
@@ -404,15 +421,23 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
 
 	if (I915_READ(ISR) & I915_OVERLAY_PLANE_FLIP_PENDING_INTERRUPT) {
 		/* synchronous slowpath */
-		ret = intel_ring_begin(ring, 2);
+		struct drm_i915_gem_request *req;
+
+		ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 		if (ret)
 			return ret;
 
+		ret = intel_ring_begin(ring, 2);
+		if (ret) {
+			i915_gem_request_cancel(req);
+			return ret;
+		}
+
 		intel_ring_emit(ring, MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP);
 		intel_ring_emit(ring, MI_NOOP);
 		intel_ring_advance(ring);
 
-		ret = intel_overlay_do_wait_request(overlay,
+		ret = intel_overlay_do_wait_request(overlay, req,
 						    intel_overlay_release_old_vid_tail);
 		if (ret)
 			return ret;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 27/55] drm/i915: Update queue_flip() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (25 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 26/55] drm/i915: Update overlay code to do explicit request management John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 28/55] drm/i915: Update add_request() " John.C.Harrison
                   ` (29 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the display page flip code to do explicit request creation and
submission rather than relying on the OLR and just hoping that the request
actually gets submitted at some random point.

The sequence is now to create a request, queue the work to the ring, assign the
known request to the flip queue work item then actually submit the work and post
the request.

Note that every single flip function used to finish with
'__intel_ring_advance(ring);'. However, immediately after they return there is
now an add request call which will do the advance anyway. Thus the many
duplicate advance calls have been removed.

v2: Updated commit message with comment about advance removal.

v3: The request can now be allocated by the _sync() code earlier on. Thus the
page flip path does not necessarily need to allocate a new request, it may be
able to re-use one.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    2 +-
 drivers/gpu/drm/i915/intel_display.c    |   33 ++++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.c |    2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |    1 -
 4 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f69e9cb..650e5c7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -608,7 +608,7 @@ struct drm_i915_display_funcs {
 	int (*queue_flip)(struct drm_device *dev, struct drm_crtc *crtc,
 			  struct drm_framebuffer *fb,
 			  struct drm_i915_gem_object *obj,
-			  struct intel_engine_cs *ring,
+			  struct drm_i915_gem_request *req,
 			  uint32_t flags);
 	void (*update_primary_plane)(struct drm_crtc *crtc,
 				     struct drm_framebuffer *fb,
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 6528ada..a1ac557 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10635,9 +10635,10 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
 				 struct drm_i915_gem_object *obj,
-				 struct intel_engine_cs *ring,
+				 struct drm_i915_gem_request *req,
 				 uint32_t flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	u32 flip_mask;
 	int ret;
@@ -10662,7 +10663,6 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
 	return 0;
 }
 
@@ -10670,9 +10670,10 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
 				 struct drm_i915_gem_object *obj,
-				 struct intel_engine_cs *ring,
+				 struct drm_i915_gem_request *req,
 				 uint32_t flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	u32 flip_mask;
 	int ret;
@@ -10694,7 +10695,6 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
 	return 0;
 }
 
@@ -10702,9 +10702,10 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
 				 struct drm_i915_gem_object *obj,
-				 struct intel_engine_cs *ring,
+				 struct drm_i915_gem_request *req,
 				 uint32_t flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	uint32_t pf, pipesrc;
@@ -10733,7 +10734,6 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
 	return 0;
 }
 
@@ -10741,9 +10741,10 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
 				 struct drm_i915_gem_object *obj,
-				 struct intel_engine_cs *ring,
+				 struct drm_i915_gem_request *req,
 				 uint32_t flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	uint32_t pf, pipesrc;
@@ -10769,7 +10770,6 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
 	return 0;
 }
 
@@ -10777,9 +10777,10 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
 				 struct drm_i915_gem_object *obj,
-				 struct intel_engine_cs *ring,
+				 struct drm_i915_gem_request *req,
 				 uint32_t flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	uint32_t plane_bit = 0;
 	int len, ret;
@@ -10864,7 +10865,6 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
 	return 0;
 }
 
@@ -11034,7 +11034,7 @@ static int intel_default_queue_flip(struct drm_device *dev,
 				    struct drm_crtc *crtc,
 				    struct drm_framebuffer *fb,
 				    struct drm_i915_gem_object *obj,
-				    struct intel_engine_cs *ring,
+				    struct drm_i915_gem_request *req,
 				    uint32_t flags)
 {
 	return -ENODEV;
@@ -11249,13 +11249,18 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 				goto cleanup_unpin;
 		}
 
-		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring,
+		if (!request) {
+			ret = i915_gem_request_alloc(ring, ring->default_context, &request);
+			if (ret)
+				goto cleanup_unpin;
+		}
+
+		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, request,
 						   page_flip_flags);
 		if (ret)
 			goto cleanup_unpin;
 
-		i915_gem_request_assign(&work->flip_queued_req,
-					intel_ring_get_request(ring));
+		i915_gem_request_assign(&work->flip_queued_req, request);
 	}
 
 	if (request)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 72735a0a..ea91a33 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -81,7 +81,7 @@ bool intel_ring_stopped(struct intel_engine_cs *ring)
 	return dev_priv->gpu_error.stop_rings & intel_ring_flag(ring);
 }
 
-void __intel_ring_advance(struct intel_engine_cs *ring)
+static void __intel_ring_advance(struct intel_engine_cs *ring)
 {
 	struct intel_ringbuffer *ringbuf = ring->buffer;
 	ringbuf->tail &= ringbuf->size - 1;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 8450add..15c5d83 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -420,7 +420,6 @@ int __intel_ring_space(int head, int tail, int size);
 void intel_ring_update_space(struct intel_ringbuffer *ringbuf);
 int intel_ring_space(struct intel_ringbuffer *ringbuf);
 bool intel_ring_stopped(struct intel_engine_cs *ring);
-void __intel_ring_advance(struct intel_engine_cs *ring);
 
 int __must_check intel_ring_idle(struct intel_engine_cs *ring);
 void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 28/55] drm/i915: Update add_request() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (26 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 27/55] drm/i915: Update queue_flip() to take a request structure John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 29/55] drm/i915: Update [vma|object]_move_to_active() to take request structures John.C.Harrison
                   ` (28 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Now that all callers of i915_add_request() have a request pointer to hand, it is
possible to update the add request function to take a request pointer rather
than pulling it out of the OLR.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |   10 +++++-----
 drivers/gpu/drm/i915/i915_gem.c            |   22 +++++++++++-----------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
 drivers/gpu/drm/i915/intel_display.c       |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
 drivers/gpu/drm/i915/intel_overlay.c       |    4 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c    |    3 ++-
 7 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 650e5c7..f517fcc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2863,14 +2863,14 @@ void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
-void __i915_add_request(struct intel_engine_cs *ring,
+void __i915_add_request(struct drm_i915_gem_request *req,
 			struct drm_file *file,
 			struct drm_i915_gem_object *batch_obj,
 			bool flush_caches);
-#define i915_add_request(ring) \
-	__i915_add_request(ring, NULL, NULL, true)
-#define i915_add_request_no_flush(ring) \
-	__i915_add_request(ring, NULL, NULL, false)
+#define i915_add_request(req) \
+	__i915_add_request(req, NULL, NULL, true)
+#define i915_add_request_no_flush(req) \
+	__i915_add_request(req, NULL, NULL, false)
 int __i915_wait_request(struct drm_i915_gem_request *req,
 			unsigned reset_counter,
 			bool interruptible,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index db90043..12078bd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1158,7 +1158,7 @@ i915_gem_check_olr(struct drm_i915_gem_request *req)
 	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
 
 	if (req == req->ring->outstanding_lazy_request)
-		i915_add_request(req->ring);
+		i915_add_request(req);
 
 	return 0;
 }
@@ -2468,25 +2468,25 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
  * request is not being tracked for completion but the work itself is
  * going to happen on the hardware. This would be a Bad Thing(tm).
  */
-void __i915_add_request(struct intel_engine_cs *ring,
+void __i915_add_request(struct drm_i915_gem_request *request,
 			struct drm_file *file,
 			struct drm_i915_gem_object *obj,
 			bool flush_caches)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct drm_i915_gem_request *request;
+	struct intel_engine_cs *ring;
+	struct drm_i915_private *dev_priv;
 	struct intel_ringbuffer *ringbuf;
 	u32 request_start;
 	int ret;
 
-	request = ring->outstanding_lazy_request;
 	if (WARN_ON(request == NULL))
 		return;
 
-	if (i915.enable_execlists) {
-		ringbuf = request->ctx->engine[ring->id].ringbuf;
-	} else
-		ringbuf = ring->buffer;
+	ring = request->ring;
+	dev_priv = ring->dev->dev_private;
+	ringbuf = request->ringbuf;
+
+	WARN_ON(request != ring->outstanding_lazy_request);
 
 	/*
 	 * To ensure that this call will not fail, space for its emissions
@@ -3344,7 +3344,7 @@ int i915_gpu_idle(struct drm_device *dev)
 				return ret;
 			}
 
-			i915_add_request_no_flush(req->ring);
+			i915_add_request_no_flush(req);
 		}
 
 		WARN_ON(ring->outstanding_lazy_request);
@@ -5128,7 +5128,7 @@ i915_gem_init_hw(struct drm_device *dev)
 			goto out;
 		}
 
-		i915_add_request_no_flush(ring);
+		i915_add_request_no_flush(req);
 	}
 
 out:
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index bea92ad..7533fb3 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1058,7 +1058,7 @@ i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
 	params->ring->gpu_caches_dirty = true;
 
 	/* Add a breadcrumb for the completion of the batch buffer */
-	__i915_add_request(params->ring, params->file, params->batch_obj, true);
+	__i915_add_request(params->request, params->file, params->batch_obj, true);
 }
 
 static int
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index a1ac557..811ff0a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11264,7 +11264,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	}
 
 	if (request)
-		i915_add_request_no_flush(request->ring);
+		i915_add_request_no_flush(request);
 
 	work->flip_queued_vblank = drm_crtc_vblank_count(crtc);
 	work->enable_stall_check = true;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f8e8fdb..521ecfd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1983,7 +1983,7 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
 				goto error;
 			}
 
-			i915_add_request_no_flush(req->ring);
+			i915_add_request_no_flush(req);
 		}
 
 		ctx->rcs_initialized = true;
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 3adb63e..3f70904 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -217,7 +217,7 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
 
 	WARN_ON(overlay->last_flip_req);
 	i915_gem_request_assign(&overlay->last_flip_req, req);
-	i915_add_request(req->ring);
+	i915_add_request(req);
 
 	overlay->flip_tail = tail;
 	ret = i915_wait_request(overlay->last_flip_req);
@@ -299,7 +299,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 
 	WARN_ON(overlay->last_flip_req);
 	i915_gem_request_assign(&overlay->last_flip_req, req);
-	i915_add_request(req->ring);
+	i915_add_request(req);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ea91a33..978984e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2157,8 +2157,9 @@ int intel_ring_idle(struct intel_engine_cs *ring)
 	struct drm_i915_gem_request *req;
 
 	/* We need to add any requests required to flush the objects and ring */
+	WARN_ON(ring->outstanding_lazy_request);
 	if (ring->outstanding_lazy_request)
-		i915_add_request(ring);
+		i915_add_request(ring->outstanding_lazy_request);
 
 	/* Wait upon the last request to be completed */
 	if (list_empty(&ring->request_list))
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 29/55] drm/i915: Update [vma|object]_move_to_active() to take request structures
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (27 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 28/55] drm/i915: Update add_request() " John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 30/55] drm/i915: Update l3_remap to take a request structure John.C.Harrison
                   ` (27 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Now that everything above has been converted to use request structures, it is
possible to update the lower level move_to_active() functions to be request
based as well.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h              |    2 +-
 drivers/gpu/drm/i915/i915_gem.c              |    8 +++++---
 drivers/gpu/drm/i915/i915_gem_context.c      |    2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
 6 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f517fcc..adbd3dd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2781,7 +2781,7 @@ int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_engine_cs *to,
 			 struct drm_i915_gem_request **to_req);
 void i915_vma_move_to_active(struct i915_vma *vma,
-			     struct intel_engine_cs *ring);
+			     struct drm_i915_gem_request *req);
 int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
 			 struct drm_mode_create_dumb *args);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 12078bd..87702fc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2340,9 +2340,12 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 }
 
 void i915_vma_move_to_active(struct i915_vma *vma,
-			     struct intel_engine_cs *ring)
+			     struct drm_i915_gem_request *req)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
+	struct intel_engine_cs *ring;
+
+	ring = i915_gem_request_get_ring(req);
 
 	/* Add a reference if we're newly entering the active list. */
 	if (obj->active == 0)
@@ -2350,8 +2353,7 @@ void i915_vma_move_to_active(struct i915_vma *vma,
 	obj->active |= intel_ring_flag(ring);
 
 	list_move_tail(&obj->ring_list[ring->id], &ring->active_list);
-	i915_gem_request_assign(&obj->last_read_req[ring->id],
-				intel_ring_get_request(ring));
+	i915_gem_request_assign(&obj->last_read_req[ring->id], req);
 
 	list_move_tail(&vma->mm_list, &vma->vm->active_list);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c6640d4..9a8f4a18 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -736,7 +736,7 @@ static int do_switch(struct drm_i915_gem_request *req)
 	 */
 	if (from != NULL) {
 		from->legacy_hw_ctx.rcs_state->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-		i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->legacy_hw_ctx.rcs_state), ring);
+		i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->legacy_hw_ctx.rcs_state), req);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
 		 * object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 7533fb3..a24de9c 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1028,7 +1028,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 			obj->base.pending_read_domains |= obj->base.read_domains;
 		obj->base.read_domains = obj->base.pending_read_domains;
 
-		i915_vma_move_to_active(vma, ring);
+		i915_vma_move_to_active(vma, req);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
 			i915_gem_request_assign(&obj->last_write_req, req);
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 6598f9b..e04cda4 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -171,7 +171,7 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req)
 	if (ret)
 		goto out;
 
-	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req->ring);
+	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req);
 
 out:
 	i915_gem_render_state_fini(&so);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 521ecfd..4811826 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1365,7 +1365,7 @@ static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req)
 	if (ret)
 		goto out;
 
-	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req->ring);
+	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req);
 
 out:
 	i915_gem_render_state_fini(&so);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 30/55] drm/i915: Update l3_remap to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (28 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 29/55] drm/i915: Update [vma|object]_move_to_active() to take request structures John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 31/55] drm/i915: Update mi_set_context() " John.C.Harrison
                   ` (26 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Converted i915_gem_l3_remap() to take a request structure instead of a ring.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    2 +-
 drivers/gpu/drm/i915/i915_gem.c         |    5 +++--
 drivers/gpu/drm/i915/i915_gem_context.c |    2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index adbd3dd..f9b6517 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2858,7 +2858,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_init(struct drm_device *dev);
 int i915_gem_init_rings(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
-int i915_gem_l3_remap(struct intel_engine_cs *ring, int slice);
+int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 87702fc..39f9891 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4888,8 +4888,9 @@ err:
 	return ret;
 }
 
-int i915_gem_l3_remap(struct intel_engine_cs *ring, int slice)
+int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
@@ -5111,7 +5112,7 @@ i915_gem_init_hw(struct drm_device *dev)
 
 		if (ring->id == RCS) {
 			for (i = 0; i < NUM_L3_SLICES(dev); i++)
-				i915_gem_l3_remap(ring, i);
+				i915_gem_l3_remap(req, i);
 		}
 
 		ret = i915_ppgtt_init_ring(req);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 9a8f4a18..fe2e9b0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -720,7 +720,7 @@ static int do_switch(struct drm_i915_gem_request *req)
 		if (!(to->remap_slice & (1<<i)))
 			continue;
 
-		ret = i915_gem_l3_remap(ring, i);
+		ret = i915_gem_l3_remap(req, i);
 		/* If it failed, try again next round */
 		if (ret)
 			DRM_DEBUG_DRIVER("L3 remapping failed\n");
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 31/55] drm/i915: Update mi_set_context() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (29 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 30/55] drm/i915: Update l3_remap to take a request structure John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 32/55] drm/i915: Update a bunch of execbuffer helpers to take request structures John.C.Harrison
                   ` (25 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated mi_set_context() to take a request structure instead of a ring and
context pair.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index fe2e9b0..5f31584 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -478,10 +478,9 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 }
 
 static inline int
-mi_set_context(struct intel_engine_cs *ring,
-	       struct intel_context *new_context,
-	       u32 hw_flags)
+mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	u32 flags = hw_flags | MI_MM_SPACE_GTT;
 	const int num_rings =
 		/* Use an extended w/a on ivb+ if signalling from other rings */
@@ -533,7 +532,7 @@ mi_set_context(struct intel_engine_cs *ring,
 
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_emit(ring, MI_SET_CONTEXT);
-	intel_ring_emit(ring, i915_gem_obj_ggtt_offset(new_context->legacy_hw_ctx.rcs_state) |
+	intel_ring_emit(ring, i915_gem_obj_ggtt_offset(req->ctx->legacy_hw_ctx.rcs_state) |
 			flags);
 	/*
 	 * w/a: MI_SET_CONTEXT must always be followed by MI_NOOP
@@ -695,7 +694,7 @@ static int do_switch(struct drm_i915_gem_request *req)
 	WARN_ON(needs_pd_load_pre(ring, to) &&
 		needs_pd_load_post(ring, to, hw_flags));
 
-	ret = mi_set_context(ring, to, hw_flags);
+	ret = mi_set_context(req, hw_flags);
 	if (ret)
 		goto unpin_out;
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 32/55] drm/i915: Update a bunch of execbuffer helpers to take request structures
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (30 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 31/55] drm/i915: Update mi_set_context() " John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 33/55] drm/i915: Update workarounds_emit() " John.C.Harrison
                   ` (24 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated *_ring_invalidate_all_caches(), i915_reset_gen7_sol_offsets() and
i915_emit_box() to take request structures instead of ring or ringbuf/context
pairs.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   12 +++++++-----
 drivers/gpu/drm/i915/intel_lrc.c           |    9 ++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.c    |    3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h    |    2 +-
 4 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a24de9c..f2f3f99 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -919,7 +919,7 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req,
 	/* Unconditionally invalidate gpu caches and ensure that we do flush
 	 * any residual writes from the previous batch.
 	 */
-	return intel_ring_invalidate_all_caches(req->ring);
+	return intel_ring_invalidate_all_caches(req);
 }
 
 static bool
@@ -1063,8 +1063,9 @@ i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
 
 static int
 i915_reset_gen7_sol_offsets(struct drm_device *dev,
-			    struct intel_engine_cs *ring)
+			    struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret, i;
 
@@ -1089,10 +1090,11 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 }
 
 static int
-i915_emit_box(struct intel_engine_cs *ring,
+i915_emit_box(struct drm_i915_gem_request *req,
 	      struct drm_clip_rect *box,
 	      int DR1, int DR4)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	if (box->y2 <= box->y1 || box->x2 <= box->x1 ||
@@ -1302,7 +1304,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 	}
 
 	if (args->flags & I915_EXEC_GEN7_SOL_RESET) {
-		ret = i915_reset_gen7_sol_offsets(dev, ring);
+		ret = i915_reset_gen7_sol_offsets(dev, params->request);
 		if (ret)
 			goto error;
 	}
@@ -1313,7 +1315,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
-			ret = i915_emit_box(ring, &cliprects[i],
+			ret = i915_emit_box(params->request, &cliprects[i],
 					    args->DR1, args->DR4);
 			if (ret)
 				goto error;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4811826..9df9a0b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -605,10 +605,9 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
 	return 0;
 }
 
-static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf,
-					      struct intel_context *ctx)
+static int logical_ring_invalidate_all_caches(struct drm_i915_gem_request *req)
 {
-	struct intel_engine_cs *ring = ringbuf->ring;
+	struct intel_engine_cs *ring = req->ring;
 	uint32_t flush_domains;
 	int ret;
 
@@ -616,7 +615,7 @@ static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf,
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;
 
-	ret = ring->emit_flush(ringbuf, ctx,
+	ret = ring->emit_flush(req->ringbuf, req->ctx,
 			       I915_GEM_GPU_DOMAINS, flush_domains);
 	if (ret)
 		return ret;
@@ -655,7 +654,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
 	/* Unconditionally invalidate gpu caches and ensure that we do flush
 	 * any residual writes from the previous batch.
 	 */
-	return logical_ring_invalidate_all_caches(req->ringbuf, req->ctx);
+	return logical_ring_invalidate_all_caches(req);
 }
 
 int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 978984e..2ecdc70 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2897,8 +2897,9 @@ intel_ring_flush_all_caches(struct intel_engine_cs *ring)
 }
 
 int
-intel_ring_invalidate_all_caches(struct intel_engine_cs *ring)
+intel_ring_invalidate_all_caches(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	uint32_t flush_domains;
 	int ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 15c5d83..ef5ec04 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -424,7 +424,7 @@ bool intel_ring_stopped(struct intel_engine_cs *ring);
 int __must_check intel_ring_idle(struct intel_engine_cs *ring);
 void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno);
 int intel_ring_flush_all_caches(struct intel_engine_cs *ring);
-int intel_ring_invalidate_all_caches(struct intel_engine_cs *ring);
+int intel_ring_invalidate_all_caches(struct drm_i915_gem_request *req);
 
 void intel_fini_pipe_control(struct intel_engine_cs *ring);
 int intel_init_pipe_control(struct intel_engine_cs *ring);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 33/55] drm/i915: Update workarounds_emit() to take request structures
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (31 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 32/55] drm/i915: Update a bunch of execbuffer helpers to take request structures John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 34/55] drm/i915: Update flush_all_caches() " John.C.Harrison
                   ` (23 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the *_ring_workarounds_emit() functions to take requests instead of
ring/context pairs.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        |   14 +++++++-------
 drivers/gpu/drm/i915/intel_ringbuffer.c |    6 +++---
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9df9a0b..1a59a1f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1047,11 +1047,11 @@ void intel_lr_context_unpin(struct intel_engine_cs *ring,
 	}
 }
 
-static int intel_logical_ring_workarounds_emit(struct intel_engine_cs *ring,
-					       struct intel_context *ctx)
+static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
 {
 	int ret, i;
-	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
+	struct intel_engine_cs *ring = req->ring;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_workarounds *w = &dev_priv->workarounds;
@@ -1060,11 +1060,11 @@ static int intel_logical_ring_workarounds_emit(struct intel_engine_cs *ring,
 		return 0;
 
 	ring->gpu_caches_dirty = true;
-	ret = logical_ring_flush_all_caches(ringbuf, ctx);
+	ret = logical_ring_flush_all_caches(ringbuf, req->ctx);
 	if (ret)
 		return ret;
 
-	ret = intel_logical_ring_begin(ringbuf, ctx, w->count * 2 + 2);
+	ret = intel_logical_ring_begin(ringbuf, req->ctx, w->count * 2 + 2);
 	if (ret)
 		return ret;
 
@@ -1078,7 +1078,7 @@ static int intel_logical_ring_workarounds_emit(struct intel_engine_cs *ring,
 	intel_logical_ring_advance(ringbuf);
 
 	ring->gpu_caches_dirty = true;
-	ret = logical_ring_flush_all_caches(ringbuf, ctx);
+	ret = logical_ring_flush_all_caches(ringbuf, req->ctx);
 	if (ret)
 		return ret;
 
@@ -1375,7 +1375,7 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 {
 	int ret;
 
-	ret = intel_logical_ring_workarounds_emit(req->ring, req->ctx);
+	ret = intel_logical_ring_workarounds_emit(req);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 2ecdc70..bfb2d52 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -703,10 +703,10 @@ err:
 	return ret;
 }
 
-static int intel_ring_workarounds_emit(struct intel_engine_cs *ring,
-				       struct intel_context *ctx)
+static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 {
 	int ret, i;
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_workarounds *w = &dev_priv->workarounds;
@@ -746,7 +746,7 @@ static int intel_rcs_ctx_init(struct drm_i915_gem_request *req)
 {
 	int ret;
 
-	ret = intel_ring_workarounds_emit(req->ring, req->ctx);
+	ret = intel_ring_workarounds_emit(req);
 	if (ret != 0)
 		return ret;
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 34/55] drm/i915: Update flush_all_caches() to take request structures
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (32 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 33/55] drm/i915: Update workarounds_emit() " John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 35/55] drm/i915: Update switch_mm() to take a request structure John.C.Harrison
                   ` (22 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the *_ring_flush_all_caches() functions to take requests instead of
rings or ringbuf/context pairs.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |    4 ++--
 drivers/gpu/drm/i915/intel_lrc.c        |   11 +++++------
 drivers/gpu/drm/i915/intel_lrc.h        |    3 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c |    7 ++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h |    2 +-
 5 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 39f9891..27836bc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2507,9 +2507,9 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 	 */
 	if (flush_caches) {
 		if (i915.enable_execlists)
-			ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
+			ret = logical_ring_flush_all_caches(request);
 		else
-			ret = intel_ring_flush_all_caches(ring);
+			ret = intel_ring_flush_all_caches(request);
 		/* Not allowed to fail! */
 		WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
 	}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1a59a1f..c067cbb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -986,16 +986,15 @@ void intel_logical_ring_stop(struct intel_engine_cs *ring)
 	I915_WRITE_MODE(ring, _MASKED_BIT_DISABLE(STOP_RING));
 }
 
-int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf,
-				  struct intel_context *ctx)
+int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
 {
-	struct intel_engine_cs *ring = ringbuf->ring;
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	if (!ring->gpu_caches_dirty)
 		return 0;
 
-	ret = ring->emit_flush(ringbuf, ctx, 0, I915_GEM_GPU_DOMAINS);
+	ret = ring->emit_flush(req->ringbuf, req->ctx, 0, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -1060,7 +1059,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
 		return 0;
 
 	ring->gpu_caches_dirty = true;
-	ret = logical_ring_flush_all_caches(ringbuf, req->ctx);
+	ret = logical_ring_flush_all_caches(req);
 	if (ret)
 		return ret;
 
@@ -1078,7 +1077,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
 	intel_logical_ring_advance(ringbuf);
 
 	ring->gpu_caches_dirty = true;
-	ret = logical_ring_flush_all_caches(ringbuf, req->ctx);
+	ret = logical_ring_flush_all_caches(req);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index bf137c4..044c0e5 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -41,8 +41,7 @@ void intel_logical_ring_stop(struct intel_engine_cs *ring);
 void intel_logical_ring_cleanup(struct intel_engine_cs *ring);
 int intel_logical_rings_init(struct drm_device *dev);
 
-int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf,
-				  struct intel_context *ctx);
+int logical_ring_flush_all_caches(struct drm_i915_gem_request *req);
 /**
  * intel_logical_ring_advance() - advance the ringbuffer tail
  * @ringbuf: Ringbuffer to advance.
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index bfb2d52..8ca95c6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -715,7 +715,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 		return 0;
 
 	ring->gpu_caches_dirty = true;
-	ret = intel_ring_flush_all_caches(ring);
+	ret = intel_ring_flush_all_caches(req);
 	if (ret)
 		return ret;
 
@@ -733,7 +733,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 	intel_ring_advance(ring);
 
 	ring->gpu_caches_dirty = true;
-	ret = intel_ring_flush_all_caches(ring);
+	ret = intel_ring_flush_all_caches(req);
 	if (ret)
 		return ret;
 
@@ -2879,8 +2879,9 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 }
 
 int
-intel_ring_flush_all_caches(struct intel_engine_cs *ring)
+intel_ring_flush_all_caches(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	if (!ring->gpu_caches_dirty)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index ef5ec04..95a2d19 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -423,7 +423,7 @@ bool intel_ring_stopped(struct intel_engine_cs *ring);
 
 int __must_check intel_ring_idle(struct intel_engine_cs *ring);
 void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno);
-int intel_ring_flush_all_caches(struct intel_engine_cs *ring);
+int intel_ring_flush_all_caches(struct drm_i915_gem_request *req);
 int intel_ring_invalidate_all_caches(struct drm_i915_gem_request *req);
 
 void intel_fini_pipe_control(struct intel_engine_cs *ring);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 35/55] drm/i915: Update switch_mm() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (33 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 34/55] drm/i915: Update flush_all_caches() " John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 36/55] drm/i915: Update ring->flush() to take a requests structure John.C.Harrison
                   ` (21 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the switch_mm() code paths to take a request instead of a ring. This
includes the myriad *_mm_switch functions themselves and a bunch of PDP related
helper functions.

v2: Rebased to newer tree.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c |    4 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c     |   21 +++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_gtt.h     |    2 +-
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5f31584..7573f4a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -652,7 +652,7 @@ static int do_switch(struct drm_i915_gem_request *req)
 		 * Register Immediate commands in Ring Buffer before submitting
 		 * a context."*/
 		trace_switch_mm(ring, to);
-		ret = to->ppgtt->switch_mm(to->ppgtt, ring);
+		ret = to->ppgtt->switch_mm(to->ppgtt, req);
 		if (ret)
 			goto unpin_out;
 
@@ -703,7 +703,7 @@ static int do_switch(struct drm_i915_gem_request *req)
 	 */
 	if (needs_pd_load_post(ring, to, hw_flags)) {
 		trace_switch_mm(ring, to);
-		ret = to->ppgtt->switch_mm(to->ppgtt, ring);
+		ret = to->ppgtt->switch_mm(to->ppgtt, req);
 		/* The hardware context switch is emitted, but we haven't
 		 * actually changed the state - so it's probably safe to bail
 		 * here. Still, let the user know something dangerous has
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 73d6f9e..6a50c13 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -452,10 +452,11 @@ free_pd:
 }
 
 /* Broadwell Page Directory Pointer Descriptors */
-static int gen8_write_pdp(struct intel_engine_cs *ring,
+static int gen8_write_pdp(struct drm_i915_gem_request *req,
 			  unsigned entry,
 			  dma_addr_t addr)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	BUG_ON(entry >= 4);
@@ -476,7 +477,7 @@ static int gen8_write_pdp(struct intel_engine_cs *ring,
 }
 
 static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_engine_cs *ring)
+			  struct drm_i915_gem_request *req)
 {
 	int i, ret;
 
@@ -485,7 +486,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 		dma_addr_t pd_daddr = pd ? pd->daddr : ppgtt->scratch_pd->daddr;
 		/* The page directory might be NULL, but we need to clear out
 		 * whatever the previous context might have used. */
-		ret = gen8_write_pdp(ring, i, pd_daddr);
+		ret = gen8_write_pdp(req, i, pd_daddr);
 		if (ret)
 			return ret;
 	}
@@ -1055,8 +1056,9 @@ static uint32_t get_pd_offset(struct i915_hw_ppgtt *ppgtt)
 }
 
 static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			 struct intel_engine_cs *ring)
+			 struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
@@ -1080,8 +1082,9 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static int vgpu_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_engine_cs *ring)
+			  struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_i915_private *dev_priv = to_i915(ppgtt->base.dev);
 
 	I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
@@ -1090,8 +1093,9 @@ static int vgpu_mm_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_engine_cs *ring)
+			  struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
@@ -1122,8 +1126,9 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static int gen6_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_engine_cs *ring)
+			  struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
@@ -1575,7 +1580,7 @@ int i915_ppgtt_init_ring(struct drm_i915_gem_request *req)
 	if (!ppgtt)
 		return 0;
 
-	return ppgtt->switch_mm(ppgtt, req->ring);
+	return ppgtt->switch_mm(ppgtt, req);
 }
 
 struct i915_hw_ppgtt *
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 75dfa05..735f119 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -338,7 +338,7 @@ struct i915_hw_ppgtt {
 
 	int (*enable)(struct i915_hw_ppgtt *ppgtt);
 	int (*switch_mm)(struct i915_hw_ppgtt *ppgtt,
-			 struct intel_engine_cs *ring);
+			 struct drm_i915_gem_request *req);
 	void (*debug_dump)(struct i915_hw_ppgtt *ppgtt, struct seq_file *m);
 };
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 36/55] drm/i915: Update ring->flush() to take a requests structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (34 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 35/55] drm/i915: Update switch_mm() to take a request structure John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 37/55] drm/i915: Update some flush helpers to take request structures John.C.Harrison
                   ` (20 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the various ring->flush() functions to take a request instead of a ring.
Also updated the tracer to include the request id.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c |    2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |    6 +++---
 drivers/gpu/drm/i915/i915_trace.h       |   14 +++++++------
 drivers/gpu/drm/i915/intel_ringbuffer.c |   34 +++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    2 +-
 5 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 7573f4a..b90e4c0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -495,7 +495,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	 * itlb_before_ctx_switch.
 	 */
 	if (IS_GEN6(ring->dev)) {
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, 0);
+		ret = ring->flush(req, I915_GEM_GPU_DOMAINS, 0);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6a50c13..ea522fa 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1062,7 +1062,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	int ret;
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(req, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -1099,7 +1099,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	int ret;
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(req, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -1117,7 +1117,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 
 	/* XXX: RCS is the only one to auto invalidate the TLBs? */
 	if (ring->id != RCS) {
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+		ret = ring->flush(req, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 6cbc280..8e2d97f 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -475,25 +475,27 @@ TRACE_EVENT(i915_gem_ring_dispatch,
 );
 
 TRACE_EVENT(i915_gem_ring_flush,
-	    TP_PROTO(struct intel_engine_cs *ring, u32 invalidate, u32 flush),
-	    TP_ARGS(ring, invalidate, flush),
+	    TP_PROTO(struct drm_i915_gem_request *req, u32 invalidate, u32 flush),
+	    TP_ARGS(req, invalidate, flush),
 
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
 			     __field(u32, ring)
+			     __field(u32, uniq)
 			     __field(u32, invalidate)
 			     __field(u32, flush)
 			     ),
 
 	    TP_fast_assign(
-			   __entry->dev = ring->dev->primary->index;
-			   __entry->ring = ring->id;
+			   __entry->dev = req->ring->dev->primary->index;
+			   __entry->ring = req->ring->id;
+			   __entry->uniq = req->uniq;
 			   __entry->invalidate = invalidate;
 			   __entry->flush = flush;
 			   ),
 
-	    TP_printk("dev=%u, ring=%x, invalidate=%04x, flush=%04x",
-		      __entry->dev, __entry->ring,
+	    TP_printk("dev=%u, ring=%x, request=%u, invalidate=%04x, flush=%04x",
+		      __entry->dev, __entry->ring, __entry->uniq,
 		      __entry->invalidate, __entry->flush)
 );
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8ca95c6..b90f2d8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -91,10 +91,11 @@ static void __intel_ring_advance(struct intel_engine_cs *ring)
 }
 
 static int
-gen2_render_ring_flush(struct intel_engine_cs *ring,
+gen2_render_ring_flush(struct drm_i915_gem_request *req,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
+	struct intel_engine_cs *ring = req->ring;
 	u32 cmd;
 	int ret;
 
@@ -117,10 +118,11 @@ gen2_render_ring_flush(struct intel_engine_cs *ring,
 }
 
 static int
-gen4_render_ring_flush(struct intel_engine_cs *ring,
+gen4_render_ring_flush(struct drm_i915_gem_request *req,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_device *dev = ring->dev;
 	u32 cmd;
 	int ret;
@@ -247,9 +249,10 @@ intel_emit_post_sync_nonzero_flush(struct intel_engine_cs *ring)
 }
 
 static int
-gen6_render_ring_flush(struct intel_engine_cs *ring,
-                         u32 invalidate_domains, u32 flush_domains)
+gen6_render_ring_flush(struct drm_i915_gem_request *req,
+		       u32 invalidate_domains, u32 flush_domains)
 {
+	struct intel_engine_cs *ring = req->ring;
 	u32 flags = 0;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
@@ -318,9 +321,10 @@ gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring)
 }
 
 static int
-gen7_render_ring_flush(struct intel_engine_cs *ring,
+gen7_render_ring_flush(struct drm_i915_gem_request *req,
 		       u32 invalidate_domains, u32 flush_domains)
 {
+	struct intel_engine_cs *ring = req->ring;
 	u32 flags = 0;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
@@ -400,9 +404,10 @@ gen8_emit_pipe_control(struct intel_engine_cs *ring,
 }
 
 static int
-gen8_render_ring_flush(struct intel_engine_cs *ring,
+gen8_render_ring_flush(struct drm_i915_gem_request *req,
 		       u32 invalidate_domains, u32 flush_domains)
 {
+	struct intel_engine_cs *ring = req->ring;
 	u32 flags = 0;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
@@ -1584,10 +1589,11 @@ i8xx_ring_put_irq(struct intel_engine_cs *ring)
 }
 
 static int
-bsd_ring_flush(struct intel_engine_cs *ring,
+bsd_ring_flush(struct drm_i915_gem_request *req,
 	       u32     invalidate_domains,
 	       u32     flush_domains)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	ret = intel_ring_begin(ring, 2);
@@ -2359,9 +2365,10 @@ static void gen6_bsd_ring_write_tail(struct intel_engine_cs *ring,
 		   _MASKED_BIT_DISABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
 }
 
-static int gen6_bsd_ring_flush(struct intel_engine_cs *ring,
+static int gen6_bsd_ring_flush(struct drm_i915_gem_request *req,
 			       u32 invalidate, u32 flush)
 {
+	struct intel_engine_cs *ring = req->ring;
 	uint32_t cmd;
 	int ret;
 
@@ -2471,9 +2478,10 @@ gen6_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
 
 /* Blitter support (SandyBridge+) */
 
-static int gen6_ring_flush(struct intel_engine_cs *ring,
+static int gen6_ring_flush(struct drm_i915_gem_request *req,
 			   u32 invalidate, u32 flush)
 {
+	struct intel_engine_cs *ring = req->ring;
 	struct drm_device *dev = ring->dev;
 	uint32_t cmd;
 	int ret;
@@ -2887,11 +2895,11 @@ intel_ring_flush_all_caches(struct drm_i915_gem_request *req)
 	if (!ring->gpu_caches_dirty)
 		return 0;
 
-	ret = ring->flush(ring, 0, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(req, 0, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
-	trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
+	trace_i915_gem_ring_flush(req, 0, I915_GEM_GPU_DOMAINS);
 
 	ring->gpu_caches_dirty = false;
 	return 0;
@@ -2908,11 +2916,11 @@ intel_ring_invalidate_all_caches(struct drm_i915_gem_request *req)
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;
 
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, flush_domains);
+	ret = ring->flush(req, I915_GEM_GPU_DOMAINS, flush_domains);
 	if (ret)
 		return ret;
 
-	trace_i915_gem_ring_flush(ring, I915_GEM_GPU_DOMAINS, flush_domains);
+	trace_i915_gem_ring_flush(req, I915_GEM_GPU_DOMAINS, flush_domains);
 
 	ring->gpu_caches_dirty = false;
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 95a2d19..87113e3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -158,7 +158,7 @@ struct  intel_engine_cs {
 
 	void		(*write_tail)(struct intel_engine_cs *ring,
 				      u32 value);
-	int __must_check (*flush)(struct intel_engine_cs *ring,
+	int __must_check (*flush)(struct drm_i915_gem_request *req,
 				  u32	invalidate_domains,
 				  u32	flush_domains);
 	int		(*add_request)(struct intel_engine_cs *ring);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 37/55] drm/i915: Update some flush helpers to take request structures
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (35 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 36/55] drm/i915: Update ring->flush() to take a requests structure John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:43 ` [PATCH 38/55] drm/i915: Update ring->emit_flush() to take a request structure John.C.Harrison
                   ` (19 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated intel_emit_post_sync_nonzero_flush(), gen7_render_ring_cs_stall_wa() and
gen8_emit_pipe_control() to take requests instead of rings.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c |   20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index b90f2d8..323b295 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -214,8 +214,9 @@ gen4_render_ring_flush(struct drm_i915_gem_request *req,
  * really our business.  That leaves only stall at scoreboard.
  */
 static int
-intel_emit_post_sync_nonzero_flush(struct intel_engine_cs *ring)
+intel_emit_post_sync_nonzero_flush(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
 
@@ -258,7 +259,7 @@ gen6_render_ring_flush(struct drm_i915_gem_request *req,
 	int ret;
 
 	/* Force SNB workarounds for PIPE_CONTROL flushes */
-	ret = intel_emit_post_sync_nonzero_flush(ring);
+	ret = intel_emit_post_sync_nonzero_flush(req);
 	if (ret)
 		return ret;
 
@@ -302,8 +303,9 @@ gen6_render_ring_flush(struct drm_i915_gem_request *req,
 }
 
 static int
-gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring)
+gen7_render_ring_cs_stall_wa(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	ret = intel_ring_begin(ring, 4);
@@ -366,7 +368,7 @@ gen7_render_ring_flush(struct drm_i915_gem_request *req,
 		/* Workaround: we must issue a pipe_control with CS-stall bit
 		 * set before a pipe_control command that has the state cache
 		 * invalidate bit set. */
-		gen7_render_ring_cs_stall_wa(ring);
+		gen7_render_ring_cs_stall_wa(req);
 	}
 
 	ret = intel_ring_begin(ring, 4);
@@ -383,9 +385,10 @@ gen7_render_ring_flush(struct drm_i915_gem_request *req,
 }
 
 static int
-gen8_emit_pipe_control(struct intel_engine_cs *ring,
+gen8_emit_pipe_control(struct drm_i915_gem_request *req,
 		       u32 flags, u32 scratch_addr)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	ret = intel_ring_begin(ring, 6);
@@ -407,9 +410,8 @@ static int
 gen8_render_ring_flush(struct drm_i915_gem_request *req,
 		       u32 invalidate_domains, u32 flush_domains)
 {
-	struct intel_engine_cs *ring = req->ring;
 	u32 flags = 0;
-	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
+	u32 scratch_addr = req->ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
 
 	flags |= PIPE_CONTROL_CS_STALL;
@@ -429,7 +431,7 @@ gen8_render_ring_flush(struct drm_i915_gem_request *req,
 		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
 
 		/* WaCsStallBeforeStateCacheInvalidate:bdw,chv */
-		ret = gen8_emit_pipe_control(ring,
+		ret = gen8_emit_pipe_control(req,
 					     PIPE_CONTROL_CS_STALL |
 					     PIPE_CONTROL_STALL_AT_SCOREBOARD,
 					     0);
@@ -437,7 +439,7 @@ gen8_render_ring_flush(struct drm_i915_gem_request *req,
 			return ret;
 	}
 
-	return gen8_emit_pipe_control(ring, flags, scratch_addr);
+	return gen8_emit_pipe_control(req, flags, scratch_addr);
 }
 
 static void ring_write_tail(struct intel_engine_cs *ring,
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 38/55] drm/i915: Update ring->emit_flush() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (36 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 37/55] drm/i915: Update some flush helpers to take request structures John.C.Harrison
@ 2015-05-29 16:43 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 39/55] drm/i915: Update ring->add_request() " John.C.Harrison
                   ` (18 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:43 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the various ring->emit_flush() implementations to take a request instead
of a ringbuf/context pair.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        |   17 ++++++++---------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    3 +--
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c067cbb..e46d6be 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -615,8 +615,7 @@ static int logical_ring_invalidate_all_caches(struct drm_i915_gem_request *req)
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;
 
-	ret = ring->emit_flush(req->ringbuf, req->ctx,
-			       I915_GEM_GPU_DOMAINS, flush_domains);
+	ret = ring->emit_flush(req, I915_GEM_GPU_DOMAINS, flush_domains);
 	if (ret)
 		return ret;
 
@@ -994,7 +993,7 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
 	if (!ring->gpu_caches_dirty)
 		return 0;
 
-	ret = ring->emit_flush(req->ringbuf, req->ctx, 0, I915_GEM_GPU_DOMAINS);
+	ret = ring->emit_flush(req, 0, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -1192,18 +1191,18 @@ static void gen8_logical_ring_put_irq(struct intel_engine_cs *ring)
 	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
 }
 
-static int gen8_emit_flush(struct intel_ringbuffer *ringbuf,
-			   struct intel_context *ctx,
+static int gen8_emit_flush(struct drm_i915_gem_request *request,
 			   u32 invalidate_domains,
 			   u32 unused)
 {
+	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	struct intel_engine_cs *ring = ringbuf->ring;
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	uint32_t cmd;
 	int ret;
 
-	ret = intel_logical_ring_begin(ringbuf, ctx, 4);
+	ret = intel_logical_ring_begin(ringbuf, request->ctx, 4);
 	if (ret)
 		return ret;
 
@@ -1233,11 +1232,11 @@ static int gen8_emit_flush(struct intel_ringbuffer *ringbuf,
 	return 0;
 }
 
-static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
-				  struct intel_context *ctx,
+static int gen8_emit_flush_render(struct drm_i915_gem_request *request,
 				  u32 invalidate_domains,
 				  u32 flush_domains)
 {
+	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	struct intel_engine_cs *ring = ringbuf->ring;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	bool vf_flush_wa;
@@ -1269,7 +1268,7 @@ static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
 	vf_flush_wa = INTEL_INFO(ring->dev)->gen >= 9 &&
 		      flags & PIPE_CONTROL_VF_CACHE_INVALIDATE;
 
-	ret = intel_logical_ring_begin(ringbuf, ctx, vf_flush_wa ? 12 : 6);
+	ret = intel_logical_ring_begin(ringbuf, request->ctx, vf_flush_wa ? 12 : 6);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 87113e3..0b9fecd 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -246,8 +246,7 @@ struct  intel_engine_cs {
 	u32             irq_keep_mask; /* bitmask for interrupts that should not be masked */
 	int		(*emit_request)(struct intel_ringbuffer *ringbuf,
 					struct drm_i915_gem_request *request);
-	int		(*emit_flush)(struct intel_ringbuffer *ringbuf,
-				      struct intel_context *ctx,
+	int		(*emit_flush)(struct drm_i915_gem_request *request,
 				      u32 invalidate_domains,
 				      u32 flush_domains);
 	int		(*emit_bb_start)(struct intel_ringbuffer *ringbuf,
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 39/55] drm/i915: Update ring->add_request() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (37 preceding siblings ...)
  2015-05-29 16:43 ` [PATCH 38/55] drm/i915: Update ring->emit_flush() to take a request structure John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 40/55] drm/i915: Update ring->emit_request() " John.C.Harrison
                   ` (17 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the various ring->add_request() implementations to take a request
instead of a ring. This removes their reliance on the OLR to obtain the seqno
value that the request should be tagged with.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |    2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |   26 ++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    2 +-
 3 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 27836bc..c035b9a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2524,7 +2524,7 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 	if (i915.enable_execlists)
 		ret = ring->emit_request(ringbuf, request);
 	else {
-		ret = ring->add_request(ring);
+		ret = ring->add_request(request);
 
 		request->tail = intel_ring_get_tail(ringbuf);
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 323b295..e2ad3b8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1278,16 +1278,16 @@ static int gen6_signal(struct intel_engine_cs *signaller,
 
 /**
  * gen6_add_request - Update the semaphore mailbox registers
- * 
- * @ring - ring that is adding a request
- * @seqno - return seqno stuck into the ring
+ *
+ * @request - request to write to the ring
  *
  * Update the mailbox registers in the *other* rings with the current seqno.
  * This acts like a signal in the canonical semaphore.
  */
 static int
-gen6_add_request(struct intel_engine_cs *ring)
+gen6_add_request(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	if (ring->semaphore.signal)
@@ -1300,8 +1300,7 @@ gen6_add_request(struct intel_engine_cs *ring)
 
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-	intel_ring_emit(ring,
-		    i915_gem_request_get_seqno(ring->outstanding_lazy_request));
+	intel_ring_emit(ring, i915_gem_request_get_seqno(req));
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	__intel_ring_advance(ring);
 
@@ -1398,8 +1397,9 @@ do {									\
 } while (0)
 
 static int
-pc_render_add_request(struct intel_engine_cs *ring)
+pc_render_add_request(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
 
@@ -1419,8 +1419,7 @@ pc_render_add_request(struct intel_engine_cs *ring)
 			PIPE_CONTROL_WRITE_FLUSH |
 			PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE);
 	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
-	intel_ring_emit(ring,
-		    i915_gem_request_get_seqno(ring->outstanding_lazy_request));
+	intel_ring_emit(ring, i915_gem_request_get_seqno(req));
 	intel_ring_emit(ring, 0);
 	PIPE_CONTROL_FLUSH(ring, scratch_addr);
 	scratch_addr += 2 * CACHELINE_BYTES; /* write to separate cachelines */
@@ -1439,8 +1438,7 @@ pc_render_add_request(struct intel_engine_cs *ring)
 			PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
 			PIPE_CONTROL_NOTIFY);
 	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
-	intel_ring_emit(ring,
-		    i915_gem_request_get_seqno(ring->outstanding_lazy_request));
+	intel_ring_emit(ring, i915_gem_request_get_seqno(req));
 	intel_ring_emit(ring, 0);
 	__intel_ring_advance(ring);
 
@@ -1609,8 +1607,9 @@ bsd_ring_flush(struct drm_i915_gem_request *req,
 }
 
 static int
-i9xx_add_request(struct intel_engine_cs *ring)
+i9xx_add_request(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	ret = intel_ring_begin(ring, 4);
@@ -1619,8 +1618,7 @@ i9xx_add_request(struct intel_engine_cs *ring)
 
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-	intel_ring_emit(ring,
-		    i915_gem_request_get_seqno(ring->outstanding_lazy_request));
+	intel_ring_emit(ring, i915_gem_request_get_seqno(req));
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	__intel_ring_advance(ring);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 0b9fecd..91d379b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -161,7 +161,7 @@ struct  intel_engine_cs {
 	int __must_check (*flush)(struct drm_i915_gem_request *req,
 				  u32	invalidate_domains,
 				  u32	flush_domains);
-	int		(*add_request)(struct intel_engine_cs *ring);
+	int		(*add_request)(struct drm_i915_gem_request *req);
 	/* Some chipsets are not quite as coherent as advertised and need
 	 * an expensive kick to force a true read of the up-to-date seqno.
 	 * However, the up-to-date seqno is not always required and the last
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 40/55] drm/i915: Update ring->emit_request() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (38 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 39/55] drm/i915: Update ring->add_request() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 41/55] drm/i915: Update ring->dispatch_execbuffer() " John.C.Harrison
                   ` (16 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the ring->emit_request() implementation to take a request instead of a
ringbuf/request pair. Also removed its use of the OLR for obtaining the
request's seqno.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c        |    7 +++----
 drivers/gpu/drm/i915/intel_ringbuffer.h |    3 +--
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c035b9a..f98374a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2522,7 +2522,7 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 	request->postfix = intel_ring_get_tail(ringbuf);
 
 	if (i915.enable_execlists)
-		ret = ring->emit_request(ringbuf, request);
+		ret = ring->emit_request(request);
 	else {
 		ret = ring->add_request(request);
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e46d6be..86b021b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1302,9 +1302,9 @@ static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno)
 	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
 }
 
-static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
-			     struct drm_i915_gem_request *request)
+static int gen8_emit_request(struct drm_i915_gem_request *request)
 {
+	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	struct intel_engine_cs *ring = ringbuf->ring;
 	u32 cmd;
 	int ret;
@@ -1326,8 +1326,7 @@ static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
 				(ring->status_page.gfx_addr +
 				(I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)));
 	intel_logical_ring_emit(ringbuf, 0);
-	intel_logical_ring_emit(ringbuf,
-		i915_gem_request_get_seqno(ring->outstanding_lazy_request));
+	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
 	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
 	intel_logical_ring_advance_and_submit(ringbuf, request->ctx, request);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 91d379b..ed81b28 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -244,8 +244,7 @@ struct  intel_engine_cs {
 	struct list_head execlist_retired_req_list;
 	u8 next_context_status_buffer;
 	u32             irq_keep_mask; /* bitmask for interrupts that should not be masked */
-	int		(*emit_request)(struct intel_ringbuffer *ringbuf,
-					struct drm_i915_gem_request *request);
+	int		(*emit_request)(struct drm_i915_gem_request *request);
 	int		(*emit_flush)(struct drm_i915_gem_request *request,
 				      u32 invalidate_domains,
 				      u32 flush_domains);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 41/55] drm/i915: Update ring->dispatch_execbuffer() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (39 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 40/55] drm/i915: Update ring->emit_request() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 42/55] drm/i915: Update ring->emit_bb_start() " John.C.Harrison
                   ` (15 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the various ring->dispatch_execbuffer() implementations to take a
request instead of a ring.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    4 ++--
 drivers/gpu/drm/i915/i915_gem_render_state.c |    3 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c      |   18 ++++++++++++------
 drivers/gpu/drm/i915/intel_ringbuffer.h      |    2 +-
 4 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f2f3f99..0b24d8f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1320,14 +1320,14 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 			if (ret)
 				goto error;
 
-			ret = ring->dispatch_execbuffer(ring,
+			ret = ring->dispatch_execbuffer(params->request,
 							exec_start, exec_len,
 							params->dispatch_flags);
 			if (ret)
 				goto error;
 		}
 	} else {
-		ret = ring->dispatch_execbuffer(ring,
+		ret = ring->dispatch_execbuffer(params->request,
 						exec_start, exec_len,
 						params->dispatch_flags);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index e04cda4..a0201fc 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -164,8 +164,7 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req)
 	if (so.rodata == NULL)
 		return 0;
 
-	ret = req->ring->dispatch_execbuffer(req->ring,
-					     so.ggtt_offset,
+	ret = req->ring->dispatch_execbuffer(req, so.ggtt_offset,
 					     so.rodata->batch_items * 4,
 					     I915_DISPATCH_SECURE);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e2ad3b8..88791b3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1750,10 +1750,11 @@ gen8_ring_put_irq(struct intel_engine_cs *ring)
 }
 
 static int
-i965_dispatch_execbuffer(struct intel_engine_cs *ring,
+i965_dispatch_execbuffer(struct drm_i915_gem_request *req,
 			 u64 offset, u32 length,
 			 unsigned dispatch_flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	ret = intel_ring_begin(ring, 2);
@@ -1776,10 +1777,11 @@ i965_dispatch_execbuffer(struct intel_engine_cs *ring,
 #define I830_TLB_ENTRIES (2)
 #define I830_WA_SIZE max(I830_TLB_ENTRIES*4096, I830_BATCH_LIMIT)
 static int
-i830_dispatch_execbuffer(struct intel_engine_cs *ring,
+i830_dispatch_execbuffer(struct drm_i915_gem_request *req,
 			 u64 offset, u32 len,
 			 unsigned dispatch_flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	u32 cs_offset = ring->scratch.gtt_offset;
 	int ret;
 
@@ -1838,10 +1840,11 @@ i830_dispatch_execbuffer(struct intel_engine_cs *ring,
 }
 
 static int
-i915_dispatch_execbuffer(struct intel_engine_cs *ring,
+i915_dispatch_execbuffer(struct drm_i915_gem_request *req,
 			 u64 offset, u32 len,
 			 unsigned dispatch_flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	ret = intel_ring_begin(ring, 2);
@@ -2410,10 +2413,11 @@ static int gen6_bsd_ring_flush(struct drm_i915_gem_request *req,
 }
 
 static int
-gen8_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
+gen8_ring_dispatch_execbuffer(struct drm_i915_gem_request *req,
 			      u64 offset, u32 len,
 			      unsigned dispatch_flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	bool ppgtt = USES_PPGTT(ring->dev) &&
 			!(dispatch_flags & I915_DISPATCH_SECURE);
 	int ret;
@@ -2433,10 +2437,11 @@ gen8_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
 }
 
 static int
-hsw_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
+hsw_ring_dispatch_execbuffer(struct drm_i915_gem_request *req,
 			     u64 offset, u32 len,
 			     unsigned dispatch_flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	ret = intel_ring_begin(ring, 2);
@@ -2455,10 +2460,11 @@ hsw_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
 }
 
 static int
-gen6_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
+gen6_ring_dispatch_execbuffer(struct drm_i915_gem_request *req,
 			      u64 offset, u32 len,
 			      unsigned dispatch_flags)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
 	ret = intel_ring_begin(ring, 2);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index ed81b28..1016b8d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -172,7 +172,7 @@ struct  intel_engine_cs {
 				     bool lazy_coherency);
 	void		(*set_seqno)(struct intel_engine_cs *ring,
 				     u32 seqno);
-	int		(*dispatch_execbuffer)(struct intel_engine_cs *ring,
+	int		(*dispatch_execbuffer)(struct drm_i915_gem_request *req,
 					       u64 offset, u32 length,
 					       unsigned dispatch_flags);
 #define I915_DISPATCH_SECURE 0x1
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 42/55] drm/i915: Update ring->emit_bb_start() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (40 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 41/55] drm/i915: Update ring->dispatch_execbuffer() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 43/55] drm/i915: Update ring->sync_to() " John.C.Harrison
                   ` (14 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the ring->emit_bb_start() implementation to take a request instead of a
ringbuf/context pair.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        |   12 +++++-------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    3 +--
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 86b021b..648aca7 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -925,7 +925,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 	exec_start = params->batch_obj_vm_offset +
 		     args->batch_start_offset;
 
-	ret = ring->emit_bb_start(ringbuf, params->ctx, exec_start, params->dispatch_flags);
+	ret = ring->emit_bb_start(params->request, exec_start, params->dispatch_flags);
 	if (ret)
 		return ret;
 
@@ -1137,14 +1137,14 @@ static int gen9_init_render_ring(struct intel_engine_cs *ring)
 	return init_workarounds_ring(ring);
 }
 
-static int gen8_emit_bb_start(struct intel_ringbuffer *ringbuf,
-			      struct intel_context *ctx,
+static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
 			      u64 offset, unsigned dispatch_flags)
 {
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
 	bool ppgtt = !(dispatch_flags & I915_DISPATCH_SECURE);
 	int ret;
 
-	ret = intel_logical_ring_begin(ringbuf, ctx, 4);
+	ret = intel_logical_ring_begin(ringbuf, req->ctx, 4);
 	if (ret)
 		return ret;
 
@@ -1354,9 +1354,7 @@ static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req)
 	if (so.rodata == NULL)
 		return 0;
 
-	ret = req->ring->emit_bb_start(req->ringbuf,
-				       req->ctx,
-				       so.ggtt_offset,
+	ret = req->ring->emit_bb_start(req, so.ggtt_offset,
 				       I915_DISPATCH_SECURE);
 	if (ret)
 		goto out;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 1016b8d..ab50970 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -248,8 +248,7 @@ struct  intel_engine_cs {
 	int		(*emit_flush)(struct drm_i915_gem_request *request,
 				      u32 invalidate_domains,
 				      u32 flush_domains);
-	int		(*emit_bb_start)(struct intel_ringbuffer *ringbuf,
-					 struct intel_context *ctx,
+	int		(*emit_bb_start)(struct drm_i915_gem_request *req,
 					 u64 offset, unsigned dispatch_flags);
 
 	/**
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 43/55] drm/i915: Update ring->sync_to() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (41 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 42/55] drm/i915: Update ring->emit_bb_start() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 44/55] drm/i915: Update ring->signal() " John.C.Harrison
                   ` (13 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the ring->sync_to() implementations to take a request instead of a ring.
Also updated the tracer to include the request id.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |    4 ++--
 drivers/gpu/drm/i915/i915_trace.h       |   14 ++++++++------
 drivers/gpu/drm/i915/intel_ringbuffer.c |    6 ++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |    4 ++--
 4 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f98374a..d34d5ac 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3145,8 +3145,8 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		trace_i915_gem_ring_sync_to(from, to, from_req);
-		ret = to->semaphore.sync_to(to, from, seqno);
+		trace_i915_gem_ring_sync_to(*to_req, from, from_req);
+		ret = to->semaphore.sync_to(*to_req, from, seqno);
 		if (ret)
 			return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 8e2d97f..bc25458 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -424,29 +424,31 @@ TRACE_EVENT(i915_gem_evict_vm,
 );
 
 TRACE_EVENT(i915_gem_ring_sync_to,
-	    TP_PROTO(struct intel_engine_cs *from,
-		     struct intel_engine_cs *to,
+	    TP_PROTO(struct drm_i915_gem_request *to_req,
+		     struct intel_engine_cs *from,
 		     struct drm_i915_gem_request *req),
-	    TP_ARGS(from, to, req),
+	    TP_ARGS(to_req, from, req),
 
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
 			     __field(u32, sync_from)
 			     __field(u32, sync_to)
+			     __field(u32, uniq_to)
 			     __field(u32, seqno)
 			     ),
 
 	    TP_fast_assign(
 			   __entry->dev = from->dev->primary->index;
 			   __entry->sync_from = from->id;
-			   __entry->sync_to = to->id;
+			   __entry->sync_to = to_req->ring->id;
+			   __entry->uniq_to = to_req->uniq;
 			   __entry->seqno = i915_gem_request_get_seqno(req);
 			   ),
 
-	    TP_printk("dev=%u, sync-from=%u, sync-to=%u, seqno=%u",
+	    TP_printk("dev=%u, sync-from=%u, sync-to=%u, seqno=%u, to_uniq=%u",
 		      __entry->dev,
 		      __entry->sync_from, __entry->sync_to,
-		      __entry->seqno)
+		      __entry->seqno, __entry->uniq_to)
 );
 
 TRACE_EVENT(i915_gem_ring_dispatch,
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 88791b3..a79a676 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1323,10 +1323,11 @@ static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
  */
 
 static int
-gen8_ring_sync(struct intel_engine_cs *waiter,
+gen8_ring_sync(struct drm_i915_gem_request *waiter_req,
 	       struct intel_engine_cs *signaller,
 	       u32 seqno)
 {
+	struct intel_engine_cs *waiter = waiter_req->ring;
 	struct drm_i915_private *dev_priv = waiter->dev->dev_private;
 	int ret;
 
@@ -1348,10 +1349,11 @@ gen8_ring_sync(struct intel_engine_cs *waiter,
 }
 
 static int
-gen6_ring_sync(struct intel_engine_cs *waiter,
+gen6_ring_sync(struct drm_i915_gem_request *waiter_req,
 	       struct intel_engine_cs *signaller,
 	       u32 seqno)
 {
+	struct intel_engine_cs *waiter = waiter_req->ring;
 	u32 dw1 = MI_SEMAPHORE_MBOX |
 		  MI_SEMAPHORE_COMPARE |
 		  MI_SEMAPHORE_REGISTER;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index ab50970..6b8a2de 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -230,8 +230,8 @@ struct  intel_engine_cs {
 		};
 
 		/* AKA wait() */
-		int	(*sync_to)(struct intel_engine_cs *ring,
-				   struct intel_engine_cs *to,
+		int	(*sync_to)(struct drm_i915_gem_request *to_req,
+				   struct intel_engine_cs *from,
 				   u32 seqno);
 		int	(*signal)(struct intel_engine_cs *signaller,
 				  /* num_dwords needed by caller */
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 44/55] drm/i915: Update ring->signal() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (42 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 43/55] drm/i915: Update ring->sync_to() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 45/55] drm/i915: Update cacheline_align() " John.C.Harrison
                   ` (12 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated the various ring->signal() implementations to take a request instead of
a ring. This removes their reliance on the OLR to obtain the seqno value that
should be used for the signal.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c |   20 ++++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    2 +-
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index a79a676..d3d384a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1161,10 +1161,11 @@ static void render_ring_cleanup(struct intel_engine_cs *ring)
 	intel_fini_pipe_control(ring);
 }
 
-static int gen8_rcs_signal(struct intel_engine_cs *signaller,
+static int gen8_rcs_signal(struct drm_i915_gem_request *signaller_req,
 			   unsigned int num_dwords)
 {
 #define MBOX_UPDATE_DWORDS 8
+	struct intel_engine_cs *signaller = signaller_req->ring;
 	struct drm_device *dev = signaller->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *waiter;
@@ -1184,8 +1185,7 @@ static int gen8_rcs_signal(struct intel_engine_cs *signaller,
 		if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
 			continue;
 
-		seqno = i915_gem_request_get_seqno(
-					   signaller->outstanding_lazy_request);
+		seqno = i915_gem_request_get_seqno(signaller_req);
 		intel_ring_emit(signaller, GFX_OP_PIPE_CONTROL(6));
 		intel_ring_emit(signaller, PIPE_CONTROL_GLOBAL_GTT_IVB |
 					   PIPE_CONTROL_QW_WRITE |
@@ -1202,10 +1202,11 @@ static int gen8_rcs_signal(struct intel_engine_cs *signaller,
 	return 0;
 }
 
-static int gen8_xcs_signal(struct intel_engine_cs *signaller,
+static int gen8_xcs_signal(struct drm_i915_gem_request *signaller_req,
 			   unsigned int num_dwords)
 {
 #define MBOX_UPDATE_DWORDS 6
+	struct intel_engine_cs *signaller = signaller_req->ring;
 	struct drm_device *dev = signaller->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *waiter;
@@ -1225,8 +1226,7 @@ static int gen8_xcs_signal(struct intel_engine_cs *signaller,
 		if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
 			continue;
 
-		seqno = i915_gem_request_get_seqno(
-					   signaller->outstanding_lazy_request);
+		seqno = i915_gem_request_get_seqno(signaller_req);
 		intel_ring_emit(signaller, (MI_FLUSH_DW + 1) |
 					   MI_FLUSH_DW_OP_STOREDW);
 		intel_ring_emit(signaller, lower_32_bits(gtt_offset) |
@@ -1241,9 +1241,10 @@ static int gen8_xcs_signal(struct intel_engine_cs *signaller,
 	return 0;
 }
 
-static int gen6_signal(struct intel_engine_cs *signaller,
+static int gen6_signal(struct drm_i915_gem_request *signaller_req,
 		       unsigned int num_dwords)
 {
+	struct intel_engine_cs *signaller = signaller_req->ring;
 	struct drm_device *dev = signaller->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *useless;
@@ -1261,8 +1262,7 @@ static int gen6_signal(struct intel_engine_cs *signaller,
 	for_each_ring(useless, dev_priv, i) {
 		u32 mbox_reg = signaller->semaphore.mbox.signal[i];
 		if (mbox_reg != GEN6_NOSYNC) {
-			u32 seqno = i915_gem_request_get_seqno(
-					   signaller->outstanding_lazy_request);
+			u32 seqno = i915_gem_request_get_seqno(signaller_req);
 			intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
 			intel_ring_emit(signaller, mbox_reg);
 			intel_ring_emit(signaller, seqno);
@@ -1291,7 +1291,7 @@ gen6_add_request(struct drm_i915_gem_request *req)
 	int ret;
 
 	if (ring->semaphore.signal)
-		ret = ring->semaphore.signal(ring, 4);
+		ret = ring->semaphore.signal(req, 4);
 	else
 		ret = intel_ring_begin(ring, 4);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 6b8a2de..f78ef49 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -233,7 +233,7 @@ struct  intel_engine_cs {
 		int	(*sync_to)(struct drm_i915_gem_request *to_req,
 				   struct intel_engine_cs *from,
 				   u32 seqno);
-		int	(*signal)(struct intel_engine_cs *signaller,
+		int	(*signal)(struct drm_i915_gem_request *signaller_req,
 				  /* num_dwords needed by caller */
 				  unsigned int num_dwords);
 	} semaphore;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 45/55] drm/i915: Update cacheline_align() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (43 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 44/55] drm/i915: Update ring->signal() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 46/55] drm/i915: Update intel_ring_begin() " John.C.Harrison
                   ` (11 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Updated intel_ring_cacheline_align() to take a request instead of a ring.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c    |    2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |    3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |    2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 811ff0a..0327628 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10822,7 +10822,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	 * then do the cacheline alignment, and finally emit the
 	 * MI_DISPLAY_FLIP.
 	 */
-	ret = intel_ring_cacheline_align(ring);
+	ret = intel_ring_cacheline_align(req);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d3d384a..8ebc0d8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2298,8 +2298,9 @@ int intel_ring_begin(struct intel_engine_cs *ring,
 }
 
 /* Align the ring tail to a cacheline boundary */
-int intel_ring_cacheline_align(struct intel_engine_cs *ring)
+int intel_ring_cacheline_align(struct drm_i915_gem_request *req)
 {
+	struct intel_engine_cs *ring = req->ring;
 	int num_dwords = (ring->buffer->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
 	int ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f78ef49..bfeca53 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -400,7 +400,7 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring);
 int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request);
 
 int __must_check intel_ring_begin(struct intel_engine_cs *ring, int n);
-int __must_check intel_ring_cacheline_align(struct intel_engine_cs *ring);
+int __must_check intel_ring_cacheline_align(struct drm_i915_gem_request *req);
 static inline void intel_ring_emit(struct intel_engine_cs *ring,
 				   u32 data)
 {
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 46/55] drm/i915: Update intel_ring_begin() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (44 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 45/55] drm/i915: Update cacheline_align() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-06-23 10:24   ` Chris Wilson
  2015-05-29 16:44 ` [PATCH 47/55] drm/i915: Update intel_logical_ring_begin() " John.C.Harrison
                   ` (10 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Now that everything above has been converted to use requests, intel_ring_begin()
can be updated to take a request instead of a ring. This also means that it no
longer needs to lazily allocate a request if no-one happens to have done it
earlier.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c            |    2 +-
 drivers/gpu/drm/i915/i915_gem_context.c    |    2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    8 +--
 drivers/gpu/drm/i915/i915_gem_gtt.c        |    6 +--
 drivers/gpu/drm/i915/intel_display.c       |   10 ++--
 drivers/gpu/drm/i915/intel_overlay.c       |    8 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c    |   74 ++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |    2 +-
 8 files changed, 55 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d34d5ac..9f3e0717 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4900,7 +4900,7 @@ int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
 	if (!HAS_L3_DPF(dev) || !remap_info)
 		return 0;
 
-	ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
+	ret = intel_ring_begin(req, GEN7_L3LOG_SIZE / 4 * 3);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index b90e4c0..5161747 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -509,7 +509,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	if (INTEL_INFO(ring->dev)->gen >= 7)
 		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
 
-	ret = intel_ring_begin(ring, len);
+	ret = intel_ring_begin(req, len);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0b24d8f..805d288 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1074,7 +1074,7 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 		return -EINVAL;
 	}
 
-	ret = intel_ring_begin(ring, 4 * 3);
+	ret = intel_ring_begin(req, 4 * 3);
 	if (ret)
 		return ret;
 
@@ -1105,7 +1105,7 @@ i915_emit_box(struct drm_i915_gem_request *req,
 	}
 
 	if (INTEL_INFO(ring->dev)->gen >= 4) {
-		ret = intel_ring_begin(ring, 4);
+		ret = intel_ring_begin(req, 4);
 		if (ret)
 			return ret;
 
@@ -1114,7 +1114,7 @@ i915_emit_box(struct drm_i915_gem_request *req,
 		intel_ring_emit(ring, ((box->x2 - 1) & 0xffff) | (box->y2 - 1) << 16);
 		intel_ring_emit(ring, DR4);
 	} else {
-		ret = intel_ring_begin(ring, 6);
+		ret = intel_ring_begin(req, 6);
 		if (ret)
 			return ret;
 
@@ -1290,7 +1290,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 
 	if (ring == &dev_priv->ring[RCS] &&
 			instp_mode != dev_priv->relative_constants_mode) {
-		ret = intel_ring_begin(ring, 4);
+		ret = intel_ring_begin(params->request, 4);
 		if (ret)
 			goto error;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ea522fa..acde25a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -461,7 +461,7 @@ static int gen8_write_pdp(struct drm_i915_gem_request *req,
 
 	BUG_ON(entry >= 4);
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
@@ -1066,7 +1066,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	if (ret)
 		return ret;
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
@@ -1103,7 +1103,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	if (ret)
 		return ret;
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0327628..91e19d0 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10643,7 +10643,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	u32 flip_mask;
 	int ret;
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
@@ -10678,7 +10678,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	u32 flip_mask;
 	int ret;
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
@@ -10711,7 +10711,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	uint32_t pf, pipesrc;
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -10750,7 +10750,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 	uint32_t pf, pipesrc;
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -10826,7 +10826,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	if (ret)
 		return ret;
 
-	ret = intel_ring_begin(ring, len);
+	ret = intel_ring_begin(req, len);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 3f70904..4445426 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -244,7 +244,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 	if (ret)
 		return ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret) {
 		i915_gem_request_cancel(req);
 		return ret;
@@ -287,7 +287,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 	if (ret)
 		return ret;
 
-	ret = intel_ring_begin(ring, 2);
+	ret = intel_ring_begin(req, 2);
 	if (ret) {
 		i915_gem_request_cancel(req);
 		return ret;
@@ -353,7 +353,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
 	if (ret)
 		return ret;
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret) {
 		i915_gem_request_cancel(req);
 		return ret;
@@ -427,7 +427,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
 		if (ret)
 			return ret;
 
-		ret = intel_ring_begin(ring, 2);
+		ret = intel_ring_begin(req, 2);
 		if (ret) {
 			i915_gem_request_cancel(req);
 			return ret;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8ebc0d8..bb10fc2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -106,7 +106,7 @@ gen2_render_ring_flush(struct drm_i915_gem_request *req,
 	if (invalidate_domains & I915_GEM_DOMAIN_SAMPLER)
 		cmd |= MI_READ_FLUSH;
 
-	ret = intel_ring_begin(ring, 2);
+	ret = intel_ring_begin(req, 2);
 	if (ret)
 		return ret;
 
@@ -165,7 +165,7 @@ gen4_render_ring_flush(struct drm_i915_gem_request *req,
 	    (IS_G4X(dev) || IS_GEN5(dev)))
 		cmd |= MI_INVALIDATE_ISP;
 
-	ret = intel_ring_begin(ring, 2);
+	ret = intel_ring_begin(req, 2);
 	if (ret)
 		return ret;
 
@@ -220,8 +220,7 @@ intel_emit_post_sync_nonzero_flush(struct drm_i915_gem_request *req)
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
 
-
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
@@ -234,7 +233,7 @@ intel_emit_post_sync_nonzero_flush(struct drm_i915_gem_request *req)
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_advance(ring);
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
@@ -289,7 +288,7 @@ gen6_render_ring_flush(struct drm_i915_gem_request *req,
 		flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
 	}
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -308,7 +307,7 @@ gen7_render_ring_cs_stall_wa(struct drm_i915_gem_request *req)
 	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -371,7 +370,7 @@ gen7_render_ring_flush(struct drm_i915_gem_request *req,
 		gen7_render_ring_cs_stall_wa(req);
 	}
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -391,7 +390,7 @@ gen8_emit_pipe_control(struct drm_i915_gem_request *req,
 	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
@@ -726,7 +725,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 	if (ret)
 		return ret;
 
-	ret = intel_ring_begin(ring, (w->count * 2 + 2));
+	ret = intel_ring_begin(req, (w->count * 2 + 2));
 	if (ret)
 		return ret;
 
@@ -1175,7 +1174,7 @@ static int gen8_rcs_signal(struct drm_i915_gem_request *signaller_req,
 	num_dwords += (num_rings-1) * MBOX_UPDATE_DWORDS;
 #undef MBOX_UPDATE_DWORDS
 
-	ret = intel_ring_begin(signaller, num_dwords);
+	ret = intel_ring_begin(signaller_req, num_dwords);
 	if (ret)
 		return ret;
 
@@ -1216,7 +1215,7 @@ static int gen8_xcs_signal(struct drm_i915_gem_request *signaller_req,
 	num_dwords += (num_rings-1) * MBOX_UPDATE_DWORDS;
 #undef MBOX_UPDATE_DWORDS
 
-	ret = intel_ring_begin(signaller, num_dwords);
+	ret = intel_ring_begin(signaller_req, num_dwords);
 	if (ret)
 		return ret;
 
@@ -1255,7 +1254,7 @@ static int gen6_signal(struct drm_i915_gem_request *signaller_req,
 	num_dwords += round_up((num_rings-1) * MBOX_UPDATE_DWORDS, 2);
 #undef MBOX_UPDATE_DWORDS
 
-	ret = intel_ring_begin(signaller, num_dwords);
+	ret = intel_ring_begin(signaller_req, num_dwords);
 	if (ret)
 		return ret;
 
@@ -1293,7 +1292,7 @@ gen6_add_request(struct drm_i915_gem_request *req)
 	if (ring->semaphore.signal)
 		ret = ring->semaphore.signal(req, 4);
 	else
-		ret = intel_ring_begin(ring, 4);
+		ret = intel_ring_begin(req, 4);
 
 	if (ret)
 		return ret;
@@ -1331,7 +1330,7 @@ gen8_ring_sync(struct drm_i915_gem_request *waiter_req,
 	struct drm_i915_private *dev_priv = waiter->dev->dev_private;
 	int ret;
 
-	ret = intel_ring_begin(waiter, 4);
+	ret = intel_ring_begin(waiter_req, 4);
 	if (ret)
 		return ret;
 
@@ -1368,7 +1367,7 @@ gen6_ring_sync(struct drm_i915_gem_request *waiter_req,
 
 	WARN_ON(wait_mbox == MI_SEMAPHORE_SYNC_INVALID);
 
-	ret = intel_ring_begin(waiter, 4);
+	ret = intel_ring_begin(waiter_req, 4);
 	if (ret)
 		return ret;
 
@@ -1413,7 +1412,7 @@ pc_render_add_request(struct drm_i915_gem_request *req)
 	 * incoherence by flushing the 6 PIPE_NOTIFY buffers out to
 	 * memory before requesting an interrupt.
 	 */
-	ret = intel_ring_begin(ring, 32);
+	ret = intel_ring_begin(req, 32);
 	if (ret)
 		return ret;
 
@@ -1598,7 +1597,7 @@ bsd_ring_flush(struct drm_i915_gem_request *req,
 	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
-	ret = intel_ring_begin(ring, 2);
+	ret = intel_ring_begin(req, 2);
 	if (ret)
 		return ret;
 
@@ -1614,7 +1613,7 @@ i9xx_add_request(struct drm_i915_gem_request *req)
 	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -1759,7 +1758,7 @@ i965_dispatch_execbuffer(struct drm_i915_gem_request *req,
 	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
-	ret = intel_ring_begin(ring, 2);
+	ret = intel_ring_begin(req, 2);
 	if (ret)
 		return ret;
 
@@ -1787,7 +1786,7 @@ i830_dispatch_execbuffer(struct drm_i915_gem_request *req,
 	u32 cs_offset = ring->scratch.gtt_offset;
 	int ret;
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ring_begin(req, 6);
 	if (ret)
 		return ret;
 
@@ -1804,7 +1803,7 @@ i830_dispatch_execbuffer(struct drm_i915_gem_request *req,
 		if (len > I830_BATCH_LIMIT)
 			return -ENOSPC;
 
-		ret = intel_ring_begin(ring, 6 + 2);
+		ret = intel_ring_begin(req, 6 + 2);
 		if (ret)
 			return ret;
 
@@ -1827,7 +1826,7 @@ i830_dispatch_execbuffer(struct drm_i915_gem_request *req,
 		offset = cs_offset;
 	}
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -1849,7 +1848,7 @@ i915_dispatch_execbuffer(struct drm_i915_gem_request *req,
 	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
-	ret = intel_ring_begin(ring, 2);
+	ret = intel_ring_begin(req, 2);
 	if (ret)
 		return ret;
 
@@ -2272,13 +2271,17 @@ static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
 	return 0;
 }
 
-int intel_ring_begin(struct intel_engine_cs *ring,
+int intel_ring_begin(struct drm_i915_gem_request *req,
 		     int num_dwords)
 {
-	struct drm_i915_gem_request *req;
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_engine_cs *ring;
+	struct drm_i915_private *dev_priv;
 	int ret;
 
+	WARN_ON(req == NULL);
+	ring = req->ring;
+	dev_priv = ring->dev->dev_private;
+
 	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
 				   dev_priv->mm.interruptible);
 	if (ret)
@@ -2288,11 +2291,6 @@ int intel_ring_begin(struct intel_engine_cs *ring,
 	if (ret)
 		return ret;
 
-	/* Preallocate the olr before touching the ring */
-	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
-	if (ret)
-		return ret;
-
 	ring->buffer->space -= num_dwords * sizeof(uint32_t);
 	return 0;
 }
@@ -2308,7 +2306,7 @@ int intel_ring_cacheline_align(struct drm_i915_gem_request *req)
 		return 0;
 
 	num_dwords = CACHELINE_BYTES / sizeof(uint32_t) - num_dwords;
-	ret = intel_ring_begin(ring, num_dwords);
+	ret = intel_ring_begin(req, num_dwords);
 	if (ret)
 		return ret;
 
@@ -2378,7 +2376,7 @@ static int gen6_bsd_ring_flush(struct drm_i915_gem_request *req,
 	uint32_t cmd;
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -2425,7 +2423,7 @@ gen8_ring_dispatch_execbuffer(struct drm_i915_gem_request *req,
 			!(dispatch_flags & I915_DISPATCH_SECURE);
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -2447,7 +2445,7 @@ hsw_ring_dispatch_execbuffer(struct drm_i915_gem_request *req,
 	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
-	ret = intel_ring_begin(ring, 2);
+	ret = intel_ring_begin(req, 2);
 	if (ret)
 		return ret;
 
@@ -2470,7 +2468,7 @@ gen6_ring_dispatch_execbuffer(struct drm_i915_gem_request *req,
 	struct intel_engine_cs *ring = req->ring;
 	int ret;
 
-	ret = intel_ring_begin(ring, 2);
+	ret = intel_ring_begin(req, 2);
 	if (ret)
 		return ret;
 
@@ -2495,7 +2493,7 @@ static int gen6_ring_flush(struct drm_i915_gem_request *req,
 	uint32_t cmd;
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index bfeca53..16fd9ba 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -399,7 +399,7 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring);
 
 int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request);
 
-int __must_check intel_ring_begin(struct intel_engine_cs *ring, int n);
+int __must_check intel_ring_begin(struct drm_i915_gem_request *req, int n);
 int __must_check intel_ring_cacheline_align(struct drm_i915_gem_request *req);
 static inline void intel_ring_emit(struct intel_engine_cs *ring,
 				   u32 data)
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 47/55] drm/i915: Update intel_logical_ring_begin() to take a request structure
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (45 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 46/55] drm/i915: Update intel_ring_begin() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation John.C.Harrison
                   ` (9 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Now that everything above has been converted to use requests,
intel_logical_ring_begin() can be updated to take a request instead of a
ringbuf/context pair. This also means that it no longer needs to lazily allocate
a request if no-one happens to have done it earlier.

Note that this change makes the execlist signature the same as the legacy
version. Thus the two functions could be merged into a ring->begin() wrapper if
required.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c |   36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 648aca7..548c53d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -790,7 +790,7 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
 /**
  * intel_logical_ring_begin() - prepare the logical ringbuffer to accept some commands
  *
- * @ringbuf: Logical ringbuffer.
+ * @request: The request to start some new work for
  * @num_dwords: number of DWORDs that we plan to write to the ringbuffer.
  *
  * The ringbuffer might not be ready to accept the commands right away (maybe it needs to
@@ -800,30 +800,26 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
  *
  * Return: non-zero if the ringbuffer is not ready to be written to.
  */
-static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
-				    struct intel_context *ctx, int num_dwords)
+static int intel_logical_ring_begin(struct drm_i915_gem_request *req,
+				    int num_dwords)
 {
-	struct drm_i915_gem_request *req;
-	struct intel_engine_cs *ring = ringbuf->ring;
-	struct drm_device *dev = ring->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv;
 	int ret;
 
+	WARN_ON(req == NULL);
+	dev_priv = req->ring->dev->dev_private;
+
 	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
 				   dev_priv->mm.interruptible);
 	if (ret)
 		return ret;
 
-	ret = logical_ring_prepare(ringbuf, ctx, num_dwords * sizeof(uint32_t));
-	if (ret)
-		return ret;
-
-	/* Preallocate the olr before touching the ring */
-	ret = i915_gem_request_alloc(ring, ctx, &req);
+	ret = logical_ring_prepare(req->ringbuf, req->ctx,
+				   num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
 
-	ringbuf->space -= num_dwords * sizeof(uint32_t);
+	req->ringbuf->space -= num_dwords * sizeof(uint32_t);
 	return 0;
 }
 
@@ -909,7 +905,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    instp_mode != dev_priv->relative_constants_mode) {
-		ret = intel_logical_ring_begin(ringbuf, params->ctx, 4);
+		ret = intel_logical_ring_begin(params->request, 4);
 		if (ret)
 			return ret;
 
@@ -1062,7 +1058,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
 	if (ret)
 		return ret;
 
-	ret = intel_logical_ring_begin(ringbuf, req->ctx, w->count * 2 + 2);
+	ret = intel_logical_ring_begin(req, w->count * 2 + 2);
 	if (ret)
 		return ret;
 
@@ -1144,7 +1140,7 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
 	bool ppgtt = !(dispatch_flags & I915_DISPATCH_SECURE);
 	int ret;
 
-	ret = intel_logical_ring_begin(ringbuf, req->ctx, 4);
+	ret = intel_logical_ring_begin(req, 4);
 	if (ret)
 		return ret;
 
@@ -1202,7 +1198,7 @@ static int gen8_emit_flush(struct drm_i915_gem_request *request,
 	uint32_t cmd;
 	int ret;
 
-	ret = intel_logical_ring_begin(ringbuf, request->ctx, 4);
+	ret = intel_logical_ring_begin(request, 4);
 	if (ret)
 		return ret;
 
@@ -1268,7 +1264,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request,
 	vf_flush_wa = INTEL_INFO(ring->dev)->gen >= 9 &&
 		      flags & PIPE_CONTROL_VF_CACHE_INVALIDATE;
 
-	ret = intel_logical_ring_begin(ringbuf, request->ctx, vf_flush_wa ? 12 : 6);
+	ret = intel_logical_ring_begin(request, vf_flush_wa ? 12 : 6);
 	if (ret)
 		return ret;
 
@@ -1314,7 +1310,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
 	 * used as a workaround for not being allowed to do lite
 	 * restore with HEAD==TAIL (WaIdleLiteRestore).
 	 */
-	ret = intel_logical_ring_begin(ringbuf, request->ctx, 8);
+	ret = intel_logical_ring_begin(request, 8);
 	if (ret)
 		return ret;
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (46 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 47/55] drm/i915: Update intel_logical_ring_begin() " John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-06-17 13:31   ` Daniel Vetter
  2015-05-29 16:44 ` [PATCH 49/55] drm/i915: Remove the now obsolete intel_ring_get_request() John.C.Harrison
                   ` (8 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Now that the *_ring_begin() functions no longer call the request allocation
code, it is finally safe for the request allocation code to call *_ring_begin().
This is important to guarantee that the space reserved for the subsequent
i915_add_request() call does actually get reserved.

v2: Renamed functions according to review feedback (Tomas Elf).

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |   25 +++++++++++++------------
 drivers/gpu/drm/i915/intel_lrc.c        |   15 +++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.h        |    1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |   29 ++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    1 +
 5 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9f3e0717..1261792 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2680,19 +2680,20 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	 * i915_add_request() call can't fail. Note that the reserve may need
 	 * to be redone if the request is not actually submitted straight
 	 * away, e.g. because a GPU scheduler has deferred it.
-	 *
-	 * Note further that this call merely notes the reserve request. A
-	 * subsequent call to *_ring_begin() is required to actually ensure
-	 * that the reservation is available. Without the begin, if the
-	 * request creator immediately submitted the request without adding
-	 * any commands to it then there might not actually be sufficient
-	 * room for the submission commands. Unfortunately, the current
-	 * *_ring_begin() implementations potentially call back here to
-	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
-	 * infinite recursion! Until that back call path is removed, it is
-	 * necessary to do a manual _begin() outside.
 	 */
-	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
+	if (i915.enable_execlists)
+		ret = intel_logical_ring_reserve_space(req);
+	else
+		ret = intel_ring_reserve_space(req);
+	if (ret) {
+		/*
+		 * At this point, the request is fully allocated even if not
+		 * fully prepared. Thus it can be cleaned up using the proper
+		 * free code.
+		 */
+		i915_gem_request_cancel(req);
+		return ret;
+	}
 
 	*req_out = ring->outstanding_lazy_request = req;
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 548c53d..e164ac0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -823,6 +823,21 @@ static int intel_logical_ring_begin(struct drm_i915_gem_request *req,
 	return 0;
 }
 
+int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
+{
+	/*
+	 * The first call merely notes the reserve request and is common for
+	 * all back ends. The subsequent localised _begin() call actually
+	 * ensures that the reservation is available. Without the begin, if
+	 * the request creator immediately submitted the request without
+	 * adding any commands to it then there might not actually be
+	 * sufficient room for the submission commands.
+	 */
+	intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
+
+	return intel_logical_ring_begin(request, 0);
+}
+
 /**
  * execlists_submission() - submit a batchbuffer for execution, Execlists style
  * @dev: DRM device.
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 044c0e5..f59940a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -37,6 +37,7 @@
 
 /* Logical Rings */
 int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request);
+int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request);
 void intel_logical_ring_stop(struct intel_engine_cs *ring);
 void intel_logical_ring_cleanup(struct intel_engine_cs *ring);
 int intel_logical_rings_init(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index bb10fc2..0ba5787 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2192,24 +2192,27 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
 	return 0;
 }
 
+int intel_ring_reserve_space(struct drm_i915_gem_request *request)
+{
+	/*
+	 * The first call merely notes the reserve request and is common for
+	 * all back ends. The subsequent localised _begin() call actually
+	 * ensures that the reservation is available. Without the begin, if
+	 * the request creator immediately submitted the request without
+	 * adding any commands to it then there might not actually be
+	 * sufficient room for the submission commands.
+	 */
+	intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
+
+	return intel_ring_begin(request, 0);
+}
+
 void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
 {
-	/* NB: Until request management is fully tidied up and the OLR is
-	 * removed, there are too many ways for get false hits on this
-	 * anti-recursion check! */
-	/*WARN_ON(ringbuf->reserved_size);*/
+	WARN_ON(ringbuf->reserved_size);
 	WARN_ON(ringbuf->reserved_in_use);
 
 	ringbuf->reserved_size = size;
-
-	/*
-	 * Really need to call _begin() here but that currently leads to
-	 * recursion problems! This will be fixed later but for now just
-	 * return and hope for the best. Note that there is only a real
-	 * problem if the create of the request never actually calls _begin()
-	 * but if they are not submitting any work then why did they create
-	 * the request in the first place?
-	 */
 }
 
 void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 16fd9ba..f4633ca 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -450,6 +450,7 @@ intel_ring_get_request(struct intel_engine_cs *ring)
 
 #define MIN_SPACE_FOR_ADD_REQUEST	128
 
+int intel_ring_reserve_space(struct drm_i915_gem_request *request);
 void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
 void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
 void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 49/55] drm/i915: Remove the now obsolete intel_ring_get_request()
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (47 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 50/55] drm/i915: Remove the now obsolete 'outstanding_lazy_request' John.C.Harrison
                   ` (7 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Much of the driver has now been converted to passing requests around instead of
rings/ringbufs/contexts. Thus the function for retreiving the request from a
ring (i.e. the OLR) is no longer used and can be removed.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.h |    7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f4633ca..9b1dab4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -441,13 +441,6 @@ static inline u32 intel_ring_get_tail(struct intel_ringbuffer *ringbuf)
 	return ringbuf->tail;
 }
 
-static inline struct drm_i915_gem_request *
-intel_ring_get_request(struct intel_engine_cs *ring)
-{
-	BUG_ON(ring->outstanding_lazy_request == NULL);
-	return ring->outstanding_lazy_request;
-}
-
 #define MIN_SPACE_FOR_ADD_REQUEST	128
 
 int intel_ring_reserve_space(struct drm_i915_gem_request *request);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 50/55] drm/i915: Remove the now obsolete 'outstanding_lazy_request'
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (48 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 49/55] drm/i915: Remove the now obsolete intel_ring_get_request() John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time John.C.Harrison
                   ` (6 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The outstanding_lazy_request is no longer used anywhere in the driver.
Everything that was looking at it now has a request explicitly passed in from on
high. Everything that was relying upon it behind the scenes is now explicitly
creating/passing/submitting its own private request. Thus the OLR can be
removed.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c            |   16 ++--------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    4 +---
 drivers/gpu/drm/i915/intel_lrc.c           |    1 -
 drivers/gpu/drm/i915/intel_ringbuffer.c    |    8 --------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |    4 ----
 5 files changed, 3 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1261792..5aa0ad0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1157,9 +1157,6 @@ i915_gem_check_olr(struct drm_i915_gem_request *req)
 {
 	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
 
-	if (req == req->ring->outstanding_lazy_request)
-		i915_add_request(req);
-
 	return 0;
 }
 
@@ -2488,8 +2485,6 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 	dev_priv = ring->dev->dev_private;
 	ringbuf = request->ringbuf;
 
-	WARN_ON(request != ring->outstanding_lazy_request);
-
 	/*
 	 * To ensure that this call will not fail, space for its emissions
 	 * should already have been reserved in the ring buffer. Let the ring
@@ -2558,7 +2553,6 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 	}
 
 	trace_i915_gem_request_add(request);
-	ring->outstanding_lazy_request = NULL;
 
 	i915_queue_hangcheck(ring->dev);
 
@@ -2647,8 +2641,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	if (!req_out)
 		return -EINVAL;
 
-	if ((*req_out = ring->outstanding_lazy_request) != NULL)
-		return 0;
+	*req_out = NULL;
 
 	req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
 	if (req == NULL)
@@ -2695,7 +2688,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 		return ret;
 	}
 
-	*req_out = ring->outstanding_lazy_request = req;
+	*req_out = req;
 	return 0;
 
 err:
@@ -2792,9 +2785,6 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
 
 		i915_gem_request_retire(request);
 	}
-
-	/* This may not have been flushed before the reset, so clean it now */
-	i915_gem_request_assign(&ring->outstanding_lazy_request, NULL);
 }
 
 void i915_gem_restore_fences(struct drm_device *dev)
@@ -3350,8 +3340,6 @@ int i915_gpu_idle(struct drm_device *dev)
 			i915_add_request_no_flush(req);
 		}
 
-		WARN_ON(ring->outstanding_lazy_request);
-
 		ret = intel_ring_idle(ring);
 		if (ret)
 			return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 805d288..e868ac1 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1647,10 +1647,8 @@ err:
 	 * must be freed again. If it was submitted then it is being tracked
 	 * on the active request list and no clean up is required here.
 	 */
-	if (ret && params->request) {
+	if (ret && params->request)
 		i915_gem_request_cancel(params->request);
-		ring->outstanding_lazy_request = NULL;
-	}
 
 	mutex_unlock(&dev->struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e164ac0..0421480 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1405,7 +1405,6 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *ring)
 
 	intel_logical_ring_stop(ring);
 	WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0);
-	i915_gem_request_assign(&ring->outstanding_lazy_request, NULL);
 
 	if (ring->cleanup)
 		ring->cleanup(ring);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0ba5787..5858ade 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2091,7 +2091,6 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring)
 
 	intel_unpin_ringbuffer_obj(ringbuf);
 	intel_destroy_ringbuffer_obj(ringbuf);
-	i915_gem_request_assign(&ring->outstanding_lazy_request, NULL);
 
 	if (ring->cleanup)
 		ring->cleanup(ring);
@@ -2166,11 +2165,6 @@ int intel_ring_idle(struct intel_engine_cs *ring)
 {
 	struct drm_i915_gem_request *req;
 
-	/* We need to add any requests required to flush the objects and ring */
-	WARN_ON(ring->outstanding_lazy_request);
-	if (ring->outstanding_lazy_request)
-		i915_add_request(ring->outstanding_lazy_request);
-
 	/* Wait upon the last request to be completed */
 	if (list_empty(&ring->request_list))
 		return 0;
@@ -2326,8 +2320,6 @@ void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno)
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	BUG_ON(ring->outstanding_lazy_request);
-
 	if (INTEL_INFO(dev)->gen == 6 || INTEL_INFO(dev)->gen == 7) {
 		I915_WRITE(RING_SYNC_0(ring->mmio_base), 0);
 		I915_WRITE(RING_SYNC_1(ring->mmio_base), 0);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9b1dab4..4f573b1 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -269,10 +269,6 @@ struct  intel_engine_cs {
 	 */
 	struct list_head request_list;
 
-	/**
-	 * Do we have some not yet emitted requests outstanding?
-	 */
-	struct drm_i915_gem_request *outstanding_lazy_request;
 	bool gpu_caches_dirty;
 
 	wait_queue_head_t irq_queue;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (49 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 50/55] drm/i915: Remove the now obsolete 'outstanding_lazy_request' John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-06-03 11:15   ` Tomas Elf
  2015-05-29 16:44 ` [PATCH 52/55] drm/i915: Remove 'faked' request from LRC submission John.C.Harrison
                   ` (5 subsequent siblings)
  56 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

In _i915_add_request(), the request is associated with a userland client.
Specifically it is linked to the 'file' structure and the current user process
is recorded. One problem here is that the current user process is not
necessarily the same as when the request was submitted to the driver. This is
especially true when the GPU scheduler arrives and decouples driver submission
from hardware submission. Note also that it is only in the case where the add
request comes from an execbuff call that there is a client to associate. Any
other add request call is kernel only so does not need to do it.

This patch moves the client association into a separate function. This is then
called from the execbuffer code path itself at a sensible time. It also removes
the now redundant 'file' pointer from the add request parameter list.

An extra cleanup of the client association is also added to the request clean up
code for the eventuality where the request is killed after association but
before being submitted (e.g. due to out of memory error somewhere). Once the
submission has happened, the request is on the request list and the regular
request list removal will clear the association. Note that this still needs to
happen at this point in time because the request might be kept floating around
much longer (due to someone holding a reference count) and the client should not
be worrying about this request after it has been retired.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |    7 ++--
 drivers/gpu/drm/i915/i915_gem.c            |   56 ++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    6 ++-
 3 files changed, 49 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f9b6517..18bfc84 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2199,6 +2199,8 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct drm_i915_gem_request **req_out);
 void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
+int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
+				   struct drm_file *file);
 
 static inline uint32_t
 i915_gem_request_get_seqno(struct drm_i915_gem_request *req)
@@ -2864,13 +2866,12 @@ void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 void __i915_add_request(struct drm_i915_gem_request *req,
-			struct drm_file *file,
 			struct drm_i915_gem_object *batch_obj,
 			bool flush_caches);
 #define i915_add_request(req) \
-	__i915_add_request(req, NULL, NULL, true)
+	__i915_add_request(req, NULL, true)
 #define i915_add_request_no_flush(req) \
-	__i915_add_request(req, NULL, NULL, false)
+	__i915_add_request(req, NULL, false)
 int __i915_wait_request(struct drm_i915_gem_request *req,
 			unsigned reset_counter,
 			bool interruptible,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5aa0ad0..b8fe931 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1331,6 +1331,33 @@ out:
 	return ret;
 }
 
+int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
+				   struct drm_file *file)
+{
+	struct drm_i915_private *dev_private;
+	struct drm_i915_file_private *file_priv;
+
+	WARN_ON(!req || !file || req->file_priv);
+
+	if (!req || !file)
+		return -EINVAL;
+
+	if (req->file_priv)
+		return -EINVAL;
+
+	dev_private = req->ring->dev->dev_private;
+	file_priv = file->driver_priv;
+
+	spin_lock(&file_priv->mm.lock);
+	req->file_priv = file_priv;
+	list_add_tail(&req->client_list, &file_priv->mm.request_list);
+	spin_unlock(&file_priv->mm.lock);
+
+	req->pid = get_pid(task_pid(current));
+
+	return 0;
+}
+
 static inline void
 i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 {
@@ -1343,6 +1370,9 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 	list_del(&request->client_list);
 	request->file_priv = NULL;
 	spin_unlock(&file_priv->mm.lock);
+
+	put_pid(request->pid);
+	request->pid = NULL;
 }
 
 static void i915_gem_request_retire(struct drm_i915_gem_request *request)
@@ -1362,8 +1392,6 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	list_del_init(&request->list);
 	i915_gem_request_remove_from_client(request);
 
-	put_pid(request->pid);
-
 	i915_gem_request_unreference(request);
 }
 
@@ -2468,7 +2496,6 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
  * going to happen on the hardware. This would be a Bad Thing(tm).
  */
 void __i915_add_request(struct drm_i915_gem_request *request,
-			struct drm_file *file,
 			struct drm_i915_gem_object *obj,
 			bool flush_caches)
 {
@@ -2538,19 +2565,6 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 
 	request->emitted_jiffies = jiffies;
 	list_add_tail(&request->list, &ring->request_list);
-	request->file_priv = NULL;
-
-	if (file) {
-		struct drm_i915_file_private *file_priv = file->driver_priv;
-
-		spin_lock(&file_priv->mm.lock);
-		request->file_priv = file_priv;
-		list_add_tail(&request->client_list,
-			      &file_priv->mm.request_list);
-		spin_unlock(&file_priv->mm.lock);
-
-		request->pid = get_pid(task_pid(current));
-	}
 
 	trace_i915_gem_request_add(request);
 
@@ -2616,6 +2630,9 @@ void i915_gem_request_free(struct kref *req_ref)
 						 typeof(*req), ref);
 	struct intel_context *ctx = req->ctx;
 
+	if (req->file_priv)
+		i915_gem_request_remove_from_client(req);
+
 	if (ctx) {
 		if (i915.enable_execlists) {
 			struct intel_engine_cs *ring = req->ring;
@@ -4320,6 +4337,13 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 		if (time_after_eq(request->emitted_jiffies, recent_enough))
 			break;
 
+		/*
+		 * Note that the request might not have been submitted yet.
+		 * In which case emitted_jiffies will be zero.
+		 */
+		if (!request->emitted_jiffies)
+			continue;
+
 		target = request;
 	}
 	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e868ac1..52139c6 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1058,7 +1058,7 @@ i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
 	params->ring->gpu_caches_dirty = true;
 
 	/* Add a breadcrumb for the completion of the batch buffer */
-	__i915_add_request(params->request, params->file, params->batch_obj, true);
+	__i915_add_request(params->request, params->batch_obj, true);
 }
 
 static int
@@ -1612,6 +1612,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	if (ret)
 		goto err_batch_unpin;
 
+	ret = i915_gem_request_add_to_client(params->request, file);
+	if (ret)
+		goto err_batch_unpin;
+
 	/*
 	 * Save assorted stuff away to pass through to *_submission().
 	 * NB: This data should be 'persistent' and not local as it will
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 52/55] drm/i915: Remove 'faked' request from LRC submission
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (50 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 53/55] drm/i915: Update a bunch of LRC functions to take requests John.C.Harrison
                   ` (4 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The LRC submission code requires a request for tracking purposes. It does not
actually require that request to 'complete' it simply uses it for keeping hold
of reference counts on contexts and such like.

Previously, the fall back path of polling for space in the ring would start by
submitting any outstanding work that was sat in the buffer. This submission was
not done as part of the request that that work was owned by because that would
lead to complications with the request being submitted twice. Instead, a null
request structure was passed in to the submit call and a fake one was created.

That fall back path has long since been obsoleted and has now been removed. Thus
there is never any need to fake up a request structure. This patch removes that
code. A couple of sanity check warnings are added as well, just in case.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c |   23 +++++------------------
 1 file changed, 5 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0421480..323a4a1 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -549,29 +549,16 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
 				   struct drm_i915_gem_request *request)
 {
 	struct drm_i915_gem_request *cursor;
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	int num_elements = 0;
 
 	if (to != ring->default_context)
 		intel_lr_context_pin(ring, to);
 
-	if (!request) {
-		/*
-		 * If there isn't a request associated with this submission,
-		 * create one as a temporary holder.
-		 */
-		request = kzalloc(sizeof(*request), GFP_KERNEL);
-		if (request == NULL)
-			return -ENOMEM;
-		request->ring = ring;
-		request->ctx = to;
-		kref_init(&request->ref);
-		request->uniq = dev_priv->request_uniq++;
-		i915_gem_context_reference(request->ctx);
-	} else {
-		i915_gem_request_reference(request);
-		WARN_ON(to != request->ctx);
-	}
+	WARN_ON(!request);
+	WARN_ON(to != request->ctx);
+
+	i915_gem_request_reference(request);
+
 	request->tail = tail;
 
 	spin_lock_irq(&ring->execlist_lock);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 53/55] drm/i915: Update a bunch of LRC functions to take requests
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (51 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 52/55] drm/i915: Remove 'faked' request from LRC submission John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-05-29 16:44 ` [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()' John.C.Harrison
                   ` (3 subsequent siblings)
  56 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

A bunch of the low level LRC functions were passing around ringbuf and ctx
pairs. In a few cases, they took the r/c pair and a request as well. This is all
quite messy and unnecesary. The context_queue() call is especially bad since the
fake request code got removed - it takes a request and three extra things that
must be extracted from the request and then it checks them against what it finds
in the request. Removing all the derivable data makes the code much simpler all
round.

This patch updates those functions to just take the request structure.

Note that logical_ring_wait_for_space now takes a request structure but already
had a local request pointer that it uses to scan for something to wait on. To
avoid confusion the local variable has been renamed 'target' (it is searching
for a target request to do something with) and the parameter has been called req
(to guarantee anything accidentally missed gets a compiler error).

v2: Updated commit message re wait_for_space (Tomas Elf review comment).

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c |   66 +++++++++++++++++---------------------
 1 file changed, 29 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 323a4a1..4ce809d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -543,23 +543,18 @@ void intel_lrc_irq_handler(struct intel_engine_cs *ring)
 		   ((u32)ring->next_context_status_buffer & 0x07) << 8);
 }
 
-static int execlists_context_queue(struct intel_engine_cs *ring,
-				   struct intel_context *to,
-				   u32 tail,
-				   struct drm_i915_gem_request *request)
+static int execlists_context_queue(struct drm_i915_gem_request *request)
 {
+	struct intel_engine_cs *ring = request->ring;
 	struct drm_i915_gem_request *cursor;
 	int num_elements = 0;
 
-	if (to != ring->default_context)
-		intel_lr_context_pin(ring, to);
-
-	WARN_ON(!request);
-	WARN_ON(to != request->ctx);
+	if (request->ctx != ring->default_context)
+		intel_lr_context_pin(ring, request->ctx);
 
 	i915_gem_request_reference(request);
 
-	request->tail = tail;
+	request->tail = request->ringbuf->tail;
 
 	spin_lock_irq(&ring->execlist_lock);
 
@@ -574,7 +569,7 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
 					   struct drm_i915_gem_request,
 					   execlist_link);
 
-		if (to == tail_req->ctx) {
+		if (request->ctx == tail_req->ctx) {
 			WARN(tail_req->elsp_submitted != 0,
 				"More than 2 already-submitted reqs queued\n");
 			list_del(&tail_req->execlist_link);
@@ -658,12 +653,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
 	return 0;
 }
 
-static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
-				       struct intel_context *ctx,
+static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
 				       int bytes)
 {
-	struct intel_engine_cs *ring = ringbuf->ring;
-	struct drm_i915_gem_request *request;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	struct intel_engine_cs *ring = req->ring;
+	struct drm_i915_gem_request *target;
 	unsigned space;
 	int ret;
 
@@ -673,26 +668,26 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
 	if (intel_ring_space(ringbuf) >= bytes)
 		return 0;
 
-	list_for_each_entry(request, &ring->request_list, list) {
+	list_for_each_entry(target, &ring->request_list, list) {
 		/*
 		 * The request queue is per-engine, so can contain requests
 		 * from multiple ringbuffers. Here, we must ignore any that
 		 * aren't from the ringbuffer we're considering.
 		 */
-		if (request->ringbuf != ringbuf)
+		if (target->ringbuf != ringbuf)
 			continue;
 
 		/* Would completion of this request free enough space? */
-		space = __intel_ring_space(request->postfix, ringbuf->tail,
+		space = __intel_ring_space(target->postfix, ringbuf->tail,
 					   ringbuf->size);
 		if (space >= bytes)
 			break;
 	}
 
-	if (WARN_ON(&request->list == &ring->request_list))
+	if (WARN_ON(&target->list == &ring->request_list))
 		return -ENOSPC;
 
-	ret = i915_wait_request(request);
+	ret = i915_wait_request(target);
 	if (ret)
 		return ret;
 
@@ -702,7 +697,7 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
 
 /*
  * intel_logical_ring_advance_and_submit() - advance the tail and submit the workload
- * @ringbuf: Logical Ringbuffer to advance.
+ * @request: Request to advance the logical ringbuffer of.
  *
  * The tail is updated in our logical ringbuffer struct, not in the actual context. What
  * really happens during submission is that the context and current tail will be placed
@@ -710,23 +705,21 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
  * point, the tail *inside* the context is updated and the ELSP written to.
  */
 static void
-intel_logical_ring_advance_and_submit(struct intel_ringbuffer *ringbuf,
-				      struct intel_context *ctx,
-				      struct drm_i915_gem_request *request)
+intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
 {
-	struct intel_engine_cs *ring = ringbuf->ring;
+	struct intel_engine_cs *ring = request->ring;
 
-	intel_logical_ring_advance(ringbuf);
+	intel_logical_ring_advance(request->ringbuf);
 
 	if (intel_ring_stopped(ring))
 		return;
 
-	execlists_context_queue(ring, ctx, ringbuf->tail, request);
+	execlists_context_queue(request);
 }
 
-static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
-				    struct intel_context *ctx)
+static int logical_ring_wrap_buffer(struct drm_i915_gem_request *req)
 {
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
 	uint32_t __iomem *virt;
 	int rem = ringbuf->size - ringbuf->tail;
 
@@ -734,7 +727,7 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
 	WARN_ON(ringbuf->reserved_in_use);
 
 	if (ringbuf->space < rem) {
-		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
+		int ret = logical_ring_wait_for_space(req, rem);
 
 		if (ret)
 			return ret;
@@ -751,22 +744,22 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
 	return 0;
 }
 
-static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
-				struct intel_context *ctx, int bytes)
+static int logical_ring_prepare(struct drm_i915_gem_request *req, int bytes)
 {
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
 	int ret;
 
 	if (!ringbuf->reserved_in_use)
 		bytes += ringbuf->reserved_size;
 
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
-		ret = logical_ring_wrap_buffer(ringbuf, ctx);
+		ret = logical_ring_wrap_buffer(req);
 		if (unlikely(ret))
 			return ret;
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
-		ret = logical_ring_wait_for_space(ringbuf, ctx, bytes);
+		ret = logical_ring_wait_for_space(req, bytes);
 		if (unlikely(ret))
 			return ret;
 	}
@@ -801,8 +794,7 @@ static int intel_logical_ring_begin(struct drm_i915_gem_request *req,
 	if (ret)
 		return ret;
 
-	ret = logical_ring_prepare(req->ringbuf, req->ctx,
-				   num_dwords * sizeof(uint32_t));
+	ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
 
@@ -1327,7 +1319,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
 	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
 	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
-	intel_logical_ring_advance_and_submit(ringbuf, request->ctx, request);
+	intel_logical_ring_advance_and_submit(request);
 
 	/*
 	 * Here we add two extra NOOPs as padding to avoid
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()'
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (52 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 53/55] drm/i915: Update a bunch of LRC functions to take requests John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-06-02 18:27   ` Tomas Elf
  2015-06-23 10:23   ` Chris Wilson
  2015-05-29 16:44 ` [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active() John.C.Harrison
                   ` (2 subsequent siblings)
  56 siblings, 2 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

As there is no OLR to check, the check_olr() function is now a no-op and can be
removed.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |    1 -
 drivers/gpu/drm/i915/i915_gem.c      |   34 +---------------------------------
 drivers/gpu/drm/i915/intel_display.c |    6 ------
 3 files changed, 1 insertion(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 18bfc84..cb5bb4a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2825,7 +2825,6 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
-int __must_check i915_gem_check_olr(struct drm_i915_gem_request *req);
 
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b8fe931..f825942 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1149,17 +1149,6 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
 	return 0;
 }
 
-/*
- * Compare arbitrary request against outstanding lazy request. Emit on match.
- */
-int
-i915_gem_check_olr(struct drm_i915_gem_request *req)
-{
-	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
-
-	return 0;
-}
-
 static void fake_irq(unsigned long data)
 {
 	wake_up_process((struct task_struct *)data);
@@ -1440,10 +1429,6 @@ i915_wait_request(struct drm_i915_gem_request *req)
 	if (ret)
 		return ret;
 
-	ret = i915_gem_check_olr(req);
-	if (ret)
-		return ret;
-
 	ret = __i915_wait_request(req,
 				  atomic_read(&dev_priv->gpu_error.reset_counter),
 				  interruptible, NULL, NULL);
@@ -1543,10 +1528,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 		if (req == NULL)
 			return 0;
 
-		ret = i915_gem_check_olr(req);
-		if (ret)
-			goto err;
-
 		requests[n++] = i915_gem_request_reference(req);
 	} else {
 		for (i = 0; i < I915_NUM_RINGS; i++) {
@@ -1556,10 +1537,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 			if (req == NULL)
 				continue;
 
-			ret = i915_gem_check_olr(req);
-			if (ret)
-				goto err;
-
 			requests[n++] = i915_gem_request_reference(req);
 		}
 	}
@@ -1570,7 +1547,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 					  NULL, rps);
 	mutex_lock(&dev->struct_mutex);
 
-err:
 	for (i = 0; i < n; i++) {
 		if (ret == 0)
 			i915_gem_object_retire_request(obj, requests[i]);
@@ -2987,7 +2963,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
 static int
 i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
 {
-	int ret, i;
+	int i;
 
 	if (!obj->active)
 		return 0;
@@ -3002,10 +2978,6 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
 		if (list_empty(&req->list))
 			goto retire;
 
-		ret = i915_gem_check_olr(req);
-		if (ret)
-			return ret;
-
 		if (i915_gem_request_completed(req, true)) {
 			__i915_gem_request_retire__upto(req);
 retire:
@@ -3121,10 +3093,6 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (i915_gem_request_completed(from_req, true))
 		return 0;
 
-	ret = i915_gem_check_olr(from_req);
-	if (ret)
-		return ret;
-
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
 		struct drm_i915_private *i915 = to_i915(obj->base.dev);
 		ret = __i915_wait_request(from_req,
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 91e19d0..aca2215 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11243,12 +11243,6 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 		i915_gem_request_assign(&work->flip_queued_req,
 					obj->last_write_req);
 	} else {
-		if (obj->last_write_req) {
-			ret = i915_gem_check_olr(obj->last_write_req);
-			if (ret)
-				goto cleanup_unpin;
-		}
-
 		if (!request) {
 			ret = i915_gem_request_alloc(ring, ring->default_context, &request);
 			if (ret)
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (53 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()' John.C.Harrison
@ 2015-05-29 16:44 ` John.C.Harrison
  2015-06-02 18:27   ` Tomas Elf
  2015-06-17 14:06   ` Daniel Vetter
  2015-06-04 18:23 ` [PATCH 14/56] drm/i915: Make retire condition check for requests not objects John.C.Harrison
  2015-06-22 21:04 ` [PATCH 00/55] Remove the outstanding_lazy_request Daniel Vetter
  56 siblings, 2 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-05-29 16:44 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The i915_gem_object_flush_active() call used to do lots. Over time it has done
less and less. Now all it does check the various associated requests to see if
they can be retired. Hence this patch renames the function and updates the
comments around it to match the current operation.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c |   18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f825942..081cbbf 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2956,12 +2956,10 @@ i915_gem_idle_work_handler(struct work_struct *work)
 }
 
 /**
- * Ensures that an object will eventually get non-busy by flushing any required
- * write domains, emitting any outstanding lazy request and retiring and
- * completed requests.
+ * Check an object to see if any of it's associated requests can be retired.
  */
 static int
-i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
+i915_gem_object_retire(struct drm_i915_gem_object *obj)
 {
 	int i;
 
@@ -3034,8 +3032,8 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		return -ENOENT;
 	}
 
-	/* Need to make sure the object gets inactive eventually. */
-	ret = i915_gem_object_flush_active(obj);
+	/* Check if the object is pending clean up. */
+	ret = i915_gem_object_retire(obj);
 	if (ret)
 		goto out;
 
@@ -4526,12 +4524,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 		goto unlock;
 	}
 
-	/* Count all active objects as busy, even if they are currently not used
-	 * by the gpu. Users of this interface expect objects to eventually
-	 * become non-busy without any further actions, therefore emit any
-	 * necessary flushes here.
-	 */
-	ret = i915_gem_object_flush_active(obj);
+	/* Check if the object is pending clean up. */
+	ret = i915_gem_object_retire(obj);
 	if (ret)
 		goto unref;
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
@ 2015-06-02 18:14   ` Tomas Elf
  2015-06-04 12:06   ` John.C.Harrison
  2015-06-19 16:34   ` John.C.Harrison
  2 siblings, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-02 18:14 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> It is a bad idea for i915_add_request() to fail. The work will already have been
> send to the ring and will be processed, but there will not be any tracking or
> management of that work.
>
> The only way the add request call can fail is if it can't write its epilogue
> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> reasons for that are mostly down to running out of ring buffer space and the
> problems associated with trying to get some more. This patch prevents that
> situation from happening in the first place.
>
> When a request is created, it marks sufficient space as reserved for the
> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> there will be plenty of space for it. Note that a ring_begin() call is required
> to actually reserve the space (and do any potential waiting). However, that is
> not currently done at request creation time. This is because the ring_begin()
> code can allocate a request. Hence calling begin() from the request allocation
> code would lead to infinite recursion! Later patches in this series remove the
> need for begin() to do the allocate. At that point, it becomes safe for the
> allocate to call begin() and really reserve the space.
>
> Until then, there is a potential for insufficient space to be available at the
> point of calling i915_add_request(). However, that would only be in the case
> where the request was created and immediately submitted without ever calling
> ring_begin() and adding any work to that request. Which should never happen. And
> even if it does, and if that request happens to fall down the tiny window of
> opportunity for failing due to being out of ring space then does it really
> matter because the request wasn't doing anything in the first place?
>
> v2: Updated the 'reserved space too small' warning to include the offending
> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> re-initialisation of tracking state after a buffer wrap to keep the sanity
> checks accurate.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         |    1 +
>   drivers/gpu/drm/i915/i915_gem.c         |   37 +++++++++++++++++
>   drivers/gpu/drm/i915/intel_lrc.c        |    9 ++++
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   68 ++++++++++++++++++++++++++++++-
>   drivers/gpu/drm/i915/intel_ringbuffer.h |   10 +++++
>   5 files changed, 123 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 0347eb9..eba1857 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
>
>   int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   			   struct intel_context *ctx);
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>   void i915_gem_request_free(struct kref *req_ref);
>
>   static inline uint32_t
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 68f1d1e..6f51416 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
>   	} else
>   		ringbuf = ring->buffer;
>
> +	/*
> +	 * To ensure that this call will not fail, space for its emissions
> +	 * should already have been reserved in the ring buffer. Let the ring
> +	 * know that it is time to use that space up.
> +	 */
> +	intel_ring_reserved_space_use(ringbuf);
> +
>   	request_start = intel_ring_get_tail(ringbuf);
>   	/*
>   	 * Emit any outstanding flushes - execbuf can fail to emit the flush
> @@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
>   			   round_jiffies_up_relative(HZ));
>   	intel_mark_busy(dev_priv->dev);
>
> +	/* Sanity check that the reserved size was large enough. */
> +	intel_ring_reserved_space_end(ringbuf);
> +
>   	return 0;
>   }
>
> @@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   	if (ret)
>   		goto err;
>
> +	/*
> +	 * Reserve space in the ring buffer for all the commands required to
> +	 * eventually emit this request. This is to guarantee that the
> +	 * i915_add_request() call can't fail. Note that the reserve may need
> +	 * to be redone if the request is not actually submitted straight
> +	 * away, e.g. because a GPU scheduler has deferred it.
> +	 *
> +	 * Note further that this call merely notes the reserve request. A
> +	 * subsequent call to *_ring_begin() is required to actually ensure
> +	 * that the reservation is available. Without the begin, if the
> +	 * request creator immediately submitted the request without adding
> +	 * any commands to it then there might not actually be sufficient
> +	 * room for the submission commands. Unfortunately, the current
> +	 * *_ring_begin() implementations potentially call back here to
> +	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
> +	 * infinite recursion! Until that back call path is removed, it is
> +	 * necessary to do a manual _begin() outside.
> +	 */
> +	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> +
>   	ring->outstanding_lazy_request = req;
>   	return 0;
>
> @@ -2674,6 +2704,13 @@ err:
>   	return ret;
>   }
>
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req)
> +{
> +	intel_ring_reserved_space_cancel(req->ringbuf);
> +
> +	i915_gem_request_unreference(req);
> +}
> +
>   struct drm_i915_gem_request *
>   i915_gem_find_active_request(struct intel_engine_cs *ring)
>   {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6a5ed07..e62d396 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -687,6 +687,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
>   	unsigned space;
>   	int ret;
>
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>   	if (intel_ring_space(ringbuf) >= bytes)
>   		return 0;
>
> @@ -747,6 +750,9 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>   	uint32_t __iomem *virt;
>   	int rem = ringbuf->size - ringbuf->tail;
>
> +	/* Can't wrap if space has already been reserved! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>   	if (ringbuf->space < rem) {
>   		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>
> @@ -770,6 +776,9 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>   {
>   	int ret;
>
> +	if (!ringbuf->reserved_in_use)
> +		bytes += ringbuf->reserved_size;
> +
>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>   		ret = logical_ring_wrap_buffer(ringbuf, ctx);
>   		if (unlikely(ret))
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index d934f85..74c2222 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2103,6 +2103,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>   	unsigned space;
>   	int ret;
>
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>   	if (intel_ring_space(ringbuf) >= n)
>   		return 0;
>
> @@ -2130,6 +2133,9 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>   	int rem = ringbuf->size - ringbuf->tail;
>
> +	/* Can't wrap if space has already been reserved! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>   	if (ringbuf->space < rem) {
>   		int ret = ring_wait_for_space(ring, rem);
>   		if (ret)
> @@ -2180,16 +2186,74 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>   	return 0;
>   }
>
> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
> -				int bytes)
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
> +{
> +	/* NB: Until request management is fully tidied up and the OLR is
> +	 * removed, there are too many ways for get false hits on this
> +	 * anti-recursion check! */
> +	/*WARN_ON(ringbuf->reserved_size);*/
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size = size;
> +
> +	/*
> +	 * Really need to call _begin() here but that currently leads to
> +	 * recursion problems! This will be fixed later but for now just
> +	 * return and hope for the best. Note that there is only a real
> +	 * problem if the create of the request never actually calls _begin()
> +	 * but if they are not submitting any work then why did they create
> +	 * the request in the first place?
> +	 */
> +}
> +
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_in_use = true;
> +	ringbuf->reserved_tail   = ringbuf->tail;
> +}
> +
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(!ringbuf->reserved_in_use);
> +	WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
> +	     "request reserved size too small: %d vs %d!\n",
> +	     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
>   {
>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>   	int ret;
>
> +	if (!ringbuf->reserved_in_use)
> +		bytes += ringbuf->reserved_size;
> +
>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
> +		WARN_ON(ringbuf->reserved_in_use);
> +
>   		ret = intel_wrap_ring_buffer(ring);
>   		if (unlikely(ret))
>   			return ret;
> +
> +		if(ringbuf->reserved_size) {
> +			uint32_t size = ringbuf->reserved_size;
> +
> +			intel_ring_reserved_space_cancel(ringbuf);
> +			intel_ring_reserved_space_reserve(ringbuf, size);
> +		}
>   	}
>
>   	if (unlikely(ringbuf->space < bytes)) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 39f6dfc..39f795c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -105,6 +105,9 @@ struct intel_ringbuffer {
>   	int space;
>   	int size;
>   	int effective_size;
> +	int reserved_size;
> +	int reserved_tail;
> +	bool reserved_in_use;
>
>   	/** We track the position of the requests in the ring buffer, and
>   	 * when each is retired we increment last_retired_head as the GPU
> @@ -450,4 +453,11 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>   	return ring->outstanding_lazy_request;
>   }
>
> +#define MIN_SPACE_FOR_ADD_REQUEST	128

1. As far as I know ILK is still broken when using this size.

2. Once we get a size that works we need a comment here saying that this 
is an empirically established constant but that it gets the job done.
	

> +
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
> +
>   #endif /* _INTEL_RINGBUFFER_H_ */
>

3. We still need a decent comment outlining the relationship between 
these functions, when they are used and what their respective purposes 
are etc. See comment in last patch series for more details.

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 03/55] drm/i915: i915_add_request must not fail
  2015-05-29 16:43 ` [PATCH 03/55] drm/i915: i915_add_request must not fail John.C.Harrison
@ 2015-06-02 18:16   ` Tomas Elf
  2015-06-04 14:07     ` John Harrison
  2015-06-23 10:16   ` Chris Wilson
  1 sibling, 1 reply; 120+ messages in thread
From: Tomas Elf @ 2015-06-02 18:16 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The i915_add_request() function is called to keep track of work that has been
> written to the ring buffer. It adds epilogue commands to track progress (seqno
> updates and such), moves the request structure onto the right list and other
> such house keeping tasks. However, the work itself has already been written to
> the ring and will get executed whether or not the add request call succeeds. So
> no matter what goes wrong, there isn't a whole lot of point in failing the call.
>
> At the moment, this is fine(ish). If the add request does bail early on and not
> do the housekeeping, the request will still float around in the
> ring->outstanding_lazy_request field and be picked up next time. It means
> multiple pieces of work will be tagged as the same request and driver can't
> actually wait for the first piece of work until something else has been
> submitted. But it all sort of hangs together.
>
> This patch series is all about removing the OLR and guaranteeing that each piece
> of work gets its own personal request. That means that there is no more
> 'hoovering up of forgotten requests'. If the request does not get tracked then
> it will be leaked. Thus the add request call _must_ not fail. The previous patch
> should have already ensured that it _will_ not fail by removing the potential
> for running out of ring space. This patch enforces the rule by actually removing
> the early exit paths and the return code.
>
> Note that if something does manage to fail and the epilogue commands don't get
> written to the ring, the driver will still hang together. The request will be
> added to the tracking lists. And as in the old case, any subsequent work will
> generate a new seqno which will suffice for marking the old one as complete.
>
> v2: Improved WARNings (Tomas Elf review request).
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h              |    6 ++--
>   drivers/gpu/drm/i915/i915_gem.c              |   43 ++++++++++++--------------
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
>   drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
>   drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
>   drivers/gpu/drm/i915/intel_overlay.c         |    8 ++---
>   drivers/gpu/drm/i915/intel_ringbuffer.c      |    8 ++---
>   7 files changed, 31 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index eba1857..1be4a52 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2860,9 +2860,9 @@ void i915_gem_init_swizzling(struct drm_device *dev);
>   void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>   int __must_check i915_gpu_idle(struct drm_device *dev);
>   int __must_check i915_gem_suspend(struct drm_device *dev);
> -int __i915_add_request(struct intel_engine_cs *ring,
> -		       struct drm_file *file,
> -		       struct drm_i915_gem_object *batch_obj);
> +void __i915_add_request(struct intel_engine_cs *ring,
> +			struct drm_file *file,
> +			struct drm_i915_gem_object *batch_obj);
>   #define i915_add_request(ring) \
>   	__i915_add_request(ring, NULL, NULL)
>   int __i915_wait_request(struct drm_i915_gem_request *req,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6f51416..dd39aa5 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1155,15 +1155,12 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>   int
>   i915_gem_check_olr(struct drm_i915_gem_request *req)
>   {
> -	int ret;
> -
>   	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
>
> -	ret = 0;
>   	if (req == req->ring->outstanding_lazy_request)
> -		ret = i915_add_request(req->ring);
> +		i915_add_request(req->ring);
>
> -	return ret;
> +	return 0;
>   }

i915_gem_check_olr never returns anything but 0. How about making it void?

Thanks,
Tomas

>
>   static void fake_irq(unsigned long data)
> @@ -2466,9 +2463,14 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
>   	return 0;
>   }
>
> -int __i915_add_request(struct intel_engine_cs *ring,
> -		       struct drm_file *file,
> -		       struct drm_i915_gem_object *obj)
> +/*
> + * NB: This function is not allowed to fail. Doing so would mean the the
> + * request is not being tracked for completion but the work itself is
> + * going to happen on the hardware. This would be a Bad Thing(tm).
> + */
> +void __i915_add_request(struct intel_engine_cs *ring,
> +			struct drm_file *file,
> +			struct drm_i915_gem_object *obj)
>   {
>   	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>   	struct drm_i915_gem_request *request;
> @@ -2478,7 +2480,7 @@ int __i915_add_request(struct intel_engine_cs *ring,
>
>   	request = ring->outstanding_lazy_request;
>   	if (WARN_ON(request == NULL))
> -		return -ENOMEM;
> +		return;

You have a WARN for the other points of failure in this function, why 
not here?

>
>   	if (i915.enable_execlists) {
>   		ringbuf = request->ctx->engine[ring->id].ringbuf;
> @@ -2500,15 +2502,12 @@ int __i915_add_request(struct intel_engine_cs *ring,
>   	 * is that the flush _must_ happen before the next request, no matter
>   	 * what.
>   	 */
> -	if (i915.enable_execlists) {
> +	if (i915.enable_execlists)
>   		ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
> -		if (ret)
> -			return ret;
> -	} else {
> +	else
>   		ret = intel_ring_flush_all_caches(ring);
> -		if (ret)
> -			return ret;
> -	}
> +	/* Not allowed to fail! */
> +	WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
>
>   	/* Record the position of the start of the request so that
>   	 * should we detect the updated seqno part-way through the
> @@ -2517,17 +2516,15 @@ int __i915_add_request(struct intel_engine_cs *ring,
>   	 */
>   	request->postfix = intel_ring_get_tail(ringbuf);
>
> -	if (i915.enable_execlists) {
> +	if (i915.enable_execlists)
>   		ret = ring->emit_request(ringbuf, request);
> -		if (ret)
> -			return ret;
> -	} else {
> +	else {
>   		ret = ring->add_request(ring);
> -		if (ret)
> -			return ret;
>
>   		request->tail = intel_ring_get_tail(ringbuf);
>   	}
> +	/* Not allowed to fail! */
> +	WARN(ret, "emit|add_request failed: %d!\n", ret);
>
>   	request->head = request_start;
>
> @@ -2576,8 +2573,6 @@ int __i915_add_request(struct intel_engine_cs *ring,
>
>   	/* Sanity check that the reserved size was large enough. */
>   	intel_ring_reserved_space_end(ringbuf);
> -
> -	return 0;
>   }
>
>   static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index bd0e4bd..2b48a31 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1061,7 +1061,7 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev,
>   	ring->gpu_caches_dirty = true;
>
>   	/* Add a breadcrumb for the completion of the batch buffer */
> -	(void)__i915_add_request(ring, file, obj);
> +	__i915_add_request(ring, file, obj);
>   }
>
>   static int
> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
> index 521548a..ce4788f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
> @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
>
>   	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>
> -	ret = __i915_add_request(ring, NULL, so.obj);
> +	__i915_add_request(ring, NULL, so.obj);
>   	/* __i915_add_request moves object to inactive if it fails */
>   out:
>   	i915_gem_render_state_fini(&so);
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index e62d396..7a75fc8 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1373,7 +1373,7 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>
>   	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>
> -	ret = __i915_add_request(ring, file, so.obj);
> +	__i915_add_request(ring, file, so.obj);
>   	/* intel_logical_ring_add_request moves object to inactive if it
>   	 * fails */
>   out:
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 25c8ec6..e7534b9 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -220,9 +220,7 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
>   	WARN_ON(overlay->last_flip_req);
>   	i915_gem_request_assign(&overlay->last_flip_req,
>   					     ring->outstanding_lazy_request);
> -	ret = i915_add_request(ring);
> -	if (ret)
> -		return ret;
> +	i915_add_request(ring);
>
>   	overlay->flip_tail = tail;
>   	ret = i915_wait_request(overlay->last_flip_req);
> @@ -291,7 +289,9 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>   	WARN_ON(overlay->last_flip_req);
>   	i915_gem_request_assign(&overlay->last_flip_req,
>   					     ring->outstanding_lazy_request);
> -	return i915_add_request(ring);
> +	i915_add_request(ring);
> +
> +	return 0;
>   }
>
>   static void intel_overlay_release_old_vid_tail(struct intel_overlay *overlay)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 74c2222..7061b07 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2156,14 +2156,10 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>   int intel_ring_idle(struct intel_engine_cs *ring)
>   {
>   	struct drm_i915_gem_request *req;
> -	int ret;
>
>   	/* We need to add any requests required to flush the objects and ring */
> -	if (ring->outstanding_lazy_request) {
> -		ret = i915_add_request(ring);
> -		if (ret)
> -			return ret;
> -	}
> +	if (ring->outstanding_lazy_request)
> +		i915_add_request(ring);
>
>   	/* Wait upon the last request to be completed */
>   	if (list_empty(&ring->request_list))
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush
  2015-05-29 16:43 ` [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush John.C.Harrison
@ 2015-06-02 18:19   ` Tomas Elf
  0 siblings, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-02 18:19 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> In order to explcitly track all GPU work (and completely remove the outstanding
> lazy request), it is necessary to add extra i915_add_request() calls to various
> places. Some of these do not need the implicit cache flush done as part of the
> standard batch buffer submission process.
>
> This patch adds a flag to _add_request() to specify whether the flush is
> required or not.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h              |    7 +++++--
>   drivers/gpu/drm/i915/i915_gem.c              |   17 ++++++++++-------
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
>   drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
>   drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
>   5 files changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index cc2c45c..f5a733b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2863,9 +2863,12 @@ int __must_check i915_gpu_idle(struct drm_device *dev);
>   int __must_check i915_gem_suspend(struct drm_device *dev);
>   void __i915_add_request(struct intel_engine_cs *ring,
>   			struct drm_file *file,
> -			struct drm_i915_gem_object *batch_obj);
> +			struct drm_i915_gem_object *batch_obj,
> +			bool flush_caches);
>   #define i915_add_request(ring) \
> -	__i915_add_request(ring, NULL, NULL)
> +	__i915_add_request(ring, NULL, NULL, true)
> +#define i915_add_request_no_flush(ring) \
> +	__i915_add_request(ring, NULL, NULL, false)
>   int __i915_wait_request(struct drm_i915_gem_request *req,
>   			unsigned reset_counter,
>   			bool interruptible,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ba2e7f7..458b54e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2470,7 +2470,8 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
>    */
>   void __i915_add_request(struct intel_engine_cs *ring,
>   			struct drm_file *file,
> -			struct drm_i915_gem_object *obj)
> +			struct drm_i915_gem_object *obj,
> +			bool flush_caches)
>   {
>   	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>   	struct drm_i915_gem_request *request;
> @@ -2502,12 +2503,14 @@ void __i915_add_request(struct intel_engine_cs *ring,
>   	 * is that the flush _must_ happen before the next request, no matter
>   	 * what.
>   	 */
> -	if (i915.enable_execlists)
> -		ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
> -	else
> -		ret = intel_ring_flush_all_caches(ring);
> -	/* Not allowed to fail! */
> -	WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
> +	if (flush_caches) {
> +		if (i915.enable_execlists)
> +			ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
> +		else
> +			ret = intel_ring_flush_all_caches(ring);
> +		/* Not allowed to fail! */
> +		WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
> +	}
>
>   	/* Record the position of the start of the request so that
>   	 * should we detect the updated seqno part-way through the
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index a6532db..e27f47f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1058,7 +1058,7 @@ i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>   	params->ring->gpu_caches_dirty = true;
>
>   	/* Add a breadcrumb for the completion of the batch buffer */
> -	__i915_add_request(params->ring, params->file, params->batch_obj);
> +	__i915_add_request(params->ring, params->file, params->batch_obj, true);
>   }
>
>   static int
> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
> index ce4788f..4418616 100644
> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
> @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
>
>   	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>
> -	__i915_add_request(ring, NULL, so.obj);
> +	__i915_add_request(ring, NULL, so.obj, true);
>   	/* __i915_add_request moves object to inactive if it fails */
>   out:
>   	i915_gem_render_state_fini(&so);
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6c0b16f..00bb335 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1371,7 +1371,7 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>
>   	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>
> -	__i915_add_request(ring, file, so.obj);
> +	__i915_add_request(ring, file, so.obj, true);
>   	/* intel_logical_ring_add_request moves object to inactive if it
>   	 * fails */
>   out:
>


Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw()
  2015-05-29 16:43 ` [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw() John.C.Harrison
@ 2015-06-02 18:20   ` Tomas Elf
  0 siblings, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-02 18:20 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> Now that a single per ring loop is being done for all the different
> intialisation steps in i915_gem_init_hw(), it is possible to add proper request
> management as well. The last remaining issue is that the context enable call
> eventually ends up within *_render_state_init() and this does its own private
> _i915_add_request() call.
>
> This patch adds explicit request creation and submission to the top level loop
> and removes the add_request() from deep within the sub-functions.
>
> v2: Updated for removal of batch_obj from add_request call in previous patch.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h              |    3 ++-
>   drivers/gpu/drm/i915/i915_gem.c              |   12 ++++++++++++
>   drivers/gpu/drm/i915/i915_gem_render_state.c |    2 --
>   drivers/gpu/drm/i915/intel_lrc.c             |    5 -----
>   4 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 69c8f56..21045e7 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2154,7 +2154,8 @@ struct drm_i915_gem_request {
>   	struct intel_context *ctx;
>   	struct intel_ringbuffer *ringbuf;
>
> -	/** Batch buffer related to this request if any */
> +	/** Batch buffer related to this request if any (used for
> +	    error state dump only) */
>   	struct drm_i915_gem_object *batch_obj;
>
>   	/** Time at which this request was emitted, in jiffies. */
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a2712a6..1960e30 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5073,8 +5073,16 @@ i915_gem_init_hw(struct drm_device *dev)
>
>   	/* Now it is safe to go back round and do everything else: */
>   	for_each_ring(ring, dev_priv, i) {
> +		struct drm_i915_gem_request *req;
> +
>   		WARN_ON(!ring->default_context);
>
> +		ret = i915_gem_request_alloc(ring, ring->default_context, &req);
> +		if (ret) {
> +			i915_gem_cleanup_ringbuffer(dev);
> +			goto out;
> +		}
> +
>   		if (ring->id == RCS) {
>   			for (i = 0; i < NUM_L3_SLICES(dev); i++)
>   				i915_gem_l3_remap(ring, i);
> @@ -5083,6 +5091,7 @@ i915_gem_init_hw(struct drm_device *dev)
>   		ret = i915_ppgtt_init_ring(ring);
>   		if (ret && ret != -EIO) {
>   			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
> +			i915_gem_request_cancel(req);
>   			i915_gem_cleanup_ringbuffer(dev);
>   			goto out;
>   		}
> @@ -5090,9 +5099,12 @@ i915_gem_init_hw(struct drm_device *dev)
>   		ret = i915_gem_context_enable(ring);
>   		if (ret && ret != -EIO) {
>   			DRM_ERROR("Context enable ring #%d failed %d\n", i, ret);
> +			i915_gem_request_cancel(req);
>   			i915_gem_cleanup_ringbuffer(dev);
>   			goto out;
>   		}
> +
> +		i915_add_request_no_flush(ring);
>   	}
>
>   out:
> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
> index a32a4b9..a07b4ee 100644
> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
> @@ -173,8 +173,6 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
>
>   	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>
> -	__i915_add_request(ring, NULL, NULL, true);
> -	/* __i915_add_request moves object to inactive if it fails */
>   out:
>   	i915_gem_render_state_fini(&so);
>   	return ret;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index c744362..37efa93 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1351,8 +1351,6 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>   {
>   	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
>   	struct render_state so;
> -	struct drm_i915_file_private *file_priv = ctx->file_priv;
> -	struct drm_file *file = file_priv ? file_priv->file : NULL;
>   	int ret;
>
>   	ret = i915_gem_render_state_prepare(ring, &so);
> @@ -1371,9 +1369,6 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>
>   	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>
> -	__i915_add_request(ring, file, NULL, true);
> -	/* intel_logical_ring_add_request moves object to inactive if it
> -	 * fails */
>   out:
>   	i915_gem_render_state_fini(&so);
>   	return ret;
>


Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management
  2015-05-29 16:43 ` [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management John.C.Harrison
@ 2015-06-02 18:22   ` Tomas Elf
  0 siblings, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-02 18:22 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> In execlist mode, context initialisation is deferred until first use of the
> given context. This is because execlist mode has per ring context state and thus
> many more context storage objects than legacy mode and many are never actually
> used. Previously, the initialisation commands were written to the ring and
> tagged with some random request structure via the OLR. This seemed to be causing
> a null pointer deference bug under certain circumstances (BZ:88865).
>
> This patch adds explicit request creation and submission to the deferred
> initialisation code path. Thus removing any reliance on or randomness caused by
> the OLR.
>
> Note that it should be possible to move the deferred context creation until even
> later - when the context is actually switched to rather than when it is merely
> validated. This would allow the initialisation to be done within the request of
> the work that is wanting to use the context. Hence, the extra request that is
> created, used and retired just for the context init could be removed completely.
> However, this is left for a follow up patch.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c |   11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 37efa93..2730efd 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1971,13 +1971,22 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
>   		lrc_setup_hardware_status_page(ring, ctx_obj);
>   	else if (ring->id == RCS && !ctx->rcs_initialized) {
>   		if (ring->init_context) {
> -			ret = ring->init_context(ring, ctx);
> +			struct drm_i915_gem_request *req;
> +
> +			ret = i915_gem_request_alloc(ring, ctx, &req);
> +			if (ret)
> +				return ret;
> +
> +			ret = ring->init_context(req->ring, ctx);
>   			if (ret) {
>   				DRM_ERROR("ring init context: %d\n", ret);
> +				i915_gem_request_cancel(req);
>   				ctx->engine[ring->id].ringbuf = NULL;
>   				ctx->engine[ring->id].state = NULL;
>   				goto error;
>   			}
> +
> +			i915_add_request_no_flush(req->ring);
>   		}
>
>   		ctx->rcs_initialized = true;
>

Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-05-29 16:43 ` [PATCH 25/55] drm/i915: Update i915_gem_object_sync() " John.C.Harrison
@ 2015-06-02 18:26   ` Tomas Elf
  2015-06-04 12:57     ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Tomas Elf @ 2015-06-02 18:26 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The plan is to pass requests around as the basic submission tracking structure
> rather than rings and contexts. This patch updates the i915_gem_object_sync()
> code path.
>
> v2: Much more complex patch to share a single request between the sync and the
> page flip. The _sync() function now supports lazy allocation of the request
> structure. That is, if one is passed in then that will be used. If one is not,
> then a request will be allocated and passed back out. Note that the _sync() code
> does not necessarily require a request. Thus one will only be created until
> certain situations. The reason the lazy allocation must be done within the
> _sync() code itself is because the decision to need one or not is not really
> something that code above can second guess (except in the case where one is
> definitely not required because no ring is passed in).
>
> The call chains above _sync() now support passing a request through which most
> callers passing in NULL and assuming that no request will be required (because
> they also pass in NULL for the ring and therefore can't be generating any ring
> code).
>
> The exeception is intel_crtc_page_flip() which now supports having a request

1. "The exeception" -> "The exception"

> returned from _sync(). If one is, then that request is shared by the page flip
> (if the page flip is of a type to need a request). If _sync() does not generate
> a request but the page flip does need one, then the page flip path will create
> its own request.
>
> v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
> Elf review request). Rebased onto newer tree that significantly changed the
> synchronisation code.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
>   drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
>   drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
>   drivers/gpu/drm/i915/intel_drv.h           |    3 +-
>   drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
>   drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
>   drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
>   8 files changed, 58 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 64a10fa..f69e9cb 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>
>   int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>   int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> -			 struct intel_engine_cs *to);
> +			 struct intel_engine_cs *to,
> +			 struct drm_i915_gem_request **to_req);
>   void i915_vma_move_to_active(struct i915_vma *vma,
>   			     struct intel_engine_cs *ring);
>   int i915_gem_dumb_create(struct drm_file *file_priv,
> @@ -2889,6 +2890,7 @@ int __must_check
>   i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>   				     u32 alignment,
>   				     struct intel_engine_cs *pipelined,
> +				     struct drm_i915_gem_request **pipelined_request,
>   				     const struct i915_ggtt_view *view);
>   void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj,
>   					      const struct i915_ggtt_view *view);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b7d66aa..db90043 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3098,25 +3098,26 @@ out:
>   static int
>   __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   		       struct intel_engine_cs *to,
> -		       struct drm_i915_gem_request *req)
> +		       struct drm_i915_gem_request *from_req,
> +		       struct drm_i915_gem_request **to_req)
>   {
>   	struct intel_engine_cs *from;
>   	int ret;
>
> -	from = i915_gem_request_get_ring(req);
> +	from = i915_gem_request_get_ring(from_req);
>   	if (to == from)
>   		return 0;
>
> -	if (i915_gem_request_completed(req, true))
> +	if (i915_gem_request_completed(from_req, true))
>   		return 0;
>
> -	ret = i915_gem_check_olr(req);
> +	ret = i915_gem_check_olr(from_req);
>   	if (ret)
>   		return ret;
>
>   	if (!i915_semaphore_is_enabled(obj->base.dev)) {
>   		struct drm_i915_private *i915 = to_i915(obj->base.dev);
> -		ret = __i915_wait_request(req,
> +		ret = __i915_wait_request(from_req,
>   					  atomic_read(&i915->gpu_error.reset_counter),
>   					  i915->mm.interruptible,
>   					  NULL,
> @@ -3124,15 +3125,25 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   		if (ret)
>   			return ret;
>
> -		i915_gem_object_retire_request(obj, req);
> +		i915_gem_object_retire_request(obj, from_req);
>   	} else {
>   		int idx = intel_ring_sync_index(from, to);
> -		u32 seqno = i915_gem_request_get_seqno(req);
> +		u32 seqno = i915_gem_request_get_seqno(from_req);
>
> +		WARN_ON(!to_req);
> +
> +		/* Optimization: Avoid semaphore sync when we are sure we already
> +		 * waited for an object with higher seqno */

2. How about using the standard multi-line comment format?

/* (empty line)
  * (first line)
  * (second line)
  */

>   		if (seqno <= from->semaphore.sync_seqno[idx])
>   			return 0;
>
> -		trace_i915_gem_ring_sync_to(from, to, req);
> +		if (*to_req == NULL) {
> +			ret = i915_gem_request_alloc(to, to->default_context, to_req);
> +			if (ret)
> +				return ret;
> +		}
> +
> +		trace_i915_gem_ring_sync_to(from, to, from_req);
>   		ret = to->semaphore.sync_to(to, from, seqno);
>   		if (ret)
>   			return ret;
> @@ -3153,6 +3164,9 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>    *
>    * @obj: object which may be in use on another ring.
>    * @to: ring we wish to use the object on. May be NULL.
> + * @to_req: request we wish to use the object for. See below.
> + *          This will be allocated and returned if a request is
> + *          required but not passed in.
>    *
>    * This code is meant to abstract object synchronization with the GPU.
>    * Calling with NULL implies synchronizing the object with the CPU
> @@ -3168,11 +3182,22 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>    * - If we are a write request (pending_write_domain is set), the new
>    *   request must wait for outstanding read requests to complete.
>    *
> + * For CPU synchronisation (NULL to) no request is required. For syncing with
> + * rings to_req must be non-NULL. However, a request does not have to be
> + * pre-allocated. If *to_req is null and sync commands will be emitted then a
> + * request will be allocated automatically and returned through *to_req. Note
> + * that it is not guaranteed that commands will be emitted (because the
> + * might already be idle). Hence there is no need to create a request that
> + * might never have any work submitted. Note further that if a request is
> + * returned in *to_req, it is the responsibility of the caller to submit
> + * that request (after potentially adding more work to it).
> + *

3. "(because the might already be idle)" : The what? The engine?
4. "NULL" and "null" mixed. Please be consistent.

Overall, the explanation is better than in the last patch version

With those minor changes:

Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

>    * Returns 0 if successful, else propagates up the lower layer error.
>    */
>   int
>   i915_gem_object_sync(struct drm_i915_gem_object *obj,
> -		     struct intel_engine_cs *to)
> +		     struct intel_engine_cs *to,
> +		     struct drm_i915_gem_request **to_req)
>   {
>   	const bool readonly = obj->base.pending_write_domain == 0;
>   	struct drm_i915_gem_request *req[I915_NUM_RINGS];
> @@ -3194,7 +3219,7 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   				req[n++] = obj->last_read_req[i];
>   	}
>   	for (i = 0; i < n; i++) {
> -		ret = __i915_gem_object_sync(obj, to, req[i]);
> +		ret = __i915_gem_object_sync(obj, to, req[i], to_req);
>   		if (ret)
>   			return ret;
>   	}
> @@ -4144,12 +4169,13 @@ int
>   i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>   				     u32 alignment,
>   				     struct intel_engine_cs *pipelined,
> +				     struct drm_i915_gem_request **pipelined_request,
>   				     const struct i915_ggtt_view *view)
>   {
>   	u32 old_read_domains, old_write_domain;
>   	int ret;
>
> -	ret = i915_gem_object_sync(obj, pipelined);
> +	ret = i915_gem_object_sync(obj, pipelined, pipelined_request);
>   	if (ret)
>   		return ret;
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 50b1ced..bea92ad 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -899,7 +899,7 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req,
>   		struct drm_i915_gem_object *obj = vma->obj;
>
>   		if (obj->active & other_rings) {
> -			ret = i915_gem_object_sync(obj, req->ring);
> +			ret = i915_gem_object_sync(obj, req->ring, &req);
>   			if (ret)
>   				return ret;
>   		}
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 657a333..6528ada 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2338,7 +2338,8 @@ int
>   intel_pin_and_fence_fb_obj(struct drm_plane *plane,
>   			   struct drm_framebuffer *fb,
>   			   const struct drm_plane_state *plane_state,
> -			   struct intel_engine_cs *pipelined)
> +			   struct intel_engine_cs *pipelined,
> +			   struct drm_i915_gem_request **pipelined_request)
>   {
>   	struct drm_device *dev = fb->dev;
>   	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -2403,7 +2404,7 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
>
>   	dev_priv->mm.interruptible = false;
>   	ret = i915_gem_object_pin_to_display_plane(obj, alignment, pipelined,
> -						   &view);
> +						   pipelined_request, &view);
>   	if (ret)
>   		goto err_interruptible;
>
> @@ -11119,6 +11120,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>   	struct intel_unpin_work *work;
>   	struct intel_engine_cs *ring;
>   	bool mmio_flip;
> +	struct drm_i915_gem_request *request = NULL;
>   	int ret;
>
>   	/*
> @@ -11225,7 +11227,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>   	 */
>   	ret = intel_pin_and_fence_fb_obj(crtc->primary, fb,
>   					 crtc->primary->state,
> -					 mmio_flip ? i915_gem_request_get_ring(obj->last_write_req) : ring);
> +					 mmio_flip ? i915_gem_request_get_ring(obj->last_write_req) : ring, &request);
>   	if (ret)
>   		goto cleanup_pending;
>
> @@ -11256,6 +11258,9 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>   					intel_ring_get_request(ring));
>   	}
>
> +	if (request)
> +		i915_add_request_no_flush(request->ring);
> +
>   	work->flip_queued_vblank = drm_crtc_vblank_count(crtc);
>   	work->enable_stall_check = true;
>
> @@ -11273,6 +11278,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>   cleanup_unpin:
>   	intel_unpin_fb_obj(fb, crtc->primary->state);
>   cleanup_pending:
> +	if (request)
> +		i915_gem_request_cancel(request);
>   	atomic_dec(&intel_crtc->unpin_work_count);
>   	mutex_unlock(&dev->struct_mutex);
>   cleanup:
> @@ -13171,7 +13178,7 @@ intel_prepare_plane_fb(struct drm_plane *plane,
>   		if (ret)
>   			DRM_DEBUG_KMS("failed to attach phys object\n");
>   	} else {
> -		ret = intel_pin_and_fence_fb_obj(plane, fb, new_state, NULL);
> +		ret = intel_pin_and_fence_fb_obj(plane, fb, new_state, NULL, NULL);
>   	}
>
>   	if (ret == 0)
> @@ -15218,7 +15225,7 @@ void intel_modeset_gem_init(struct drm_device *dev)
>   		ret = intel_pin_and_fence_fb_obj(c->primary,
>   						 c->primary->fb,
>   						 c->primary->state,
> -						 NULL);
> +						 NULL, NULL);
>   		mutex_unlock(&dev->struct_mutex);
>   		if (ret) {
>   			DRM_ERROR("failed to pin boot fb on pipe %d\n",
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 02d8317..73650ae 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1034,7 +1034,8 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
>   int intel_pin_and_fence_fb_obj(struct drm_plane *plane,
>   			       struct drm_framebuffer *fb,
>   			       const struct drm_plane_state *plane_state,
> -			       struct intel_engine_cs *pipelined);
> +			       struct intel_engine_cs *pipelined,
> +			       struct drm_i915_gem_request **pipelined_request);
>   struct drm_framebuffer *
>   __intel_framebuffer_create(struct drm_device *dev,
>   			   struct drm_mode_fb_cmd2 *mode_cmd,
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index 4e7e7da..dd9f3b2 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -151,7 +151,7 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
>   	}
>
>   	/* Flush everything out, we'll be doing GTT only from now on */
> -	ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL);
> +	ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL, NULL);
>   	if (ret) {
>   		DRM_ERROR("failed to pin obj: %d\n", ret);
>   		goto out_fb;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6d005b1..f8e8fdb 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -638,7 +638,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
>   		struct drm_i915_gem_object *obj = vma->obj;
>
>   		if (obj->active & other_rings) {
> -			ret = i915_gem_object_sync(obj, req->ring);
> +			ret = i915_gem_object_sync(obj, req->ring, &req);
>   			if (ret)
>   				return ret;
>   		}
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index e7534b9..0f8187a 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -724,7 +724,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
>   	if (ret != 0)
>   		return ret;
>
> -	ret = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL,
> +	ret = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL, NULL,
>   						   &i915_ggtt_view_normal);
>   	if (ret != 0)
>   		return ret;
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()'
  2015-05-29 16:44 ` [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()' John.C.Harrison
@ 2015-06-02 18:27   ` Tomas Elf
  2015-06-23 10:23   ` Chris Wilson
  1 sibling, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-02 18:27 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:44, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> As there is no OLR to check, the check_olr() function is now a no-op and can be
> removed.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h      |    1 -
>   drivers/gpu/drm/i915/i915_gem.c      |   34 +---------------------------------
>   drivers/gpu/drm/i915/intel_display.c |    6 ------
>   3 files changed, 1 insertion(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 18bfc84..cb5bb4a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2825,7 +2825,6 @@ bool i915_gem_retire_requests(struct drm_device *dev);
>   void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
>   int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
>   				      bool interruptible);
> -int __must_check i915_gem_check_olr(struct drm_i915_gem_request *req);
>
>   static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
>   {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b8fe931..f825942 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1149,17 +1149,6 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>   	return 0;
>   }
>
> -/*
> - * Compare arbitrary request against outstanding lazy request. Emit on match.
> - */
> -int
> -i915_gem_check_olr(struct drm_i915_gem_request *req)
> -{
> -	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
> -
> -	return 0;
> -}
> -
>   static void fake_irq(unsigned long data)
>   {
>   	wake_up_process((struct task_struct *)data);
> @@ -1440,10 +1429,6 @@ i915_wait_request(struct drm_i915_gem_request *req)
>   	if (ret)
>   		return ret;
>
> -	ret = i915_gem_check_olr(req);
> -	if (ret)
> -		return ret;
> -
>   	ret = __i915_wait_request(req,
>   				  atomic_read(&dev_priv->gpu_error.reset_counter),
>   				  interruptible, NULL, NULL);
> @@ -1543,10 +1528,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>   		if (req == NULL)
>   			return 0;
>
> -		ret = i915_gem_check_olr(req);
> -		if (ret)
> -			goto err;
> -
>   		requests[n++] = i915_gem_request_reference(req);
>   	} else {
>   		for (i = 0; i < I915_NUM_RINGS; i++) {
> @@ -1556,10 +1537,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>   			if (req == NULL)
>   				continue;
>
> -			ret = i915_gem_check_olr(req);
> -			if (ret)
> -				goto err;
> -
>   			requests[n++] = i915_gem_request_reference(req);
>   		}
>   	}
> @@ -1570,7 +1547,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>   					  NULL, rps);
>   	mutex_lock(&dev->struct_mutex);
>
> -err:
>   	for (i = 0; i < n; i++) {
>   		if (ret == 0)
>   			i915_gem_object_retire_request(obj, requests[i]);
> @@ -2987,7 +2963,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
>   static int
>   i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
>   {
> -	int ret, i;
> +	int i;
>
>   	if (!obj->active)
>   		return 0;
> @@ -3002,10 +2978,6 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
>   		if (list_empty(&req->list))
>   			goto retire;
>
> -		ret = i915_gem_check_olr(req);
> -		if (ret)
> -			return ret;
> -
>   		if (i915_gem_request_completed(req, true)) {
>   			__i915_gem_request_retire__upto(req);
>   retire:
> @@ -3121,10 +3093,6 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   	if (i915_gem_request_completed(from_req, true))
>   		return 0;
>
> -	ret = i915_gem_check_olr(from_req);
> -	if (ret)
> -		return ret;
> -
>   	if (!i915_semaphore_is_enabled(obj->base.dev)) {
>   		struct drm_i915_private *i915 = to_i915(obj->base.dev);
>   		ret = __i915_wait_request(from_req,
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 91e19d0..aca2215 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11243,12 +11243,6 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>   		i915_gem_request_assign(&work->flip_queued_req,
>   					obj->last_write_req);
>   	} else {
> -		if (obj->last_write_req) {
> -			ret = i915_gem_check_olr(obj->last_write_req);
> -			if (ret)
> -				goto cleanup_unpin;
> -		}
> -
>   		if (!request) {
>   			ret = i915_gem_request_alloc(ring, ring->default_context, &request);
>   			if (ret)
>


Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()
  2015-05-29 16:44 ` [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active() John.C.Harrison
@ 2015-06-02 18:27   ` Tomas Elf
  2015-06-17 14:06   ` Daniel Vetter
  1 sibling, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-02 18:27 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:44, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The i915_gem_object_flush_active() call used to do lots. Over time it has done
> less and less. Now all it does check the various associated requests to see if
> they can be retired. Hence this patch renames the function and updates the
> comments around it to match the current operation.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c |   18 ++++++------------
>   1 file changed, 6 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f825942..081cbbf 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2956,12 +2956,10 @@ i915_gem_idle_work_handler(struct work_struct *work)
>   }
>
>   /**
> - * Ensures that an object will eventually get non-busy by flushing any required
> - * write domains, emitting any outstanding lazy request and retiring and
> - * completed requests.
> + * Check an object to see if any of it's associated requests can be retired.
>    */
>   static int
> -i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
> +i915_gem_object_retire(struct drm_i915_gem_object *obj)
>   {
>   	int i;
>
> @@ -3034,8 +3032,8 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>   		return -ENOENT;
>   	}
>
> -	/* Need to make sure the object gets inactive eventually. */
> -	ret = i915_gem_object_flush_active(obj);
> +	/* Check if the object is pending clean up. */
> +	ret = i915_gem_object_retire(obj);
>   	if (ret)
>   		goto out;
>
> @@ -4526,12 +4524,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>   		goto unlock;
>   	}
>
> -	/* Count all active objects as busy, even if they are currently not used
> -	 * by the gpu. Users of this interface expect objects to eventually
> -	 * become non-busy without any further actions, therefore emit any
> -	 * necessary flushes here.
> -	 */
> -	ret = i915_gem_object_flush_active(obj);
> +	/* Check if the object is pending clean up. */
> +	ret = i915_gem_object_retire(obj);
>   	if (ret)
>   		goto unref;
>
>

Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful
  2015-05-29 16:43 ` [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful John.C.Harrison
@ 2015-06-03 11:14   ` Tomas Elf
  0 siblings, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-03 11:14 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The seqno value cannot always be used when debugging issues via trace
> points. This is because it can be reset back to start, especially
> during TDR type tests. Also, when the scheduler arrives the seqno is
> only valid while a given request is executing on the hardware. While
> the request is simply queued waiting for submission, it's seqno value
> will be zero (meaning invalid).
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h   |    4 ++++
>   drivers/gpu/drm/i915/i915_gem.c   |    1 +
>   drivers/gpu/drm/i915/i915_trace.h |   13 +++++++++----
>   drivers/gpu/drm/i915/intel_lrc.c  |    2 ++
>   4 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1038f5c..0347eb9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1882,6 +1882,8 @@ struct drm_i915_private {
>
>   	bool edp_low_vswing;
>
> +	uint32_t request_uniq;
> +
>   	/*
>   	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
>   	 * will be rejected. Instead look for a better place.
> @@ -2160,6 +2162,8 @@ struct drm_i915_gem_request {
>   	/** process identifier submitting this request */
>   	struct pid *pid;
>
> +	uint32_t uniq;
> +
>   	/**
>   	 * The ELSP only accepts two elements at a time, so we queue
>   	 * context/tail pairs on a given queue (ring->execlist_queue) until the
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index cc206f1..68f1d1e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2657,6 +2657,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   		goto err;
>
>   	req->ring = ring;
> +	req->uniq = dev_priv->request_uniq++;
>
>   	if (i915.enable_execlists)
>   		ret = intel_logical_ring_alloc_request_extras(req, ctx);
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 497cba5..6cbc280 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -504,6 +504,7 @@ DECLARE_EVENT_CLASS(i915_gem_request,
>   	    TP_STRUCT__entry(
>   			     __field(u32, dev)
>   			     __field(u32, ring)
> +			     __field(u32, uniq)
>   			     __field(u32, seqno)
>   			     ),
>
> @@ -512,11 +513,13 @@ DECLARE_EVENT_CLASS(i915_gem_request,
>   						i915_gem_request_get_ring(req);
>   			   __entry->dev = ring->dev->primary->index;
>   			   __entry->ring = ring->id;
> +			   __entry->uniq = req ? req->uniq : 0;
>   			   __entry->seqno = i915_gem_request_get_seqno(req);
>   			   ),
>
> -	    TP_printk("dev=%u, ring=%u, seqno=%u",
> -		      __entry->dev, __entry->ring, __entry->seqno)
> +	    TP_printk("dev=%u, ring=%u, uniq=%u, seqno=%u",
> +		      __entry->dev, __entry->ring, __entry->uniq,
> +		      __entry->seqno)
>   );
>
>   DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
> @@ -561,6 +564,7 @@ TRACE_EVENT(i915_gem_request_wait_begin,
>   	    TP_STRUCT__entry(
>   			     __field(u32, dev)
>   			     __field(u32, ring)
> +			     __field(u32, uniq)
>   			     __field(u32, seqno)
>   			     __field(bool, blocking)
>   			     ),
> @@ -576,13 +580,14 @@ TRACE_EVENT(i915_gem_request_wait_begin,
>   						i915_gem_request_get_ring(req);
>   			   __entry->dev = ring->dev->primary->index;
>   			   __entry->ring = ring->id;
> +			   __entry->uniq = req ? req->uniq : 0;
>   			   __entry->seqno = i915_gem_request_get_seqno(req);
>   			   __entry->blocking =
>   				     mutex_is_locked(&ring->dev->struct_mutex);
>   			   ),
>
> -	    TP_printk("dev=%u, ring=%u, seqno=%u, blocking=%s",
> -		      __entry->dev, __entry->ring,
> +	    TP_printk("dev=%u, ring=%u, uniq=%u, seqno=%u, blocking=%s",
> +		      __entry->dev, __entry->ring, __entry->uniq,
>   		      __entry->seqno, __entry->blocking ?  "yes (NB)" : "no")
>   );
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 96ae90a..6a5ed07 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -549,6 +549,7 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
>   				   struct drm_i915_gem_request *request)
>   {
>   	struct drm_i915_gem_request *cursor;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>   	int num_elements = 0;
>
>   	if (to != ring->default_context)
> @@ -565,6 +566,7 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
>   		request->ring = ring;
>   		request->ctx = to;
>   		kref_init(&request->ref);
> +		request->uniq = dev_priv->request_uniq++;
>   		i915_gem_context_reference(request->ctx);
>   	} else {
>   		i915_gem_request_reference(request);
>


Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time
  2015-05-29 16:44 ` [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time John.C.Harrison
@ 2015-06-03 11:15   ` Tomas Elf
  0 siblings, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-03 11:15 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 29/05/2015 17:44, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> In _i915_add_request(), the request is associated with a userland client.
> Specifically it is linked to the 'file' structure and the current user process
> is recorded. One problem here is that the current user process is not
> necessarily the same as when the request was submitted to the driver. This is
> especially true when the GPU scheduler arrives and decouples driver submission
> from hardware submission. Note also that it is only in the case where the add
> request comes from an execbuff call that there is a client to associate. Any
> other add request call is kernel only so does not need to do it.
>
> This patch moves the client association into a separate function. This is then
> called from the execbuffer code path itself at a sensible time. It also removes
> the now redundant 'file' pointer from the add request parameter list.
>
> An extra cleanup of the client association is also added to the request clean up
> code for the eventuality where the request is killed after association but
> before being submitted (e.g. due to out of memory error somewhere). Once the
> submission has happened, the request is on the request list and the regular
> request list removal will clear the association. Note that this still needs to
> happen at this point in time because the request might be kept floating around
> much longer (due to someone holding a reference count) and the client should not
> be worrying about this request after it has been retired.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h            |    7 ++--
>   drivers/gpu/drm/i915/i915_gem.c            |   56 ++++++++++++++++++++--------
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |    6 ++-
>   3 files changed, 49 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f9b6517..18bfc84 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2199,6 +2199,8 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   			   struct drm_i915_gem_request **req_out);
>   void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>   void i915_gem_request_free(struct kref *req_ref);
> +int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
> +				   struct drm_file *file);
>
>   static inline uint32_t
>   i915_gem_request_get_seqno(struct drm_i915_gem_request *req)
> @@ -2864,13 +2866,12 @@ void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>   int __must_check i915_gpu_idle(struct drm_device *dev);
>   int __must_check i915_gem_suspend(struct drm_device *dev);
>   void __i915_add_request(struct drm_i915_gem_request *req,
> -			struct drm_file *file,
>   			struct drm_i915_gem_object *batch_obj,
>   			bool flush_caches);
>   #define i915_add_request(req) \
> -	__i915_add_request(req, NULL, NULL, true)
> +	__i915_add_request(req, NULL, true)
>   #define i915_add_request_no_flush(req) \
> -	__i915_add_request(req, NULL, NULL, false)
> +	__i915_add_request(req, NULL, false)
>   int __i915_wait_request(struct drm_i915_gem_request *req,
>   			unsigned reset_counter,
>   			bool interruptible,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 5aa0ad0..b8fe931 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1331,6 +1331,33 @@ out:
>   	return ret;
>   }
>
> +int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
> +				   struct drm_file *file)
> +{
> +	struct drm_i915_private *dev_private;
> +	struct drm_i915_file_private *file_priv;
> +
> +	WARN_ON(!req || !file || req->file_priv);
> +
> +	if (!req || !file)
> +		return -EINVAL;
> +
> +	if (req->file_priv)
> +		return -EINVAL;
> +
> +	dev_private = req->ring->dev->dev_private;
> +	file_priv = file->driver_priv;
> +
> +	spin_lock(&file_priv->mm.lock);
> +	req->file_priv = file_priv;
> +	list_add_tail(&req->client_list, &file_priv->mm.request_list);
> +	spin_unlock(&file_priv->mm.lock);
> +
> +	req->pid = get_pid(task_pid(current));
> +
> +	return 0;
> +}
> +
>   static inline void
>   i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
>   {
> @@ -1343,6 +1370,9 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
>   	list_del(&request->client_list);
>   	request->file_priv = NULL;
>   	spin_unlock(&file_priv->mm.lock);
> +
> +	put_pid(request->pid);
> +	request->pid = NULL;
>   }
>
>   static void i915_gem_request_retire(struct drm_i915_gem_request *request)
> @@ -1362,8 +1392,6 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>   	list_del_init(&request->list);
>   	i915_gem_request_remove_from_client(request);
>
> -	put_pid(request->pid);
> -
>   	i915_gem_request_unreference(request);
>   }
>
> @@ -2468,7 +2496,6 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
>    * going to happen on the hardware. This would be a Bad Thing(tm).
>    */
>   void __i915_add_request(struct drm_i915_gem_request *request,
> -			struct drm_file *file,
>   			struct drm_i915_gem_object *obj,
>   			bool flush_caches)
>   {
> @@ -2538,19 +2565,6 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>
>   	request->emitted_jiffies = jiffies;
>   	list_add_tail(&request->list, &ring->request_list);
> -	request->file_priv = NULL;
> -
> -	if (file) {
> -		struct drm_i915_file_private *file_priv = file->driver_priv;
> -
> -		spin_lock(&file_priv->mm.lock);
> -		request->file_priv = file_priv;
> -		list_add_tail(&request->client_list,
> -			      &file_priv->mm.request_list);
> -		spin_unlock(&file_priv->mm.lock);
> -
> -		request->pid = get_pid(task_pid(current));
> -	}
>
>   	trace_i915_gem_request_add(request);
>
> @@ -2616,6 +2630,9 @@ void i915_gem_request_free(struct kref *req_ref)
>   						 typeof(*req), ref);
>   	struct intel_context *ctx = req->ctx;
>
> +	if (req->file_priv)
> +		i915_gem_request_remove_from_client(req);
> +
>   	if (ctx) {
>   		if (i915.enable_execlists) {
>   			struct intel_engine_cs *ring = req->ring;
> @@ -4320,6 +4337,13 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>   		if (time_after_eq(request->emitted_jiffies, recent_enough))
>   			break;
>
> +		/*
> +		 * Note that the request might not have been submitted yet.
> +		 * In which case emitted_jiffies will be zero.
> +		 */
> +		if (!request->emitted_jiffies)
> +			continue;
> +
>   		target = request;
>   	}
>   	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index e868ac1..52139c6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1058,7 +1058,7 @@ i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>   	params->ring->gpu_caches_dirty = true;
>
>   	/* Add a breadcrumb for the completion of the batch buffer */
> -	__i915_add_request(params->request, params->file, params->batch_obj, true);
> +	__i915_add_request(params->request, params->batch_obj, true);
>   }
>
>   static int
> @@ -1612,6 +1612,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>   	if (ret)
>   		goto err_batch_unpin;
>
> +	ret = i915_gem_request_add_to_client(params->request, file);
> +	if (ret)
> +		goto err_batch_unpin;
> +
>   	/*
>   	 * Save assorted stuff away to pass through to *_submission().
>   	 * NB: This data should be 'persistent' and not local as it will
>


Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
  2015-06-02 18:14   ` Tomas Elf
@ 2015-06-04 12:06   ` John.C.Harrison
  2015-06-09 16:00     ` Tomas Elf
  2015-06-17 14:04     ` Daniel Vetter
  2015-06-19 16:34   ` John.C.Harrison
  2 siblings, 2 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-06-04 12:06 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

It is a bad idea for i915_add_request() to fail. The work will already have been
send to the ring and will be processed, but there will not be any tracking or
management of that work.

The only way the add request call can fail is if it can't write its epilogue
commands to the ring (cache flushing, seqno updates, interrupt signalling). The
reasons for that are mostly down to running out of ring buffer space and the
problems associated with trying to get some more. This patch prevents that
situation from happening in the first place.

When a request is created, it marks sufficient space as reserved for the
epilogue commands. Thus guaranteeing that by the time the epilogue is written,
there will be plenty of space for it. Note that a ring_begin() call is required
to actually reserve the space (and do any potential waiting). However, that is
not currently done at request creation time. This is because the ring_begin()
code can allocate a request. Hence calling begin() from the request allocation
code would lead to infinite recursion! Later patches in this series remove the
need for begin() to do the allocate. At that point, it becomes safe for the
allocate to call begin() and really reserve the space.

Until then, there is a potential for insufficient space to be available at the
point of calling i915_add_request(). However, that would only be in the case
where the request was created and immediately submitted without ever calling
ring_begin() and adding any work to that request. Which should never happen. And
even if it does, and if that request happens to fall down the tiny window of
opportunity for failing due to being out of ring space then does it really
matter because the request wasn't doing anything in the first place?

v2: Updated the 'reserved space too small' warning to include the offending
sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
re-initialisation of tracking state after a buffer wrap to keep the sanity
checks accurate.

v3: Incremented the reserved size to accommodate Ironlake (after finally
managing to run on an ILK system). Also fixed missing wrap code in LRC mode.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    1 +
 drivers/gpu/drm/i915/i915_gem.c         |   37 +++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        |   18 ++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |   68 ++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++++++
 5 files changed, 147 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e9d76f3..44dee31 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
 
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct intel_context *ctx);
+void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 
 static inline uint32_t
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 78f6a89..516e9b7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2495,6 +2495,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
 	} else
 		ringbuf = ring->buffer;
 
+	/*
+	 * To ensure that this call will not fail, space for its emissions
+	 * should already have been reserved in the ring buffer. Let the ring
+	 * know that it is time to use that space up.
+	 */
+	intel_ring_reserved_space_use(ringbuf);
+
 	request_start = intel_ring_get_tail(ringbuf);
 	/*
 	 * Emit any outstanding flushes - execbuf can fail to emit the flush
@@ -2577,6 +2584,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
 			   round_jiffies_up_relative(HZ));
 	intel_mark_busy(dev_priv->dev);
 
+	/* Sanity check that the reserved size was large enough. */
+	intel_ring_reserved_space_end(ringbuf);
+
 	return 0;
 }
 
@@ -2676,6 +2686,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	if (ret)
 		goto err;
 
+	/*
+	 * Reserve space in the ring buffer for all the commands required to
+	 * eventually emit this request. This is to guarantee that the
+	 * i915_add_request() call can't fail. Note that the reserve may need
+	 * to be redone if the request is not actually submitted straight
+	 * away, e.g. because a GPU scheduler has deferred it.
+	 *
+	 * Note further that this call merely notes the reserve request. A
+	 * subsequent call to *_ring_begin() is required to actually ensure
+	 * that the reservation is available. Without the begin, if the
+	 * request creator immediately submitted the request without adding
+	 * any commands to it then there might not actually be sufficient
+	 * room for the submission commands. Unfortunately, the current
+	 * *_ring_begin() implementations potentially call back here to
+	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
+	 * infinite recursion! Until that back call path is removed, it is
+	 * necessary to do a manual _begin() outside.
+	 */
+	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
+
 	ring->outstanding_lazy_request = req;
 	return 0;
 
@@ -2684,6 +2714,13 @@ err:
 	return ret;
 }
 
+void i915_gem_request_cancel(struct drm_i915_gem_request *req)
+{
+	intel_ring_reserved_space_cancel(req->ringbuf);
+
+	i915_gem_request_unreference(req);
+}
+
 struct drm_i915_gem_request *
 i915_gem_find_active_request(struct intel_engine_cs *ring)
 {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6a5ed07..42a756d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -687,6 +687,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
 	unsigned space;
 	int ret;
 
+	/* The whole point of reserving space is to not wait! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (intel_ring_space(ringbuf) >= bytes)
 		return 0;
 
@@ -747,6 +750,9 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
 	uint32_t __iomem *virt;
 	int rem = ringbuf->size - ringbuf->tail;
 
+	/* Can't wrap if space has already been reserved! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (ringbuf->space < rem) {
 		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
 
@@ -770,10 +776,22 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
 {
 	int ret;
 
+	if (!ringbuf->reserved_in_use)
+		bytes += ringbuf->reserved_size;
+
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
+		WARN_ON(ringbuf->reserved_in_use);
+
 		ret = logical_ring_wrap_buffer(ringbuf, ctx);
 		if (unlikely(ret))
 			return ret;
+
+		if(ringbuf->reserved_size) {
+			uint32_t size = ringbuf->reserved_size;
+
+			intel_ring_reserved_space_cancel(ringbuf);
+			intel_ring_reserved_space_reserve(ringbuf, size);
+		}
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d934f85..74c2222 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2103,6 +2103,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
 	unsigned space;
 	int ret;
 
+	/* The whole point of reserving space is to not wait! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (intel_ring_space(ringbuf) >= n)
 		return 0;
 
@@ -2130,6 +2133,9 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
 	struct intel_ringbuffer *ringbuf = ring->buffer;
 	int rem = ringbuf->size - ringbuf->tail;
 
+	/* Can't wrap if space has already been reserved! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (ringbuf->space < rem) {
 		int ret = ring_wait_for_space(ring, rem);
 		if (ret)
@@ -2180,16 +2186,74 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
 	return 0;
 }
 
-static int __intel_ring_prepare(struct intel_engine_cs *ring,
-				int bytes)
+void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
+{
+	/* NB: Until request management is fully tidied up and the OLR is
+	 * removed, there are too many ways for get false hits on this
+	 * anti-recursion check! */
+	/*WARN_ON(ringbuf->reserved_size);*/
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_size = size;
+
+	/*
+	 * Really need to call _begin() here but that currently leads to
+	 * recursion problems! This will be fixed later but for now just
+	 * return and hope for the best. Note that there is only a real
+	 * problem if the create of the request never actually calls _begin()
+	 * but if they are not submitting any work then why did they create
+	 * the request in the first place?
+	 */
+}
+
+void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_size   = 0;
+	ringbuf->reserved_in_use = false;
+}
+
+void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_in_use = true;
+	ringbuf->reserved_tail   = ringbuf->tail;
+}
+
+void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(!ringbuf->reserved_in_use);
+	WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
+	     "request reserved size too small: %d vs %d!\n",
+	     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
+
+	ringbuf->reserved_size   = 0;
+	ringbuf->reserved_in_use = false;
+}
+
+static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
 {
 	struct intel_ringbuffer *ringbuf = ring->buffer;
 	int ret;
 
+	if (!ringbuf->reserved_in_use)
+		bytes += ringbuf->reserved_size;
+
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
+		WARN_ON(ringbuf->reserved_in_use);
+
 		ret = intel_wrap_ring_buffer(ring);
 		if (unlikely(ret))
 			return ret;
+
+		if(ringbuf->reserved_size) {
+			uint32_t size = ringbuf->reserved_size;
+
+			intel_ring_reserved_space_cancel(ringbuf);
+			intel_ring_reserved_space_reserve(ringbuf, size);
+		}
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 39f6dfc..bf2ac28 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -105,6 +105,9 @@ struct intel_ringbuffer {
 	int space;
 	int size;
 	int effective_size;
+	int reserved_size;
+	int reserved_tail;
+	bool reserved_in_use;
 
 	/** We track the position of the requests in the ring buffer, and
 	 * when each is retired we increment last_retired_head as the GPU
@@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
 	return ring->outstanding_lazy_request;
 }
 
+/*
+ * Arbitrary size for largest possible 'add request' sequence. The code paths
+ * are complex and variable. Empirical measurement shows that the worst case
+ * is ILK at 136 words. Reserving too much is better than reserving too little
+ * as that allows for corner cases that might have been missed. So the figure
+ * has been rounded up to 160 words.
+ */
+#define MIN_SPACE_FOR_ADD_REQUEST	160
+
+/*
+ * Reserve space in the ring to guarantee that the i915_add_request() call
+ * will always have sufficient room to do its stuff. The request creation
+ * code calls this automatically.
+ */
+void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
+/* Cancel the reservation, e.g. because the request is being discarded. */
+void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
+/* Use the reserved space - for use by i915_add_request() only. */
+void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
+/* Finish with the reserved space - for use by i915_add_request() only. */
+void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-02 18:26   ` Tomas Elf
@ 2015-06-04 12:57     ` John Harrison
  2015-06-18 12:14       ` John.C.Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-04 12:57 UTC (permalink / raw)
  To: Tomas Elf, Intel-GFX

On 02/06/2015 19:26, Tomas Elf wrote:
> On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The plan is to pass requests around as the basic submission tracking 
>> structure
>> rather than rings and contexts. This patch updates the 
>> i915_gem_object_sync()
>> code path.
>>
>> v2: Much more complex patch to share a single request between the 
>> sync and the
>> page flip. The _sync() function now supports lazy allocation of the 
>> request
>> structure. That is, if one is passed in then that will be used. If 
>> one is not,
>> then a request will be allocated and passed back out. Note that the 
>> _sync() code
>> does not necessarily require a request. Thus one will only be created 
>> until
>> certain situations. The reason the lazy allocation must be done 
>> within the
>> _sync() code itself is because the decision to need one or not is not 
>> really
>> something that code above can second guess (except in the case where 
>> one is
>> definitely not required because no ring is passed in).
>>
>> The call chains above _sync() now support passing a request through 
>> which most
>> callers passing in NULL and assuming that no request will be required 
>> (because
>> they also pass in NULL for the ring and therefore can't be generating 
>> any ring
>> code).
>>
>> The exeception is intel_crtc_page_flip() which now supports having a 
>> request
>
> 1. "The exeception" -> "The exception"
>
>> returned from _sync(). If one is, then that request is shared by the 
>> page flip
>> (if the page flip is of a type to need a request). If _sync() does 
>> not generate
>> a request but the page flip does need one, then the page flip path 
>> will create
>> its own request.
>>
>> v3: Updated comment description to be clearer about 'to_req' 
>> parameter (Tomas
>> Elf review request). Rebased onto newer tree that significantly 
>> changed the
>> synchronisation code.
>>
>> For: VIZ-5115
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
>>   drivers/gpu/drm/i915/i915_gem.c            |   48 
>> +++++++++++++++++++++-------
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
>>   drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
>>   drivers/gpu/drm/i915/intel_drv.h           |    3 +-
>>   drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
>>   drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
>>   drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
>>   8 files changed, 58 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 64a10fa..f69e9cb 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2778,7 +2778,8 @@ static inline void 
>> i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>>
>>   int __must_check i915_mutex_lock_interruptible(struct drm_device 
>> *dev);
>>   int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>> -             struct intel_engine_cs *to);
>> +             struct intel_engine_cs *to,
>> +             struct drm_i915_gem_request **to_req);
>>   void i915_vma_move_to_active(struct i915_vma *vma,
>>                    struct intel_engine_cs *ring);
>>   int i915_gem_dumb_create(struct drm_file *file_priv,
>> @@ -2889,6 +2890,7 @@ int __must_check
>>   i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>>                        u32 alignment,
>>                        struct intel_engine_cs *pipelined,
>> +                     struct drm_i915_gem_request **pipelined_request,
>>                        const struct i915_ggtt_view *view);
>>   void i915_gem_object_unpin_from_display_plane(struct 
>> drm_i915_gem_object *obj,
>>                             const struct i915_ggtt_view *view);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c 
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index b7d66aa..db90043 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3098,25 +3098,26 @@ out:
>>   static int
>>   __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>>                  struct intel_engine_cs *to,
>> -               struct drm_i915_gem_request *req)
>> +               struct drm_i915_gem_request *from_req,
>> +               struct drm_i915_gem_request **to_req)
>>   {
>>       struct intel_engine_cs *from;
>>       int ret;
>>
>> -    from = i915_gem_request_get_ring(req);
>> +    from = i915_gem_request_get_ring(from_req);
>>       if (to == from)
>>           return 0;
>>
>> -    if (i915_gem_request_completed(req, true))
>> +    if (i915_gem_request_completed(from_req, true))
>>           return 0;
>>
>> -    ret = i915_gem_check_olr(req);
>> +    ret = i915_gem_check_olr(from_req);
>>       if (ret)
>>           return ret;
>>
>>       if (!i915_semaphore_is_enabled(obj->base.dev)) {
>>           struct drm_i915_private *i915 = to_i915(obj->base.dev);
>> -        ret = __i915_wait_request(req,
>> +        ret = __i915_wait_request(from_req,
>> atomic_read(&i915->gpu_error.reset_counter),
>>                         i915->mm.interruptible,
>>                         NULL,
>> @@ -3124,15 +3125,25 @@ __i915_gem_object_sync(struct 
>> drm_i915_gem_object *obj,
>>           if (ret)
>>               return ret;
>>
>> -        i915_gem_object_retire_request(obj, req);
>> +        i915_gem_object_retire_request(obj, from_req);
>>       } else {
>>           int idx = intel_ring_sync_index(from, to);
>> -        u32 seqno = i915_gem_request_get_seqno(req);
>> +        u32 seqno = i915_gem_request_get_seqno(from_req);
>>
>> +        WARN_ON(!to_req);
>> +
>> +        /* Optimization: Avoid semaphore sync when we are sure we 
>> already
>> +         * waited for an object with higher seqno */
>
> 2. How about using the standard multi-line comment format?
>

Not my comment. It looks like Chris removed it in his re-write of the 
sync code and I accidentally put it back in when resolving the merge 
conflicts. I'll drop it again.

> /* (empty line)
>  * (first line)
>  * (second line)
>  */
>
>>           if (seqno <= from->semaphore.sync_seqno[idx])
>>               return 0;
>>
>> -        trace_i915_gem_ring_sync_to(from, to, req);
>> +        if (*to_req == NULL) {
>> +            ret = i915_gem_request_alloc(to, to->default_context, 
>> to_req);
>> +            if (ret)
>> +                return ret;
>> +        }
>> +
>> +        trace_i915_gem_ring_sync_to(from, to, from_req);
>>           ret = to->semaphore.sync_to(to, from, seqno);
>>           if (ret)
>>               return ret;
>> @@ -3153,6 +3164,9 @@ __i915_gem_object_sync(struct 
>> drm_i915_gem_object *obj,
>>    *
>>    * @obj: object which may be in use on another ring.
>>    * @to: ring we wish to use the object on. May be NULL.
>> + * @to_req: request we wish to use the object for. See below.
>> + *          This will be allocated and returned if a request is
>> + *          required but not passed in.
>>    *
>>    * This code is meant to abstract object synchronization with the GPU.
>>    * Calling with NULL implies synchronizing the object with the CPU
>> @@ -3168,11 +3182,22 @@ __i915_gem_object_sync(struct 
>> drm_i915_gem_object *obj,
>>    * - If we are a write request (pending_write_domain is set), the new
>>    *   request must wait for outstanding read requests to complete.
>>    *
>> + * For CPU synchronisation (NULL to) no request is required. For 
>> syncing with
>> + * rings to_req must be non-NULL. However, a request does not have 
>> to be
>> + * pre-allocated. If *to_req is null and sync commands will be 
>> emitted then a
>> + * request will be allocated automatically and returned through 
>> *to_req. Note
>> + * that it is not guaranteed that commands will be emitted (because the
>> + * might already be idle). Hence there is no need to create a 
>> request that
>> + * might never have any work submitted. Note further that if a 
>> request is
>> + * returned in *to_req, it is the responsibility of the caller to 
>> submit
>> + * that request (after potentially adding more work to it).
>> + *
>
> 3. "(because the might already be idle)" : The what? The engine?
> 4. "NULL" and "null" mixed. Please be consistent.
>
> Overall, the explanation is better than in the last patch version
>
> With those minor changes:
>
> Reviewed-by: Tomas Elf <tomas.elf@intel.com>
>
> Thanks,
> Tomas
>
>>    * Returns 0 if successful, else propagates up the lower layer error.
>>    */
>>   int
>>   i915_gem_object_sync(struct drm_i915_gem_object *obj,
>> -             struct intel_engine_cs *to)
>> +             struct intel_engine_cs *to,
>> +             struct drm_i915_gem_request **to_req)
>>   {
>>       const bool readonly = obj->base.pending_write_domain == 0;
>>       struct drm_i915_gem_request *req[I915_NUM_RINGS];
>> @@ -3194,7 +3219,7 @@ i915_gem_object_sync(struct drm_i915_gem_object 
>> *obj,
>>                   req[n++] = obj->last_read_req[i];
>>       }
>>       for (i = 0; i < n; i++) {
>> -        ret = __i915_gem_object_sync(obj, to, req[i]);
>> +        ret = __i915_gem_object_sync(obj, to, req[i], to_req);
>>           if (ret)
>>               return ret;
>>       }
>> @@ -4144,12 +4169,13 @@ int
>>   i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>>                        u32 alignment,
>>                        struct intel_engine_cs *pipelined,
>> +                     struct drm_i915_gem_request **pipelined_request,
>>                        const struct i915_ggtt_view *view)
>>   {
>>       u32 old_read_domains, old_write_domain;
>>       int ret;
>>
>> -    ret = i915_gem_object_sync(obj, pipelined);
>> +    ret = i915_gem_object_sync(obj, pipelined, pipelined_request);
>>       if (ret)
>>           return ret;
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 50b1ced..bea92ad 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -899,7 +899,7 @@ i915_gem_execbuffer_move_to_gpu(struct 
>> drm_i915_gem_request *req,
>>           struct drm_i915_gem_object *obj = vma->obj;
>>
>>           if (obj->active & other_rings) {
>> -            ret = i915_gem_object_sync(obj, req->ring);
>> +            ret = i915_gem_object_sync(obj, req->ring, &req);
>>               if (ret)
>>                   return ret;
>>           }
>> diff --git a/drivers/gpu/drm/i915/intel_display.c 
>> b/drivers/gpu/drm/i915/intel_display.c
>> index 657a333..6528ada 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -2338,7 +2338,8 @@ int
>>   intel_pin_and_fence_fb_obj(struct drm_plane *plane,
>>                  struct drm_framebuffer *fb,
>>                  const struct drm_plane_state *plane_state,
>> -               struct intel_engine_cs *pipelined)
>> +               struct intel_engine_cs *pipelined,
>> +               struct drm_i915_gem_request **pipelined_request)
>>   {
>>       struct drm_device *dev = fb->dev;
>>       struct drm_i915_private *dev_priv = dev->dev_private;
>> @@ -2403,7 +2404,7 @@ intel_pin_and_fence_fb_obj(struct drm_plane 
>> *plane,
>>
>>       dev_priv->mm.interruptible = false;
>>       ret = i915_gem_object_pin_to_display_plane(obj, alignment, 
>> pipelined,
>> -                           &view);
>> +                           pipelined_request, &view);
>>       if (ret)
>>           goto err_interruptible;
>>
>> @@ -11119,6 +11120,7 @@ static int intel_crtc_page_flip(struct 
>> drm_crtc *crtc,
>>       struct intel_unpin_work *work;
>>       struct intel_engine_cs *ring;
>>       bool mmio_flip;
>> +    struct drm_i915_gem_request *request = NULL;
>>       int ret;
>>
>>       /*
>> @@ -11225,7 +11227,7 @@ static int intel_crtc_page_flip(struct 
>> drm_crtc *crtc,
>>        */
>>       ret = intel_pin_and_fence_fb_obj(crtc->primary, fb,
>>                        crtc->primary->state,
>> -                     mmio_flip ? 
>> i915_gem_request_get_ring(obj->last_write_req) : ring);
>> +                     mmio_flip ? 
>> i915_gem_request_get_ring(obj->last_write_req) : ring, &request);
>>       if (ret)
>>           goto cleanup_pending;
>>
>> @@ -11256,6 +11258,9 @@ static int intel_crtc_page_flip(struct 
>> drm_crtc *crtc,
>>                       intel_ring_get_request(ring));
>>       }
>>
>> +    if (request)
>> +        i915_add_request_no_flush(request->ring);
>> +
>>       work->flip_queued_vblank = drm_crtc_vblank_count(crtc);
>>       work->enable_stall_check = true;
>>
>> @@ -11273,6 +11278,8 @@ static int intel_crtc_page_flip(struct 
>> drm_crtc *crtc,
>>   cleanup_unpin:
>>       intel_unpin_fb_obj(fb, crtc->primary->state);
>>   cleanup_pending:
>> +    if (request)
>> +        i915_gem_request_cancel(request);
>>       atomic_dec(&intel_crtc->unpin_work_count);
>>       mutex_unlock(&dev->struct_mutex);
>>   cleanup:
>> @@ -13171,7 +13178,7 @@ intel_prepare_plane_fb(struct drm_plane *plane,
>>           if (ret)
>>               DRM_DEBUG_KMS("failed to attach phys object\n");
>>       } else {
>> -        ret = intel_pin_and_fence_fb_obj(plane, fb, new_state, NULL);
>> +        ret = intel_pin_and_fence_fb_obj(plane, fb, new_state, NULL, 
>> NULL);
>>       }
>>
>>       if (ret == 0)
>> @@ -15218,7 +15225,7 @@ void intel_modeset_gem_init(struct drm_device 
>> *dev)
>>           ret = intel_pin_and_fence_fb_obj(c->primary,
>>                            c->primary->fb,
>>                            c->primary->state,
>> -                         NULL);
>> +                         NULL, NULL);
>>           mutex_unlock(&dev->struct_mutex);
>>           if (ret) {
>>               DRM_ERROR("failed to pin boot fb on pipe %d\n",
>> diff --git a/drivers/gpu/drm/i915/intel_drv.h 
>> b/drivers/gpu/drm/i915/intel_drv.h
>> index 02d8317..73650ae 100644
>> --- a/drivers/gpu/drm/i915/intel_drv.h
>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>> @@ -1034,7 +1034,8 @@ void intel_release_load_detect_pipe(struct 
>> drm_connector *connector,
>>   int intel_pin_and_fence_fb_obj(struct drm_plane *plane,
>>                      struct drm_framebuffer *fb,
>>                      const struct drm_plane_state *plane_state,
>> -                   struct intel_engine_cs *pipelined);
>> +                   struct intel_engine_cs *pipelined,
>> +                   struct drm_i915_gem_request **pipelined_request);
>>   struct drm_framebuffer *
>>   __intel_framebuffer_create(struct drm_device *dev,
>>                  struct drm_mode_fb_cmd2 *mode_cmd,
>> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
>> b/drivers/gpu/drm/i915/intel_fbdev.c
>> index 4e7e7da..dd9f3b2 100644
>> --- a/drivers/gpu/drm/i915/intel_fbdev.c
>> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
>> @@ -151,7 +151,7 @@ static int intelfb_alloc(struct drm_fb_helper 
>> *helper,
>>       }
>>
>>       /* Flush everything out, we'll be doing GTT only from now on */
>> -    ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL);
>> +    ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL, NULL);
>>       if (ret) {
>>           DRM_ERROR("failed to pin obj: %d\n", ret);
>>           goto out_fb;
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index 6d005b1..f8e8fdb 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -638,7 +638,7 @@ static int execlists_move_to_gpu(struct 
>> drm_i915_gem_request *req,
>>           struct drm_i915_gem_object *obj = vma->obj;
>>
>>           if (obj->active & other_rings) {
>> -            ret = i915_gem_object_sync(obj, req->ring);
>> +            ret = i915_gem_object_sync(obj, req->ring, &req);
>>               if (ret)
>>                   return ret;
>>           }
>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c 
>> b/drivers/gpu/drm/i915/intel_overlay.c
>> index e7534b9..0f8187a 100644
>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>> @@ -724,7 +724,7 @@ static int intel_overlay_do_put_image(struct 
>> intel_overlay *overlay,
>>       if (ret != 0)
>>           return ret;
>>
>> -    ret = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL,
>> +    ret = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL, NULL,
>>                              &i915_ggtt_view_normal);
>>       if (ret != 0)
>>           return ret;
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 03/55] drm/i915: i915_add_request must not fail
  2015-06-02 18:16   ` Tomas Elf
@ 2015-06-04 14:07     ` John Harrison
  2015-06-05 10:55       ` Tomas Elf
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-04 14:07 UTC (permalink / raw)
  To: Tomas Elf, Intel-GFX

On 02/06/2015 19:16, Tomas Elf wrote:
> On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The i915_add_request() function is called to keep track of work that 
>> has been
>> written to the ring buffer. It adds epilogue commands to track 
>> progress (seqno
>> updates and such), moves the request structure onto the right list 
>> and other
>> such house keeping tasks. However, the work itself has already been 
>> written to
>> the ring and will get executed whether or not the add request call 
>> succeeds. So
>> no matter what goes wrong, there isn't a whole lot of point in 
>> failing the call.
>>
>> At the moment, this is fine(ish). If the add request does bail early 
>> on and not
>> do the housekeeping, the request will still float around in the
>> ring->outstanding_lazy_request field and be picked up next time. It 
>> means
>> multiple pieces of work will be tagged as the same request and driver 
>> can't
>> actually wait for the first piece of work until something else has been
>> submitted. But it all sort of hangs together.
>>
>> This patch series is all about removing the OLR and guaranteeing that 
>> each piece
>> of work gets its own personal request. That means that there is no more
>> 'hoovering up of forgotten requests'. If the request does not get 
>> tracked then
>> it will be leaked. Thus the add request call _must_ not fail. The 
>> previous patch
>> should have already ensured that it _will_ not fail by removing the 
>> potential
>> for running out of ring space. This patch enforces the rule by 
>> actually removing
>> the early exit paths and the return code.
>>
>> Note that if something does manage to fail and the epilogue commands 
>> don't get
>> written to the ring, the driver will still hang together. The request 
>> will be
>> added to the tracking lists. And as in the old case, any subsequent 
>> work will
>> generate a new seqno which will suffice for marking the old one as 
>> complete.
>>
>> v2: Improved WARNings (Tomas Elf review request).
>>
>> For: VIZ-5115
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h              |    6 ++--
>>   drivers/gpu/drm/i915/i915_gem.c              |   43 
>> ++++++++++++--------------
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
>>   drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
>>   drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
>>   drivers/gpu/drm/i915/intel_overlay.c         |    8 ++---
>>   drivers/gpu/drm/i915/intel_ringbuffer.c      |    8 ++---
>>   7 files changed, 31 insertions(+), 40 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index eba1857..1be4a52 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2860,9 +2860,9 @@ void i915_gem_init_swizzling(struct drm_device 
>> *dev);
>>   void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>>   int __must_check i915_gpu_idle(struct drm_device *dev);
>>   int __must_check i915_gem_suspend(struct drm_device *dev);
>> -int __i915_add_request(struct intel_engine_cs *ring,
>> -               struct drm_file *file,
>> -               struct drm_i915_gem_object *batch_obj);
>> +void __i915_add_request(struct intel_engine_cs *ring,
>> +            struct drm_file *file,
>> +            struct drm_i915_gem_object *batch_obj);
>>   #define i915_add_request(ring) \
>>       __i915_add_request(ring, NULL, NULL)
>>   int __i915_wait_request(struct drm_i915_gem_request *req,
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c 
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index 6f51416..dd39aa5 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1155,15 +1155,12 @@ i915_gem_check_wedge(struct i915_gpu_error 
>> *error,
>>   int
>>   i915_gem_check_olr(struct drm_i915_gem_request *req)
>>   {
>> -    int ret;
>> -
>> WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
>>
>> -    ret = 0;
>>       if (req == req->ring->outstanding_lazy_request)
>> -        ret = i915_add_request(req->ring);
>> +        i915_add_request(req->ring);
>>
>> -    return ret;
>> +    return 0;
>>   }
>
> i915_gem_check_olr never returns anything but 0. How about making it 
> void?
Seems like redundant/unnecessary changes given that the entire function 
is removed later in the series. Changing it from a '__must_check' to a 
'void' now would be a lot of extra changes.

> Thanks,
> Tomas
>
>>
>>   static void fake_irq(unsigned long data)
>> @@ -2466,9 +2463,14 @@ i915_gem_get_seqno(struct drm_device *dev, u32 
>> *seqno)
>>       return 0;
>>   }
>>
>> -int __i915_add_request(struct intel_engine_cs *ring,
>> -               struct drm_file *file,
>> -               struct drm_i915_gem_object *obj)
>> +/*
>> + * NB: This function is not allowed to fail. Doing so would mean the 
>> the
>> + * request is not being tracked for completion but the work itself is
>> + * going to happen on the hardware. This would be a Bad Thing(tm).
>> + */
>> +void __i915_add_request(struct intel_engine_cs *ring,
>> +            struct drm_file *file,
>> +            struct drm_i915_gem_object *obj)
>>   {
>>       struct drm_i915_private *dev_priv = ring->dev->dev_private;
>>       struct drm_i915_gem_request *request;
>> @@ -2478,7 +2480,7 @@ int __i915_add_request(struct intel_engine_cs 
>> *ring,
>>
>>       request = ring->outstanding_lazy_request;
>>       if (WARN_ON(request == NULL))
>> -        return -ENOMEM;
>> +        return;
>
> You have a WARN for the other points of failure in this function, why 
> not here?
Erm, you mean like the one in the 'if' itself?

>
>>
>>       if (i915.enable_execlists) {
>>           ringbuf = request->ctx->engine[ring->id].ringbuf;
>> @@ -2500,15 +2502,12 @@ int __i915_add_request(struct intel_engine_cs 
>> *ring,
>>        * is that the flush _must_ happen before the next request, no 
>> matter
>>        * what.
>>        */
>> -    if (i915.enable_execlists) {
>> +    if (i915.enable_execlists)
>>           ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
>> -        if (ret)
>> -            return ret;
>> -    } else {
>> +    else
>>           ret = intel_ring_flush_all_caches(ring);
>> -        if (ret)
>> -            return ret;
>> -    }
>> +    /* Not allowed to fail! */
>> +    WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
>>
>>       /* Record the position of the start of the request so that
>>        * should we detect the updated seqno part-way through the
>> @@ -2517,17 +2516,15 @@ int __i915_add_request(struct intel_engine_cs 
>> *ring,
>>        */
>>       request->postfix = intel_ring_get_tail(ringbuf);
>>
>> -    if (i915.enable_execlists) {
>> +    if (i915.enable_execlists)
>>           ret = ring->emit_request(ringbuf, request);
>> -        if (ret)
>> -            return ret;
>> -    } else {
>> +    else {
>>           ret = ring->add_request(ring);
>> -        if (ret)
>> -            return ret;
>>
>>           request->tail = intel_ring_get_tail(ringbuf);
>>       }
>> +    /* Not allowed to fail! */
>> +    WARN(ret, "emit|add_request failed: %d!\n", ret);
>>
>>       request->head = request_start;
>>
>> @@ -2576,8 +2573,6 @@ int __i915_add_request(struct intel_engine_cs 
>> *ring,
>>
>>       /* Sanity check that the reserved size was large enough. */
>>       intel_ring_reserved_space_end(ringbuf);
>> -
>> -    return 0;
>>   }
>>
>>   static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index bd0e4bd..2b48a31 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1061,7 +1061,7 @@ i915_gem_execbuffer_retire_commands(struct 
>> drm_device *dev,
>>       ring->gpu_caches_dirty = true;
>>
>>       /* Add a breadcrumb for the completion of the batch buffer */
>> -    (void)__i915_add_request(ring, file, obj);
>> +    __i915_add_request(ring, file, obj);
>>   }
>>
>>   static int
>> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c 
>> b/drivers/gpu/drm/i915/i915_gem_render_state.c
>> index 521548a..ce4788f 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
>> @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct 
>> intel_engine_cs *ring)
>>
>>       i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>>
>> -    ret = __i915_add_request(ring, NULL, so.obj);
>> +    __i915_add_request(ring, NULL, so.obj);
>>       /* __i915_add_request moves object to inactive if it fails */
>>   out:
>>       i915_gem_render_state_fini(&so);
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index e62d396..7a75fc8 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1373,7 +1373,7 @@ static int 
>> intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>>
>>       i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>>
>> -    ret = __i915_add_request(ring, file, so.obj);
>> +    __i915_add_request(ring, file, so.obj);
>>       /* intel_logical_ring_add_request moves object to inactive if it
>>        * fails */
>>   out:
>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c 
>> b/drivers/gpu/drm/i915/intel_overlay.c
>> index 25c8ec6..e7534b9 100644
>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>> @@ -220,9 +220,7 @@ static int intel_overlay_do_wait_request(struct 
>> intel_overlay *overlay,
>>       WARN_ON(overlay->last_flip_req);
>>       i915_gem_request_assign(&overlay->last_flip_req,
>>                            ring->outstanding_lazy_request);
>> -    ret = i915_add_request(ring);
>> -    if (ret)
>> -        return ret;
>> +    i915_add_request(ring);
>>
>>       overlay->flip_tail = tail;
>>       ret = i915_wait_request(overlay->last_flip_req);
>> @@ -291,7 +289,9 @@ static int intel_overlay_continue(struct 
>> intel_overlay *overlay,
>>       WARN_ON(overlay->last_flip_req);
>>       i915_gem_request_assign(&overlay->last_flip_req,
>>                            ring->outstanding_lazy_request);
>> -    return i915_add_request(ring);
>> +    i915_add_request(ring);
>> +
>> +    return 0;
>>   }
>>
>>   static void intel_overlay_release_old_vid_tail(struct intel_overlay 
>> *overlay)
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 74c2222..7061b07 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2156,14 +2156,10 @@ static int intel_wrap_ring_buffer(struct 
>> intel_engine_cs *ring)
>>   int intel_ring_idle(struct intel_engine_cs *ring)
>>   {
>>       struct drm_i915_gem_request *req;
>> -    int ret;
>>
>>       /* We need to add any requests required to flush the objects 
>> and ring */
>> -    if (ring->outstanding_lazy_request) {
>> -        ret = i915_add_request(ring);
>> -        if (ret)
>> -            return ret;
>> -    }
>> +    if (ring->outstanding_lazy_request)
>> +        i915_add_request(ring);
>>
>>       /* Wait upon the last request to be completed */
>>       if (list_empty(&ring->request_list))
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* [PATCH 14/56] drm/i915: Make retire condition check for requests not objects
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (54 preceding siblings ...)
  2015-05-29 16:44 ` [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active() John.C.Harrison
@ 2015-06-04 18:23 ` John.C.Harrison
  2015-06-04 18:24   ` John Harrison
  2015-06-09 15:56   ` Tomas Elf
  2015-06-22 21:04 ` [PATCH 00/55] Remove the outstanding_lazy_request Daniel Vetter
  56 siblings, 2 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-06-04 18:23 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

A previous patch (read-read optimisation) changed the early exit
condition in i915_gem_retire_requests_ring() from checking the request
list to checking the active list. This assumes that all requests have
objects associated with them which are placed on the active list. The
removal of the OLR means that non-batch buffer work is no longer
tagged onto the nearest batch buffer submission and thus there are
requests going through the system which do not have objects associated
with them. This can therefore lead to the situation where an
outstanding request never gets retired.

This change reverts the early exit condition to check for requests.
Given that the purpose of the function is to retire requests, this
does seem to make much more sense.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7117659..4c5a6cd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2859,7 +2859,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 {
 	WARN_ON(i915_verify_lists(ring->dev));
 
-	if (list_empty(&ring->active_list))
+	if (list_empty(&ring->request_list))
 		return;
 
 	/* Retire requests first as we use it above for the early return.
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* Re: [PATCH 14/56] drm/i915: Make retire condition check for requests not objects
  2015-06-04 18:23 ` [PATCH 14/56] drm/i915: Make retire condition check for requests not objects John.C.Harrison
@ 2015-06-04 18:24   ` John Harrison
  2015-06-09 15:56   ` Tomas Elf
  1 sibling, 0 replies; 120+ messages in thread
From: John Harrison @ 2015-06-04 18:24 UTC (permalink / raw)
  To: Intel-GFX

Note that this is a new patch to the series. The issue was found when 
debugging a problem with the conversion to struct fence that is still in 
progress. Basically, it was possible to get continuous TDR timeouts on 
the ECS ring because the start of day initialisation request never got 
retired and the TDR got confused by this.

This patch should therefore be included in the anti-OLR series as a new 
patch #14 (immediately before 'update i915_gpu_idle() to ...'). As that 
is the point at which non-batch buffer requests start to appear in the 
system.


On 04/06/2015 19:23, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> A previous patch (read-read optimisation) changed the early exit
> condition in i915_gem_retire_requests_ring() from checking the request
> list to checking the active list. This assumes that all requests have
> objects associated with them which are placed on the active list. The
> removal of the OLR means that non-batch buffer work is no longer
> tagged onto the nearest batch buffer submission and thus there are
> requests going through the system which do not have objects associated
> with them. This can therefore lead to the situation where an
> outstanding request never gets retired.
>
> This change reverts the early exit condition to check for requests.
> Given that the purpose of the function is to retire requests, this
> does seem to make much more sense.
>
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 7117659..4c5a6cd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2859,7 +2859,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
>   {
>   	WARN_ON(i915_verify_lists(ring->dev));
>   
> -	if (list_empty(&ring->active_list))
> +	if (list_empty(&ring->request_list))
>   		return;
>   
>   	/* Retire requests first as we use it above for the early return.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 03/55] drm/i915: i915_add_request must not fail
  2015-06-04 14:07     ` John Harrison
@ 2015-06-05 10:55       ` Tomas Elf
  0 siblings, 0 replies; 120+ messages in thread
From: Tomas Elf @ 2015-06-05 10:55 UTC (permalink / raw)
  To: John Harrison, Intel-GFX

On 04/06/2015 15:07, John Harrison wrote:
> On 02/06/2015 19:16, Tomas Elf wrote:
>> On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> The i915_add_request() function is called to keep track of work that
>>> has been
>>> written to the ring buffer. It adds epilogue commands to track
>>> progress (seqno
>>> updates and such), moves the request structure onto the right list
>>> and other
>>> such house keeping tasks. However, the work itself has already been
>>> written to
>>> the ring and will get executed whether or not the add request call
>>> succeeds. So
>>> no matter what goes wrong, there isn't a whole lot of point in
>>> failing the call.
>>>
>>> At the moment, this is fine(ish). If the add request does bail early
>>> on and not
>>> do the housekeeping, the request will still float around in the
>>> ring->outstanding_lazy_request field and be picked up next time. It
>>> means
>>> multiple pieces of work will be tagged as the same request and driver
>>> can't
>>> actually wait for the first piece of work until something else has been
>>> submitted. But it all sort of hangs together.
>>>
>>> This patch series is all about removing the OLR and guaranteeing that
>>> each piece
>>> of work gets its own personal request. That means that there is no more
>>> 'hoovering up of forgotten requests'. If the request does not get
>>> tracked then
>>> it will be leaked. Thus the add request call _must_ not fail. The
>>> previous patch
>>> should have already ensured that it _will_ not fail by removing the
>>> potential
>>> for running out of ring space. This patch enforces the rule by
>>> actually removing
>>> the early exit paths and the return code.
>>>
>>> Note that if something does manage to fail and the epilogue commands
>>> don't get
>>> written to the ring, the driver will still hang together. The request
>>> will be
>>> added to the tracking lists. And as in the old case, any subsequent
>>> work will
>>> generate a new seqno which will suffice for marking the old one as
>>> complete.
>>>
>>> v2: Improved WARNings (Tomas Elf review request).
>>>
>>> For: VIZ-5115
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.h              |    6 ++--
>>>   drivers/gpu/drm/i915/i915_gem.c              |   43
>>> ++++++++++++--------------
>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
>>>   drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
>>>   drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
>>>   drivers/gpu/drm/i915/intel_overlay.c         |    8 ++---
>>>   drivers/gpu/drm/i915/intel_ringbuffer.c      |    8 ++---
>>>   7 files changed, 31 insertions(+), 40 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index eba1857..1be4a52 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2860,9 +2860,9 @@ void i915_gem_init_swizzling(struct drm_device
>>> *dev);
>>>   void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>>>   int __must_check i915_gpu_idle(struct drm_device *dev);
>>>   int __must_check i915_gem_suspend(struct drm_device *dev);
>>> -int __i915_add_request(struct intel_engine_cs *ring,
>>> -               struct drm_file *file,
>>> -               struct drm_i915_gem_object *batch_obj);
>>> +void __i915_add_request(struct intel_engine_cs *ring,
>>> +            struct drm_file *file,
>>> +            struct drm_i915_gem_object *batch_obj);
>>>   #define i915_add_request(ring) \
>>>       __i915_add_request(ring, NULL, NULL)
>>>   int __i915_wait_request(struct drm_i915_gem_request *req,
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>>> b/drivers/gpu/drm/i915/i915_gem.c
>>> index 6f51416..dd39aa5 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -1155,15 +1155,12 @@ i915_gem_check_wedge(struct i915_gpu_error
>>> *error,
>>>   int
>>>   i915_gem_check_olr(struct drm_i915_gem_request *req)
>>>   {
>>> -    int ret;
>>> -
>>> WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
>>>
>>> -    ret = 0;
>>>       if (req == req->ring->outstanding_lazy_request)
>>> -        ret = i915_add_request(req->ring);
>>> +        i915_add_request(req->ring);
>>>
>>> -    return ret;
>>> +    return 0;
>>>   }
>>
>> i915_gem_check_olr never returns anything but 0. How about making it
>> void?
> Seems like redundant/unnecessary changes given that the entire function
> is removed later in the series. Changing it from a '__must_check' to a
> 'void' now would be a lot of extra changes.
>
>> Thanks,
>> Tomas
>>
>>>
>>>   static void fake_irq(unsigned long data)
>>> @@ -2466,9 +2463,14 @@ i915_gem_get_seqno(struct drm_device *dev, u32
>>> *seqno)
>>>       return 0;
>>>   }
>>>
>>> -int __i915_add_request(struct intel_engine_cs *ring,
>>> -               struct drm_file *file,
>>> -               struct drm_i915_gem_object *obj)
>>> +/*
>>> + * NB: This function is not allowed to fail. Doing so would mean the
>>> the
>>> + * request is not being tracked for completion but the work itself is
>>> + * going to happen on the hardware. This would be a Bad Thing(tm).
>>> + */
>>> +void __i915_add_request(struct intel_engine_cs *ring,
>>> +            struct drm_file *file,
>>> +            struct drm_i915_gem_object *obj)
>>>   {
>>>       struct drm_i915_private *dev_priv = ring->dev->dev_private;
>>>       struct drm_i915_gem_request *request;
>>> @@ -2478,7 +2480,7 @@ int __i915_add_request(struct intel_engine_cs
>>> *ring,
>>>
>>>       request = ring->outstanding_lazy_request;
>>>       if (WARN_ON(request == NULL))
>>> -        return -ENOMEM;
>>> +        return;
>>
>> You have a WARN for the other points of failure in this function, why
>> not here?
> Erm, you mean like the one in the 'if' itself?
>

Exactly! :)

Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

>>
>>>
>>>       if (i915.enable_execlists) {
>>>           ringbuf = request->ctx->engine[ring->id].ringbuf;
>>> @@ -2500,15 +2502,12 @@ int __i915_add_request(struct intel_engine_cs
>>> *ring,
>>>        * is that the flush _must_ happen before the next request, no
>>> matter
>>>        * what.
>>>        */
>>> -    if (i915.enable_execlists) {
>>> +    if (i915.enable_execlists)
>>>           ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
>>> -        if (ret)
>>> -            return ret;
>>> -    } else {
>>> +    else
>>>           ret = intel_ring_flush_all_caches(ring);
>>> -        if (ret)
>>> -            return ret;
>>> -    }
>>> +    /* Not allowed to fail! */
>>> +    WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
>>>
>>>       /* Record the position of the start of the request so that
>>>        * should we detect the updated seqno part-way through the
>>> @@ -2517,17 +2516,15 @@ int __i915_add_request(struct intel_engine_cs
>>> *ring,
>>>        */
>>>       request->postfix = intel_ring_get_tail(ringbuf);
>>>
>>> -    if (i915.enable_execlists) {
>>> +    if (i915.enable_execlists)
>>>           ret = ring->emit_request(ringbuf, request);
>>> -        if (ret)
>>> -            return ret;
>>> -    } else {
>>> +    else {
>>>           ret = ring->add_request(ring);
>>> -        if (ret)
>>> -            return ret;
>>>
>>>           request->tail = intel_ring_get_tail(ringbuf);
>>>       }
>>> +    /* Not allowed to fail! */
>>> +    WARN(ret, "emit|add_request failed: %d!\n", ret);
>>>
>>>       request->head = request_start;
>>>
>>> @@ -2576,8 +2573,6 @@ int __i915_add_request(struct intel_engine_cs
>>> *ring,
>>>
>>>       /* Sanity check that the reserved size was large enough. */
>>>       intel_ring_reserved_space_end(ringbuf);
>>> -
>>> -    return 0;
>>>   }
>>>
>>>   static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> index bd0e4bd..2b48a31 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> @@ -1061,7 +1061,7 @@ i915_gem_execbuffer_retire_commands(struct
>>> drm_device *dev,
>>>       ring->gpu_caches_dirty = true;
>>>
>>>       /* Add a breadcrumb for the completion of the batch buffer */
>>> -    (void)__i915_add_request(ring, file, obj);
>>> +    __i915_add_request(ring, file, obj);
>>>   }
>>>
>>>   static int
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> b/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> index 521548a..ce4788f 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct
>>> intel_engine_cs *ring)
>>>
>>>       i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>>>
>>> -    ret = __i915_add_request(ring, NULL, so.obj);
>>> +    __i915_add_request(ring, NULL, so.obj);
>>>       /* __i915_add_request moves object to inactive if it fails */
>>>   out:
>>>       i915_gem_render_state_fini(&so);
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index e62d396..7a75fc8 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -1373,7 +1373,7 @@ static int
>>> intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>>>
>>>       i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>>>
>>> -    ret = __i915_add_request(ring, file, so.obj);
>>> +    __i915_add_request(ring, file, so.obj);
>>>       /* intel_logical_ring_add_request moves object to inactive if it
>>>        * fails */
>>>   out:
>>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c
>>> b/drivers/gpu/drm/i915/intel_overlay.c
>>> index 25c8ec6..e7534b9 100644
>>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>>> @@ -220,9 +220,7 @@ static int intel_overlay_do_wait_request(struct
>>> intel_overlay *overlay,
>>>       WARN_ON(overlay->last_flip_req);
>>>       i915_gem_request_assign(&overlay->last_flip_req,
>>>                            ring->outstanding_lazy_request);
>>> -    ret = i915_add_request(ring);
>>> -    if (ret)
>>> -        return ret;
>>> +    i915_add_request(ring);
>>>
>>>       overlay->flip_tail = tail;
>>>       ret = i915_wait_request(overlay->last_flip_req);
>>> @@ -291,7 +289,9 @@ static int intel_overlay_continue(struct
>>> intel_overlay *overlay,
>>>       WARN_ON(overlay->last_flip_req);
>>>       i915_gem_request_assign(&overlay->last_flip_req,
>>>                            ring->outstanding_lazy_request);
>>> -    return i915_add_request(ring);
>>> +    i915_add_request(ring);
>>> +
>>> +    return 0;
>>>   }
>>>
>>>   static void intel_overlay_release_old_vid_tail(struct intel_overlay
>>> *overlay)
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> index 74c2222..7061b07 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> @@ -2156,14 +2156,10 @@ static int intel_wrap_ring_buffer(struct
>>> intel_engine_cs *ring)
>>>   int intel_ring_idle(struct intel_engine_cs *ring)
>>>   {
>>>       struct drm_i915_gem_request *req;
>>> -    int ret;
>>>
>>>       /* We need to add any requests required to flush the objects
>>> and ring */
>>> -    if (ring->outstanding_lazy_request) {
>>> -        ret = i915_add_request(ring);
>>> -        if (ret)
>>> -            return ret;
>>> -    }
>>> +    if (ring->outstanding_lazy_request)
>>> +        i915_add_request(ring);
>>>
>>>       /* Wait upon the last request to be completed */
>>>       if (list_empty(&ring->request_list))
>>>
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 14/56] drm/i915: Make retire condition check for requests not objects
  2015-06-04 18:23 ` [PATCH 14/56] drm/i915: Make retire condition check for requests not objects John.C.Harrison
  2015-06-04 18:24   ` John Harrison
@ 2015-06-09 15:56   ` Tomas Elf
  2015-06-17 15:01     ` Daniel Vetter
  1 sibling, 1 reply; 120+ messages in thread
From: Tomas Elf @ 2015-06-09 15:56 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 04/06/2015 19:23, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> A previous patch (read-read optimisation) changed the early exit
> condition in i915_gem_retire_requests_ring() from checking the request
> list to checking the active list. This assumes that all requests have
> objects associated with them which are placed on the active list. The
> removal of the OLR means that non-batch buffer work is no longer
> tagged onto the nearest batch buffer submission and thus there are
> requests going through the system which do not have objects associated
> with them. This can therefore lead to the situation where an
> outstanding request never gets retired.
>
> This change reverts the early exit condition to check for requests.
> Given that the purpose of the function is to retire requests, this
> does seem to make much more sense.
>
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 7117659..4c5a6cd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2859,7 +2859,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
>   {
>   	WARN_ON(i915_verify_lists(ring->dev));
>
> -	if (list_empty(&ring->active_list))
> +	if (list_empty(&ring->request_list))
>   		return;
>
>   	/* Retire requests first as we use it above for the early return.
>

Note to whoever is integrating this patch: This patch can either be 
applied or we could drop the request_list check entirely. This according 
to Chris Wilson in the following conversation:

3:26:09 PM - ickle: we just kill the check
3:26:25 PM - ickle: the final function is just request_list + trace_request
3:26:37 PM - ickle: adding a test to save one isn't a great tradeoff
3:26:58 PM - siglesias has left the room (Quit: Ping timeout: 256 seconds).
3:27:28 PM - twnqx has left the room (Quit: Ping timeout: 272 seconds).
3:28:04 PM - tomas_elf: fine
3:28:20 PM - tomas_elf: anyway, good to know
3:29:32 PM - ickle: there's actually one more, it's also where the 
execlists_retire should be
3:30:48 PM - tomas_elf: maybe you can just submit a patch (unless you've 
already done so) that removes all of the references
3:31:09 PM - JohnHarrison Joryn
3:31:23 PM - ickle: there's like a backlog of 50 patches before we even 
get to that point
3:31:29 PM - tomas_elf: ok, cool
3:31:51 PM - tomas_elf: at any rate, JohnHarrison's patch can be 
accepted either with the request_list check or no check at all
3:33:14 PM - ickle: I thought vsyrjala sent the patch to completely kill 
it along with a bug citation
3:34:50 PM - doome_ [~doome@82.150.48.146] entered the room.
3:35:25 PM - ickle: 0aedb1626566efd72b369c01992ee7413c82a0c5
3:35:39 PM - darkbasic_ [~quassel@niko.linuxsystems.it] entered the room.
3:36:13 PM - darkbasic has left the room (Quit: Read error: Connection 
reset by peer).
3:39:05 PM - tomas_elf: has it been merged?
3:39:43 PM - ickle: it is in drm-intel-fixes
3:40:02 PM - tomas_elf: ah, ok

As long as the active_list check is removed since it breaks things.

Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-04 12:06   ` John.C.Harrison
@ 2015-06-09 16:00     ` Tomas Elf
  2015-06-18 12:10       ` John.C.Harrison
  2015-06-17 14:04     ` Daniel Vetter
  1 sibling, 1 reply; 120+ messages in thread
From: Tomas Elf @ 2015-06-09 16:00 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX

On 04/06/2015 13:06, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> It is a bad idea for i915_add_request() to fail. The work will already have been
> send to the ring and will be processed, but there will not be any tracking or
> management of that work.
>
> The only way the add request call can fail is if it can't write its epilogue
> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> reasons for that are mostly down to running out of ring buffer space and the
> problems associated with trying to get some more. This patch prevents that
> situation from happening in the first place.
>
> When a request is created, it marks sufficient space as reserved for the
> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> there will be plenty of space for it. Note that a ring_begin() call is required
> to actually reserve the space (and do any potential waiting). However, that is
> not currently done at request creation time. This is because the ring_begin()
> code can allocate a request. Hence calling begin() from the request allocation
> code would lead to infinite recursion! Later patches in this series remove the
> need for begin() to do the allocate. At that point, it becomes safe for the
> allocate to call begin() and really reserve the space.
>
> Until then, there is a potential for insufficient space to be available at the
> point of calling i915_add_request(). However, that would only be in the case
> where the request was created and immediately submitted without ever calling
> ring_begin() and adding any work to that request. Which should never happen. And
> even if it does, and if that request happens to fall down the tiny window of
> opportunity for failing due to being out of ring space then does it really
> matter because the request wasn't doing anything in the first place?
>
> v2: Updated the 'reserved space too small' warning to include the offending
> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> re-initialisation of tracking state after a buffer wrap to keep the sanity
> checks accurate.
>
> v3: Incremented the reserved size to accommodate Ironlake (after finally
> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         |    1 +
>   drivers/gpu/drm/i915/i915_gem.c         |   37 +++++++++++++++++
>   drivers/gpu/drm/i915/intel_lrc.c        |   18 ++++++++
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   68 ++++++++++++++++++++++++++++++-
>   drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++++++
>   5 files changed, 147 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e9d76f3..44dee31 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
>
>   int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   			   struct intel_context *ctx);
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>   void i915_gem_request_free(struct kref *req_ref);
>
>   static inline uint32_t
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 78f6a89..516e9b7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2495,6 +2495,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
>   	} else
>   		ringbuf = ring->buffer;
>
> +	/*
> +	 * To ensure that this call will not fail, space for its emissions
> +	 * should already have been reserved in the ring buffer. Let the ring
> +	 * know that it is time to use that space up.
> +	 */
> +	intel_ring_reserved_space_use(ringbuf);
> +
>   	request_start = intel_ring_get_tail(ringbuf);
>   	/*
>   	 * Emit any outstanding flushes - execbuf can fail to emit the flush
> @@ -2577,6 +2584,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
>   			   round_jiffies_up_relative(HZ));
>   	intel_mark_busy(dev_priv->dev);
>
> +	/* Sanity check that the reserved size was large enough. */
> +	intel_ring_reserved_space_end(ringbuf);
> +
>   	return 0;
>   }
>
> @@ -2676,6 +2686,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   	if (ret)
>   		goto err;
>
> +	/*
> +	 * Reserve space in the ring buffer for all the commands required to
> +	 * eventually emit this request. This is to guarantee that the
> +	 * i915_add_request() call can't fail. Note that the reserve may need
> +	 * to be redone if the request is not actually submitted straight
> +	 * away, e.g. because a GPU scheduler has deferred it.
> +	 *
> +	 * Note further that this call merely notes the reserve request. A
> +	 * subsequent call to *_ring_begin() is required to actually ensure
> +	 * that the reservation is available. Without the begin, if the
> +	 * request creator immediately submitted the request without adding
> +	 * any commands to it then there might not actually be sufficient
> +	 * room for the submission commands. Unfortunately, the current
> +	 * *_ring_begin() implementations potentially call back here to
> +	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
> +	 * infinite recursion! Until that back call path is removed, it is
> +	 * necessary to do a manual _begin() outside.
> +	 */
> +	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> +
>   	ring->outstanding_lazy_request = req;
>   	return 0;
>
> @@ -2684,6 +2714,13 @@ err:
>   	return ret;
>   }
>
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req)
> +{
> +	intel_ring_reserved_space_cancel(req->ringbuf);
> +
> +	i915_gem_request_unreference(req);
> +}
> +
>   struct drm_i915_gem_request *
>   i915_gem_find_active_request(struct intel_engine_cs *ring)
>   {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6a5ed07..42a756d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -687,6 +687,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
>   	unsigned space;
>   	int ret;
>
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>   	if (intel_ring_space(ringbuf) >= bytes)
>   		return 0;
>
> @@ -747,6 +750,9 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>   	uint32_t __iomem *virt;
>   	int rem = ringbuf->size - ringbuf->tail;
>
> +	/* Can't wrap if space has already been reserved! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>   	if (ringbuf->space < rem) {
>   		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>
> @@ -770,10 +776,22 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>   {
>   	int ret;
>
> +	if (!ringbuf->reserved_in_use)
> +		bytes += ringbuf->reserved_size;

This line right here is the main integration point between the buffer 
reservation scheme and the existing infrastructure. Please point this 
out in a comment down together with the space reservation prototypes 
where you describe _where_ this reserved ring space size actually comes 
to matter in the end. Or include a comment here. Or both. It's kinda 
important to know where the reserved space ends up in the end and where 
it integrates.

> +
>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
> +		WARN_ON(ringbuf->reserved_in_use);

This WARN_ON is already done in logical_ring_wrap_buffer. Unless there 
is a reason for a second warning please remove this one.

> +
>   		ret = logical_ring_wrap_buffer(ringbuf, ctx);
>   		if (unlikely(ret))
>   			return ret;
> +
> +		if(ringbuf->reserved_size) {
> +			uint32_t size = ringbuf->reserved_size;
> +
> +			intel_ring_reserved_space_cancel(ringbuf);
> +			intel_ring_reserved_space_reserve(ringbuf, size);
> +		}
>   	}
>
>   	if (unlikely(ringbuf->space < bytes)) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index d934f85..74c2222 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2103,6 +2103,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>   	unsigned space;
>   	int ret;
>
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>   	if (intel_ring_space(ringbuf) >= n)
>   		return 0;
>
> @@ -2130,6 +2133,9 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>   	int rem = ringbuf->size - ringbuf->tail;
>
> +	/* Can't wrap if space has already been reserved! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>   	if (ringbuf->space < rem) {
>   		int ret = ring_wait_for_space(ring, rem);
>   		if (ret)
> @@ -2180,16 +2186,74 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>   	return 0;
>   }
>
> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
> -				int bytes)
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
> +{
> +	/* NB: Until request management is fully tidied up and the OLR is
> +	 * removed, there are too many ways for get false hits on this
> +	 * anti-recursion check! */
> +	/*WARN_ON(ringbuf->reserved_size);*/
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size = size;
> +
> +	/*
> +	 * Really need to call _begin() here but that currently leads to
> +	 * recursion problems! This will be fixed later but for now just
> +	 * return and hope for the best. Note that there is only a real
> +	 * problem if the create of the request never actually calls _begin()
> +	 * but if they are not submitting any work then why did they create
> +	 * the request in the first place?
> +	 */
> +}
> +
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_in_use = true;
> +	ringbuf->reserved_tail   = ringbuf->tail;
> +}
> +
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(!ringbuf->reserved_in_use);
> +	WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
> +	     "request reserved size too small: %d vs %d!\n",
> +	     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
>   {
>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>   	int ret;
>
> +	if (!ringbuf->reserved_in_use)
> +		bytes += ringbuf->reserved_size;
> +
>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
> +		WARN_ON(ringbuf->reserved_in_use);
> +

This WARN_ON is already done in intel_wrap_ring_buffer. Unless there is 
a reason for a second warning please remove this one.

Thanks,
Tomas

>   		ret = intel_wrap_ring_buffer(ring);
>   		if (unlikely(ret))
>   			return ret;
> +
> +		if(ringbuf->reserved_size) {
> +			uint32_t size = ringbuf->reserved_size;
> +
> +			intel_ring_reserved_space_cancel(ringbuf);
> +			intel_ring_reserved_space_reserve(ringbuf, size);
> +		}
>   	}
>
>   	if (unlikely(ringbuf->space < bytes)) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 39f6dfc..bf2ac28 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -105,6 +105,9 @@ struct intel_ringbuffer {
>   	int space;
>   	int size;
>   	int effective_size;
> +	int reserved_size;
> +	int reserved_tail;
> +	bool reserved_in_use;
>
>   	/** We track the position of the requests in the ring buffer, and
>   	 * when each is retired we increment last_retired_head as the GPU
> @@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>   	return ring->outstanding_lazy_request;
>   }
>
> +/*
> + * Arbitrary size for largest possible 'add request' sequence. The code paths
> + * are complex and variable. Empirical measurement shows that the worst case
> + * is ILK at 136 words. Reserving too much is better than reserving too little
> + * as that allows for corner cases that might have been missed. So the figure
> + * has been rounded up to 160 words.
> + */
> +#define MIN_SPACE_FOR_ADD_REQUEST	160
> +
> +/*
> + * Reserve space in the ring to guarantee that the i915_add_request() call
> + * will always have sufficient room to do its stuff. The request creation
> + * code calls this automatically.
> + */
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
> +/* Cancel the reservation, e.g. because the request is being discarded. */
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
> +/* Use the reserved space - for use by i915_add_request() only. */
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
> +/* Finish with the reserved space - for use by i915_add_request() only. */
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
> +
>   #endif /* _INTEL_RINGBUFFER_H_ */
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
  2015-05-29 16:44 ` [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation John.C.Harrison
@ 2015-06-17 13:31   ` Daniel Vetter
  2015-06-17 14:27     ` Chris Wilson
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-17 13:31 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Fri, May 29, 2015 at 05:44:09PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> Now that the *_ring_begin() functions no longer call the request allocation
> code, it is finally safe for the request allocation code to call *_ring_begin().
> This is important to guarantee that the space reserved for the subsequent
> i915_add_request() call does actually get reserved.
> 
> v2: Renamed functions according to review feedback (Tomas Elf).
> 
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

Still has my question open from the previos round:

http://mid.gmane.org/20150323091030.GL1349@phenom.ffwll.local

Note that this isn't all that unlikely with GuC mode since there the
ringbuffer is substantially smaller (due to firmware limitations) than
what we allocate ourselves right now.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem.c         |   25 +++++++++++++------------
>  drivers/gpu/drm/i915/intel_lrc.c        |   15 +++++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.h        |    1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   29 ++++++++++++++++-------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h |    1 +
>  5 files changed, 46 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9f3e0717..1261792 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2680,19 +2680,20 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  	 * i915_add_request() call can't fail. Note that the reserve may need
>  	 * to be redone if the request is not actually submitted straight
>  	 * away, e.g. because a GPU scheduler has deferred it.
> -	 *
> -	 * Note further that this call merely notes the reserve request. A
> -	 * subsequent call to *_ring_begin() is required to actually ensure
> -	 * that the reservation is available. Without the begin, if the
> -	 * request creator immediately submitted the request without adding
> -	 * any commands to it then there might not actually be sufficient
> -	 * room for the submission commands. Unfortunately, the current
> -	 * *_ring_begin() implementations potentially call back here to
> -	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
> -	 * infinite recursion! Until that back call path is removed, it is
> -	 * necessary to do a manual _begin() outside.
>  	 */
> -	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> +	if (i915.enable_execlists)
> +		ret = intel_logical_ring_reserve_space(req);
> +	else
> +		ret = intel_ring_reserve_space(req);
> +	if (ret) {
> +		/*
> +		 * At this point, the request is fully allocated even if not
> +		 * fully prepared. Thus it can be cleaned up using the proper
> +		 * free code.
> +		 */
> +		i915_gem_request_cancel(req);
> +		return ret;
> +	}
>  
>  	*req_out = ring->outstanding_lazy_request = req;
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 548c53d..e164ac0 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -823,6 +823,21 @@ static int intel_logical_ring_begin(struct drm_i915_gem_request *req,
>  	return 0;
>  }
>  
> +int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
> +{
> +	/*
> +	 * The first call merely notes the reserve request and is common for
> +	 * all back ends. The subsequent localised _begin() call actually
> +	 * ensures that the reservation is available. Without the begin, if
> +	 * the request creator immediately submitted the request without
> +	 * adding any commands to it then there might not actually be
> +	 * sufficient room for the submission commands.
> +	 */
> +	intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> +
> +	return intel_logical_ring_begin(request, 0);
> +}
> +
>  /**
>   * execlists_submission() - submit a batchbuffer for execution, Execlists style
>   * @dev: DRM device.
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 044c0e5..f59940a 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -37,6 +37,7 @@
>  
>  /* Logical Rings */
>  int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request);
> +int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request);
>  void intel_logical_ring_stop(struct intel_engine_cs *ring);
>  void intel_logical_ring_cleanup(struct intel_engine_cs *ring);
>  int intel_logical_rings_init(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index bb10fc2..0ba5787 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2192,24 +2192,27 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>  	return 0;
>  }
>  
> +int intel_ring_reserve_space(struct drm_i915_gem_request *request)
> +{
> +	/*
> +	 * The first call merely notes the reserve request and is common for
> +	 * all back ends. The subsequent localised _begin() call actually
> +	 * ensures that the reservation is available. Without the begin, if
> +	 * the request creator immediately submitted the request without
> +	 * adding any commands to it then there might not actually be
> +	 * sufficient room for the submission commands.
> +	 */
> +	intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> +
> +	return intel_ring_begin(request, 0);
> +}
> +
>  void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
>  {
> -	/* NB: Until request management is fully tidied up and the OLR is
> -	 * removed, there are too many ways for get false hits on this
> -	 * anti-recursion check! */
> -	/*WARN_ON(ringbuf->reserved_size);*/
> +	WARN_ON(ringbuf->reserved_size);
>  	WARN_ON(ringbuf->reserved_in_use);
>  
>  	ringbuf->reserved_size = size;
> -
> -	/*
> -	 * Really need to call _begin() here but that currently leads to
> -	 * recursion problems! This will be fixed later but for now just
> -	 * return and hope for the best. Note that there is only a real
> -	 * problem if the create of the request never actually calls _begin()
> -	 * but if they are not submitting any work then why did they create
> -	 * the request in the first place?
> -	 */
>  }
>  
>  void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 16fd9ba..f4633ca 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -450,6 +450,7 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>  
>  #define MIN_SPACE_FOR_ADD_REQUEST	128
>  
> +int intel_ring_reserve_space(struct drm_i915_gem_request *request);
>  void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
>  void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
>  void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-04 12:06   ` John.C.Harrison
  2015-06-09 16:00     ` Tomas Elf
@ 2015-06-17 14:04     ` Daniel Vetter
  2015-06-18 10:43       ` John Harrison
  1 sibling, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-17 14:04 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Thu, Jun 04, 2015 at 01:06:34PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> It is a bad idea for i915_add_request() to fail. The work will already have been
> send to the ring and will be processed, but there will not be any tracking or
> management of that work.
> 
> The only way the add request call can fail is if it can't write its epilogue
> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> reasons for that are mostly down to running out of ring buffer space and the
> problems associated with trying to get some more. This patch prevents that
> situation from happening in the first place.
> 
> When a request is created, it marks sufficient space as reserved for the
> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> there will be plenty of space for it. Note that a ring_begin() call is required
> to actually reserve the space (and do any potential waiting). However, that is
> not currently done at request creation time. This is because the ring_begin()
> code can allocate a request. Hence calling begin() from the request allocation
> code would lead to infinite recursion! Later patches in this series remove the
> need for begin() to do the allocate. At that point, it becomes safe for the
> allocate to call begin() and really reserve the space.
> 
> Until then, there is a potential for insufficient space to be available at the
> point of calling i915_add_request(). However, that would only be in the case
> where the request was created and immediately submitted without ever calling
> ring_begin() and adding any work to that request. Which should never happen. And
> even if it does, and if that request happens to fall down the tiny window of
> opportunity for failing due to being out of ring space then does it really
> matter because the request wasn't doing anything in the first place?
> 
> v2: Updated the 'reserved space too small' warning to include the offending
> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> re-initialisation of tracking state after a buffer wrap to keep the sanity
> checks accurate.
> 
> v3: Incremented the reserved size to accommodate Ironlake (after finally
> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
> 
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

From the last review round there's still my question wrt the correctness
of the reservation overflow vs. wrapping outstanding:

http://article.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/56575

Also when resending patches, especially after such a long delay please
leave some indication of what you've decided to do wrt review comments.
Either as a reply in the review discussion (preferred) or at least as an
update in the cover letter or per-patch changelog. Otherwise reviewers
need to reverse-engineer what you have or haven't done by diffing patches,
which is just not that efficient.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h         |    1 +
>  drivers/gpu/drm/i915/i915_gem.c         |   37 +++++++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        |   18 ++++++++
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   68 ++++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++++++
>  5 files changed, 147 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e9d76f3..44dee31 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
>  
>  int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  			   struct intel_context *ctx);
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  void i915_gem_request_free(struct kref *req_ref);
>  
>  static inline uint32_t
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 78f6a89..516e9b7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2495,6 +2495,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
>  	} else
>  		ringbuf = ring->buffer;
>  
> +	/*
> +	 * To ensure that this call will not fail, space for its emissions
> +	 * should already have been reserved in the ring buffer. Let the ring
> +	 * know that it is time to use that space up.
> +	 */
> +	intel_ring_reserved_space_use(ringbuf);
> +
>  	request_start = intel_ring_get_tail(ringbuf);
>  	/*
>  	 * Emit any outstanding flushes - execbuf can fail to emit the flush
> @@ -2577,6 +2584,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
>  			   round_jiffies_up_relative(HZ));
>  	intel_mark_busy(dev_priv->dev);
>  
> +	/* Sanity check that the reserved size was large enough. */
> +	intel_ring_reserved_space_end(ringbuf);
> +
>  	return 0;
>  }
>  
> @@ -2676,6 +2686,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  	if (ret)
>  		goto err;
>  
> +	/*
> +	 * Reserve space in the ring buffer for all the commands required to
> +	 * eventually emit this request. This is to guarantee that the
> +	 * i915_add_request() call can't fail. Note that the reserve may need
> +	 * to be redone if the request is not actually submitted straight
> +	 * away, e.g. because a GPU scheduler has deferred it.
> +	 *
> +	 * Note further that this call merely notes the reserve request. A
> +	 * subsequent call to *_ring_begin() is required to actually ensure
> +	 * that the reservation is available. Without the begin, if the
> +	 * request creator immediately submitted the request without adding
> +	 * any commands to it then there might not actually be sufficient
> +	 * room for the submission commands. Unfortunately, the current
> +	 * *_ring_begin() implementations potentially call back here to
> +	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
> +	 * infinite recursion! Until that back call path is removed, it is
> +	 * necessary to do a manual _begin() outside.
> +	 */
> +	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> +
>  	ring->outstanding_lazy_request = req;
>  	return 0;
>  
> @@ -2684,6 +2714,13 @@ err:
>  	return ret;
>  }
>  
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req)
> +{
> +	intel_ring_reserved_space_cancel(req->ringbuf);
> +
> +	i915_gem_request_unreference(req);
> +}
> +
>  struct drm_i915_gem_request *
>  i915_gem_find_active_request(struct intel_engine_cs *ring)
>  {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6a5ed07..42a756d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -687,6 +687,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
>  	unsigned space;
>  	int ret;
>  
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>  	if (intel_ring_space(ringbuf) >= bytes)
>  		return 0;
>  
> @@ -747,6 +750,9 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>  	uint32_t __iomem *virt;
>  	int rem = ringbuf->size - ringbuf->tail;
>  
> +	/* Can't wrap if space has already been reserved! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>  	if (ringbuf->space < rem) {
>  		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>  
> @@ -770,10 +776,22 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>  {
>  	int ret;
>  
> +	if (!ringbuf->reserved_in_use)
> +		bytes += ringbuf->reserved_size;
> +
>  	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
> +		WARN_ON(ringbuf->reserved_in_use);
> +
>  		ret = logical_ring_wrap_buffer(ringbuf, ctx);
>  		if (unlikely(ret))
>  			return ret;
> +
> +		if(ringbuf->reserved_size) {
> +			uint32_t size = ringbuf->reserved_size;
> +
> +			intel_ring_reserved_space_cancel(ringbuf);
> +			intel_ring_reserved_space_reserve(ringbuf, size);
> +		}
>  	}
>  
>  	if (unlikely(ringbuf->space < bytes)) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index d934f85..74c2222 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2103,6 +2103,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>  	unsigned space;
>  	int ret;
>  
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>  	if (intel_ring_space(ringbuf) >= n)
>  		return 0;
>  
> @@ -2130,6 +2133,9 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>  	struct intel_ringbuffer *ringbuf = ring->buffer;
>  	int rem = ringbuf->size - ringbuf->tail;
>  
> +	/* Can't wrap if space has already been reserved! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>  	if (ringbuf->space < rem) {
>  		int ret = ring_wait_for_space(ring, rem);
>  		if (ret)
> @@ -2180,16 +2186,74 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>  	return 0;
>  }
>  
> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
> -				int bytes)
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
> +{
> +	/* NB: Until request management is fully tidied up and the OLR is
> +	 * removed, there are too many ways for get false hits on this
> +	 * anti-recursion check! */
> +	/*WARN_ON(ringbuf->reserved_size);*/
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size = size;
> +
> +	/*
> +	 * Really need to call _begin() here but that currently leads to
> +	 * recursion problems! This will be fixed later but for now just
> +	 * return and hope for the best. Note that there is only a real
> +	 * problem if the create of the request never actually calls _begin()
> +	 * but if they are not submitting any work then why did they create
> +	 * the request in the first place?
> +	 */
> +}
> +
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_in_use = true;
> +	ringbuf->reserved_tail   = ringbuf->tail;
> +}
> +
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(!ringbuf->reserved_in_use);
> +	WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
> +	     "request reserved size too small: %d vs %d!\n",
> +	     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
>  {
>  	struct intel_ringbuffer *ringbuf = ring->buffer;
>  	int ret;
>  
> +	if (!ringbuf->reserved_in_use)
> +		bytes += ringbuf->reserved_size;
> +
>  	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
> +		WARN_ON(ringbuf->reserved_in_use);
> +
>  		ret = intel_wrap_ring_buffer(ring);
>  		if (unlikely(ret))
>  			return ret;
> +
> +		if(ringbuf->reserved_size) {
> +			uint32_t size = ringbuf->reserved_size;
> +
> +			intel_ring_reserved_space_cancel(ringbuf);
> +			intel_ring_reserved_space_reserve(ringbuf, size);
> +		}
>  	}
>  
>  	if (unlikely(ringbuf->space < bytes)) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 39f6dfc..bf2ac28 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -105,6 +105,9 @@ struct intel_ringbuffer {
>  	int space;
>  	int size;
>  	int effective_size;
> +	int reserved_size;
> +	int reserved_tail;
> +	bool reserved_in_use;
>  
>  	/** We track the position of the requests in the ring buffer, and
>  	 * when each is retired we increment last_retired_head as the GPU
> @@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>  	return ring->outstanding_lazy_request;
>  }
>  
> +/*
> + * Arbitrary size for largest possible 'add request' sequence. The code paths
> + * are complex and variable. Empirical measurement shows that the worst case
> + * is ILK at 136 words. Reserving too much is better than reserving too little
> + * as that allows for corner cases that might have been missed. So the figure
> + * has been rounded up to 160 words.
> + */
> +#define MIN_SPACE_FOR_ADD_REQUEST	160
> +
> +/*
> + * Reserve space in the ring to guarantee that the i915_add_request() call
> + * will always have sufficient room to do its stuff. The request creation
> + * code calls this automatically.
> + */
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
> +/* Cancel the reservation, e.g. because the request is being discarded. */
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
> +/* Use the reserved space - for use by i915_add_request() only. */
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
> +/* Finish with the reserved space - for use by i915_add_request() only. */
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
> +
>  #endif /* _INTEL_RINGBUFFER_H_ */
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()
  2015-05-29 16:44 ` [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active() John.C.Harrison
  2015-06-02 18:27   ` Tomas Elf
@ 2015-06-17 14:06   ` Daniel Vetter
  2015-06-17 14:21     ` Chris Wilson
  2015-06-18 10:57     ` John Harrison
  1 sibling, 2 replies; 120+ messages in thread
From: Daniel Vetter @ 2015-06-17 14:06 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Fri, May 29, 2015 at 05:44:16PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The i915_gem_object_flush_active() call used to do lots. Over time it has done
> less and less. Now all it does check the various associated requests to see if
> they can be retired. Hence this patch renames the function and updates the
> comments around it to match the current operation.
> 
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

When rebasing patches and especially like here when also renaming them a
bit please leave some indication of what you've changed. Took me a while
to figure out where one of my pending comments from the previous round
went too.

And please don't just "v2: rebase", but please add some indicators against
what it conflicted if it's obvious.

Thanks, Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem.c |   18 ++++++------------
>  1 file changed, 6 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f825942..081cbbf 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2956,12 +2956,10 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  }
>  
>  /**
> - * Ensures that an object will eventually get non-busy by flushing any required
> - * write domains, emitting any outstanding lazy request and retiring and
> - * completed requests.
> + * Check an object to see if any of it's associated requests can be retired.
>   */
>  static int
> -i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
> +i915_gem_object_retire(struct drm_i915_gem_object *obj)
>  {
>  	int i;
>  
> @@ -3034,8 +3032,8 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  		return -ENOENT;
>  	}
>  
> -	/* Need to make sure the object gets inactive eventually. */
> -	ret = i915_gem_object_flush_active(obj);
> +	/* Check if the object is pending clean up. */
> +	ret = i915_gem_object_retire(obj);
>  	if (ret)
>  		goto out;
>  
> @@ -4526,12 +4524,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>  		goto unlock;
>  	}
>  
> -	/* Count all active objects as busy, even if they are currently not used
> -	 * by the gpu. Users of this interface expect objects to eventually
> -	 * become non-busy without any further actions, therefore emit any
> -	 * necessary flushes here.
> -	 */
> -	ret = i915_gem_object_flush_active(obj);
> +	/* Check if the object is pending clean up. */
> +	ret = i915_gem_object_retire(obj);
>  	if (ret)
>  		goto unref;
>  
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()
  2015-06-17 14:06   ` Daniel Vetter
@ 2015-06-17 14:21     ` Chris Wilson
  2015-06-18 11:03       ` John Harrison
  2015-06-18 10:57     ` John Harrison
  1 sibling, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-17 14:21 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On Wed, Jun 17, 2015 at 04:06:05PM +0200, Daniel Vetter wrote:
> On Fri, May 29, 2015 at 05:44:16PM +0100, John.C.Harrison@Intel.com wrote:
> > From: John Harrison <John.C.Harrison@Intel.com>
> > 
> > The i915_gem_object_flush_active() call used to do lots. Over time it has done
> > less and less. Now all it does check the various associated requests to see if
> > they can be retired. Hence this patch renames the function and updates the
> > comments around it to match the current operation.
> > 
> > For: VIZ-5115
> > Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> 
> When rebasing patches and especially like here when also renaming them a
> bit please leave some indication of what you've changed. Took me a while
> to figure out where one of my pending comments from the previous round
> went too.
> 
> And please don't just "v2: rebase", but please add some indicators against
> what it conflicted if it's obvious.

This function doesn't do an unconditional retire - the new name is much
worse since it is inconsistent with how requests retire. In my make GEM
umpteen times faster patches, I repurposed this function for reporting
the object's current activeness and called it bool i915_gem_oject_active()
 - though that is probably better as i915_gem_object_is_active().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
  2015-06-17 13:31   ` Daniel Vetter
@ 2015-06-17 14:27     ` Chris Wilson
  2015-06-17 14:54       ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-17 14:27 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On Wed, Jun 17, 2015 at 03:31:59PM +0200, Daniel Vetter wrote:
> On Fri, May 29, 2015 at 05:44:09PM +0100, John.C.Harrison@Intel.com wrote:
> > From: John Harrison <John.C.Harrison@Intel.com>
> > 
> > Now that the *_ring_begin() functions no longer call the request allocation
> > code, it is finally safe for the request allocation code to call *_ring_begin().
> > This is important to guarantee that the space reserved for the subsequent
> > i915_add_request() call does actually get reserved.
> > 
> > v2: Renamed functions according to review feedback (Tomas Elf).
> > 
> > For: VIZ-5115
> > Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> 
> Still has my question open from the previos round:
> 
> http://mid.gmane.org/20150323091030.GL1349@phenom.ffwll.local
> 
> Note that this isn't all that unlikely with GuC mode since there the
> ringbuffer is substantially smaller (due to firmware limitations) than
> what we allocate ourselves right now.

Looking at this patch, I am still fundamentally opposed to reserving
space for the request. Detecting a request that wraps and cancelling
that request (after the appropriate WARN for the overlow) is trivial and
such a rare case (as it is a programming error) that it should only be
handled in the slow path.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
  2015-06-17 14:27     ` Chris Wilson
@ 2015-06-17 14:54       ` Daniel Vetter
  2015-06-17 15:52         ` Chris Wilson
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-17 14:54 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, John.C.Harrison, Intel-GFX

On Wed, Jun 17, 2015 at 03:27:08PM +0100, Chris Wilson wrote:
> On Wed, Jun 17, 2015 at 03:31:59PM +0200, Daniel Vetter wrote:
> > On Fri, May 29, 2015 at 05:44:09PM +0100, John.C.Harrison@Intel.com wrote:
> > > From: John Harrison <John.C.Harrison@Intel.com>
> > > 
> > > Now that the *_ring_begin() functions no longer call the request allocation
> > > code, it is finally safe for the request allocation code to call *_ring_begin().
> > > This is important to guarantee that the space reserved for the subsequent
> > > i915_add_request() call does actually get reserved.
> > > 
> > > v2: Renamed functions according to review feedback (Tomas Elf).
> > > 
> > > For: VIZ-5115
> > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> > 
> > Still has my question open from the previos round:
> > 
> > http://mid.gmane.org/20150323091030.GL1349@phenom.ffwll.local
> > 
> > Note that this isn't all that unlikely with GuC mode since there the
> > ringbuffer is substantially smaller (due to firmware limitations) than
> > what we allocate ourselves right now.
> 
> Looking at this patch, I am still fundamentally opposed to reserving
> space for the request. Detecting a request that wraps and cancelling
> that request (after the appropriate WARN for the overlow) is trivial and
> such a rare case (as it is a programming error) that it should only be
> handled in the slow path.

I thought the entire point here that we don't have request half-committed
because the final request ringcmds didn't fit in. And that does require
that we reserve a bit of space for that postamble.

I guess if it's too much (atm it's super-pessimistic due to ilk) we can
make per-platform reservation limits to be really minimal.

Maybe we could go towards a rollback model longterm of rewingind the
ringbuffer. But if there's no clear need I'd like to avoid that
complexity.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 14/56] drm/i915: Make retire condition check for requests not objects
  2015-06-09 15:56   ` Tomas Elf
@ 2015-06-17 15:01     ` Daniel Vetter
  0 siblings, 0 replies; 120+ messages in thread
From: Daniel Vetter @ 2015-06-17 15:01 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Tue, Jun 09, 2015 at 04:56:01PM +0100, Tomas Elf wrote:
> On 04/06/2015 19:23, John.C.Harrison@Intel.com wrote:
> >From: John Harrison <John.C.Harrison@Intel.com>
> >
> >A previous patch (read-read optimisation) changed the early exit
> >condition in i915_gem_retire_requests_ring() from checking the request
> >list to checking the active list. This assumes that all requests have
> >objects associated with them which are placed on the active list. The
> >removal of the OLR means that non-batch buffer work is no longer
> >tagged onto the nearest batch buffer submission and thus there are
> >requests going through the system which do not have objects associated
> >with them. This can therefore lead to the situation where an
> >outstanding request never gets retired.
> >
> >This change reverts the early exit condition to check for requests.
> >Given that the purpose of the function is to retire requests, this
> >does seem to make much more sense.
> >
> >For: VIZ-5190
> >Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> >---
> >  drivers/gpu/drm/i915/i915_gem.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 7117659..4c5a6cd 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -2859,7 +2859,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
> >  {
> >  	WARN_ON(i915_verify_lists(ring->dev));
> >
> >-	if (list_empty(&ring->active_list))
> >+	if (list_empty(&ring->request_list))
> >  		return;
> >
> >  	/* Retire requests first as we use it above for the early return.
> >
> 
> Note to whoever is integrating this patch: This patch can either be applied
> or we could drop the request_list check entirely. This according to Chris
> Wilson in the following conversation:
> 
> 3:26:09 PM - ickle: we just kill the check
> 3:26:25 PM - ickle: the final function is just request_list + trace_request
> 3:26:37 PM - ickle: adding a test to save one isn't a great tradeoff
> 3:26:58 PM - siglesias has left the room (Quit: Ping timeout: 256 seconds).
> 3:27:28 PM - twnqx has left the room (Quit: Ping timeout: 272 seconds).
> 3:28:04 PM - tomas_elf: fine
> 3:28:20 PM - tomas_elf: anyway, good to know
> 3:29:32 PM - ickle: there's actually one more, it's also where the
> execlists_retire should be
> 3:30:48 PM - tomas_elf: maybe you can just submit a patch (unless you've
> already done so) that removes all of the references
> 3:31:09 PM - JohnHarrison Joryn
> 3:31:23 PM - ickle: there's like a backlog of 50 patches before we even get
> to that point
> 3:31:29 PM - tomas_elf: ok, cool
> 3:31:51 PM - tomas_elf: at any rate, JohnHarrison's patch can be accepted
> either with the request_list check or no check at all
> 3:33:14 PM - ickle: I thought vsyrjala sent the patch to completely kill it
> along with a bug citation
> 3:34:50 PM - doome_ [~doome@82.150.48.146] entered the room.
> 3:35:25 PM - ickle: 0aedb1626566efd72b369c01992ee7413c82a0c5
> 3:35:39 PM - darkbasic_ [~quassel@niko.linuxsystems.it] entered the room.
> 3:36:13 PM - darkbasic has left the room (Quit: Read error: Connection reset
> by peer).
> 3:39:05 PM - tomas_elf: has it been merged?
> 3:39:43 PM - ickle: it is in drm-intel-fixes
> 3:40:02 PM - tomas_elf: ah, ok
> 
> As long as the active_list check is removed since it breaks things.
> 
> Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Jani already picked up Ville's version of this for dinf:

commit 11ee9615f9bbc9c0c2dbd9f5eb275459b76f032a
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Thu May 28 18:32:36 2015 +0300

    drm/i915: Don't skip request retirement if the active list is empty

We should be covered here I think.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
  2015-06-17 14:54       ` Daniel Vetter
@ 2015-06-17 15:52         ` Chris Wilson
  2015-06-18 11:21           ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-17 15:52 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On Wed, Jun 17, 2015 at 04:54:42PM +0200, Daniel Vetter wrote:
> On Wed, Jun 17, 2015 at 03:27:08PM +0100, Chris Wilson wrote:
> > On Wed, Jun 17, 2015 at 03:31:59PM +0200, Daniel Vetter wrote:
> > > On Fri, May 29, 2015 at 05:44:09PM +0100, John.C.Harrison@Intel.com wrote:
> > > > From: John Harrison <John.C.Harrison@Intel.com>
> > > > 
> > > > Now that the *_ring_begin() functions no longer call the request allocation
> > > > code, it is finally safe for the request allocation code to call *_ring_begin().
> > > > This is important to guarantee that the space reserved for the subsequent
> > > > i915_add_request() call does actually get reserved.
> > > > 
> > > > v2: Renamed functions according to review feedback (Tomas Elf).
> > > > 
> > > > For: VIZ-5115
> > > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> > > 
> > > Still has my question open from the previos round:
> > > 
> > > http://mid.gmane.org/20150323091030.GL1349@phenom.ffwll.local
> > > 
> > > Note that this isn't all that unlikely with GuC mode since there the
> > > ringbuffer is substantially smaller (due to firmware limitations) than
> > > what we allocate ourselves right now.
> > 
> > Looking at this patch, I am still fundamentally opposed to reserving
> > space for the request. Detecting a request that wraps and cancelling
> > that request (after the appropriate WARN for the overlow) is trivial and
> > such a rare case (as it is a programming error) that it should only be
> > handled in the slow path.
> 
> I thought the entire point here that we don't have request half-committed
> because the final request ringcmds didn't fit in. And that does require
> that we reserve a bit of space for that postamble.
> 
> I guess if it's too much (atm it's super-pessimistic due to ilk) we can
> make per-platform reservation limits to be really minimal.
> 
> Maybe we could go towards a rollback model longterm of rewingind the
> ringbuffer. But if there's no clear need I'd like to avoid that
> complexity.

Even if you didn't like the rollback model which helps handling the
partial state from context switches and what not, if you run out of
ringspace you can set the GPU as wedged. Issuing a request that fills
the entire ringbuffer is a programming bug that needs to be caught very
early in development.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-17 14:04     ` Daniel Vetter
@ 2015-06-18 10:43       ` John Harrison
  0 siblings, 0 replies; 120+ messages in thread
From: John Harrison @ 2015-06-18 10:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 17/06/2015 15:04, Daniel Vetter wrote:
> On Thu, Jun 04, 2015 at 01:06:34PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> It is a bad idea for i915_add_request() to fail. The work will already have been
>> send to the ring and will be processed, but there will not be any tracking or
>> management of that work.
>>
>> The only way the add request call can fail is if it can't write its epilogue
>> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
>> reasons for that are mostly down to running out of ring buffer space and the
>> problems associated with trying to get some more. This patch prevents that
>> situation from happening in the first place.
>>
>> When a request is created, it marks sufficient space as reserved for the
>> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
>> there will be plenty of space for it. Note that a ring_begin() call is required
>> to actually reserve the space (and do any potential waiting). However, that is
>> not currently done at request creation time. This is because the ring_begin()
>> code can allocate a request. Hence calling begin() from the request allocation
>> code would lead to infinite recursion! Later patches in this series remove the
>> need for begin() to do the allocate. At that point, it becomes safe for the
>> allocate to call begin() and really reserve the space.
>>
>> Until then, there is a potential for insufficient space to be available at the
>> point of calling i915_add_request(). However, that would only be in the case
>> where the request was created and immediately submitted without ever calling
>> ring_begin() and adding any work to that request. Which should never happen. And
>> even if it does, and if that request happens to fall down the tiny window of
>> opportunity for failing due to being out of ring space then does it really
>> matter because the request wasn't doing anything in the first place?
>>
>> v2: Updated the 'reserved space too small' warning to include the offending
>> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
>> re-initialisation of tracking state after a buffer wrap to keep the sanity
>> checks accurate.
>>
>> v3: Incremented the reserved size to accommodate Ironlake (after finally
>> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>>
>> For: VIZ-5115
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>  From the last review round there's still my question wrt the correctness
> of the reservation overflow vs. wrapping outstanding:
>
> http://article.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/56575

v2 - 'added re-intialisation of tracking state after a buffer wrap to 
keep the sanity checks accurate'. Does that not address your issue with 
wrapping?


> Also when resending patches, especially after such a long delay please
> leave some indication of what you've decided to do wrt review comments.
> Either as a reply in the review discussion (preferred) or at least as an
> update in the cover letter or per-patch changelog. Otherwise reviewers
> need to reverse-engineer what you have or haven't done by diffing patches,
> which is just not that efficient.
> -Daniel
>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         |    1 +
>>   drivers/gpu/drm/i915/i915_gem.c         |   37 +++++++++++++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        |   18 ++++++++
>>   drivers/gpu/drm/i915/intel_ringbuffer.c |   68 ++++++++++++++++++++++++++++++-
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++++++
>>   5 files changed, 147 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index e9d76f3..44dee31 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
>>   
>>   int i915_gem_request_alloc(struct intel_engine_cs *ring,
>>   			   struct intel_context *ctx);
>> +void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>   void i915_gem_request_free(struct kref *req_ref);
>>   
>>   static inline uint32_t
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 78f6a89..516e9b7 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2495,6 +2495,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
>>   	} else
>>   		ringbuf = ring->buffer;
>>   
>> +	/*
>> +	 * To ensure that this call will not fail, space for its emissions
>> +	 * should already have been reserved in the ring buffer. Let the ring
>> +	 * know that it is time to use that space up.
>> +	 */
>> +	intel_ring_reserved_space_use(ringbuf);
>> +
>>   	request_start = intel_ring_get_tail(ringbuf);
>>   	/*
>>   	 * Emit any outstanding flushes - execbuf can fail to emit the flush
>> @@ -2577,6 +2584,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
>>   			   round_jiffies_up_relative(HZ));
>>   	intel_mark_busy(dev_priv->dev);
>>   
>> +	/* Sanity check that the reserved size was large enough. */
>> +	intel_ring_reserved_space_end(ringbuf);
>> +
>>   	return 0;
>>   }
>>   
>> @@ -2676,6 +2686,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>>   	if (ret)
>>   		goto err;
>>   
>> +	/*
>> +	 * Reserve space in the ring buffer for all the commands required to
>> +	 * eventually emit this request. This is to guarantee that the
>> +	 * i915_add_request() call can't fail. Note that the reserve may need
>> +	 * to be redone if the request is not actually submitted straight
>> +	 * away, e.g. because a GPU scheduler has deferred it.
>> +	 *
>> +	 * Note further that this call merely notes the reserve request. A
>> +	 * subsequent call to *_ring_begin() is required to actually ensure
>> +	 * that the reservation is available. Without the begin, if the
>> +	 * request creator immediately submitted the request without adding
>> +	 * any commands to it then there might not actually be sufficient
>> +	 * room for the submission commands. Unfortunately, the current
>> +	 * *_ring_begin() implementations potentially call back here to
>> +	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
>> +	 * infinite recursion! Until that back call path is removed, it is
>> +	 * necessary to do a manual _begin() outside.
>> +	 */
>> +	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
>> +
>>   	ring->outstanding_lazy_request = req;
>>   	return 0;
>>   
>> @@ -2684,6 +2714,13 @@ err:
>>   	return ret;
>>   }
>>   
>> +void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>> +{
>> +	intel_ring_reserved_space_cancel(req->ringbuf);
>> +
>> +	i915_gem_request_unreference(req);
>> +}
>> +
>>   struct drm_i915_gem_request *
>>   i915_gem_find_active_request(struct intel_engine_cs *ring)
>>   {
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 6a5ed07..42a756d 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -687,6 +687,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
>>   	unsigned space;
>>   	int ret;
>>   
>> +	/* The whole point of reserving space is to not wait! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	if (intel_ring_space(ringbuf) >= bytes)
>>   		return 0;
>>   
>> @@ -747,6 +750,9 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>>   	uint32_t __iomem *virt;
>>   	int rem = ringbuf->size - ringbuf->tail;
>>   
>> +	/* Can't wrap if space has already been reserved! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	if (ringbuf->space < rem) {
>>   		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>>   
>> @@ -770,10 +776,22 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>>   {
>>   	int ret;
>>   
>> +	if (!ringbuf->reserved_in_use)
>> +		bytes += ringbuf->reserved_size;
>> +
>>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>> +		WARN_ON(ringbuf->reserved_in_use);
>> +
>>   		ret = logical_ring_wrap_buffer(ringbuf, ctx);
>>   		if (unlikely(ret))
>>   			return ret;
>> +
>> +		if(ringbuf->reserved_size) {
>> +			uint32_t size = ringbuf->reserved_size;
>> +
>> +			intel_ring_reserved_space_cancel(ringbuf);
>> +			intel_ring_reserved_space_reserve(ringbuf, size);
>> +		}
>>   	}
>>   
>>   	if (unlikely(ringbuf->space < bytes)) {
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index d934f85..74c2222 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2103,6 +2103,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>>   	unsigned space;
>>   	int ret;
>>   
>> +	/* The whole point of reserving space is to not wait! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	if (intel_ring_space(ringbuf) >= n)
>>   		return 0;
>>   
>> @@ -2130,6 +2133,9 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>>   	int rem = ringbuf->size - ringbuf->tail;
>>   
>> +	/* Can't wrap if space has already been reserved! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	if (ringbuf->space < rem) {
>>   		int ret = ring_wait_for_space(ring, rem);
>>   		if (ret)
>> @@ -2180,16 +2186,74 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>>   	return 0;
>>   }
>>   
>> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
>> -				int bytes)
>> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
>> +{
>> +	/* NB: Until request management is fully tidied up and the OLR is
>> +	 * removed, there are too many ways for get false hits on this
>> +	 * anti-recursion check! */
>> +	/*WARN_ON(ringbuf->reserved_size);*/
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_size = size;
>> +
>> +	/*
>> +	 * Really need to call _begin() here but that currently leads to
>> +	 * recursion problems! This will be fixed later but for now just
>> +	 * return and hope for the best. Note that there is only a real
>> +	 * problem if the create of the request never actually calls _begin()
>> +	 * but if they are not submitting any work then why did they create
>> +	 * the request in the first place?
>> +	 */
>> +}
>> +
>> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_size   = 0;
>> +	ringbuf->reserved_in_use = false;
>> +}
>> +
>> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_in_use = true;
>> +	ringbuf->reserved_tail   = ringbuf->tail;
>> +}
>> +
>> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(!ringbuf->reserved_in_use);
>> +	WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
>> +	     "request reserved size too small: %d vs %d!\n",
>> +	     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
>> +
>> +	ringbuf->reserved_size   = 0;
>> +	ringbuf->reserved_in_use = false;
>> +}
>> +
>> +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
>>   {
>>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>>   	int ret;
>>   
>> +	if (!ringbuf->reserved_in_use)
>> +		bytes += ringbuf->reserved_size;
>> +
>>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>> +		WARN_ON(ringbuf->reserved_in_use);
>> +
>>   		ret = intel_wrap_ring_buffer(ring);
>>   		if (unlikely(ret))
>>   			return ret;
>> +
>> +		if(ringbuf->reserved_size) {
>> +			uint32_t size = ringbuf->reserved_size;
>> +
>> +			intel_ring_reserved_space_cancel(ringbuf);
>> +			intel_ring_reserved_space_reserve(ringbuf, size);
>> +		}
>>   	}
>>   
>>   	if (unlikely(ringbuf->space < bytes)) {
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index 39f6dfc..bf2ac28 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -105,6 +105,9 @@ struct intel_ringbuffer {
>>   	int space;
>>   	int size;
>>   	int effective_size;
>> +	int reserved_size;
>> +	int reserved_tail;
>> +	bool reserved_in_use;
>>   
>>   	/** We track the position of the requests in the ring buffer, and
>>   	 * when each is retired we increment last_retired_head as the GPU
>> @@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>>   	return ring->outstanding_lazy_request;
>>   }
>>   
>> +/*
>> + * Arbitrary size for largest possible 'add request' sequence. The code paths
>> + * are complex and variable. Empirical measurement shows that the worst case
>> + * is ILK at 136 words. Reserving too much is better than reserving too little
>> + * as that allows for corner cases that might have been missed. So the figure
>> + * has been rounded up to 160 words.
>> + */
>> +#define MIN_SPACE_FOR_ADD_REQUEST	160
>> +
>> +/*
>> + * Reserve space in the ring to guarantee that the i915_add_request() call
>> + * will always have sufficient room to do its stuff. The request creation
>> + * code calls this automatically.
>> + */
>> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
>> +/* Cancel the reservation, e.g. because the request is being discarded. */
>> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
>> +/* Use the reserved space - for use by i915_add_request() only. */
>> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
>> +/* Finish with the reserved space - for use by i915_add_request() only. */
>> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
>> +
>>   #endif /* _INTEL_RINGBUFFER_H_ */
>> -- 
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()
  2015-06-17 14:06   ` Daniel Vetter
  2015-06-17 14:21     ` Chris Wilson
@ 2015-06-18 10:57     ` John Harrison
  1 sibling, 0 replies; 120+ messages in thread
From: John Harrison @ 2015-06-18 10:57 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 17/06/2015 15:06, Daniel Vetter wrote:
> On Fri, May 29, 2015 at 05:44:16PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The i915_gem_object_flush_active() call used to do lots. Over time it has done
>> less and less. Now all it does check the various associated requests to see if
>> they can be retired. Hence this patch renames the function and updates the
>> comments around it to match the current operation.
>>
>> For: VIZ-5115
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> When rebasing patches and especially like here when also renaming them a
> bit please leave some indication of what you've changed. Took me a while
> to figure out where one of my pending comments from the previous round
> went too.

The thing is that this isn't really a rebase of a previous patch. It is 
a completely different patch. The original version was removing this 
function entirely as it had become a trivial wrapper around a call to 
i915_gem_retire_requests_ring(). Whereas, some of Chris Wilson's work in 
the meantime has meant that it is now a non-trivial wrapper. Hence the 
function cannot be removed. However, it is not doing what its name or 
comment says. So this new patch is just renaming it and updating the 
comment to be more accurate. Thus the posted comments about the previous 
patch do not apply to this patch - the change being commented on is no 
longer being made.

> And please don't just "v2: rebase", but please add some indicators against
> what it conflicted if it's obvious.
>
> Thanks, Daniel
>
>> ---
>>   drivers/gpu/drm/i915/i915_gem.c |   18 ++++++------------
>>   1 file changed, 6 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index f825942..081cbbf 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2956,12 +2956,10 @@ i915_gem_idle_work_handler(struct work_struct *work)
>>   }
>>   
>>   /**
>> - * Ensures that an object will eventually get non-busy by flushing any required
>> - * write domains, emitting any outstanding lazy request and retiring and
>> - * completed requests.
>> + * Check an object to see if any of it's associated requests can be retired.
>>    */
>>   static int
>> -i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
>> +i915_gem_object_retire(struct drm_i915_gem_object *obj)
>>   {
>>   	int i;
>>   
>> @@ -3034,8 +3032,8 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>>   		return -ENOENT;
>>   	}
>>   
>> -	/* Need to make sure the object gets inactive eventually. */
>> -	ret = i915_gem_object_flush_active(obj);
>> +	/* Check if the object is pending clean up. */
>> +	ret = i915_gem_object_retire(obj);
>>   	if (ret)
>>   		goto out;
>>   
>> @@ -4526,12 +4524,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>>   		goto unlock;
>>   	}
>>   
>> -	/* Count all active objects as busy, even if they are currently not used
>> -	 * by the gpu. Users of this interface expect objects to eventually
>> -	 * become non-busy without any further actions, therefore emit any
>> -	 * necessary flushes here.
>> -	 */
>> -	ret = i915_gem_object_flush_active(obj);
>> +	/* Check if the object is pending clean up. */
>> +	ret = i915_gem_object_retire(obj);
>>   	if (ret)
>>   		goto unref;
>>   
>> -- 
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()
  2015-06-17 14:21     ` Chris Wilson
@ 2015-06-18 11:03       ` John Harrison
  2015-06-18 11:10         ` Chris Wilson
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-18 11:03 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Intel-GFX

On 17/06/2015 15:21, Chris Wilson wrote:
> On Wed, Jun 17, 2015 at 04:06:05PM +0200, Daniel Vetter wrote:
>> On Fri, May 29, 2015 at 05:44:16PM +0100, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> The i915_gem_object_flush_active() call used to do lots. Over time it has done
>>> less and less. Now all it does check the various associated requests to see if
>>> they can be retired. Hence this patch renames the function and updates the
>>> comments around it to match the current operation.
>>>
>>> For: VIZ-5115
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> When rebasing patches and especially like here when also renaming them a
>> bit please leave some indication of what you've changed. Took me a while
>> to figure out where one of my pending comments from the previous round
>> went too.
>>
>> And please don't just "v2: rebase", but please add some indicators against
>> what it conflicted if it's obvious.
> This function doesn't do an unconditional retire - the new name is much
> worse since it is inconsistent with how requests retire. In my make GEM
> umpteen times faster patches, I repurposed this function for reporting
> the object's current activeness and called it bool i915_gem_oject_active()
>   - though that is probably better as i915_gem_object_is_active().
> -Chris
>

Retiring is generally not an unconditional operation. 
i915_gem_retire_requests[_ring]() does not forcefully retire all 
requests, it only retires stuff that has completed. Same here. It could 
be called i915_gem_object_check_retire() or some such if you prefer. 
Either way, it is better than i915_gem_object_flush_active() as that is 
referring to flushing out the OLR which no longer exists.

If you are adding other functionality back in then feel free to rename 
it again as appropriate. But as this patch series stands, the function 
is a 'retire completed work associated with this object' operation and 
really needs to be named accordingly.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()
  2015-06-18 11:03       ` John Harrison
@ 2015-06-18 11:10         ` Chris Wilson
  2015-06-18 11:27           ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-18 11:10 UTC (permalink / raw)
  To: John Harrison; +Cc: Intel-GFX

On Thu, Jun 18, 2015 at 12:03:12PM +0100, John Harrison wrote:
> On 17/06/2015 15:21, Chris Wilson wrote:
> >On Wed, Jun 17, 2015 at 04:06:05PM +0200, Daniel Vetter wrote:
> >>On Fri, May 29, 2015 at 05:44:16PM +0100, John.C.Harrison@Intel.com wrote:
> >>>From: John Harrison <John.C.Harrison@Intel.com>
> >>>
> >>>The i915_gem_object_flush_active() call used to do lots. Over time it has done
> >>>less and less. Now all it does check the various associated requests to see if
> >>>they can be retired. Hence this patch renames the function and updates the
> >>>comments around it to match the current operation.
> >>>
> >>>For: VIZ-5115
> >>>Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> >>When rebasing patches and especially like here when also renaming them a
> >>bit please leave some indication of what you've changed. Took me a while
> >>to figure out where one of my pending comments from the previous round
> >>went too.
> >>
> >>And please don't just "v2: rebase", but please add some indicators against
> >>what it conflicted if it's obvious.
> >This function doesn't do an unconditional retire - the new name is much
> >worse since it is inconsistent with how requests retire. In my make GEM
> >umpteen times faster patches, I repurposed this function for reporting
> >the object's current activeness and called it bool i915_gem_oject_active()
> >  - though that is probably better as i915_gem_object_is_active().
> >-Chris
> >
> 
> Retiring is generally not an unconditional operation.

In the code, I use <object>_retire to perform the retiring operation on
that object. I can rename i915_gem_retire_requests if that makes you
happier, but I don't think it needs to since retire_requests does not
imply to me that all requests are retired, just some indefinite value
(though positive indefinite at least!).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
  2015-06-17 15:52         ` Chris Wilson
@ 2015-06-18 11:21           ` John Harrison
  2015-06-18 13:29             ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-18 11:21 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Intel-GFX

On 17/06/2015 16:52, Chris Wilson wrote:
> On Wed, Jun 17, 2015 at 04:54:42PM +0200, Daniel Vetter wrote:
>> On Wed, Jun 17, 2015 at 03:27:08PM +0100, Chris Wilson wrote:
>>> On Wed, Jun 17, 2015 at 03:31:59PM +0200, Daniel Vetter wrote:
>>>> On Fri, May 29, 2015 at 05:44:09PM +0100, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> Now that the *_ring_begin() functions no longer call the request allocation
>>>>> code, it is finally safe for the request allocation code to call *_ring_begin().
>>>>> This is important to guarantee that the space reserved for the subsequent
>>>>> i915_add_request() call does actually get reserved.
>>>>>
>>>>> v2: Renamed functions according to review feedback (Tomas Elf).
>>>>>
>>>>> For: VIZ-5115
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>> Still has my question open from the previos round:
>>>>
>>>> http://mid.gmane.org/20150323091030.GL1349@phenom.ffwll.local
>>>>
>>>> Note that this isn't all that unlikely with GuC mode since there the
>>>> ringbuffer is substantially smaller (due to firmware limitations) than
>>>> what we allocate ourselves right now.
>>> Looking at this patch, I am still fundamentally opposed to reserving
>>> space for the request. Detecting a request that wraps and cancelling
>>> that request (after the appropriate WARN for the overlow) is trivial and
>>> such a rare case (as it is a programming error) that it should only be
>>> handled in the slow path.
>> I thought the entire point here that we don't have request half-committed
>> because the final request ringcmds didn't fit in. And that does require
>> that we reserve a bit of space for that postamble.
>>
>> I guess if it's too much (atm it's super-pessimistic due to ilk) we can
>> make per-platform reservation limits to be really minimal.
>>
>> Maybe we could go towards a rollback model longterm of rewingind the
>> ringbuffer. But if there's no clear need I'd like to avoid that
>> complexity.
> Even if you didn't like the rollback model which helps handling the
> partial state from context switches and what not, if you run out of
> ringspace you can set the GPU as wedged. Issuing a request that fills
> the entire ringbuffer is a programming bug that needs to be caught very
> early in development.
> -Chris
>

I'm still confused by what you are saying in the above referenced email. 
Part of it is about the sanity checks failing to handle the wrapping 
case correctly which has been fixed in the base reserve space patch 
(patch 2 in the series). The rest is either saying that you think we are 
potentially wrappping too early and wasting a few bytes of the ring 
buffer or that something is actually broken?

Point 2: 100 bytes of reserve, 160 bytes of execbuf and 200 bytes 
remaining. You seem to think this will fail somehow? Why? The 
wait_for_space(160) in the execbuf code will cause a wrap because the 
the 100 bytes for the add_request reservation is added on and the wait 
is actually being done for 260 bytes. So yes, we wrap earlier than would 
otherwise have been necessary but that is the only way to absolutely 
guarantee that the add_request() call cannot fail when trying to do the 
wrap itself.

As Chris says, if the driver is attempting to create a single request 
that fills the entire ringbuffer then that is a bug that should be 
caught as soon as possible. Even with a Guc, the ring buffer is not 
small compared to the size of requests the driver currently produces. 
Part of the scheduler work is to limit the number of batch buffers that 
a given application/context can have outstanding in the ring buffer at 
any given time in order to prevent starvation of the rest of the system 
by one badly behaved app. Thus completely filling a large ring buffer 
becomes impossible anyway - the application will be blocked before it 
gets that far.

Note that with the removal of the OLR, all requests now have a definite 
start and a definite end. Thus the scheme could be extended to provide 
rollback of the ring buffer. Each new request takes a note of the ring 
pointers at creation time. If the request is cancelled it can reset the 
pointers to where they were before. Thus all half submitted work is 
discarded. That is a much bigger semantic change however, so I would 
really like to get the bare minimum anti-OLR patch set in first before 
trying to do fancy extra features.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()
  2015-06-18 11:10         ` Chris Wilson
@ 2015-06-18 11:27           ` John Harrison
  0 siblings, 0 replies; 120+ messages in thread
From: John Harrison @ 2015-06-18 11:27 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Intel-GFX

On 18/06/2015 12:10, Chris Wilson wrote:
> On Thu, Jun 18, 2015 at 12:03:12PM +0100, John Harrison wrote:
>> On 17/06/2015 15:21, Chris Wilson wrote:
>>> On Wed, Jun 17, 2015 at 04:06:05PM +0200, Daniel Vetter wrote:
>>>> On Fri, May 29, 2015 at 05:44:16PM +0100, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> The i915_gem_object_flush_active() call used to do lots. Over time it has done
>>>>> less and less. Now all it does check the various associated requests to see if
>>>>> they can be retired. Hence this patch renames the function and updates the
>>>>> comments around it to match the current operation.
>>>>>
>>>>> For: VIZ-5115
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>> When rebasing patches and especially like here when also renaming them a
>>>> bit please leave some indication of what you've changed. Took me a while
>>>> to figure out where one of my pending comments from the previous round
>>>> went too.
>>>>
>>>> And please don't just "v2: rebase", but please add some indicators against
>>>> what it conflicted if it's obvious.
>>> This function doesn't do an unconditional retire - the new name is much
>>> worse since it is inconsistent with how requests retire. In my make GEM
>>> umpteen times faster patches, I repurposed this function for reporting
>>> the object's current activeness and called it bool i915_gem_oject_active()
>>>   - though that is probably better as i915_gem_object_is_active().
>>> -Chris
>>>
>> Retiring is generally not an unconditional operation.
> In the code, I use <object>_retire to perform the retiring operation on
> that object. I can rename i915_gem_retire_requests if that makes you
> happier, but I don't think it needs to since retire_requests does not
> imply to me that all requests are retired, just some indefinite value
> (though positive indefinite at least!).
> -Chris
>

Fair enough. I guess I'm still thinking of the driver as it was when I 
first wrote the patch series which was before your re-write for 
read/read optimisations. Like I said, the exact new name isn't as 
important as at least giving it a new name. The old name is definitely 
not valid any more. Feel free to suggest something better.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-09 16:00     ` Tomas Elf
@ 2015-06-18 12:10       ` John.C.Harrison
  0 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-06-18 12:10 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

It is a bad idea for i915_add_request() to fail. The work will already have been
send to the ring and will be processed, but there will not be any tracking or
management of that work.

The only way the add request call can fail is if it can't write its epilogue
commands to the ring (cache flushing, seqno updates, interrupt signalling). The
reasons for that are mostly down to running out of ring buffer space and the
problems associated with trying to get some more. This patch prevents that
situation from happening in the first place.

When a request is created, it marks sufficient space as reserved for the
epilogue commands. Thus guaranteeing that by the time the epilogue is written,
there will be plenty of space for it. Note that a ring_begin() call is required
to actually reserve the space (and do any potential waiting). However, that is
not currently done at request creation time. This is because the ring_begin()
code can allocate a request. Hence calling begin() from the request allocation
code would lead to infinite recursion! Later patches in this series remove the
need for begin() to do the allocate. At that point, it becomes safe for the
allocate to call begin() and really reserve the space.

Until then, there is a potential for insufficient space to be available at the
point of calling i915_add_request(). However, that would only be in the case
where the request was created and immediately submitted without ever calling
ring_begin() and adding any work to that request. Which should never happen. And
even if it does, and if that request happens to fall down the tiny window of
opportunity for failing due to being out of ring space then does it really
matter because the request wasn't doing anything in the first place?

v2: Updated the 'reserved space too small' warning to include the offending
sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
re-initialisation of tracking state after a buffer wrap to keep the sanity
checks accurate.

v3: Incremented the reserved size to accommodate Ironlake (after finally
managing to run on an ILK system). Also fixed missing wrap code in LRC mode.

v4: Added extra comment and removed duplicate WARN (feedback from Tomas).

For: VIZ-5115
CC: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    1 +
 drivers/gpu/drm/i915/i915_gem.c         |   37 ++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        |   21 +++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |   71 ++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |   25 +++++++++++
 5 files changed, 153 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0347eb9..eba1857 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
 
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct intel_context *ctx);
+void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 
 static inline uint32_t
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 81f3512..85fa27b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
 	} else
 		ringbuf = ring->buffer;
 
+	/*
+	 * To ensure that this call will not fail, space for its emissions
+	 * should already have been reserved in the ring buffer. Let the ring
+	 * know that it is time to use that space up.
+	 */
+	intel_ring_reserved_space_use(ringbuf);
+
 	request_start = intel_ring_get_tail(ringbuf);
 	/*
 	 * Emit any outstanding flushes - execbuf can fail to emit the flush
@@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
 			   round_jiffies_up_relative(HZ));
 	intel_mark_busy(dev_priv->dev);
 
+	/* Sanity check that the reserved size was large enough. */
+	intel_ring_reserved_space_end(ringbuf);
+
 	return 0;
 }
 
@@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	if (ret)
 		goto err;
 
+	/*
+	 * Reserve space in the ring buffer for all the commands required to
+	 * eventually emit this request. This is to guarantee that the
+	 * i915_add_request() call can't fail. Note that the reserve may need
+	 * to be redone if the request is not actually submitted straight
+	 * away, e.g. because a GPU scheduler has deferred it.
+	 *
+	 * Note further that this call merely notes the reserve request. A
+	 * subsequent call to *_ring_begin() is required to actually ensure
+	 * that the reservation is available. Without the begin, if the
+	 * request creator immediately submitted the request without adding
+	 * any commands to it then there might not actually be sufficient
+	 * room for the submission commands. Unfortunately, the current
+	 * *_ring_begin() implementations potentially call back here to
+	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
+	 * infinite recursion! Until that back call path is removed, it is
+	 * necessary to do a manual _begin() outside.
+	 */
+	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
+
 	ring->outstanding_lazy_request = req;
 	return 0;
 
@@ -2674,6 +2704,13 @@ err:
 	return ret;
 }
 
+void i915_gem_request_cancel(struct drm_i915_gem_request *req)
+{
+	intel_ring_reserved_space_cancel(req->ringbuf);
+
+	i915_gem_request_unreference(req);
+}
+
 struct drm_i915_gem_request *
 i915_gem_find_active_request(struct intel_engine_cs *ring)
 {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6a5ed07..288631b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -687,6 +687,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
 	unsigned space;
 	int ret;
 
+	/* The whole point of reserving space is to not wait! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (intel_ring_space(ringbuf) >= bytes)
 		return 0;
 
@@ -747,6 +750,9 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
 	uint32_t __iomem *virt;
 	int rem = ringbuf->size - ringbuf->tail;
 
+	/* Can't wrap if space has already been reserved! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (ringbuf->space < rem) {
 		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
 
@@ -770,10 +776,25 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
 {
 	int ret;
 
+	/*
+	 * Add on the reserved size to the request to make sure that after
+	 * the intended commands have been emitted, there is guaranteed to
+	 * still be enough free space to send them to the hardware.
+	 */
+	if (!ringbuf->reserved_in_use)
+		bytes += ringbuf->reserved_size;
+
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
 		ret = logical_ring_wrap_buffer(ringbuf, ctx);
 		if (unlikely(ret))
 			return ret;
+
+		if(ringbuf->reserved_size) {
+			uint32_t size = ringbuf->reserved_size;
+
+			intel_ring_reserved_space_cancel(ringbuf);
+			intel_ring_reserved_space_reserve(ringbuf, size);
+		}
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d934f85..6d6abe6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2103,6 +2103,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
 	unsigned space;
 	int ret;
 
+	/* The whole point of reserving space is to not wait! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (intel_ring_space(ringbuf) >= n)
 		return 0;
 
@@ -2130,6 +2133,9 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
 	struct intel_ringbuffer *ringbuf = ring->buffer;
 	int rem = ringbuf->size - ringbuf->tail;
 
+	/* Can't wrap if space has already been reserved! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	if (ringbuf->space < rem) {
 		int ret = ring_wait_for_space(ring, rem);
 		if (ret)
@@ -2180,16 +2186,77 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
 	return 0;
 }
 
-static int __intel_ring_prepare(struct intel_engine_cs *ring,
-				int bytes)
+void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
+{
+	/* NB: Until request management is fully tidied up and the OLR is
+	 * removed, there are too many ways for get false hits on this
+	 * anti-recursion check! */
+	/*WARN_ON(ringbuf->reserved_size);*/
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_size = size;
+
+	/*
+	 * Really need to call _begin() here but that currently leads to
+	 * recursion problems! This will be fixed later but for now just
+	 * return and hope for the best. Note that there is only a real
+	 * problem if the create of the request never actually calls _begin()
+	 * but if they are not submitting any work then why did they create
+	 * the request in the first place?
+	 */
+}
+
+void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_size   = 0;
+	ringbuf->reserved_in_use = false;
+}
+
+void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_in_use = true;
+	ringbuf->reserved_tail   = ringbuf->tail;
+}
+
+void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(!ringbuf->reserved_in_use);
+	WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
+	     "request reserved size too small: %d vs %d!\n",
+	     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
+
+	ringbuf->reserved_size   = 0;
+	ringbuf->reserved_in_use = false;
+}
+
+static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
 {
 	struct intel_ringbuffer *ringbuf = ring->buffer;
 	int ret;
 
+	/*
+	 * Add on the reserved size to the request to make sure that after
+	 * the intended commands have been emitted, there is guaranteed to
+	 * still be enough free space to send them to the hardware.
+	 */
+	if (!ringbuf->reserved_in_use)
+		bytes += ringbuf->reserved_size;
+
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
 		ret = intel_wrap_ring_buffer(ring);
 		if (unlikely(ret))
 			return ret;
+
+		if(ringbuf->reserved_size) {
+			uint32_t size = ringbuf->reserved_size;
+
+			intel_ring_reserved_space_cancel(ringbuf);
+			intel_ring_reserved_space_reserve(ringbuf, size);
+		}
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 39f6dfc..bf2ac28 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -105,6 +105,9 @@ struct intel_ringbuffer {
 	int space;
 	int size;
 	int effective_size;
+	int reserved_size;
+	int reserved_tail;
+	bool reserved_in_use;
 
 	/** We track the position of the requests in the ring buffer, and
 	 * when each is retired we increment last_retired_head as the GPU
@@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
 	return ring->outstanding_lazy_request;
 }
 
+/*
+ * Arbitrary size for largest possible 'add request' sequence. The code paths
+ * are complex and variable. Empirical measurement shows that the worst case
+ * is ILK at 136 words. Reserving too much is better than reserving too little
+ * as that allows for corner cases that might have been missed. So the figure
+ * has been rounded up to 160 words.
+ */
+#define MIN_SPACE_FOR_ADD_REQUEST	160
+
+/*
+ * Reserve space in the ring to guarantee that the i915_add_request() call
+ * will always have sufficient room to do its stuff. The request creation
+ * code calls this automatically.
+ */
+void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
+/* Cancel the reservation, e.g. because the request is being discarded. */
+void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
+/* Use the reserved space - for use by i915_add_request() only. */
+void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
+/* Finish with the reserved space - for use by i915_add_request() only. */
+void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring
  2015-05-29 16:43 ` [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring John.C.Harrison
@ 2015-06-18 12:11   ` John.C.Harrison
  0 siblings, 0 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-06-18 12:11 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The i915_gem_init_hw() function calls a bunch of smaller initialisation
functions. Multiple of which have generic sections and per ring sections. This
means multiple passes are done over the rings. Each pass writes data to the ring
which floats around in that ring's OLR until some random point in the future
when an add_request() is done by some random other piece of code.

This patch breaks i915_ppgtt_init_hw() in two with the per ring initialisation
now being done in i915_ppgtt_init_ring(). The ring looping is now done at the
top level in i915_gem_init_hw().

v2: Fix dumb loop variable re-use.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c     |   27 ++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_gtt.c |   28 +++++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem_gtt.h |    1 +
 3 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ac893e3..dff21bd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5016,7 +5016,7 @@ i915_gem_init_hw(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring;
-	int ret, i;
+	int ret, i, j;
 
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
@@ -5053,19 +5053,32 @@ i915_gem_init_hw(struct drm_device *dev)
 	 */
 	init_unused_rings(dev);
 
+	ret = i915_ppgtt_init_hw(dev);
+	if (ret) {
+		DRM_ERROR("PPGTT enable HW failed %d\n", ret);
+		goto out;
+	}
+
+	/* Need to do basic initialisation of all rings first: */
 	for_each_ring(ring, dev_priv, i) {
 		ret = ring->init_hw(ring);
 		if (ret)
 			goto out;
 	}
 
-	for (i = 0; i < NUM_L3_SLICES(dev); i++)
-		i915_gem_l3_remap(&dev_priv->ring[RCS], i);
+	/* Now it is safe to go back round and do everything else: */
+	for_each_ring(ring, dev_priv, i) {
+		if (ring->id == RCS) {
+			for (j = 0; j < NUM_L3_SLICES(dev); j++)
+				i915_gem_l3_remap(ring, j);
+		}
 
-	ret = i915_ppgtt_init_hw(dev);
-	if (ret && ret != -EIO) {
-		DRM_ERROR("PPGTT enable failed %d\n", ret);
-		i915_gem_cleanup_ringbuffer(dev);
+		ret = i915_ppgtt_init_ring(ring);
+		if (ret && ret != -EIO) {
+			DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret);
+			i915_gem_cleanup_ringbuffer(dev);
+			goto out;
+		}
 	}
 
 	ret = i915_gem_context_enable(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 17b7df0..b14ae63 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1543,11 +1543,6 @@ int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 
 int i915_ppgtt_init_hw(struct drm_device *dev)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_engine_cs *ring;
-	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
-	int i, ret = 0;
-
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -1566,16 +1561,23 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 	else
 		MISSING_CASE(INTEL_INFO(dev)->gen);
 
-	if (ppgtt) {
-		for_each_ring(ring, dev_priv, i) {
-			ret = ppgtt->switch_mm(ppgtt, ring);
-			if (ret != 0)
-				return ret;
-		}
-	}
+	return 0;
+}
 
-	return ret;
+int i915_ppgtt_init_ring(struct intel_engine_cs *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
+
+	if (i915.enable_execlists)
+		return 0;
+
+	if (!ppgtt)
+		return 0;
+
+	return ppgtt->switch_mm(ppgtt, ring);
 }
+
 struct i915_hw_ppgtt *
 i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 0d46dd2..0caa9eb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -475,6 +475,7 @@ void i915_global_gtt_cleanup(struct drm_device *dev);
 
 int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
 int i915_ppgtt_init_hw(struct drm_device *dev);
+int i915_ppgtt_init_ring(struct intel_engine_cs *ring);
 void i915_ppgtt_release(struct kref *kref);
 struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_device *dev,
 					struct drm_i915_file_private *fpriv);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-04 12:57     ` John Harrison
@ 2015-06-18 12:14       ` John.C.Harrison
  2015-06-18 12:21         ` Chris Wilson
  2015-06-18 16:36         ` 3.16 backlight kernel options Stéphane ANCELOT
  0 siblings, 2 replies; 120+ messages in thread
From: John.C.Harrison @ 2015-06-18 12:14 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the i915_gem_object_sync()
code path.

v2: Much more complex patch to share a single request between the sync and the
page flip. The _sync() function now supports lazy allocation of the request
structure. That is, if one is passed in then that will be used. If one is not,
then a request will be allocated and passed back out. Note that the _sync() code
does not necessarily require a request. Thus one will only be created until
certain situations. The reason the lazy allocation must be done within the
_sync() code itself is because the decision to need one or not is not really
something that code above can second guess (except in the case where one is
definitely not required because no ring is passed in).

The call chains above _sync() now support passing a request through which most
callers passing in NULL and assuming that no request will be required (because
they also pass in NULL for the ring and therefore can't be generating any ring
code).

The exeception is intel_crtc_page_flip() which now supports having a request
returned from _sync(). If one is, then that request is shared by the page flip
(if the page flip is of a type to need a request). If _sync() does not generate
a request but the page flip does need one, then the page flip path will create
its own request.

v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
Elf review request). Rebased onto newer tree that significantly changed the
synchronisation code.

v4: Updated comments from review feedback (Tomas Elf)

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
 drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
 drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
 drivers/gpu/drm/i915/intel_drv.h           |    3 +-
 drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
 drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
 8 files changed, 57 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 64a10fa..f69e9cb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
-			 struct intel_engine_cs *to);
+			 struct intel_engine_cs *to,
+			 struct drm_i915_gem_request **to_req);
 void i915_vma_move_to_active(struct i915_vma *vma,
 			     struct intel_engine_cs *ring);
 int i915_gem_dumb_create(struct drm_file *file_priv,
@@ -2889,6 +2890,7 @@ int __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     struct intel_engine_cs *pipelined,
+				     struct drm_i915_gem_request **pipelined_request,
 				     const struct i915_ggtt_view *view);
 void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj,
 					      const struct i915_ggtt_view *view);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e59369a..d7c7127 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3095,25 +3095,26 @@ out:
 static int
 __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 		       struct intel_engine_cs *to,
-		       struct drm_i915_gem_request *req)
+		       struct drm_i915_gem_request *from_req,
+		       struct drm_i915_gem_request **to_req)
 {
 	struct intel_engine_cs *from;
 	int ret;
 
-	from = i915_gem_request_get_ring(req);
+	from = i915_gem_request_get_ring(from_req);
 	if (to == from)
 		return 0;
 
-	if (i915_gem_request_completed(req, true))
+	if (i915_gem_request_completed(from_req, true))
 		return 0;
 
-	ret = i915_gem_check_olr(req);
+	ret = i915_gem_check_olr(from_req);
 	if (ret)
 		return ret;
 
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
 		struct drm_i915_private *i915 = to_i915(obj->base.dev);
-		ret = __i915_wait_request(req,
+		ret = __i915_wait_request(from_req,
 					  atomic_read(&i915->gpu_error.reset_counter),
 					  i915->mm.interruptible,
 					  NULL,
@@ -3121,15 +3122,23 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 		if (ret)
 			return ret;
 
-		i915_gem_object_retire_request(obj, req);
+		i915_gem_object_retire_request(obj, from_req);
 	} else {
 		int idx = intel_ring_sync_index(from, to);
-		u32 seqno = i915_gem_request_get_seqno(req);
+		u32 seqno = i915_gem_request_get_seqno(from_req);
+
+		WARN_ON(!to_req);
 
 		if (seqno <= from->semaphore.sync_seqno[idx])
 			return 0;
 
-		trace_i915_gem_ring_sync_to(from, to, req);
+		if (*to_req == NULL) {
+			ret = i915_gem_request_alloc(to, to->default_context, to_req);
+			if (ret)
+				return ret;
+		}
+
+		trace_i915_gem_ring_sync_to(from, to, from_req);
 		ret = to->semaphore.sync_to(to, from, seqno);
 		if (ret)
 			return ret;
@@ -3150,11 +3159,14 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
  *
  * @obj: object which may be in use on another ring.
  * @to: ring we wish to use the object on. May be NULL.
+ * @to_req: request we wish to use the object for. See below.
+ *          This will be allocated and returned if a request is
+ *          required but not passed in.
  *
  * This code is meant to abstract object synchronization with the GPU.
  * Calling with NULL implies synchronizing the object with the CPU
  * rather than a particular GPU ring. Conceptually we serialise writes
- * between engines inside the GPU. We only allow on engine to write
+ * between engines inside the GPU. We only allow one engine to write
  * into a buffer at any time, but multiple readers. To ensure each has
  * a coherent view of memory, we must:
  *
@@ -3165,11 +3177,22 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
  * - If we are a write request (pending_write_domain is set), the new
  *   request must wait for outstanding read requests to complete.
  *
+ * For CPU synchronisation (NULL to) no request is required. For syncing with
+ * rings to_req must be non-NULL. However, a request does not have to be
+ * pre-allocated. If *to_req is NULL and sync commands will be emitted then a
+ * request will be allocated automatically and returned through *to_req. Note
+ * that it is not guaranteed that commands will be emitted (because the system
+ * might already be idle). Hence there is no need to create a request that
+ * might never have any work submitted. Note further that if a request is
+ * returned in *to_req, it is the responsibility of the caller to submit
+ * that request (after potentially adding more work to it).
+ *
  * Returns 0 if successful, else propagates up the lower layer error.
  */
 int
 i915_gem_object_sync(struct drm_i915_gem_object *obj,
-		     struct intel_engine_cs *to)
+		     struct intel_engine_cs *to,
+		     struct drm_i915_gem_request **to_req)
 {
 	const bool readonly = obj->base.pending_write_domain == 0;
 	struct drm_i915_gem_request *req[I915_NUM_RINGS];
@@ -3191,7 +3214,7 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
 				req[n++] = obj->last_read_req[i];
 	}
 	for (i = 0; i < n; i++) {
-		ret = __i915_gem_object_sync(obj, to, req[i]);
+		ret = __i915_gem_object_sync(obj, to, req[i], to_req);
 		if (ret)
 			return ret;
 	}
@@ -4141,12 +4164,13 @@ int
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     struct intel_engine_cs *pipelined,
+				     struct drm_i915_gem_request **pipelined_request,
 				     const struct i915_ggtt_view *view)
 {
 	u32 old_read_domains, old_write_domain;
 	int ret;
 
-	ret = i915_gem_object_sync(obj, pipelined);
+	ret = i915_gem_object_sync(obj, pipelined, pipelined_request);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 50b1ced..bea92ad 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -899,7 +899,7 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req,
 		struct drm_i915_gem_object *obj = vma->obj;
 
 		if (obj->active & other_rings) {
-			ret = i915_gem_object_sync(obj, req->ring);
+			ret = i915_gem_object_sync(obj, req->ring, &req);
 			if (ret)
 				return ret;
 		}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 657a333..6528ada 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2338,7 +2338,8 @@ int
 intel_pin_and_fence_fb_obj(struct drm_plane *plane,
 			   struct drm_framebuffer *fb,
 			   const struct drm_plane_state *plane_state,
-			   struct intel_engine_cs *pipelined)
+			   struct intel_engine_cs *pipelined,
+			   struct drm_i915_gem_request **pipelined_request)
 {
 	struct drm_device *dev = fb->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -2403,7 +2404,7 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
 
 	dev_priv->mm.interruptible = false;
 	ret = i915_gem_object_pin_to_display_plane(obj, alignment, pipelined,
-						   &view);
+						   pipelined_request, &view);
 	if (ret)
 		goto err_interruptible;
 
@@ -11119,6 +11120,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	struct intel_unpin_work *work;
 	struct intel_engine_cs *ring;
 	bool mmio_flip;
+	struct drm_i915_gem_request *request = NULL;
 	int ret;
 
 	/*
@@ -11225,7 +11227,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	 */
 	ret = intel_pin_and_fence_fb_obj(crtc->primary, fb,
 					 crtc->primary->state,
-					 mmio_flip ? i915_gem_request_get_ring(obj->last_write_req) : ring);
+					 mmio_flip ? i915_gem_request_get_ring(obj->last_write_req) : ring, &request);
 	if (ret)
 		goto cleanup_pending;
 
@@ -11256,6 +11258,9 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 					intel_ring_get_request(ring));
 	}
 
+	if (request)
+		i915_add_request_no_flush(request->ring);
+
 	work->flip_queued_vblank = drm_crtc_vblank_count(crtc);
 	work->enable_stall_check = true;
 
@@ -11273,6 +11278,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 cleanup_unpin:
 	intel_unpin_fb_obj(fb, crtc->primary->state);
 cleanup_pending:
+	if (request)
+		i915_gem_request_cancel(request);
 	atomic_dec(&intel_crtc->unpin_work_count);
 	mutex_unlock(&dev->struct_mutex);
 cleanup:
@@ -13171,7 +13178,7 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 		if (ret)
 			DRM_DEBUG_KMS("failed to attach phys object\n");
 	} else {
-		ret = intel_pin_and_fence_fb_obj(plane, fb, new_state, NULL);
+		ret = intel_pin_and_fence_fb_obj(plane, fb, new_state, NULL, NULL);
 	}
 
 	if (ret == 0)
@@ -15218,7 +15225,7 @@ void intel_modeset_gem_init(struct drm_device *dev)
 		ret = intel_pin_and_fence_fb_obj(c->primary,
 						 c->primary->fb,
 						 c->primary->state,
-						 NULL);
+						 NULL, NULL);
 		mutex_unlock(&dev->struct_mutex);
 		if (ret) {
 			DRM_ERROR("failed to pin boot fb on pipe %d\n",
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 02d8317..73650ae 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1034,7 +1034,8 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
 int intel_pin_and_fence_fb_obj(struct drm_plane *plane,
 			       struct drm_framebuffer *fb,
 			       const struct drm_plane_state *plane_state,
-			       struct intel_engine_cs *pipelined);
+			       struct intel_engine_cs *pipelined,
+			       struct drm_i915_gem_request **pipelined_request);
 struct drm_framebuffer *
 __intel_framebuffer_create(struct drm_device *dev,
 			   struct drm_mode_fb_cmd2 *mode_cmd,
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 4e7e7da..dd9f3b2 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -151,7 +151,7 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 	}
 
 	/* Flush everything out, we'll be doing GTT only from now on */
-	ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL);
+	ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL, NULL);
 	if (ret) {
 		DRM_ERROR("failed to pin obj: %d\n", ret);
 		goto out_fb;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1d9d248..c29d6c5 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -638,7 +638,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
 		struct drm_i915_gem_object *obj = vma->obj;
 
 		if (obj->active & other_rings) {
-			ret = i915_gem_object_sync(obj, req->ring);
+			ret = i915_gem_object_sync(obj, req->ring, &req);
 			if (ret)
 				return ret;
 		}
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index e7534b9..0f8187a 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -724,7 +724,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 	if (ret != 0)
 		return ret;
 
-	ret = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL,
+	ret = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL, NULL,
 						   &i915_ggtt_view_normal);
 	if (ret != 0)
 		return ret;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-18 12:14       ` John.C.Harrison
@ 2015-06-18 12:21         ` Chris Wilson
  2015-06-18 12:59           ` John Harrison
  2015-06-18 16:36         ` 3.16 backlight kernel options Stéphane ANCELOT
  1 sibling, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-18 12:21 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Thu, Jun 18, 2015 at 01:14:56PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The plan is to pass requests around as the basic submission tracking structure
> rather than rings and contexts. This patch updates the i915_gem_object_sync()
> code path.
> 
> v2: Much more complex patch to share a single request between the sync and the
> page flip. The _sync() function now supports lazy allocation of the request
> structure. That is, if one is passed in then that will be used. If one is not,
> then a request will be allocated and passed back out. Note that the _sync() code
> does not necessarily require a request. Thus one will only be created until
> certain situations. The reason the lazy allocation must be done within the
> _sync() code itself is because the decision to need one or not is not really
> something that code above can second guess (except in the case where one is
> definitely not required because no ring is passed in).
> 
> The call chains above _sync() now support passing a request through which most
> callers passing in NULL and assuming that no request will be required (because
> they also pass in NULL for the ring and therefore can't be generating any ring
> code).
> 
> The exeception is intel_crtc_page_flip() which now supports having a request
> returned from _sync(). If one is, then that request is shared by the page flip
> (if the page flip is of a type to need a request). If _sync() does not generate
> a request but the page flip does need one, then the page flip path will create
> its own request.
> 
> v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
> Elf review request). Rebased onto newer tree that significantly changed the
> synchronisation code.
> 
> v4: Updated comments from review feedback (Tomas Elf)
> 
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Reviewed-by: Tomas Elf <tomas.elf@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
>  drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
>  drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
>  drivers/gpu/drm/i915/intel_drv.h           |    3 +-
>  drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
>  drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
>  drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
>  8 files changed, 57 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 64a10fa..f69e9cb 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>  
>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> -			 struct intel_engine_cs *to);
> +			 struct intel_engine_cs *to,
> +			 struct drm_i915_gem_request **to_req);

Nope. Did you forget to reorder the code to ensure that the request is
allocated along with the context switch at the start of execbuf?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-18 12:21         ` Chris Wilson
@ 2015-06-18 12:59           ` John Harrison
  2015-06-18 14:24             ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-18 12:59 UTC (permalink / raw)
  To: Chris Wilson, Intel-GFX

On 18/06/2015 13:21, Chris Wilson wrote:
> On Thu, Jun 18, 2015 at 01:14:56PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The plan is to pass requests around as the basic submission tracking structure
>> rather than rings and contexts. This patch updates the i915_gem_object_sync()
>> code path.
>>
>> v2: Much more complex patch to share a single request between the sync and the
>> page flip. The _sync() function now supports lazy allocation of the request
>> structure. That is, if one is passed in then that will be used. If one is not,
>> then a request will be allocated and passed back out. Note that the _sync() code
>> does not necessarily require a request. Thus one will only be created until
>> certain situations. The reason the lazy allocation must be done within the
>> _sync() code itself is because the decision to need one or not is not really
>> something that code above can second guess (except in the case where one is
>> definitely not required because no ring is passed in).
>>
>> The call chains above _sync() now support passing a request through which most
>> callers passing in NULL and assuming that no request will be required (because
>> they also pass in NULL for the ring and therefore can't be generating any ring
>> code).
>>
>> The exeception is intel_crtc_page_flip() which now supports having a request
>> returned from _sync(). If one is, then that request is shared by the page flip
>> (if the page flip is of a type to need a request). If _sync() does not generate
>> a request but the page flip does need one, then the page flip path will create
>> its own request.
>>
>> v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
>> Elf review request). Rebased onto newer tree that significantly changed the
>> synchronisation code.
>>
>> v4: Updated comments from review feedback (Tomas Elf)
>>
>> For: VIZ-5115
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> Reviewed-by: Tomas Elf <tomas.elf@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
>>   drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
>>   drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
>>   drivers/gpu/drm/i915/intel_drv.h           |    3 +-
>>   drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
>>   drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
>>   drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
>>   8 files changed, 57 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 64a10fa..f69e9cb 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>>   
>>   int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>>   int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>> -			 struct intel_engine_cs *to);
>> +			 struct intel_engine_cs *to,
>> +			 struct drm_i915_gem_request **to_req);
> Nope. Did you forget to reorder the code to ensure that the request is
> allocated along with the context switch at the start of execbuf?
> -Chris
>
Not sure what you are objecting to? If you mean the lazily allocated 
request then that is for page flip code not execbuff code. If we get 
here from an execbuff call then the request will definitely have been 
allocated and will be passed in. Whereas the page flip code may or may 
not require a request (depending on whether MMIO or ring flips are in 
use. Likewise the sync code may or may not require a request (depending 
on whether there is anything to sync to or not). There is no point 
allocating and submitting an empty request in the MMIO/idle case. Hence 
the sync code needs to be able to use an existing request or create one 
if none already exists.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
  2015-06-18 11:21           ` John Harrison
@ 2015-06-18 13:29             ` Daniel Vetter
  2015-06-19 16:34               ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-18 13:29 UTC (permalink / raw)
  To: John Harrison; +Cc: intel-gfx

On Thu, Jun 18, 2015 at 1:21 PM, John Harrison
<John.C.Harrison@intel.com> wrote:
> I'm still confused by what you are saying in the above referenced email.
> Part of it is about the sanity checks failing to handle the wrapping case
> correctly which has been fixed in the base reserve space patch (patch 2 in
> the series). The rest is either saying that you think we are potentially
> wrappping too early and wasting a few bytes of the ring buffer or that
> something is actually broken?

Yeah I didn't realize that this change was meant to fix the
ring->reserved_tail check since I didn't make that connection. It is
correct with that change, but the problem I see is that the
correctness of that debug aid isn't assured locally: No we both need
that check _and_ the correct handling of the reservation tracking at
wrap-around. If the check just handles wrapping it'll robustly stay in
working shape even when the wrapping behaviour changes.

> Point 2: 100 bytes of reserve, 160 bytes of execbuf and 200 bytes remaining.
> You seem to think this will fail somehow? Why? The wait_for_space(160) in
> the execbuf code will cause a wrap because the the 100 bytes for the
> add_request reservation is added on and the wait is actually being done for
> 260 bytes. So yes, we wrap earlier than would otherwise have been necessary
> but that is the only way to absolutely guarantee that the add_request() call
> cannot fail when trying to do the wrap itself.

There's no problem except that it's wasteful. And I tried to explain
that no unconditionally force-wrapping for the entire reservation is
actually not needed, since the additional space needed to account for
the eventual wrapping is bounded by a factor of 2. It's much less in
practice since we split up the final request bits into multiple
smaller intel_ring_begin. And if feels a bit wasteful to throw that
space away (and make the gpu eat through MI_NOP) just because it makes
caring for the worst-case harder. And with GuC the 160 dwords is
actually a fairly substantial part of the ring.

Even more so when we completely switch to a transaction model for
request, where we only need to wrap for individual commands and hence
could place intel_ring_being per-cmd (which is mostly what we do
already anyway).

> As Chris says, if the driver is attempting to create a single request that
> fills the entire ringbuffer then that is a bug that should be caught as soon
> as possible. Even with a Guc, the ring buffer is not small compared to the
> size of requests the driver currently produces. Part of the scheduler work
> is to limit the number of batch buffers that a given application/context can
> have outstanding in the ring buffer at any given time in order to prevent
> starvation of the rest of the system by one badly behaved app. Thus
> completely filling a large ring buffer becomes impossible anyway - the
> application will be blocked before it gets that far.

My proposal for this reservation wrapping business would have been:
- Increase the reservation by 31 dwords (to account for the worst-case
wrap in pc_render_add_request).
- Rework the reservation overflow WARN_ON in reserve_space_end to work
correctly even when wrapping while the reservation has been in use.
- Move the addition of reserved_space below the point where we wrap
the ring and only check against total free space, neglecting wrapping.
- Remove all other complications you've added.

Result is no forced wrapping for reservation and a debug check which
should even survive random changes by monkeys since the logic for that
check is fully contained within reserve_space_end. And for the check
we should be able to reuse __intel_free_space.

If I'm reading things correctly this shouldn't have any effect outside
of patch 2 and shouldn't cause any conflicts.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-18 12:59           ` John Harrison
@ 2015-06-18 14:24             ` Daniel Vetter
  2015-06-18 15:39               ` Chris Wilson
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-18 14:24 UTC (permalink / raw)
  To: John Harrison; +Cc: Intel-GFX

On Thu, Jun 18, 2015 at 01:59:13PM +0100, John Harrison wrote:
> On 18/06/2015 13:21, Chris Wilson wrote:
> >On Thu, Jun 18, 2015 at 01:14:56PM +0100, John.C.Harrison@Intel.com wrote:
> >>From: John Harrison <John.C.Harrison@Intel.com>
> >>
> >>The plan is to pass requests around as the basic submission tracking structure
> >>rather than rings and contexts. This patch updates the i915_gem_object_sync()
> >>code path.
> >>
> >>v2: Much more complex patch to share a single request between the sync and the
> >>page flip. The _sync() function now supports lazy allocation of the request
> >>structure. That is, if one is passed in then that will be used. If one is not,
> >>then a request will be allocated and passed back out. Note that the _sync() code
> >>does not necessarily require a request. Thus one will only be created until
> >>certain situations. The reason the lazy allocation must be done within the
> >>_sync() code itself is because the decision to need one or not is not really
> >>something that code above can second guess (except in the case where one is
> >>definitely not required because no ring is passed in).
> >>
> >>The call chains above _sync() now support passing a request through which most
> >>callers passing in NULL and assuming that no request will be required (because
> >>they also pass in NULL for the ring and therefore can't be generating any ring
> >>code).
> >>
> >>The exeception is intel_crtc_page_flip() which now supports having a request
> >>returned from _sync(). If one is, then that request is shared by the page flip
> >>(if the page flip is of a type to need a request). If _sync() does not generate
> >>a request but the page flip does need one, then the page flip path will create
> >>its own request.
> >>
> >>v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
> >>Elf review request). Rebased onto newer tree that significantly changed the
> >>synchronisation code.
> >>
> >>v4: Updated comments from review feedback (Tomas Elf)
> >>
> >>For: VIZ-5115
> >>Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> >>Reviewed-by: Tomas Elf <tomas.elf@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
> >>  drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
> >>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
> >>  drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
> >>  drivers/gpu/drm/i915/intel_drv.h           |    3 +-
> >>  drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
> >>  drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
> >>  drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
> >>  8 files changed, 57 insertions(+), 23 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index 64a10fa..f69e9cb 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
> >>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> >>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> >>-			 struct intel_engine_cs *to);
> >>+			 struct intel_engine_cs *to,
> >>+			 struct drm_i915_gem_request **to_req);
> >Nope. Did you forget to reorder the code to ensure that the request is
> >allocated along with the context switch at the start of execbuf?
> >-Chris
> >
> Not sure what you are objecting to? If you mean the lazily allocated request
> then that is for page flip code not execbuff code. If we get here from an
> execbuff call then the request will definitely have been allocated and will
> be passed in. Whereas the page flip code may or may not require a request
> (depending on whether MMIO or ring flips are in use. Likewise the sync code
> may or may not require a request (depending on whether there is anything to
> sync to or not). There is no point allocating and submitting an empty
> request in the MMIO/idle case. Hence the sync code needs to be able to use
> an existing request or create one if none already exists.

I guess Chris' comment was that if you have a non-NULL to, then you better
have a non-NULL to_req. And since we link up reqeusts to the engine
they'll run on the former shouldn't be required any more. So either that's
true and we can remove the to or we don't understand something yet (and
perhaps that should be done as a follow-up).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-18 14:24             ` Daniel Vetter
@ 2015-06-18 15:39               ` Chris Wilson
  2015-06-18 16:16                 ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-18 15:39 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On Thu, Jun 18, 2015 at 04:24:53PM +0200, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 01:59:13PM +0100, John Harrison wrote:
> > On 18/06/2015 13:21, Chris Wilson wrote:
> > >On Thu, Jun 18, 2015 at 01:14:56PM +0100, John.C.Harrison@Intel.com wrote:
> > >>From: John Harrison <John.C.Harrison@Intel.com>
> > >>
> > >>The plan is to pass requests around as the basic submission tracking structure
> > >>rather than rings and contexts. This patch updates the i915_gem_object_sync()
> > >>code path.
> > >>
> > >>v2: Much more complex patch to share a single request between the sync and the
> > >>page flip. The _sync() function now supports lazy allocation of the request
> > >>structure. That is, if one is passed in then that will be used. If one is not,
> > >>then a request will be allocated and passed back out. Note that the _sync() code
> > >>does not necessarily require a request. Thus one will only be created until
> > >>certain situations. The reason the lazy allocation must be done within the
> > >>_sync() code itself is because the decision to need one or not is not really
> > >>something that code above can second guess (except in the case where one is
> > >>definitely not required because no ring is passed in).
> > >>
> > >>The call chains above _sync() now support passing a request through which most
> > >>callers passing in NULL and assuming that no request will be required (because
> > >>they also pass in NULL for the ring and therefore can't be generating any ring
> > >>code).
> > >>
> > >>The exeception is intel_crtc_page_flip() which now supports having a request
> > >>returned from _sync(). If one is, then that request is shared by the page flip
> > >>(if the page flip is of a type to need a request). If _sync() does not generate
> > >>a request but the page flip does need one, then the page flip path will create
> > >>its own request.
> > >>
> > >>v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
> > >>Elf review request). Rebased onto newer tree that significantly changed the
> > >>synchronisation code.
> > >>
> > >>v4: Updated comments from review feedback (Tomas Elf)
> > >>
> > >>For: VIZ-5115
> > >>Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> > >>Reviewed-by: Tomas Elf <tomas.elf@intel.com>
> > >>---
> > >>  drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
> > >>  drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
> > >>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
> > >>  drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
> > >>  drivers/gpu/drm/i915/intel_drv.h           |    3 +-
> > >>  drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
> > >>  drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
> > >>  drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
> > >>  8 files changed, 57 insertions(+), 23 deletions(-)
> > >>
> > >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > >>index 64a10fa..f69e9cb 100644
> > >>--- a/drivers/gpu/drm/i915/i915_drv.h
> > >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> > >>@@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
> > >>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> > >>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> > >>-			 struct intel_engine_cs *to);
> > >>+			 struct intel_engine_cs *to,
> > >>+			 struct drm_i915_gem_request **to_req);
> > >Nope. Did you forget to reorder the code to ensure that the request is
> > >allocated along with the context switch at the start of execbuf?
> > >-Chris
> > >
> > Not sure what you are objecting to? If you mean the lazily allocated request
> > then that is for page flip code not execbuff code. If we get here from an
> > execbuff call then the request will definitely have been allocated and will
> > be passed in. Whereas the page flip code may or may not require a request
> > (depending on whether MMIO or ring flips are in use. Likewise the sync code
> > may or may not require a request (depending on whether there is anything to
> > sync to or not). There is no point allocating and submitting an empty
> > request in the MMIO/idle case. Hence the sync code needs to be able to use
> > an existing request or create one if none already exists.
> 
> I guess Chris' comment was that if you have a non-NULL to, then you better
> have a non-NULL to_req. And since we link up reqeusts to the engine
> they'll run on the former shouldn't be required any more. So either that's
> true and we can remove the to or we don't understand something yet (and
> perhaps that should be done as a follow-up).

I am sure I sent a patch that outlined in great detail how that we need
only the request parameter in i915_gem_object_sync(), for handling both
execbuffer, pipelined pin_and_fence and synchronous pin_and_fence.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-18 15:39               ` Chris Wilson
@ 2015-06-18 16:16                 ` John Harrison
  2015-06-22 20:03                   ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-18 16:16 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Intel-GFX

On 18/06/2015 16:39, Chris Wilson wrote:
> On Thu, Jun 18, 2015 at 04:24:53PM +0200, Daniel Vetter wrote:
>> On Thu, Jun 18, 2015 at 01:59:13PM +0100, John Harrison wrote:
>>> On 18/06/2015 13:21, Chris Wilson wrote:
>>>> On Thu, Jun 18, 2015 at 01:14:56PM +0100, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> The plan is to pass requests around as the basic submission tracking structure
>>>>> rather than rings and contexts. This patch updates the i915_gem_object_sync()
>>>>> code path.
>>>>>
>>>>> v2: Much more complex patch to share a single request between the sync and the
>>>>> page flip. The _sync() function now supports lazy allocation of the request
>>>>> structure. That is, if one is passed in then that will be used. If one is not,
>>>>> then a request will be allocated and passed back out. Note that the _sync() code
>>>>> does not necessarily require a request. Thus one will only be created until
>>>>> certain situations. The reason the lazy allocation must be done within the
>>>>> _sync() code itself is because the decision to need one or not is not really
>>>>> something that code above can second guess (except in the case where one is
>>>>> definitely not required because no ring is passed in).
>>>>>
>>>>> The call chains above _sync() now support passing a request through which most
>>>>> callers passing in NULL and assuming that no request will be required (because
>>>>> they also pass in NULL for the ring and therefore can't be generating any ring
>>>>> code).
>>>>>
>>>>> The exeception is intel_crtc_page_flip() which now supports having a request
>>>>> returned from _sync(). If one is, then that request is shared by the page flip
>>>>> (if the page flip is of a type to need a request). If _sync() does not generate
>>>>> a request but the page flip does need one, then the page flip path will create
>>>>> its own request.
>>>>>
>>>>> v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
>>>>> Elf review request). Rebased onto newer tree that significantly changed the
>>>>> synchronisation code.
>>>>>
>>>>> v4: Updated comments from review feedback (Tomas Elf)
>>>>>
>>>>> For: VIZ-5115
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> Reviewed-by: Tomas Elf <tomas.elf@intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
>>>>>   drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
>>>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
>>>>>   drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
>>>>>   drivers/gpu/drm/i915/intel_drv.h           |    3 +-
>>>>>   drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
>>>>>   drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
>>>>>   drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
>>>>>   8 files changed, 57 insertions(+), 23 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index 64a10fa..f69e9cb 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>>>>>   int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>>>>>   int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>>>>> -			 struct intel_engine_cs *to);
>>>>> +			 struct intel_engine_cs *to,
>>>>> +			 struct drm_i915_gem_request **to_req);
>>>> Nope. Did you forget to reorder the code to ensure that the request is
>>>> allocated along with the context switch at the start of execbuf?
>>>> -Chris
>>>>
>>> Not sure what you are objecting to? If you mean the lazily allocated request
>>> then that is for page flip code not execbuff code. If we get here from an
>>> execbuff call then the request will definitely have been allocated and will
>>> be passed in. Whereas the page flip code may or may not require a request
>>> (depending on whether MMIO or ring flips are in use. Likewise the sync code
>>> may or may not require a request (depending on whether there is anything to
>>> sync to or not). There is no point allocating and submitting an empty
>>> request in the MMIO/idle case. Hence the sync code needs to be able to use
>>> an existing request or create one if none already exists.
>> I guess Chris' comment was that if you have a non-NULL to, then you better
>> have a non-NULL to_req. And since we link up reqeusts to the engine
>> they'll run on the former shouldn't be required any more. So either that's
>> true and we can remove the to or we don't understand something yet (and
>> perhaps that should be done as a follow-up).
> I am sure I sent a patch that outlined in great detail how that we need
> only the request parameter in i915_gem_object_sync(), for handling both
> execbuffer, pipelined pin_and_fence and synchronous pin_and_fence.
> -Chris
>

As the driver stands, the page flip code wants to synchronise with the 
framebuffer object but potentially without touching the ring and 
therefore without creating a request. If the synchronisation is a no-op 
(because there are no outstanding operations on the given object) then 
there is no need for a request anywhere in the call chain. Thus there is 
a need to pass in the ring together with an optional request and to be 
able to pass out a request that has been created internally.

 >  if you have a non-NULL to, then you better have a non-NULL to_req

I assume you mean 'a non-NULL *to_req'?

No, that is the whole point. If you have a non-null '*to_req' then 'to' 
must be non-null (and specifically must be the ring that '*to_req' is 
referencing). However, it is valid to have a non-null 'to' and a null 
'*to_req'.  In the case of MMIO flips, the page flip itself does not 
require a request as it does not go through the ring. However, it still 
passes in 'i915_gem_request_get_ring(obj->last_write_req)' as the ring 
to synchronise to. Thus it is potentially passing in a valid to pointer 
but without wanting to pre-allocate a request object. If the 
synchronisation requires writing a semaphore to the ring then a request 
will be created internally and passed back out for the page flip code to 
submit (and to re-use in the case of non-MMIO flips). But if the 
synchronisation is a no-op then no request ever gets created or 
submitting and nothing touches the ring at all.

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* 3.16 backlight kernel options
  2015-06-18 12:14       ` John.C.Harrison
  2015-06-18 12:21         ` Chris Wilson
@ 2015-06-18 16:36         ` Stéphane ANCELOT
  1 sibling, 0 replies; 120+ messages in thread
From: Stéphane ANCELOT @ 2015-06-18 16:36 UTC (permalink / raw)
  To: Intel-GFX

Hi,

Which option is mandatory in linux kernel to be able to act on 
brightness of display ?

Regards,
Steph

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
  2015-06-02 18:14   ` Tomas Elf
  2015-06-04 12:06   ` John.C.Harrison
@ 2015-06-19 16:34   ` John.C.Harrison
  2015-06-22 20:12     ` Daniel Vetter
  2 siblings, 1 reply; 120+ messages in thread
From: John.C.Harrison @ 2015-06-19 16:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

It is a bad idea for i915_add_request() to fail. The work will already have been
send to the ring and will be processed, but there will not be any tracking or
management of that work.

The only way the add request call can fail is if it can't write its epilogue
commands to the ring (cache flushing, seqno updates, interrupt signalling). The
reasons for that are mostly down to running out of ring buffer space and the
problems associated with trying to get some more. This patch prevents that
situation from happening in the first place.

When a request is created, it marks sufficient space as reserved for the
epilogue commands. Thus guaranteeing that by the time the epilogue is written,
there will be plenty of space for it. Note that a ring_begin() call is required
to actually reserve the space (and do any potential waiting). However, that is
not currently done at request creation time. This is because the ring_begin()
code can allocate a request. Hence calling begin() from the request allocation
code would lead to infinite recursion! Later patches in this series remove the
need for begin() to do the allocate. At that point, it becomes safe for the
allocate to call begin() and really reserve the space.

Until then, there is a potential for insufficient space to be available at the
point of calling i915_add_request(). However, that would only be in the case
where the request was created and immediately submitted without ever calling
ring_begin() and adding any work to that request. Which should never happen. And
even if it does, and if that request happens to fall down the tiny window of
opportunity for failing due to being out of ring space then does it really
matter because the request wasn't doing anything in the first place?

v2: Updated the 'reserved space too small' warning to include the offending
sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
re-initialisation of tracking state after a buffer wrap to keep the sanity
checks accurate.

v3: Incremented the reserved size to accommodate Ironlake (after finally
managing to run on an ILK system). Also fixed missing wrap code in LRC mode.

v4: Added extra comment and removed duplicate WARN (feedback from Tomas).

v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
Daniel Vetter).

For: VIZ-5115
CC: Tomas Elf <tomas.elf@intel.com>
CC: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    1 +
 drivers/gpu/drm/i915/i915_gem.c         |   37 ++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        |   35 +++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.c |   98 +++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++
 5 files changed, 186 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0347eb9..eba1857 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
 
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct intel_context *ctx);
+void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 
 static inline uint32_t
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 81f3512..85fa27b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
 	} else
 		ringbuf = ring->buffer;
 
+	/*
+	 * To ensure that this call will not fail, space for its emissions
+	 * should already have been reserved in the ring buffer. Let the ring
+	 * know that it is time to use that space up.
+	 */
+	intel_ring_reserved_space_use(ringbuf);
+
 	request_start = intel_ring_get_tail(ringbuf);
 	/*
 	 * Emit any outstanding flushes - execbuf can fail to emit the flush
@@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
 			   round_jiffies_up_relative(HZ));
 	intel_mark_busy(dev_priv->dev);
 
+	/* Sanity check that the reserved size was large enough. */
+	intel_ring_reserved_space_end(ringbuf);
+
 	return 0;
 }
 
@@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	if (ret)
 		goto err;
 
+	/*
+	 * Reserve space in the ring buffer for all the commands required to
+	 * eventually emit this request. This is to guarantee that the
+	 * i915_add_request() call can't fail. Note that the reserve may need
+	 * to be redone if the request is not actually submitted straight
+	 * away, e.g. because a GPU scheduler has deferred it.
+	 *
+	 * Note further that this call merely notes the reserve request. A
+	 * subsequent call to *_ring_begin() is required to actually ensure
+	 * that the reservation is available. Without the begin, if the
+	 * request creator immediately submitted the request without adding
+	 * any commands to it then there might not actually be sufficient
+	 * room for the submission commands. Unfortunately, the current
+	 * *_ring_begin() implementations potentially call back here to
+	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
+	 * infinite recursion! Until that back call path is removed, it is
+	 * necessary to do a manual _begin() outside.
+	 */
+	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
+
 	ring->outstanding_lazy_request = req;
 	return 0;
 
@@ -2674,6 +2704,13 @@ err:
 	return ret;
 }
 
+void i915_gem_request_cancel(struct drm_i915_gem_request *req)
+{
+	intel_ring_reserved_space_cancel(req->ringbuf);
+
+	i915_gem_request_unreference(req);
+}
+
 struct drm_i915_gem_request *
 i915_gem_find_active_request(struct intel_engine_cs *ring)
 {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6a5ed07..bd62bd6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -690,6 +690,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
 	if (intel_ring_space(ringbuf) >= bytes)
 		return 0;
 
+	/* The whole point of reserving space is to not wait! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	list_for_each_entry(request, &ring->request_list, list) {
 		/*
 		 * The request queue is per-engine, so can contain requests
@@ -748,8 +751,12 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
 	int rem = ringbuf->size - ringbuf->tail;
 
 	if (ringbuf->space < rem) {
-		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
+		int ret;
+
+		/* Can't wait if space has already been reserved! */
+		WARN_ON(ringbuf->reserved_in_use);
 
+		ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
 		if (ret)
 			return ret;
 	}
@@ -768,7 +775,7 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
 static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
 				struct intel_context *ctx, int bytes)
 {
-	int ret;
+	int ret, max_bytes;
 
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
 		ret = logical_ring_wrap_buffer(ringbuf, ctx);
@@ -776,8 +783,28 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
 			return ret;
 	}
 
-	if (unlikely(ringbuf->space < bytes)) {
-		ret = logical_ring_wait_for_space(ringbuf, ctx, bytes);
+	/*
+	 * Add on the reserved size to the request to make sure that after
+	 * the intended commands have been emitted, there is guaranteed to
+	 * still be enough free space to send them to the hardware.
+	 */
+	max_bytes = bytes + ringbuf->reserved_size;
+
+	if (unlikely(ringbuf->space < max_bytes)) {
+		/*
+		 * Bytes is guaranteed to fit within the tail of the buffer,
+		 * but the reserved space may push it off the end. If so then
+		 * need to wait for the whole of the tail plus the reserved
+		 * size. That should guarantee that the actual request
+		 * (bytes) will fit between here and the end and the reserved
+		 * usage will fit either in the same or at the start. Either
+		 * way, if a wrap occurs it will not involve a wait and thus
+		 * cannot fail.
+		 */
+		if (unlikely(ringbuf->tail + max_bytes + I915_RING_FREE_SPACE > ringbuf->effective_size))
+			max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail;
+
+		ret = logical_ring_wait_for_space(ringbuf, ctx, max_bytes);
 		if (unlikely(ret))
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d934f85..1c125e9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2106,6 +2106,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
 	if (intel_ring_space(ringbuf) >= n)
 		return 0;
 
+	/* The whole point of reserving space is to not wait! */
+	WARN_ON(ringbuf->reserved_in_use);
+
 	list_for_each_entry(request, &ring->request_list, list) {
 		space = __intel_ring_space(request->postfix, ringbuf->tail,
 					   ringbuf->size);
@@ -2131,7 +2134,12 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
 	int rem = ringbuf->size - ringbuf->tail;
 
 	if (ringbuf->space < rem) {
-		int ret = ring_wait_for_space(ring, rem);
+		int ret;
+
+		/* Can't wait if space has already been reserved! */
+		WARN_ON(ringbuf->reserved_in_use);
+
+		ret = ring_wait_for_space(ring, rem);
 		if (ret)
 			return ret;
 	}
@@ -2180,11 +2188,69 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
 	return 0;
 }
 
-static int __intel_ring_prepare(struct intel_engine_cs *ring,
-				int bytes)
+void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
+{
+	/* NB: Until request management is fully tidied up and the OLR is
+	 * removed, there are too many ways for get false hits on this
+	 * anti-recursion check! */
+	/*WARN_ON(ringbuf->reserved_size);*/
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_size = size;
+
+	/*
+	 * Really need to call _begin() here but that currently leads to
+	 * recursion problems! This will be fixed later but for now just
+	 * return and hope for the best. Note that there is only a real
+	 * problem if the create of the request never actually calls _begin()
+	 * but if they are not submitting any work then why did they create
+	 * the request in the first place?
+	 */
+}
+
+void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_size   = 0;
+	ringbuf->reserved_in_use = false;
+}
+
+void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(ringbuf->reserved_in_use);
+
+	ringbuf->reserved_in_use = true;
+	ringbuf->reserved_tail   = ringbuf->tail;
+}
+
+void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
+{
+	WARN_ON(!ringbuf->reserved_in_use);
+	if (ringbuf->tail > ringbuf->reserved_tail) {
+		WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
+		     "request reserved size too small: %d vs %d!\n",
+		     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
+	} else {
+		/*
+		 * The ring was wrapped while the reserved space was in use.
+		 * That means that some unknown amount of the ring tail was
+		 * no-op filled and skipped. Thus simply adding the ring size
+		 * to the tail and doing the above space check will not work.
+		 * Rather than attempt to track how much tail was skipped,
+		 * it is much simpler to say that also skipping the sanity
+		 * check every once in a while is not a big issue.
+		 */
+	}
+
+	ringbuf->reserved_size   = 0;
+	ringbuf->reserved_in_use = false;
+}
+
+static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
 {
 	struct intel_ringbuffer *ringbuf = ring->buffer;
-	int ret;
+	int ret, max_bytes;
 
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
 		ret = intel_wrap_ring_buffer(ring);
@@ -2192,8 +2258,28 @@ static int __intel_ring_prepare(struct intel_engine_cs *ring,
 			return ret;
 	}
 
-	if (unlikely(ringbuf->space < bytes)) {
-		ret = ring_wait_for_space(ring, bytes);
+	/*
+	 * Add on the reserved size to the request to make sure that after
+	 * the intended commands have been emitted, there is guaranteed to
+	 * still be enough free space to send them to the hardware.
+	 */
+	max_bytes = bytes + ringbuf->reserved_size;
+
+	if (unlikely(ringbuf->space < max_bytes)) {
+		/*
+		 * Bytes is guaranteed to fit within the tail of the buffer,
+		 * but the reserved space may push it off the end. If so then
+		 * need to wait for the whole of the tail plus the reserved
+		 * size. That should guarantee that the actual request
+		 * (bytes) will fit between here and the end and the reserved
+		 * usage will fit either in the same or at the start. Either
+		 * way, if a wrap occurs it will not involve a wait and thus
+		 * cannot fail.
+		 */
+		if (unlikely(ringbuf->tail + max_bytes > ringbuf->effective_size))
+			max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail;
+
+		ret = ring_wait_for_space(ring, max_bytes);
 		if (unlikely(ret))
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 39f6dfc..bf2ac28 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -105,6 +105,9 @@ struct intel_ringbuffer {
 	int space;
 	int size;
 	int effective_size;
+	int reserved_size;
+	int reserved_tail;
+	bool reserved_in_use;
 
 	/** We track the position of the requests in the ring buffer, and
 	 * when each is retired we increment last_retired_head as the GPU
@@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
 	return ring->outstanding_lazy_request;
 }
 
+/*
+ * Arbitrary size for largest possible 'add request' sequence. The code paths
+ * are complex and variable. Empirical measurement shows that the worst case
+ * is ILK at 136 words. Reserving too much is better than reserving too little
+ * as that allows for corner cases that might have been missed. So the figure
+ * has been rounded up to 160 words.
+ */
+#define MIN_SPACE_FOR_ADD_REQUEST	160
+
+/*
+ * Reserve space in the ring to guarantee that the i915_add_request() call
+ * will always have sufficient room to do its stuff. The request creation
+ * code calls this automatically.
+ */
+void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
+/* Cancel the reservation, e.g. because the request is being discarded. */
+void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
+/* Use the reserved space - for use by i915_add_request() only. */
+void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
+/* Finish with the reserved space - for use by i915_add_request() only. */
+void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 120+ messages in thread

* Re: [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation
  2015-06-18 13:29             ` Daniel Vetter
@ 2015-06-19 16:34               ` John Harrison
  0 siblings, 0 replies; 120+ messages in thread
From: John Harrison @ 2015-06-19 16:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On 18/06/2015 14:29, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 1:21 PM, John Harrison
> <John.C.Harrison@intel.com> wrote:
>> I'm still confused by what you are saying in the above referenced email.
>> Part of it is about the sanity checks failing to handle the wrapping case
>> correctly which has been fixed in the base reserve space patch (patch 2 in
>> the series). The rest is either saying that you think we are potentially
>> wrappping too early and wasting a few bytes of the ring buffer or that
>> something is actually broken?
> Yeah I didn't realize that this change was meant to fix the
> ring->reserved_tail check since I didn't make that connection. It is
> correct with that change, but the problem I see is that the
> correctness of that debug aid isn't assured locally: No we both need
> that check _and_ the correct handling of the reservation tracking at
> wrap-around. If the check just handles wrapping it'll robustly stay in
> working shape even when the wrapping behaviour changes.
>
>> Point 2: 100 bytes of reserve, 160 bytes of execbuf and 200 bytes remaining.
>> You seem to think this will fail somehow? Why? The wait_for_space(160) in
>> the execbuf code will cause a wrap because the the 100 bytes for the
>> add_request reservation is added on and the wait is actually being done for
>> 260 bytes. So yes, we wrap earlier than would otherwise have been necessary
>> but that is the only way to absolutely guarantee that the add_request() call
>> cannot fail when trying to do the wrap itself.
> There's no problem except that it's wasteful. And I tried to explain
> that no unconditionally force-wrapping for the entire reservation is
> actually not needed, since the additional space needed to account for
> the eventual wrapping is bounded by a factor of 2. It's much less in
> practice since we split up the final request bits into multiple
> smaller intel_ring_begin. And if feels a bit wasteful to throw that
> space away (and make the gpu eat through MI_NOP) just because it makes
> caring for the worst-case harder. And with GuC the 160 dwords is
> actually a fairly substantial part of the ring.
>
> Even more so when we completely switch to a transaction model for
> request, where we only need to wrap for individual commands and hence
> could place intel_ring_being per-cmd (which is mostly what we do
> already anyway).
>
>> As Chris says, if the driver is attempting to create a single request that
>> fills the entire ringbuffer then that is a bug that should be caught as soon
>> as possible. Even with a Guc, the ring buffer is not small compared to the
>> size of requests the driver currently produces. Part of the scheduler work
>> is to limit the number of batch buffers that a given application/context can
>> have outstanding in the ring buffer at any given time in order to prevent
>> starvation of the rest of the system by one badly behaved app. Thus
>> completely filling a large ring buffer becomes impossible anyway - the
>> application will be blocked before it gets that far.
> My proposal for this reservation wrapping business would have been:
> - Increase the reservation by 31 dwords (to account for the worst-case
> wrap in pc_render_add_request).
> - Rework the reservation overflow WARN_ON in reserve_space_end to work
> correctly even when wrapping while the reservation has been in use.
> - Move the addition of reserved_space below the point where we wrap
> the ring and only check against total free space, neglecting wrapping.
> - Remove all other complications you've added.
>
> Result is no forced wrapping for reservation and a debug check which
> should even survive random changes by monkeys since the logic for that
> check is fully contained within reserve_space_end. And for the check
> we should be able to reuse __intel_free_space.
>
> If I'm reading things correctly this shouldn't have any effect outside
> of patch 2 and shouldn't cause any conflicts.
> -Daniel

See new patch #2...

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-18 16:16                 ` John Harrison
@ 2015-06-22 20:03                   ` Daniel Vetter
  2015-06-22 20:14                     ` Chris Wilson
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-22 20:03 UTC (permalink / raw)
  To: John Harrison; +Cc: Intel-GFX

On Thu, Jun 18, 2015 at 05:16:15PM +0100, John Harrison wrote:
> On 18/06/2015 16:39, Chris Wilson wrote:
> >On Thu, Jun 18, 2015 at 04:24:53PM +0200, Daniel Vetter wrote:
> >>On Thu, Jun 18, 2015 at 01:59:13PM +0100, John Harrison wrote:
> >>>On 18/06/2015 13:21, Chris Wilson wrote:
> >>>>On Thu, Jun 18, 2015 at 01:14:56PM +0100, John.C.Harrison@Intel.com wrote:
> >>>>>From: John Harrison <John.C.Harrison@Intel.com>
> >>>>>
> >>>>>The plan is to pass requests around as the basic submission tracking structure
> >>>>>rather than rings and contexts. This patch updates the i915_gem_object_sync()
> >>>>>code path.
> >>>>>
> >>>>>v2: Much more complex patch to share a single request between the sync and the
> >>>>>page flip. The _sync() function now supports lazy allocation of the request
> >>>>>structure. That is, if one is passed in then that will be used. If one is not,
> >>>>>then a request will be allocated and passed back out. Note that the _sync() code
> >>>>>does not necessarily require a request. Thus one will only be created until
> >>>>>certain situations. The reason the lazy allocation must be done within the
> >>>>>_sync() code itself is because the decision to need one or not is not really
> >>>>>something that code above can second guess (except in the case where one is
> >>>>>definitely not required because no ring is passed in).
> >>>>>
> >>>>>The call chains above _sync() now support passing a request through which most
> >>>>>callers passing in NULL and assuming that no request will be required (because
> >>>>>they also pass in NULL for the ring and therefore can't be generating any ring
> >>>>>code).
> >>>>>
> >>>>>The exeception is intel_crtc_page_flip() which now supports having a request
> >>>>>returned from _sync(). If one is, then that request is shared by the page flip
> >>>>>(if the page flip is of a type to need a request). If _sync() does not generate
> >>>>>a request but the page flip does need one, then the page flip path will create
> >>>>>its own request.
> >>>>>
> >>>>>v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
> >>>>>Elf review request). Rebased onto newer tree that significantly changed the
> >>>>>synchronisation code.
> >>>>>
> >>>>>v4: Updated comments from review feedback (Tomas Elf)
> >>>>>
> >>>>>For: VIZ-5115
> >>>>>Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> >>>>>Reviewed-by: Tomas Elf <tomas.elf@intel.com>
> >>>>>---
> >>>>>  drivers/gpu/drm/i915/i915_drv.h            |    4 ++-
> >>>>>  drivers/gpu/drm/i915/i915_gem.c            |   48 +++++++++++++++++++++-------
> >>>>>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
> >>>>>  drivers/gpu/drm/i915/intel_display.c       |   17 +++++++---
> >>>>>  drivers/gpu/drm/i915/intel_drv.h           |    3 +-
> >>>>>  drivers/gpu/drm/i915/intel_fbdev.c         |    2 +-
> >>>>>  drivers/gpu/drm/i915/intel_lrc.c           |    2 +-
> >>>>>  drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
> >>>>>  8 files changed, 57 insertions(+), 23 deletions(-)
> >>>>>
> >>>>>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>>>>index 64a10fa..f69e9cb 100644
> >>>>>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>>>>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>>>>@@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
> >>>>>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> >>>>>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> >>>>>-			 struct intel_engine_cs *to);
> >>>>>+			 struct intel_engine_cs *to,
> >>>>>+			 struct drm_i915_gem_request **to_req);
> >>>>Nope. Did you forget to reorder the code to ensure that the request is
> >>>>allocated along with the context switch at the start of execbuf?
> >>>>-Chris
> >>>>
> >>>Not sure what you are objecting to? If you mean the lazily allocated request
> >>>then that is for page flip code not execbuff code. If we get here from an
> >>>execbuff call then the request will definitely have been allocated and will
> >>>be passed in. Whereas the page flip code may or may not require a request
> >>>(depending on whether MMIO or ring flips are in use. Likewise the sync code
> >>>may or may not require a request (depending on whether there is anything to
> >>>sync to or not). There is no point allocating and submitting an empty
> >>>request in the MMIO/idle case. Hence the sync code needs to be able to use
> >>>an existing request or create one if none already exists.
> >>I guess Chris' comment was that if you have a non-NULL to, then you better
> >>have a non-NULL to_req. And since we link up reqeusts to the engine
> >>they'll run on the former shouldn't be required any more. So either that's
> >>true and we can remove the to or we don't understand something yet (and
> >>perhaps that should be done as a follow-up).
> >I am sure I sent a patch that outlined in great detail how that we need
> >only the request parameter in i915_gem_object_sync(), for handling both
> >execbuffer, pipelined pin_and_fence and synchronous pin_and_fence.
> >-Chris
> >
> 
> As the driver stands, the page flip code wants to synchronise with the
> framebuffer object but potentially without touching the ring and therefore
> without creating a request. If the synchronisation is a no-op (because there
> are no outstanding operations on the given object) then there is no need for
> a request anywhere in the call chain. Thus there is a need to pass in the
> ring together with an optional request and to be able to pass out a request
> that has been created internally.
> 
> >  if you have a non-NULL to, then you better have a non-NULL to_req
> 
> I assume you mean 'a non-NULL *to_req'?
> 
> No, that is the whole point. If you have a non-null '*to_req' then 'to' must
> be non-null (and specifically must be the ring that '*to_req' is
> referencing). However, it is valid to have a non-null 'to' and a null
> '*to_req'.  In the case of MMIO flips, the page flip itself does not require
> a request as it does not go through the ring. However, it still passes in
> 'i915_gem_request_get_ring(obj->last_write_req)' as the ring to synchronise
> to. Thus it is potentially passing in a valid to pointer but without wanting
> to pre-allocate a request object. If the synchronisation requires writing a
> semaphore to the ring then a request will be created internally and passed
> back out for the page flip code to submit (and to re-use in the case of
> non-MMIO flips). But if the synchronisation is a no-op then no request ever
> gets created or submitting and nothing touches the ring at all.

We use mmio flips by default if there's any ring switching going on, which
means except when a user sets a silly debug module option this will never
happend. Which means it's not too pretty to carry this complication around
for no real use at all. Otoh the flip code is in a massive churn because
of atomic, so not much point in cleaning that out if it'll all disapear
anyway. I'll smash a patch on top to note this TODO.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-19 16:34   ` John.C.Harrison
@ 2015-06-22 20:12     ` Daniel Vetter
  2015-06-23 11:38       ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-22 20:12 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> It is a bad idea for i915_add_request() to fail. The work will already have been
> send to the ring and will be processed, but there will not be any tracking or
> management of that work.
> 
> The only way the add request call can fail is if it can't write its epilogue
> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> reasons for that are mostly down to running out of ring buffer space and the
> problems associated with trying to get some more. This patch prevents that
> situation from happening in the first place.
> 
> When a request is created, it marks sufficient space as reserved for the
> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> there will be plenty of space for it. Note that a ring_begin() call is required
> to actually reserve the space (and do any potential waiting). However, that is
> not currently done at request creation time. This is because the ring_begin()
> code can allocate a request. Hence calling begin() from the request allocation
> code would lead to infinite recursion! Later patches in this series remove the
> need for begin() to do the allocate. At that point, it becomes safe for the
> allocate to call begin() and really reserve the space.
> 
> Until then, there is a potential for insufficient space to be available at the
> point of calling i915_add_request(). However, that would only be in the case
> where the request was created and immediately submitted without ever calling
> ring_begin() and adding any work to that request. Which should never happen. And
> even if it does, and if that request happens to fall down the tiny window of
> opportunity for failing due to being out of ring space then does it really
> matter because the request wasn't doing anything in the first place?
> 
> v2: Updated the 'reserved space too small' warning to include the offending
> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> re-initialisation of tracking state after a buffer wrap to keep the sanity
> checks accurate.
> 
> v3: Incremented the reserved size to accommodate Ironlake (after finally
> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
> 
> v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
> 
> v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
> Daniel Vetter).

This didn't actually implement what I suggested (wrapping is the worst
case, hence skipping the check for that is breaking the sanity check) and
so changed the patch from "correct, but a bit fragile" to broken. I've
merged the previous version instead.
-Daniel

> 
> For: VIZ-5115
> CC: Tomas Elf <tomas.elf@intel.com>
> CC: Daniel Vetter <daniel@ffwll.ch>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |    1 +
>  drivers/gpu/drm/i915/i915_gem.c         |   37 ++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        |   35 +++++++++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   98 +++++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++
>  5 files changed, 186 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 0347eb9..eba1857 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
>  
>  int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  			   struct intel_context *ctx);
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  void i915_gem_request_free(struct kref *req_ref);
>  
>  static inline uint32_t
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 81f3512..85fa27b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
>  	} else
>  		ringbuf = ring->buffer;
>  
> +	/*
> +	 * To ensure that this call will not fail, space for its emissions
> +	 * should already have been reserved in the ring buffer. Let the ring
> +	 * know that it is time to use that space up.
> +	 */
> +	intel_ring_reserved_space_use(ringbuf);
> +
>  	request_start = intel_ring_get_tail(ringbuf);
>  	/*
>  	 * Emit any outstanding flushes - execbuf can fail to emit the flush
> @@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
>  			   round_jiffies_up_relative(HZ));
>  	intel_mark_busy(dev_priv->dev);
>  
> +	/* Sanity check that the reserved size was large enough. */
> +	intel_ring_reserved_space_end(ringbuf);
> +
>  	return 0;
>  }
>  
> @@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  	if (ret)
>  		goto err;
>  
> +	/*
> +	 * Reserve space in the ring buffer for all the commands required to
> +	 * eventually emit this request. This is to guarantee that the
> +	 * i915_add_request() call can't fail. Note that the reserve may need
> +	 * to be redone if the request is not actually submitted straight
> +	 * away, e.g. because a GPU scheduler has deferred it.
> +	 *
> +	 * Note further that this call merely notes the reserve request. A
> +	 * subsequent call to *_ring_begin() is required to actually ensure
> +	 * that the reservation is available. Without the begin, if the
> +	 * request creator immediately submitted the request without adding
> +	 * any commands to it then there might not actually be sufficient
> +	 * room for the submission commands. Unfortunately, the current
> +	 * *_ring_begin() implementations potentially call back here to
> +	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
> +	 * infinite recursion! Until that back call path is removed, it is
> +	 * necessary to do a manual _begin() outside.
> +	 */
> +	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> +
>  	ring->outstanding_lazy_request = req;
>  	return 0;
>  
> @@ -2674,6 +2704,13 @@ err:
>  	return ret;
>  }
>  
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req)
> +{
> +	intel_ring_reserved_space_cancel(req->ringbuf);
> +
> +	i915_gem_request_unreference(req);
> +}
> +
>  struct drm_i915_gem_request *
>  i915_gem_find_active_request(struct intel_engine_cs *ring)
>  {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6a5ed07..bd62bd6 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -690,6 +690,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
>  	if (intel_ring_space(ringbuf) >= bytes)
>  		return 0;
>  
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>  	list_for_each_entry(request, &ring->request_list, list) {
>  		/*
>  		 * The request queue is per-engine, so can contain requests
> @@ -748,8 +751,12 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>  	int rem = ringbuf->size - ringbuf->tail;
>  
>  	if (ringbuf->space < rem) {
> -		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
> +		int ret;
> +
> +		/* Can't wait if space has already been reserved! */
> +		WARN_ON(ringbuf->reserved_in_use);
>  
> +		ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>  		if (ret)
>  			return ret;
>  	}
> @@ -768,7 +775,7 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>  static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>  				struct intel_context *ctx, int bytes)
>  {
> -	int ret;
> +	int ret, max_bytes;
>  
>  	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>  		ret = logical_ring_wrap_buffer(ringbuf, ctx);
> @@ -776,8 +783,28 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>  			return ret;
>  	}
>  
> -	if (unlikely(ringbuf->space < bytes)) {
> -		ret = logical_ring_wait_for_space(ringbuf, ctx, bytes);
> +	/*
> +	 * Add on the reserved size to the request to make sure that after
> +	 * the intended commands have been emitted, there is guaranteed to
> +	 * still be enough free space to send them to the hardware.
> +	 */
> +	max_bytes = bytes + ringbuf->reserved_size;
> +
> +	if (unlikely(ringbuf->space < max_bytes)) {
> +		/*
> +		 * Bytes is guaranteed to fit within the tail of the buffer,
> +		 * but the reserved space may push it off the end. If so then
> +		 * need to wait for the whole of the tail plus the reserved
> +		 * size. That should guarantee that the actual request
> +		 * (bytes) will fit between here and the end and the reserved
> +		 * usage will fit either in the same or at the start. Either
> +		 * way, if a wrap occurs it will not involve a wait and thus
> +		 * cannot fail.
> +		 */
> +		if (unlikely(ringbuf->tail + max_bytes + I915_RING_FREE_SPACE > ringbuf->effective_size))
> +			max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail;
> +
> +		ret = logical_ring_wait_for_space(ringbuf, ctx, max_bytes);
>  		if (unlikely(ret))
>  			return ret;
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index d934f85..1c125e9 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2106,6 +2106,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>  	if (intel_ring_space(ringbuf) >= n)
>  		return 0;
>  
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>  	list_for_each_entry(request, &ring->request_list, list) {
>  		space = __intel_ring_space(request->postfix, ringbuf->tail,
>  					   ringbuf->size);
> @@ -2131,7 +2134,12 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>  	int rem = ringbuf->size - ringbuf->tail;
>  
>  	if (ringbuf->space < rem) {
> -		int ret = ring_wait_for_space(ring, rem);
> +		int ret;
> +
> +		/* Can't wait if space has already been reserved! */
> +		WARN_ON(ringbuf->reserved_in_use);
> +
> +		ret = ring_wait_for_space(ring, rem);
>  		if (ret)
>  			return ret;
>  	}
> @@ -2180,11 +2188,69 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>  	return 0;
>  }
>  
> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
> -				int bytes)
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
> +{
> +	/* NB: Until request management is fully tidied up and the OLR is
> +	 * removed, there are too many ways for get false hits on this
> +	 * anti-recursion check! */
> +	/*WARN_ON(ringbuf->reserved_size);*/
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size = size;
> +
> +	/*
> +	 * Really need to call _begin() here but that currently leads to
> +	 * recursion problems! This will be fixed later but for now just
> +	 * return and hope for the best. Note that there is only a real
> +	 * problem if the create of the request never actually calls _begin()
> +	 * but if they are not submitting any work then why did they create
> +	 * the request in the first place?
> +	 */
> +}
> +
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_in_use = true;
> +	ringbuf->reserved_tail   = ringbuf->tail;
> +}
> +
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(!ringbuf->reserved_in_use);
> +	if (ringbuf->tail > ringbuf->reserved_tail) {
> +		WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
> +		     "request reserved size too small: %d vs %d!\n",
> +		     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
> +	} else {
> +		/*
> +		 * The ring was wrapped while the reserved space was in use.
> +		 * That means that some unknown amount of the ring tail was
> +		 * no-op filled and skipped. Thus simply adding the ring size
> +		 * to the tail and doing the above space check will not work.
> +		 * Rather than attempt to track how much tail was skipped,
> +		 * it is much simpler to say that also skipping the sanity
> +		 * check every once in a while is not a big issue.
> +		 */
> +	}
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
>  {
>  	struct intel_ringbuffer *ringbuf = ring->buffer;
> -	int ret;
> +	int ret, max_bytes;
>  
>  	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>  		ret = intel_wrap_ring_buffer(ring);
> @@ -2192,8 +2258,28 @@ static int __intel_ring_prepare(struct intel_engine_cs *ring,
>  			return ret;
>  	}
>  
> -	if (unlikely(ringbuf->space < bytes)) {
> -		ret = ring_wait_for_space(ring, bytes);
> +	/*
> +	 * Add on the reserved size to the request to make sure that after
> +	 * the intended commands have been emitted, there is guaranteed to
> +	 * still be enough free space to send them to the hardware.
> +	 */
> +	max_bytes = bytes + ringbuf->reserved_size;
> +
> +	if (unlikely(ringbuf->space < max_bytes)) {
> +		/*
> +		 * Bytes is guaranteed to fit within the tail of the buffer,
> +		 * but the reserved space may push it off the end. If so then
> +		 * need to wait for the whole of the tail plus the reserved
> +		 * size. That should guarantee that the actual request
> +		 * (bytes) will fit between here and the end and the reserved
> +		 * usage will fit either in the same or at the start. Either
> +		 * way, if a wrap occurs it will not involve a wait and thus
> +		 * cannot fail.
> +		 */
> +		if (unlikely(ringbuf->tail + max_bytes > ringbuf->effective_size))
> +			max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail;
> +
> +		ret = ring_wait_for_space(ring, max_bytes);
>  		if (unlikely(ret))
>  			return ret;
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 39f6dfc..bf2ac28 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -105,6 +105,9 @@ struct intel_ringbuffer {
>  	int space;
>  	int size;
>  	int effective_size;
> +	int reserved_size;
> +	int reserved_tail;
> +	bool reserved_in_use;
>  
>  	/** We track the position of the requests in the ring buffer, and
>  	 * when each is retired we increment last_retired_head as the GPU
> @@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>  	return ring->outstanding_lazy_request;
>  }
>  
> +/*
> + * Arbitrary size for largest possible 'add request' sequence. The code paths
> + * are complex and variable. Empirical measurement shows that the worst case
> + * is ILK at 136 words. Reserving too much is better than reserving too little
> + * as that allows for corner cases that might have been missed. So the figure
> + * has been rounded up to 160 words.
> + */
> +#define MIN_SPACE_FOR_ADD_REQUEST	160
> +
> +/*
> + * Reserve space in the ring to guarantee that the i915_add_request() call
> + * will always have sufficient room to do its stuff. The request creation
> + * code calls this automatically.
> + */
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
> +/* Cancel the reservation, e.g. because the request is being discarded. */
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
> +/* Use the reserved space - for use by i915_add_request() only. */
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
> +/* Finish with the reserved space - for use by i915_add_request() only. */
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
> +
>  #endif /* _INTEL_RINGBUFFER_H_ */
> -- 
> 1.7.9.5
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure
  2015-06-22 20:03                   ` Daniel Vetter
@ 2015-06-22 20:14                     ` Chris Wilson
  0 siblings, 0 replies; 120+ messages in thread
From: Chris Wilson @ 2015-06-22 20:14 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On Mon, Jun 22, 2015 at 10:03:20PM +0200, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 05:16:15PM +0100, John Harrison wrote:
> > No, that is the whole point. If you have a non-null '*to_req' then 'to' must
> > be non-null (and specifically must be the ring that '*to_req' is
> > referencing). However, it is valid to have a non-null 'to' and a null
> > '*to_req'.  In the case of MMIO flips, the page flip itself does not require
> > a request as it does not go through the ring. However, it still passes in
> > 'i915_gem_request_get_ring(obj->last_write_req)' as the ring to synchronise
> > to. Thus it is potentially passing in a valid to pointer but without wanting
> > to pre-allocate a request object. If the synchronisation requires writing a
> > semaphore to the ring then a request will be created internally and passed
> > back out for the page flip code to submit (and to re-use in the case of
> > non-MMIO flips). But if the synchronisation is a no-op then no request ever
> > gets created or submitting and nothing touches the ring at all.

Your API is very badly designed.
 
> We use mmio flips by default if there's any ring switching going on, which
> means except when a user sets a silly debug module option this will never
> happend. Which means it's not too pretty to carry this complication around
> for no real use at all. Otoh the flip code is in a massive churn because
> of atomic, so not much point in cleaning that out if it'll all disapear
> anyway. I'll smash a patch on top to note this TODO.

It's only a complication of bad design, again.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 00/55] Remove the outstanding_lazy_request
  2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
                   ` (55 preceding siblings ...)
  2015-06-04 18:23 ` [PATCH 14/56] drm/i915: Make retire condition check for requests not objects John.C.Harrison
@ 2015-06-22 21:04 ` Daniel Vetter
  56 siblings, 0 replies; 120+ messages in thread
From: Daniel Vetter @ 2015-06-22 21:04 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Fri, May 29, 2015 at 05:43:21PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The driver tracks GPU work using request structures. Unfortunately, this
> tracking is not currently explicit but is done by means of a catch-all request
> that floats around in the background hoovering up work until it gets submitted.
> This background request (ring->outstanding_lazy_request or OLR) is created at
> the point of actually writing to the ring rather than when a particular piece of
> GPU work is started. This scheme sort of hangs together but causes a number of
> issues. It can mean that multiple pieces of independent work are lumped together
> in the same request or that work is not officially submitted until much later
> than it was created.
> 
> This patch series completely removes the OLR and explicitly tracks each piece of
> work with it's own personal request structure from start to submission.
> 
> The patch set seems to fix the "'gem_ringfill --r render' + ctrl-c straight
> after boot" issue logged as BZ:88865. I haven't done any analysis of that
> particular issue but the descriptions I've seen appear to blame an inconsistent
> or mangled OLR.
> 
> Note also that by the end of this series, a number of differences between the
> legacy and execlist code paths have been removed. For example add_request() and
> emit_request() now have the same signature thus could be merged back to a single
> function pointer. Merging some of these together would also allow the removal of
> a bunch of 'if(execlists)' tests where the difference is simply to call the
> legacy function or the execlist one.
> 
> v2: Rebased to newer nightly tree, fixed up a few minor issues, added two extra
> patches - one to move the LRC ring begin around in the vein of other recent
> reshuffles, the other to clean up some issues with i915_add_request().
> 
> v3: Large re-work due to feedback from code review. Some patches have been
> removed, extra ones have been added and others have been changed significantly.
> It is recommended that all patches are reviewed from scratch rather than
> assuming only certain ones have changed and need re-inspecting. The exceptions
> are where the 'reviewed-by' tag has been kept because that patch was not
> significantly affected.
> 
> v4: Further updates due to review feedback and rebasing on top of significant
> changes to the underlying tree.
> 
> [Patches against drm-intel-nightly tree fetched 22/05/2015]
> 
> John Harrison (55):
>   drm/i915: Re-instate request->uniq becuase it is extremely useful
>   drm/i915: Reserve ring buffer space for i915_add_request() commands
>   drm/i915: i915_add_request must not fail
>   drm/i915: Early alloc request in execbuff
>   drm/i915: Set context in request from creation even in legacy mode
>   drm/i915: Merged the many do_execbuf() parameters into a structure
>   drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters
>   drm/i915: Update alloc_request to return the allocated request
>   drm/i915: Add request to execbuf params and add explicit cleanup
>   drm/i915: Update the dispatch tracepoint to use params->request
>   drm/i915: Update move_to_gpu() to take a request structure
>   drm/i915: Update execbuffer_move_to_active() to take a request structure
>   drm/i915: Add flag to i915_add_request() to skip the cache flush
>   drm/i915: Update i915_gpu_idle() to manage its own request
>   drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring
>   drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable()
>   drm/i915: Don't tag kernel batches as user batches
>   drm/i915: Add explicit request management to i915_gem_init_hw()
>   drm/i915: Update ppgtt_init_ring() & context_enable() to take requests
>   drm/i915: Update i915_switch_context() to take a request structure
>   drm/i915: Update do_switch() to take a request structure
>   drm/i915: Update deferred context creation to do explicit request management
>   drm/i915: Update init_context() to take a request structure
>   drm/i915: Update render_state_init() to take a request structure
>   drm/i915: Update i915_gem_object_sync() to take a request structure
>   drm/i915: Update overlay code to do explicit request management
>   drm/i915: Update queue_flip() to take a request structure
>   drm/i915: Update add_request() to take a request structure
>   drm/i915: Update [vma|object]_move_to_active() to take request structures
>   drm/i915: Update l3_remap to take a request structure
>   drm/i915: Update mi_set_context() to take a request structure
>   drm/i915: Update a bunch of execbuffer helpers to take request structures
>   drm/i915: Update workarounds_emit() to take request structures
>   drm/i915: Update flush_all_caches() to take request structures
>   drm/i915: Update switch_mm() to take a request structure
>   drm/i915: Update ring->flush() to take a requests structure
>   drm/i915: Update some flush helpers to take request structures
>   drm/i915: Update ring->emit_flush() to take a request structure
>   drm/i915: Update ring->add_request() to take a request structure
>   drm/i915: Update ring->emit_request() to take a request structure
>   drm/i915: Update ring->dispatch_execbuffer() to take a request structure
>   drm/i915: Update ring->emit_bb_start() to take a request structure
>   drm/i915: Update ring->sync_to() to take a request structure
>   drm/i915: Update ring->signal() to take a request structure
>   drm/i915: Update cacheline_align() to take a request structure
>   drm/i915: Update intel_ring_begin() to take a request structure
>   drm/i915: Update intel_logical_ring_begin() to take a request structure
>   drm/i915: Add *_ring_begin() to request allocation
>   drm/i915: Remove the now obsolete intel_ring_get_request()
>   drm/i915: Remove the now obsolete 'outstanding_lazy_request'
>   drm/i915: Move the request/file and request/pid association to creation time
>   drm/i915: Remove 'faked' request from LRC submission
>   drm/i915: Update a bunch of LRC functions to take requests
>   drm/i915: Remove the now obsolete 'i915_gem_check_olr()'
>   drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()

Applied the entire series except patch 1 (we seem to have managed making
tracepoints abi and I'm chickening out of this, or well don't want to
block the olr removal on it and so rebased a few patches that conflicted).
And the last two since the very last is superseeded and the second last
seems to not deconfuse with the new function names after read/read.

Thanks, Daniel

> 
>  drivers/gpu/drm/i915/i915_drv.h              |   77 +++---
>  drivers/gpu/drm/i915/i915_gem.c              |  368 ++++++++++++++++----------
>  drivers/gpu/drm/i915/i915_gem_context.c      |   78 +++---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  128 +++++----
>  drivers/gpu/drm/i915/i915_gem_gtt.c          |   59 +++--
>  drivers/gpu/drm/i915/i915_gem_gtt.h          |    3 +-
>  drivers/gpu/drm/i915/i915_gem_render_state.c |   15 +-
>  drivers/gpu/drm/i915/i915_gem_render_state.h |    2 +-
>  drivers/gpu/drm/i915/i915_trace.h            |   41 +--
>  drivers/gpu/drm/i915/intel_display.c         |   60 +++--
>  drivers/gpu/drm/i915/intel_drv.h             |    3 +-
>  drivers/gpu/drm/i915/intel_fbdev.c           |    2 +-
>  drivers/gpu/drm/i915/intel_lrc.c             |  265 +++++++++----------
>  drivers/gpu/drm/i915/intel_lrc.h             |   16 +-
>  drivers/gpu/drm/i915/intel_overlay.c         |   63 +++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c      |  303 +++++++++++++--------
>  drivers/gpu/drm/i915/intel_ringbuffer.h      |   53 ++--
>  17 files changed, 876 insertions(+), 660 deletions(-)
> 
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 03/55] drm/i915: i915_add_request must not fail
  2015-05-29 16:43 ` [PATCH 03/55] drm/i915: i915_add_request must not fail John.C.Harrison
  2015-06-02 18:16   ` Tomas Elf
@ 2015-06-23 10:16   ` Chris Wilson
  2015-06-23 10:47     ` John Harrison
  1 sibling, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-23 10:16 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Fri, May 29, 2015 at 05:43:24PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The i915_add_request() function is called to keep track of work that has been
> written to the ring buffer. It adds epilogue commands to track progress (seqno
> updates and such), moves the request structure onto the right list and other
> such house keeping tasks. However, the work itself has already been written to
> the ring and will get executed whether or not the add request call succeeds. So
> no matter what goes wrong, there isn't a whole lot of point in failing the call.
> 
> At the moment, this is fine(ish). If the add request does bail early on and not
> do the housekeeping, the request will still float around in the
> ring->outstanding_lazy_request field and be picked up next time. It means
> multiple pieces of work will be tagged as the same request and driver can't
> actually wait for the first piece of work until something else has been
> submitted. But it all sort of hangs together.
> 
> This patch series is all about removing the OLR and guaranteeing that each piece
> of work gets its own personal request. That means that there is no more
> 'hoovering up of forgotten requests'. If the request does not get tracked then
> it will be leaked. Thus the add request call _must_ not fail. The previous patch
> should have already ensured that it _will_ not fail by removing the potential
> for running out of ring space. This patch enforces the rule by actually removing
> the early exit paths and the return code.
> 
> Note that if something does manage to fail and the epilogue commands don't get
> written to the ring, the driver will still hang together. The request will be
> added to the tracking lists. And as in the old case, any subsequent work will
> generate a new seqno which will suffice for marking the old one as complete.
> 
> v2: Improved WARNings (Tomas Elf review request).

Nak. Daniel please revert this mess.

Even in the current code it has a failure mode it cannot handle.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()'
  2015-05-29 16:44 ` [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()' John.C.Harrison
  2015-06-02 18:27   ` Tomas Elf
@ 2015-06-23 10:23   ` Chris Wilson
  2015-06-23 10:39     ` John Harrison
  1 sibling, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-23 10:23 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Fri, May 29, 2015 at 05:44:15PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> As there is no OLR to check, the check_olr() function is now a no-op and can be
> removed.

You ignored a genuine, and trivially easy to hit, compiler warning here.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 46/55] drm/i915: Update intel_ring_begin() to take a request structure
  2015-05-29 16:44 ` [PATCH 46/55] drm/i915: Update intel_ring_begin() " John.C.Harrison
@ 2015-06-23 10:24   ` Chris Wilson
  2015-06-23 10:37     ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Chris Wilson @ 2015-06-23 10:24 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX

On Fri, May 29, 2015 at 05:44:07PM +0100, John.C.Harrison@Intel.com wrote:
> -int intel_ring_begin(struct intel_engine_cs *ring,
> +int intel_ring_begin(struct drm_i915_gem_request *req,
>  		     int num_dwords)
>  {
> -	struct drm_i915_gem_request *req;
> -	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct intel_engine_cs *ring;
> +	struct drm_i915_private *dev_priv;
>  	int ret;
>  
> +	WARN_ON(req == NULL);
> +	ring = req->ring;

What was the point?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 46/55] drm/i915: Update intel_ring_begin() to take a request structure
  2015-06-23 10:24   ` Chris Wilson
@ 2015-06-23 10:37     ` John Harrison
  2015-06-23 13:25       ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-23 10:37 UTC (permalink / raw)
  To: Chris Wilson, Intel-GFX

On 23/06/2015 11:24, Chris Wilson wrote:
> On Fri, May 29, 2015 at 05:44:07PM +0100, John.C.Harrison@Intel.com wrote:
>> -int intel_ring_begin(struct intel_engine_cs *ring,
>> +int intel_ring_begin(struct drm_i915_gem_request *req,
>>   		     int num_dwords)
>>   {
>> -	struct drm_i915_gem_request *req;
>> -	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>> +	struct intel_engine_cs *ring;
>> +	struct drm_i915_private *dev_priv;
>>   	int ret;
>>   
>> +	WARN_ON(req == NULL);
>> +	ring = req->ring;
> What was the point?
> -Chris
>

The point is to remove the OLR. The significant change within 
intel_ring_begin is the next few lines:

-	/* Preallocate the olr before touching the ring */
-	ret = i915_gem_request_alloc(ring, ring->default_context, &req);


That is the part that causes problems by randomly creating a brand new 
request that no-one knows about and squirreling it away in the OLR to 
scoop up random bits of work. This is the whole point of the entire 
patch series - to ensure that all ring work is assigned to a known 
request by whoever instigated that work.

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()'
  2015-06-23 10:23   ` Chris Wilson
@ 2015-06-23 10:39     ` John Harrison
  0 siblings, 0 replies; 120+ messages in thread
From: John Harrison @ 2015-06-23 10:39 UTC (permalink / raw)
  To: Chris Wilson, Intel-GFX

On 23/06/2015 11:23, Chris Wilson wrote:
> On Fri, May 29, 2015 at 05:44:15PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> As there is no OLR to check, the check_olr() function is now a no-op and can be
>> removed.
> You ignored a genuine, and trivially easy to hit, compiler warning here.
> -Chris
>

Would you to care to elaborate? I have a local tweak to the makefile to 
enable warnings as errors, so it definitely builds cleanly for me. Maybe 
there is a merge issue with other changes since the tree that patch was 
against?

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 03/55] drm/i915: i915_add_request must not fail
  2015-06-23 10:16   ` Chris Wilson
@ 2015-06-23 10:47     ` John Harrison
  0 siblings, 0 replies; 120+ messages in thread
From: John Harrison @ 2015-06-23 10:47 UTC (permalink / raw)
  To: Chris Wilson, Intel-GFX

On 23/06/2015 11:16, Chris Wilson wrote:
> On Fri, May 29, 2015 at 05:43:24PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The i915_add_request() function is called to keep track of work that has been
>> written to the ring buffer. It adds epilogue commands to track progress (seqno
>> updates and such), moves the request structure onto the right list and other
>> such house keeping tasks. However, the work itself has already been written to
>> the ring and will get executed whether or not the add request call succeeds. So
>> no matter what goes wrong, there isn't a whole lot of point in failing the call.
>>
>> At the moment, this is fine(ish). If the add request does bail early on and not
>> do the housekeeping, the request will still float around in the
>> ring->outstanding_lazy_request field and be picked up next time. It means
>> multiple pieces of work will be tagged as the same request and driver can't
>> actually wait for the first piece of work until something else has been
>> submitted. But it all sort of hangs together.
>>
>> This patch series is all about removing the OLR and guaranteeing that each piece
>> of work gets its own personal request. That means that there is no more
>> 'hoovering up of forgotten requests'. If the request does not get tracked then
>> it will be leaked. Thus the add request call _must_ not fail. The previous patch
>> should have already ensured that it _will_ not fail by removing the potential
>> for running out of ring space. This patch enforces the rule by actually removing
>> the early exit paths and the return code.
>>
>> Note that if something does manage to fail and the epilogue commands don't get
>> written to the ring, the driver will still hang together. The request will be
>> added to the tracking lists. And as in the old case, any subsequent work will
>> generate a new seqno which will suffice for marking the old one as complete.
>>
>> v2: Improved WARNings (Tomas Elf review request).
> Nak. Daniel please revert this mess.
>
> Even in the current code it has a failure mode it cannot handle.
> -Chris
>

Can you explain further?

Is this a new failure mode? This patch was originally posted back in 
March and there were no comments then to say that is was not the right 
direction to take. The discussion was that this might not be the 
prettiest way to do things but it is the best solution given where the 
driver is at and what we are trying to do.

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-22 20:12     ` Daniel Vetter
@ 2015-06-23 11:38       ` John Harrison
  2015-06-23 13:24         ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-23 11:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 22/06/2015 21:12, Daniel Vetter wrote:
> On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> It is a bad idea for i915_add_request() to fail. The work will already have been
>> send to the ring and will be processed, but there will not be any tracking or
>> management of that work.
>>
>> The only way the add request call can fail is if it can't write its epilogue
>> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
>> reasons for that are mostly down to running out of ring buffer space and the
>> problems associated with trying to get some more. This patch prevents that
>> situation from happening in the first place.
>>
>> When a request is created, it marks sufficient space as reserved for the
>> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
>> there will be plenty of space for it. Note that a ring_begin() call is required
>> to actually reserve the space (and do any potential waiting). However, that is
>> not currently done at request creation time. This is because the ring_begin()
>> code can allocate a request. Hence calling begin() from the request allocation
>> code would lead to infinite recursion! Later patches in this series remove the
>> need for begin() to do the allocate. At that point, it becomes safe for the
>> allocate to call begin() and really reserve the space.
>>
>> Until then, there is a potential for insufficient space to be available at the
>> point of calling i915_add_request(). However, that would only be in the case
>> where the request was created and immediately submitted without ever calling
>> ring_begin() and adding any work to that request. Which should never happen. And
>> even if it does, and if that request happens to fall down the tiny window of
>> opportunity for failing due to being out of ring space then does it really
>> matter because the request wasn't doing anything in the first place?
>>
>> v2: Updated the 'reserved space too small' warning to include the offending
>> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
>> re-initialisation of tracking state after a buffer wrap to keep the sanity
>> checks accurate.
>>
>> v3: Incremented the reserved size to accommodate Ironlake (after finally
>> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>>
>> v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
>>
>> v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
>> Daniel Vetter).
> This didn't actually implement what I suggested (wrapping is the worst
> case, hence skipping the check for that is breaking the sanity check) and
> so changed the patch from "correct, but a bit fragile" to broken. I've
> merged the previous version instead.
> -Daniel
I'm confused. I thought your main issue was the early wrapping not the 
sanity check. The check is to ensure that the reservation is large 
enough to cover all the commands written during request submission. That 
should not be affected by whether a wrap occurs or not. Wrapping does 
not magically add an extra bunch of dwords to the emit_request() call. 
Whereas making the check work with the wrap condition requires adding in 
extra tracking state of exactly where the wrap occurred. That is extra 
code that only exists to catch something in the very rare case which 
should already have been caught in the very common case. I.e. if your 
reserved size is too small then you will hit the warning on every batch 
buffer submission.

John.

>> For: VIZ-5115
>> CC: Tomas Elf <tomas.elf@intel.com>
>> CC: Daniel Vetter <daniel@ffwll.ch>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         |    1 +
>>   drivers/gpu/drm/i915/i915_gem.c         |   37 ++++++++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        |   35 +++++++++--
>>   drivers/gpu/drm/i915/intel_ringbuffer.c |   98 +++++++++++++++++++++++++++++--
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++
>>   5 files changed, 186 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 0347eb9..eba1857 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
>>   
>>   int i915_gem_request_alloc(struct intel_engine_cs *ring,
>>   			   struct intel_context *ctx);
>> +void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>   void i915_gem_request_free(struct kref *req_ref);
>>   
>>   static inline uint32_t
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 81f3512..85fa27b 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
>>   	} else
>>   		ringbuf = ring->buffer;
>>   
>> +	/*
>> +	 * To ensure that this call will not fail, space for its emissions
>> +	 * should already have been reserved in the ring buffer. Let the ring
>> +	 * know that it is time to use that space up.
>> +	 */
>> +	intel_ring_reserved_space_use(ringbuf);
>> +
>>   	request_start = intel_ring_get_tail(ringbuf);
>>   	/*
>>   	 * Emit any outstanding flushes - execbuf can fail to emit the flush
>> @@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
>>   			   round_jiffies_up_relative(HZ));
>>   	intel_mark_busy(dev_priv->dev);
>>   
>> +	/* Sanity check that the reserved size was large enough. */
>> +	intel_ring_reserved_space_end(ringbuf);
>> +
>>   	return 0;
>>   }
>>   
>> @@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>>   	if (ret)
>>   		goto err;
>>   
>> +	/*
>> +	 * Reserve space in the ring buffer for all the commands required to
>> +	 * eventually emit this request. This is to guarantee that the
>> +	 * i915_add_request() call can't fail. Note that the reserve may need
>> +	 * to be redone if the request is not actually submitted straight
>> +	 * away, e.g. because a GPU scheduler has deferred it.
>> +	 *
>> +	 * Note further that this call merely notes the reserve request. A
>> +	 * subsequent call to *_ring_begin() is required to actually ensure
>> +	 * that the reservation is available. Without the begin, if the
>> +	 * request creator immediately submitted the request without adding
>> +	 * any commands to it then there might not actually be sufficient
>> +	 * room for the submission commands. Unfortunately, the current
>> +	 * *_ring_begin() implementations potentially call back here to
>> +	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
>> +	 * infinite recursion! Until that back call path is removed, it is
>> +	 * necessary to do a manual _begin() outside.
>> +	 */
>> +	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
>> +
>>   	ring->outstanding_lazy_request = req;
>>   	return 0;
>>   
>> @@ -2674,6 +2704,13 @@ err:
>>   	return ret;
>>   }
>>   
>> +void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>> +{
>> +	intel_ring_reserved_space_cancel(req->ringbuf);
>> +
>> +	i915_gem_request_unreference(req);
>> +}
>> +
>>   struct drm_i915_gem_request *
>>   i915_gem_find_active_request(struct intel_engine_cs *ring)
>>   {
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 6a5ed07..bd62bd6 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -690,6 +690,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
>>   	if (intel_ring_space(ringbuf) >= bytes)
>>   		return 0;
>>   
>> +	/* The whole point of reserving space is to not wait! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	list_for_each_entry(request, &ring->request_list, list) {
>>   		/*
>>   		 * The request queue is per-engine, so can contain requests
>> @@ -748,8 +751,12 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>>   	int rem = ringbuf->size - ringbuf->tail;
>>   
>>   	if (ringbuf->space < rem) {
>> -		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>> +		int ret;
>> +
>> +		/* Can't wait if space has already been reserved! */
>> +		WARN_ON(ringbuf->reserved_in_use);
>>   
>> +		ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>>   		if (ret)
>>   			return ret;
>>   	}
>> @@ -768,7 +775,7 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>>   static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>>   				struct intel_context *ctx, int bytes)
>>   {
>> -	int ret;
>> +	int ret, max_bytes;
>>   
>>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>>   		ret = logical_ring_wrap_buffer(ringbuf, ctx);
>> @@ -776,8 +783,28 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>>   			return ret;
>>   	}
>>   
>> -	if (unlikely(ringbuf->space < bytes)) {
>> -		ret = logical_ring_wait_for_space(ringbuf, ctx, bytes);
>> +	/*
>> +	 * Add on the reserved size to the request to make sure that after
>> +	 * the intended commands have been emitted, there is guaranteed to
>> +	 * still be enough free space to send them to the hardware.
>> +	 */
>> +	max_bytes = bytes + ringbuf->reserved_size;
>> +
>> +	if (unlikely(ringbuf->space < max_bytes)) {
>> +		/*
>> +		 * Bytes is guaranteed to fit within the tail of the buffer,
>> +		 * but the reserved space may push it off the end. If so then
>> +		 * need to wait for the whole of the tail plus the reserved
>> +		 * size. That should guarantee that the actual request
>> +		 * (bytes) will fit between here and the end and the reserved
>> +		 * usage will fit either in the same or at the start. Either
>> +		 * way, if a wrap occurs it will not involve a wait and thus
>> +		 * cannot fail.
>> +		 */
>> +		if (unlikely(ringbuf->tail + max_bytes + I915_RING_FREE_SPACE > ringbuf->effective_size))
>> +			max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail;
>> +
>> +		ret = logical_ring_wait_for_space(ringbuf, ctx, max_bytes);
>>   		if (unlikely(ret))
>>   			return ret;
>>   	}
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index d934f85..1c125e9 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2106,6 +2106,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>>   	if (intel_ring_space(ringbuf) >= n)
>>   		return 0;
>>   
>> +	/* The whole point of reserving space is to not wait! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	list_for_each_entry(request, &ring->request_list, list) {
>>   		space = __intel_ring_space(request->postfix, ringbuf->tail,
>>   					   ringbuf->size);
>> @@ -2131,7 +2134,12 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>>   	int rem = ringbuf->size - ringbuf->tail;
>>   
>>   	if (ringbuf->space < rem) {
>> -		int ret = ring_wait_for_space(ring, rem);
>> +		int ret;
>> +
>> +		/* Can't wait if space has already been reserved! */
>> +		WARN_ON(ringbuf->reserved_in_use);
>> +
>> +		ret = ring_wait_for_space(ring, rem);
>>   		if (ret)
>>   			return ret;
>>   	}
>> @@ -2180,11 +2188,69 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>>   	return 0;
>>   }
>>   
>> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
>> -				int bytes)
>> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
>> +{
>> +	/* NB: Until request management is fully tidied up and the OLR is
>> +	 * removed, there are too many ways for get false hits on this
>> +	 * anti-recursion check! */
>> +	/*WARN_ON(ringbuf->reserved_size);*/
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_size = size;
>> +
>> +	/*
>> +	 * Really need to call _begin() here but that currently leads to
>> +	 * recursion problems! This will be fixed later but for now just
>> +	 * return and hope for the best. Note that there is only a real
>> +	 * problem if the create of the request never actually calls _begin()
>> +	 * but if they are not submitting any work then why did they create
>> +	 * the request in the first place?
>> +	 */
>> +}
>> +
>> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_size   = 0;
>> +	ringbuf->reserved_in_use = false;
>> +}
>> +
>> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_in_use = true;
>> +	ringbuf->reserved_tail   = ringbuf->tail;
>> +}
>> +
>> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(!ringbuf->reserved_in_use);
>> +	if (ringbuf->tail > ringbuf->reserved_tail) {
>> +		WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
>> +		     "request reserved size too small: %d vs %d!\n",
>> +		     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
>> +	} else {
>> +		/*
>> +		 * The ring was wrapped while the reserved space was in use.
>> +		 * That means that some unknown amount of the ring tail was
>> +		 * no-op filled and skipped. Thus simply adding the ring size
>> +		 * to the tail and doing the above space check will not work.
>> +		 * Rather than attempt to track how much tail was skipped,
>> +		 * it is much simpler to say that also skipping the sanity
>> +		 * check every once in a while is not a big issue.
>> +		 */
>> +	}
>> +
>> +	ringbuf->reserved_size   = 0;
>> +	ringbuf->reserved_in_use = false;
>> +}
>> +
>> +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
>>   {
>>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>> -	int ret;
>> +	int ret, max_bytes;
>>   
>>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>>   		ret = intel_wrap_ring_buffer(ring);
>> @@ -2192,8 +2258,28 @@ static int __intel_ring_prepare(struct intel_engine_cs *ring,
>>   			return ret;
>>   	}
>>   
>> -	if (unlikely(ringbuf->space < bytes)) {
>> -		ret = ring_wait_for_space(ring, bytes);
>> +	/*
>> +	 * Add on the reserved size to the request to make sure that after
>> +	 * the intended commands have been emitted, there is guaranteed to
>> +	 * still be enough free space to send them to the hardware.
>> +	 */
>> +	max_bytes = bytes + ringbuf->reserved_size;
>> +
>> +	if (unlikely(ringbuf->space < max_bytes)) {
>> +		/*
>> +		 * Bytes is guaranteed to fit within the tail of the buffer,
>> +		 * but the reserved space may push it off the end. If so then
>> +		 * need to wait for the whole of the tail plus the reserved
>> +		 * size. That should guarantee that the actual request
>> +		 * (bytes) will fit between here and the end and the reserved
>> +		 * usage will fit either in the same or at the start. Either
>> +		 * way, if a wrap occurs it will not involve a wait and thus
>> +		 * cannot fail.
>> +		 */
>> +		if (unlikely(ringbuf->tail + max_bytes > ringbuf->effective_size))
>> +			max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail;
>> +
>> +		ret = ring_wait_for_space(ring, max_bytes);
>>   		if (unlikely(ret))
>>   			return ret;
>>   	}
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index 39f6dfc..bf2ac28 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -105,6 +105,9 @@ struct intel_ringbuffer {
>>   	int space;
>>   	int size;
>>   	int effective_size;
>> +	int reserved_size;
>> +	int reserved_tail;
>> +	bool reserved_in_use;
>>   
>>   	/** We track the position of the requests in the ring buffer, and
>>   	 * when each is retired we increment last_retired_head as the GPU
>> @@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>>   	return ring->outstanding_lazy_request;
>>   }
>>   
>> +/*
>> + * Arbitrary size for largest possible 'add request' sequence. The code paths
>> + * are complex and variable. Empirical measurement shows that the worst case
>> + * is ILK at 136 words. Reserving too much is better than reserving too little
>> + * as that allows for corner cases that might have been missed. So the figure
>> + * has been rounded up to 160 words.
>> + */
>> +#define MIN_SPACE_FOR_ADD_REQUEST	160
>> +
>> +/*
>> + * Reserve space in the ring to guarantee that the i915_add_request() call
>> + * will always have sufficient room to do its stuff. The request creation
>> + * code calls this automatically.
>> + */
>> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
>> +/* Cancel the reservation, e.g. because the request is being discarded. */
>> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
>> +/* Use the reserved space - for use by i915_add_request() only. */
>> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
>> +/* Finish with the reserved space - for use by i915_add_request() only. */
>> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
>> +
>>   #endif /* _INTEL_RINGBUFFER_H_ */
>> -- 
>> 1.7.9.5
>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-23 11:38       ` John Harrison
@ 2015-06-23 13:24         ` Daniel Vetter
  2015-06-23 15:43           ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-23 13:24 UTC (permalink / raw)
  To: John Harrison; +Cc: Intel-GFX

On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
> On 22/06/2015 21:12, Daniel Vetter wrote:
> >On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
> >>From: John Harrison <John.C.Harrison@Intel.com>
> >>
> >>It is a bad idea for i915_add_request() to fail. The work will already have been
> >>send to the ring and will be processed, but there will not be any tracking or
> >>management of that work.
> >>
> >>The only way the add request call can fail is if it can't write its epilogue
> >>commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> >>reasons for that are mostly down to running out of ring buffer space and the
> >>problems associated with trying to get some more. This patch prevents that
> >>situation from happening in the first place.
> >>
> >>When a request is created, it marks sufficient space as reserved for the
> >>epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> >>there will be plenty of space for it. Note that a ring_begin() call is required
> >>to actually reserve the space (and do any potential waiting). However, that is
> >>not currently done at request creation time. This is because the ring_begin()
> >>code can allocate a request. Hence calling begin() from the request allocation
> >>code would lead to infinite recursion! Later patches in this series remove the
> >>need for begin() to do the allocate. At that point, it becomes safe for the
> >>allocate to call begin() and really reserve the space.
> >>
> >>Until then, there is a potential for insufficient space to be available at the
> >>point of calling i915_add_request(). However, that would only be in the case
> >>where the request was created and immediately submitted without ever calling
> >>ring_begin() and adding any work to that request. Which should never happen. And
> >>even if it does, and if that request happens to fall down the tiny window of
> >>opportunity for failing due to being out of ring space then does it really
> >>matter because the request wasn't doing anything in the first place?
> >>
> >>v2: Updated the 'reserved space too small' warning to include the offending
> >>sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> >>re-initialisation of tracking state after a buffer wrap to keep the sanity
> >>checks accurate.
> >>
> >>v3: Incremented the reserved size to accommodate Ironlake (after finally
> >>managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
> >>
> >>v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
> >>
> >>v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
> >>Daniel Vetter).
> >This didn't actually implement what I suggested (wrapping is the worst
> >case, hence skipping the check for that is breaking the sanity check) and
> >so changed the patch from "correct, but a bit fragile" to broken. I've
> >merged the previous version instead.
> >-Daniel
> I'm confused. I thought your main issue was the early wrapping not the
> sanity check. The check is to ensure that the reservation is large enough to
> cover all the commands written during request submission. That should not be
> affected by whether a wrap occurs or not. Wrapping does not magically add an
> extra bunch of dwords to the emit_request() call. Whereas making the check
> work with the wrap condition requires adding in extra tracking state of
> exactly where the wrap occurred. That is extra code that only exists to
> catch something in the very rare case which should already have been caught
> in the very common case. I.e. if your reserved size is too small then you
> will hit the warning on every batch buffer submission.

The problem is that if you allow a wrap in the reserve size then the
ringspace requirements are bigger than if you don't wrap. And since the
add request is split up into many intel_ring_begin that's possible. Hence
if you allow wrapping in the reserved space, then the most important case
for the debug check is to make sure that it catches any kind of
reservation overflow while wrapping. The not-wrapped case is probably the
boring one.

And indeed eventually we should overflow since according to your comment
the worst case add request on ilk is 136 dwords. And the largest
intel_ring_begin in there is 32 dwords, which means at most we'll throw
away 31 dwords when wrapping. Which means the 160 dwords of reservation
are not enough since we'd need 167 dwords of space for the worst case. But
since the space_end debug check was a no-op for the wrapped case you won't
catch this one.

Wrt keeping track of wrapping while the reservation is in use, the
following should do that without any need of additional tracking:


	int used_size = ringbuf->tail - ringbuf->reserved_tail;

	if (used_size < 0)
		used_size += ringbuf->size;

	WARN(used_size < ringbuf->reserved_size,
	     "request reserved size too small: %d vs %d!\n",
	     used_size, ringbuf->reserved_size);

I was mistaken that you can reuse __intel_ring_space (since that has
slightly different requirements), but this gives you a nicely localized
check for reservation overflow which works even when you wrap. Ofc it
won't work if an add_request is bigger than the entire ring, but that's
impossible anyway since we can at most reserve ringbuf->size -
I915_RING_FREE_SPACE.

Or do I still miss something?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 46/55] drm/i915: Update intel_ring_begin() to take a request structure
  2015-06-23 10:37     ` John Harrison
@ 2015-06-23 13:25       ` Daniel Vetter
  2015-06-23 15:27         ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-23 13:25 UTC (permalink / raw)
  To: John Harrison; +Cc: Intel-GFX

On Tue, Jun 23, 2015 at 11:37:53AM +0100, John Harrison wrote:
> On 23/06/2015 11:24, Chris Wilson wrote:
> >On Fri, May 29, 2015 at 05:44:07PM +0100, John.C.Harrison@Intel.com wrote:
> >>-int intel_ring_begin(struct intel_engine_cs *ring,
> >>+int intel_ring_begin(struct drm_i915_gem_request *req,
> >>  		     int num_dwords)
> >>  {
> >>-	struct drm_i915_gem_request *req;
> >>-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >>+	struct intel_engine_cs *ring;
> >>+	struct drm_i915_private *dev_priv;
> >>  	int ret;
> >>+	WARN_ON(req == NULL);
> >>+	ring = req->ring;
> >What was the point?
> >-Chris
> >
> 
> The point is to remove the OLR. The significant change within
> intel_ring_begin is the next few lines:
> 
> -	/* Preallocate the olr before touching the ring */
> -	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
> 
> 
> That is the part that causes problems by randomly creating a brand new
> request that no-one knows about and squirreling it away in the OLR to scoop
> up random bits of work. This is the whole point of the entire patch series -
> to ensure that all ring work is assigned to a known request by whoever
> instigated that work.

I think the point was that a WARN_ON(pointer) followed by an immediate
deref of said pointer doesn't add value. This is the one case where a
BUG_ON is the right joice. Or just let the kernel oops on the NULL deref
without warning first that it'll happen.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 46/55] drm/i915: Update intel_ring_begin() to take a request structure
  2015-06-23 13:25       ` Daniel Vetter
@ 2015-06-23 15:27         ` John Harrison
  2015-06-23 15:34           ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-23 15:27 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 23/06/2015 14:25, Daniel Vetter wrote:
> On Tue, Jun 23, 2015 at 11:37:53AM +0100, John Harrison wrote:
>> On 23/06/2015 11:24, Chris Wilson wrote:
>>> On Fri, May 29, 2015 at 05:44:07PM +0100, John.C.Harrison@Intel.com wrote:
>>>> -int intel_ring_begin(struct intel_engine_cs *ring,
>>>> +int intel_ring_begin(struct drm_i915_gem_request *req,
>>>>   		     int num_dwords)
>>>>   {
>>>> -	struct drm_i915_gem_request *req;
>>>> -	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>>>> +	struct intel_engine_cs *ring;
>>>> +	struct drm_i915_private *dev_priv;
>>>>   	int ret;
>>>> +	WARN_ON(req == NULL);
>>>> +	ring = req->ring;
>>> What was the point?
>>> -Chris
>>>
>> The point is to remove the OLR. The significant change within
>> intel_ring_begin is the next few lines:
>>
>> -	/* Preallocate the olr before touching the ring */
>> -	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
>>
>>
>> That is the part that causes problems by randomly creating a brand new
>> request that no-one knows about and squirreling it away in the OLR to scoop
>> up random bits of work. This is the whole point of the entire patch series -
>> to ensure that all ring work is assigned to a known request by whoever
>> instigated that work.
> I think the point was that a WARN_ON(pointer) followed by an immediate
> deref of said pointer doesn't add value. This is the one case where a
> BUG_ON is the right joice. Or just let the kernel oops on the NULL deref
> without warning first that it'll happen.
> -Daniel
I thought the edict from yourself was that BUG_ONs should never be used, 
it should always be a WARN_ON and if the driver oopses afterwards then 
so be it. The WARN_ON does add value in that it gives you a line number 
(and file) which in turn gives you exactly what variable was null. 
Whereas the oops just gives you an offset into a function. Unless you 
happen to have the exact same binary that generated the bug report, the 
chances of identifying what was null can be slim.

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 46/55] drm/i915: Update intel_ring_begin() to take a request structure
  2015-06-23 15:27         ` John Harrison
@ 2015-06-23 15:34           ` Daniel Vetter
  0 siblings, 0 replies; 120+ messages in thread
From: Daniel Vetter @ 2015-06-23 15:34 UTC (permalink / raw)
  To: John Harrison; +Cc: Intel-GFX

On Tue, Jun 23, 2015 at 04:27:45PM +0100, John Harrison wrote:
> On 23/06/2015 14:25, Daniel Vetter wrote:
> >On Tue, Jun 23, 2015 at 11:37:53AM +0100, John Harrison wrote:
> >>On 23/06/2015 11:24, Chris Wilson wrote:
> >>>On Fri, May 29, 2015 at 05:44:07PM +0100, John.C.Harrison@Intel.com wrote:
> >>>>-int intel_ring_begin(struct intel_engine_cs *ring,
> >>>>+int intel_ring_begin(struct drm_i915_gem_request *req,
> >>>>  		     int num_dwords)
> >>>>  {
> >>>>-	struct drm_i915_gem_request *req;
> >>>>-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >>>>+	struct intel_engine_cs *ring;
> >>>>+	struct drm_i915_private *dev_priv;
> >>>>  	int ret;
> >>>>+	WARN_ON(req == NULL);
> >>>>+	ring = req->ring;
> >>>What was the point?
> >>>-Chris
> >>>
> >>The point is to remove the OLR. The significant change within
> >>intel_ring_begin is the next few lines:
> >>
> >>-	/* Preallocate the olr before touching the ring */
> >>-	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
> >>
> >>
> >>That is the part that causes problems by randomly creating a brand new
> >>request that no-one knows about and squirreling it away in the OLR to scoop
> >>up random bits of work. This is the whole point of the entire patch series -
> >>to ensure that all ring work is assigned to a known request by whoever
> >>instigated that work.
> >I think the point was that a WARN_ON(pointer) followed by an immediate
> >deref of said pointer doesn't add value. This is the one case where a
> >BUG_ON is the right joice. Or just let the kernel oops on the NULL deref
> >without warning first that it'll happen.
> >-Daniel
> I thought the edict from yourself was that BUG_ONs should never be used, it
> should always be a WARN_ON and if the driver oopses afterwards then so be
> it. The WARN_ON does add value in that it gives you a line number (and file)
> which in turn gives you exactly what variable was null. Whereas the oops
> just gives you an offset into a function. Unless you happen to have the
> exact same binary that generated the bug report, the chances of identifying
> what was null can be slim.

I might have gone over the top with screaming against WARN_ONs, so let me
clarify: By default use WARN_ON since if there's a decent chance that the
driver can limp along that greatly improves the chances that devs or users
can get useful backtraces out of a dying machine.

But if you can prove that the driver will oops anyway you can add a BUG_ON
to improve debug output for these cases.

Personally I'm on the fence whether instead a if (WARN_ON()) return
-EINVAL is better or not, I leave that to patch authors. But WARN_ON +
Oops is a bit too much.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-23 13:24         ` Daniel Vetter
@ 2015-06-23 15:43           ` John Harrison
  2015-06-23 20:00             ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-23 15:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 23/06/2015 14:24, Daniel Vetter wrote:
> On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
>> On 22/06/2015 21:12, Daniel Vetter wrote:
>>> On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>
>>>> It is a bad idea for i915_add_request() to fail. The work will already have been
>>>> send to the ring and will be processed, but there will not be any tracking or
>>>> management of that work.
>>>>
>>>> The only way the add request call can fail is if it can't write its epilogue
>>>> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
>>>> reasons for that are mostly down to running out of ring buffer space and the
>>>> problems associated with trying to get some more. This patch prevents that
>>>> situation from happening in the first place.
>>>>
>>>> When a request is created, it marks sufficient space as reserved for the
>>>> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
>>>> there will be plenty of space for it. Note that a ring_begin() call is required
>>>> to actually reserve the space (and do any potential waiting). However, that is
>>>> not currently done at request creation time. This is because the ring_begin()
>>>> code can allocate a request. Hence calling begin() from the request allocation
>>>> code would lead to infinite recursion! Later patches in this series remove the
>>>> need for begin() to do the allocate. At that point, it becomes safe for the
>>>> allocate to call begin() and really reserve the space.
>>>>
>>>> Until then, there is a potential for insufficient space to be available at the
>>>> point of calling i915_add_request(). However, that would only be in the case
>>>> where the request was created and immediately submitted without ever calling
>>>> ring_begin() and adding any work to that request. Which should never happen. And
>>>> even if it does, and if that request happens to fall down the tiny window of
>>>> opportunity for failing due to being out of ring space then does it really
>>>> matter because the request wasn't doing anything in the first place?
>>>>
>>>> v2: Updated the 'reserved space too small' warning to include the offending
>>>> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
>>>> re-initialisation of tracking state after a buffer wrap to keep the sanity
>>>> checks accurate.
>>>>
>>>> v3: Incremented the reserved size to accommodate Ironlake (after finally
>>>> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>>>>
>>>> v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
>>>>
>>>> v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
>>>> Daniel Vetter).
>>> This didn't actually implement what I suggested (wrapping is the worst
>>> case, hence skipping the check for that is breaking the sanity check) and
>>> so changed the patch from "correct, but a bit fragile" to broken. I've
>>> merged the previous version instead.
>>> -Daniel
>> I'm confused. I thought your main issue was the early wrapping not the
>> sanity check. The check is to ensure that the reservation is large enough to
>> cover all the commands written during request submission. That should not be
>> affected by whether a wrap occurs or not. Wrapping does not magically add an
>> extra bunch of dwords to the emit_request() call. Whereas making the check
>> work with the wrap condition requires adding in extra tracking state of
>> exactly where the wrap occurred. That is extra code that only exists to
>> catch something in the very rare case which should already have been caught
>> in the very common case. I.e. if your reserved size is too small then you
>> will hit the warning on every batch buffer submission.
> The problem is that if you allow a wrap in the reserve size then the
> ringspace requirements are bigger than if you don't wrap. And since the
> add request is split up into many intel_ring_begin that's possible. Hence
> if you allow wrapping in the reserved space, then the most important case
> for the debug check is to make sure that it catches any kind of
> reservation overflow while wrapping. The not-wrapped case is probably the
> boring one.
>
> And indeed eventually we should overflow since according to your comment
> the worst case add request on ilk is 136 dwords. And the largest
> intel_ring_begin in there is 32 dwords, which means at most we'll throw
> away 31 dwords when wrapping. Which means the 160 dwords of reservation
> are not enough since we'd need 167 dwords of space for the worst case. But
> since the space_end debug check was a no-op for the wrapped case you won't
> catch this one.

The minimum reservation size in this case is still only 136. The prepare 
code checks for the 32 words actually requested and wraps if necessary. 
It then checks for 136+32 words of space. If that would cause a wrap it 
will then add on the amount of space actually left in the ring and wait 
for that bigger total. That guarantees that it has waited for the 136 at 
the start of the ring. The caller is then free to fill in the 32 words 
and there is still guaranteed to be a minimum of 136 words available 
(with or without wrapping) before any further wait for space is 
necessary. Thus the add_request() code is safe from fear of failure 
irrespective of where any wrap might occur.


>
> Wrt keeping track of wrapping while the reservation is in use, the
> following should do that without any need of additional tracking:
>
>
> 	int used_size = ringbuf->tail - ringbuf->reserved_tail;
>
> 	if (used_size < 0)
> 		used_size += ringbuf->size;
>
> 	WARN(used_size < ringbuf->reserved_size,
> 	     "request reserved size too small: %d vs %d!\n",
> 	     used_size, ringbuf->reserved_size);
>
> I was mistaken that you can reuse __intel_ring_space (since that has
> slightly different requirements), but this gives you a nicely localized
> check for reservation overflow which works even when you wrap. Ofc it
> won't work if an add_request is bigger than the entire ring, but that's
> impossible anyway since we can at most reserve ringbuf->size -
> I915_RING_FREE_SPACE.
The problem with the above calculation is that it includes the wasted 
space at the end of the ring. Thus it will complain the reserved size 
was too small when in fact it was just fine.


> Or do I still miss something?
> -Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-23 15:43           ` John Harrison
@ 2015-06-23 20:00             ` Daniel Vetter
  2015-06-24 12:18               ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-23 20:00 UTC (permalink / raw)
  To: John Harrison; +Cc: Intel-GFX

On Tue, Jun 23, 2015 at 04:43:24PM +0100, John Harrison wrote:
> On 23/06/2015 14:24, Daniel Vetter wrote:
> >On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
> >>On 22/06/2015 21:12, Daniel Vetter wrote:
> >>>On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
> >>>>From: John Harrison <John.C.Harrison@Intel.com>
> >>>>
> >>>>It is a bad idea for i915_add_request() to fail. The work will already have been
> >>>>send to the ring and will be processed, but there will not be any tracking or
> >>>>management of that work.
> >>>>
> >>>>The only way the add request call can fail is if it can't write its epilogue
> >>>>commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> >>>>reasons for that are mostly down to running out of ring buffer space and the
> >>>>problems associated with trying to get some more. This patch prevents that
> >>>>situation from happening in the first place.
> >>>>
> >>>>When a request is created, it marks sufficient space as reserved for the
> >>>>epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> >>>>there will be plenty of space for it. Note that a ring_begin() call is required
> >>>>to actually reserve the space (and do any potential waiting). However, that is
> >>>>not currently done at request creation time. This is because the ring_begin()
> >>>>code can allocate a request. Hence calling begin() from the request allocation
> >>>>code would lead to infinite recursion! Later patches in this series remove the
> >>>>need for begin() to do the allocate. At that point, it becomes safe for the
> >>>>allocate to call begin() and really reserve the space.
> >>>>
> >>>>Until then, there is a potential for insufficient space to be available at the
> >>>>point of calling i915_add_request(). However, that would only be in the case
> >>>>where the request was created and immediately submitted without ever calling
> >>>>ring_begin() and adding any work to that request. Which should never happen. And
> >>>>even if it does, and if that request happens to fall down the tiny window of
> >>>>opportunity for failing due to being out of ring space then does it really
> >>>>matter because the request wasn't doing anything in the first place?
> >>>>
> >>>>v2: Updated the 'reserved space too small' warning to include the offending
> >>>>sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> >>>>re-initialisation of tracking state after a buffer wrap to keep the sanity
> >>>>checks accurate.
> >>>>
> >>>>v3: Incremented the reserved size to accommodate Ironlake (after finally
> >>>>managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
> >>>>
> >>>>v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
> >>>>
> >>>>v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
> >>>>Daniel Vetter).
> >>>This didn't actually implement what I suggested (wrapping is the worst
> >>>case, hence skipping the check for that is breaking the sanity check) and
> >>>so changed the patch from "correct, but a bit fragile" to broken. I've
> >>>merged the previous version instead.
> >>>-Daniel
> >>I'm confused. I thought your main issue was the early wrapping not the
> >>sanity check. The check is to ensure that the reservation is large enough to
> >>cover all the commands written during request submission. That should not be
> >>affected by whether a wrap occurs or not. Wrapping does not magically add an
> >>extra bunch of dwords to the emit_request() call. Whereas making the check
> >>work with the wrap condition requires adding in extra tracking state of
> >>exactly where the wrap occurred. That is extra code that only exists to
> >>catch something in the very rare case which should already have been caught
> >>in the very common case. I.e. if your reserved size is too small then you
> >>will hit the warning on every batch buffer submission.
> >The problem is that if you allow a wrap in the reserve size then the
> >ringspace requirements are bigger than if you don't wrap. And since the
> >add request is split up into many intel_ring_begin that's possible. Hence
> >if you allow wrapping in the reserved space, then the most important case
> >for the debug check is to make sure that it catches any kind of
> >reservation overflow while wrapping. The not-wrapped case is probably the
> >boring one.
> >
> >And indeed eventually we should overflow since according to your comment
> >the worst case add request on ilk is 136 dwords. And the largest
> >intel_ring_begin in there is 32 dwords, which means at most we'll throw
> >away 31 dwords when wrapping. Which means the 160 dwords of reservation
> >are not enough since we'd need 167 dwords of space for the worst case. But
> >since the space_end debug check was a no-op for the wrapped case you won't
> >catch this one.
> 
> The minimum reservation size in this case is still only 136. The prepare
> code checks for the 32 words actually requested and wraps if necessary. It
> then checks for 136+32 words of space. If that would cause a wrap it will
> then add on the amount of space actually left in the ring and wait for that
> bigger total. That guarantees that it has waited for the 136 at the start of
> the ring. The caller is then free to fill in the 32 words and there is still
> guaranteed to be a minimum of 136 words available (with or without wrapping)
> before any further wait for space is necessary. Thus the add_request() code
> is safe from fear of failure irrespective of where any wrap might occur.
> 
> 
> >
> >Wrt keeping track of wrapping while the reservation is in use, the
> >following should do that without any need of additional tracking:
> >
> >
> >	int used_size = ringbuf->tail - ringbuf->reserved_tail;
> >
> >	if (used_size < 0)
> >		used_size += ringbuf->size;
> >
> >	WARN(used_size < ringbuf->reserved_size,
> >	     "request reserved size too small: %d vs %d!\n",
> >	     used_size, ringbuf->reserved_size);
> >
> >I was mistaken that you can reuse __intel_ring_space (since that has
> >slightly different requirements), but this gives you a nicely localized
> >check for reservation overflow which works even when you wrap. Ofc it
> >won't work if an add_request is bigger than the entire ring, but that's
> >impossible anyway since we can at most reserve ringbuf->size -
> >I915_RING_FREE_SPACE.
> The problem with the above calculation is that it includes the wasted space
> at the end of the ring. Thus it will complain the reserved size was too
> small when in fact it was just fine.

Ok I again misunderstood your patch a bit since it didn't quite do what I
expect, and I stand corrected that v5 works too. But I still seem to fail
to get my main concern across. I'll see whether I can whip up a patch as a
short demonstration, maybe that helps to unconfuse this dicussion.

For now I think we're covered with either v4 or v5 so sticking with either
is ok with me.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-23 20:00             ` Daniel Vetter
@ 2015-06-24 12:18               ` John Harrison
  2015-06-24 12:45                 ` Daniel Vetter
  0 siblings, 1 reply; 120+ messages in thread
From: John Harrison @ 2015-06-24 12:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 23/06/2015 21:00, Daniel Vetter wrote:
> On Tue, Jun 23, 2015 at 04:43:24PM +0100, John Harrison wrote:
>> On 23/06/2015 14:24, Daniel Vetter wrote:
>>> On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
>>>> On 22/06/2015 21:12, Daniel Vetter wrote:
>>>>> On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>
>>>>>> It is a bad idea for i915_add_request() to fail. The work will already have been
>>>>>> send to the ring and will be processed, but there will not be any tracking or
>>>>>> management of that work.
>>>>>>
>>>>>> The only way the add request call can fail is if it can't write its epilogue
>>>>>> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
>>>>>> reasons for that are mostly down to running out of ring buffer space and the
>>>>>> problems associated with trying to get some more. This patch prevents that
>>>>>> situation from happening in the first place.
>>>>>>
>>>>>> When a request is created, it marks sufficient space as reserved for the
>>>>>> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
>>>>>> there will be plenty of space for it. Note that a ring_begin() call is required
>>>>>> to actually reserve the space (and do any potential waiting). However, that is
>>>>>> not currently done at request creation time. This is because the ring_begin()
>>>>>> code can allocate a request. Hence calling begin() from the request allocation
>>>>>> code would lead to infinite recursion! Later patches in this series remove the
>>>>>> need for begin() to do the allocate. At that point, it becomes safe for the
>>>>>> allocate to call begin() and really reserve the space.
>>>>>>
>>>>>> Until then, there is a potential for insufficient space to be available at the
>>>>>> point of calling i915_add_request(). However, that would only be in the case
>>>>>> where the request was created and immediately submitted without ever calling
>>>>>> ring_begin() and adding any work to that request. Which should never happen. And
>>>>>> even if it does, and if that request happens to fall down the tiny window of
>>>>>> opportunity for failing due to being out of ring space then does it really
>>>>>> matter because the request wasn't doing anything in the first place?
>>>>>>
>>>>>> v2: Updated the 'reserved space too small' warning to include the offending
>>>>>> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
>>>>>> re-initialisation of tracking state after a buffer wrap to keep the sanity
>>>>>> checks accurate.
>>>>>>
>>>>>> v3: Incremented the reserved size to accommodate Ironlake (after finally
>>>>>> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>>>>>>
>>>>>> v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
>>>>>>
>>>>>> v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
>>>>>> Daniel Vetter).
>>>>> This didn't actually implement what I suggested (wrapping is the worst
>>>>> case, hence skipping the check for that is breaking the sanity check) and
>>>>> so changed the patch from "correct, but a bit fragile" to broken. I've
>>>>> merged the previous version instead.
>>>>> -Daniel
>>>> I'm confused. I thought your main issue was the early wrapping not the
>>>> sanity check. The check is to ensure that the reservation is large enough to
>>>> cover all the commands written during request submission. That should not be
>>>> affected by whether a wrap occurs or not. Wrapping does not magically add an
>>>> extra bunch of dwords to the emit_request() call. Whereas making the check
>>>> work with the wrap condition requires adding in extra tracking state of
>>>> exactly where the wrap occurred. That is extra code that only exists to
>>>> catch something in the very rare case which should already have been caught
>>>> in the very common case. I.e. if your reserved size is too small then you
>>>> will hit the warning on every batch buffer submission.
>>> The problem is that if you allow a wrap in the reserve size then the
>>> ringspace requirements are bigger than if you don't wrap. And since the
>>> add request is split up into many intel_ring_begin that's possible. Hence
>>> if you allow wrapping in the reserved space, then the most important case
>>> for the debug check is to make sure that it catches any kind of
>>> reservation overflow while wrapping. The not-wrapped case is probably the
>>> boring one.
>>>
>>> And indeed eventually we should overflow since according to your comment
>>> the worst case add request on ilk is 136 dwords. And the largest
>>> intel_ring_begin in there is 32 dwords, which means at most we'll throw
>>> away 31 dwords when wrapping. Which means the 160 dwords of reservation
>>> are not enough since we'd need 167 dwords of space for the worst case. But
>>> since the space_end debug check was a no-op for the wrapped case you won't
>>> catch this one.
>> The minimum reservation size in this case is still only 136. The prepare
>> code checks for the 32 words actually requested and wraps if necessary. It
>> then checks for 136+32 words of space. If that would cause a wrap it will
>> then add on the amount of space actually left in the ring and wait for that
>> bigger total. That guarantees that it has waited for the 136 at the start of
>> the ring. The caller is then free to fill in the 32 words and there is still
>> guaranteed to be a minimum of 136 words available (with or without wrapping)
>> before any further wait for space is necessary. Thus the add_request() code
>> is safe from fear of failure irrespective of where any wrap might occur.
>>
>>
>>> Wrt keeping track of wrapping while the reservation is in use, the
>>> following should do that without any need of additional tracking:
>>>
>>>
>>> 	int used_size = ringbuf->tail - ringbuf->reserved_tail;
>>>
>>> 	if (used_size < 0)
>>> 		used_size += ringbuf->size;
>>>
>>> 	WARN(used_size < ringbuf->reserved_size,
>>> 	     "request reserved size too small: %d vs %d!\n",
>>> 	     used_size, ringbuf->reserved_size);
>>>
>>> I was mistaken that you can reuse __intel_ring_space (since that has
>>> slightly different requirements), but this gives you a nicely localized
>>> check for reservation overflow which works even when you wrap. Ofc it
>>> won't work if an add_request is bigger than the entire ring, but that's
>>> impossible anyway since we can at most reserve ringbuf->size -
>>> I915_RING_FREE_SPACE.
>> The problem with the above calculation is that it includes the wasted space
>> at the end of the ring. Thus it will complain the reserved size was too
>> small when in fact it was just fine.
> Ok I again misunderstood your patch a bit since it didn't quite do what I
> expect, and I stand corrected that v5 works too. But I still seem to fail
> to get my main concern across. I'll see whether I can whip up a patch as a
> short demonstration, maybe that helps to unconfuse this dicussion.
>
> For now I think we're covered with either v4 or v5 so sticking with either
> is ok with me.
> -Daniel

I think v5 is much better. It reduces the ring space wastage which I 
thought was your main concern.

The problem with a more simplistic approach that just doubles the 
minimum reserve size to ensure that it will fit before or after a wrap 
is that you are doubling the reserve size. That too is rather wasteful 
of ring space. It also means that you only find out when the reserve 
size is too small when you hit the maximum usage coincident with a worst 
case wrap point. Whereas the v5 method means that you notice a too small 
reserve whether wrapping or not.

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-24 12:18               ` John Harrison
@ 2015-06-24 12:45                 ` Daniel Vetter
  2015-06-24 17:05                   ` John Harrison
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel Vetter @ 2015-06-24 12:45 UTC (permalink / raw)
  To: John Harrison; +Cc: Intel-GFX

On Wed, Jun 24, 2015 at 01:18:48PM +0100, John Harrison wrote:
> On 23/06/2015 21:00, Daniel Vetter wrote:
> >On Tue, Jun 23, 2015 at 04:43:24PM +0100, John Harrison wrote:
> >>On 23/06/2015 14:24, Daniel Vetter wrote:
> >>>On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
> >>>>On 22/06/2015 21:12, Daniel Vetter wrote:
> >>>>>On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
> >>>>>>From: John Harrison <John.C.Harrison@Intel.com>
> >>>>>>
> >>>>>>It is a bad idea for i915_add_request() to fail. The work will already have been
> >>>>>>send to the ring and will be processed, but there will not be any tracking or
> >>>>>>management of that work.
> >>>>>>
> >>>>>>The only way the add request call can fail is if it can't write its epilogue
> >>>>>>commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> >>>>>>reasons for that are mostly down to running out of ring buffer space and the
> >>>>>>problems associated with trying to get some more. This patch prevents that
> >>>>>>situation from happening in the first place.
> >>>>>>
> >>>>>>When a request is created, it marks sufficient space as reserved for the
> >>>>>>epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> >>>>>>there will be plenty of space for it. Note that a ring_begin() call is required
> >>>>>>to actually reserve the space (and do any potential waiting). However, that is
> >>>>>>not currently done at request creation time. This is because the ring_begin()
> >>>>>>code can allocate a request. Hence calling begin() from the request allocation
> >>>>>>code would lead to infinite recursion! Later patches in this series remove the
> >>>>>>need for begin() to do the allocate. At that point, it becomes safe for the
> >>>>>>allocate to call begin() and really reserve the space.
> >>>>>>
> >>>>>>Until then, there is a potential for insufficient space to be available at the
> >>>>>>point of calling i915_add_request(). However, that would only be in the case
> >>>>>>where the request was created and immediately submitted without ever calling
> >>>>>>ring_begin() and adding any work to that request. Which should never happen. And
> >>>>>>even if it does, and if that request happens to fall down the tiny window of
> >>>>>>opportunity for failing due to being out of ring space then does it really
> >>>>>>matter because the request wasn't doing anything in the first place?
> >>>>>>
> >>>>>>v2: Updated the 'reserved space too small' warning to include the offending
> >>>>>>sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> >>>>>>re-initialisation of tracking state after a buffer wrap to keep the sanity
> >>>>>>checks accurate.
> >>>>>>
> >>>>>>v3: Incremented the reserved size to accommodate Ironlake (after finally
> >>>>>>managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
> >>>>>>
> >>>>>>v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
> >>>>>>
> >>>>>>v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
> >>>>>>Daniel Vetter).
> >>>>>This didn't actually implement what I suggested (wrapping is the worst
> >>>>>case, hence skipping the check for that is breaking the sanity check) and
> >>>>>so changed the patch from "correct, but a bit fragile" to broken. I've
> >>>>>merged the previous version instead.
> >>>>>-Daniel
> >>>>I'm confused. I thought your main issue was the early wrapping not the
> >>>>sanity check. The check is to ensure that the reservation is large enough to
> >>>>cover all the commands written during request submission. That should not be
> >>>>affected by whether a wrap occurs or not. Wrapping does not magically add an
> >>>>extra bunch of dwords to the emit_request() call. Whereas making the check
> >>>>work with the wrap condition requires adding in extra tracking state of
> >>>>exactly where the wrap occurred. That is extra code that only exists to
> >>>>catch something in the very rare case which should already have been caught
> >>>>in the very common case. I.e. if your reserved size is too small then you
> >>>>will hit the warning on every batch buffer submission.
> >>>The problem is that if you allow a wrap in the reserve size then the
> >>>ringspace requirements are bigger than if you don't wrap. And since the
> >>>add request is split up into many intel_ring_begin that's possible. Hence
> >>>if you allow wrapping in the reserved space, then the most important case
> >>>for the debug check is to make sure that it catches any kind of
> >>>reservation overflow while wrapping. The not-wrapped case is probably the
> >>>boring one.
> >>>
> >>>And indeed eventually we should overflow since according to your comment
> >>>the worst case add request on ilk is 136 dwords. And the largest
> >>>intel_ring_begin in there is 32 dwords, which means at most we'll throw
> >>>away 31 dwords when wrapping. Which means the 160 dwords of reservation
> >>>are not enough since we'd need 167 dwords of space for the worst case. But
> >>>since the space_end debug check was a no-op for the wrapped case you won't
> >>>catch this one.
> >>The minimum reservation size in this case is still only 136. The prepare
> >>code checks for the 32 words actually requested and wraps if necessary. It
> >>then checks for 136+32 words of space. If that would cause a wrap it will
> >>then add on the amount of space actually left in the ring and wait for that
> >>bigger total. That guarantees that it has waited for the 136 at the start of
> >>the ring. The caller is then free to fill in the 32 words and there is still
> >>guaranteed to be a minimum of 136 words available (with or without wrapping)
> >>before any further wait for space is necessary. Thus the add_request() code
> >>is safe from fear of failure irrespective of where any wrap might occur.
> >>
> >>
> >>>Wrt keeping track of wrapping while the reservation is in use, the
> >>>following should do that without any need of additional tracking:
> >>>
> >>>
> >>>	int used_size = ringbuf->tail - ringbuf->reserved_tail;
> >>>
> >>>	if (used_size < 0)
> >>>		used_size += ringbuf->size;
> >>>
> >>>	WARN(used_size < ringbuf->reserved_size,
> >>>	     "request reserved size too small: %d vs %d!\n",
> >>>	     used_size, ringbuf->reserved_size);
> >>>
> >>>I was mistaken that you can reuse __intel_ring_space (since that has
> >>>slightly different requirements), but this gives you a nicely localized
> >>>check for reservation overflow which works even when you wrap. Ofc it
> >>>won't work if an add_request is bigger than the entire ring, but that's
> >>>impossible anyway since we can at most reserve ringbuf->size -
> >>>I915_RING_FREE_SPACE.
> >>The problem with the above calculation is that it includes the wasted space
> >>at the end of the ring. Thus it will complain the reserved size was too
> >>small when in fact it was just fine.
> >Ok I again misunderstood your patch a bit since it didn't quite do what I
> >expect, and I stand corrected that v5 works too. But I still seem to fail
> >to get my main concern across. I'll see whether I can whip up a patch as a
> >short demonstration, maybe that helps to unconfuse this dicussion.
> >
> >For now I think we're covered with either v4 or v5 so sticking with either
> >is ok with me.
> >-Daniel
> 
> I think v5 is much better. It reduces the ring space wastage which I thought
> was your main concern.

Ok with me too - I simply didn't pick it up when merging yesterday because
I couldn't immediately convince myself it's correct, but really wanted to
pull in your series. Unfortunately it's now burried below piles of
patches, so can you please do a delta patch?

> The problem with a more simplistic approach that just doubles the minimum
> reserve size to ensure that it will fit before or after a wrap is that you
> are doubling the reserve size. That too is rather wasteful of ring space. It
> also means that you only find out when the reserve size is too small when
> you hit the maximum usage coincident with a worst case wrap point. Whereas
> the v5 method means that you notice a too small reserve whether wrapping or
> not.

We don't need to double the reservation since the add_request tail is
split up into many individual intel_ring_begin. And we'd only need to wrap
for the largest of those, which is substantially less than the entire
reservation. Furthermore with the reservation these commands can't ever
fail, so for those we know are only used in the add_request tail we could
go to a wrap-only intel_ring_begin which never waits and have one at a
dword cmd boundary. That means we'd need to overestimate the needed
ringbuffer space by just a few dwords (namely the size of the longest CS
cmd we emit under reservation). Which is around 6 dwords or so iirc. And
to avoid changing ilk we could just special case that in reserve_space().

In practice I don't think there would be any difference with your v5 since
especially with the scheduler we shouldn't ever overfill rings really. But
the clear upside is that the reserve_space_end check would be independent
of any implementation details of how reservation vs. wrapping is done
exactly. And hence robust against any future fumbles in this area. Looking
at our history of the relevant code we can expect a lot of those.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
  2015-06-24 12:45                 ` Daniel Vetter
@ 2015-06-24 17:05                   ` John Harrison
  0 siblings, 0 replies; 120+ messages in thread
From: John Harrison @ 2015-06-24 17:05 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 24/06/2015 13:45, Daniel Vetter wrote:
> On Wed, Jun 24, 2015 at 01:18:48PM +0100, John Harrison wrote:
>> On 23/06/2015 21:00, Daniel Vetter wrote:
>>> On Tue, Jun 23, 2015 at 04:43:24PM +0100, John Harrison wrote:
>>>> On 23/06/2015 14:24, Daniel Vetter wrote:
>>>>> On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
>>>>>> On 22/06/2015 21:12, Daniel Vetter wrote:
>>>>>>> On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
>>>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>>>
>>>>>>>> It is a bad idea for i915_add_request() to fail. The work will already have been
>>>>>>>> send to the ring and will be processed, but there will not be any tracking or
>>>>>>>> management of that work.
>>>>>>>>
>>>>>>>> The only way the add request call can fail is if it can't write its epilogue
>>>>>>>> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
>>>>>>>> reasons for that are mostly down to running out of ring buffer space and the
>>>>>>>> problems associated with trying to get some more. This patch prevents that
>>>>>>>> situation from happening in the first place.
>>>>>>>>
>>>>>>>> When a request is created, it marks sufficient space as reserved for the
>>>>>>>> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
>>>>>>>> there will be plenty of space for it. Note that a ring_begin() call is required
>>>>>>>> to actually reserve the space (and do any potential waiting). However, that is
>>>>>>>> not currently done at request creation time. This is because the ring_begin()
>>>>>>>> code can allocate a request. Hence calling begin() from the request allocation
>>>>>>>> code would lead to infinite recursion! Later patches in this series remove the
>>>>>>>> need for begin() to do the allocate. At that point, it becomes safe for the
>>>>>>>> allocate to call begin() and really reserve the space.
>>>>>>>>
>>>>>>>> Until then, there is a potential for insufficient space to be available at the
>>>>>>>> point of calling i915_add_request(). However, that would only be in the case
>>>>>>>> where the request was created and immediately submitted without ever calling
>>>>>>>> ring_begin() and adding any work to that request. Which should never happen. And
>>>>>>>> even if it does, and if that request happens to fall down the tiny window of
>>>>>>>> opportunity for failing due to being out of ring space then does it really
>>>>>>>> matter because the request wasn't doing anything in the first place?
>>>>>>>>
>>>>>>>> v2: Updated the 'reserved space too small' warning to include the offending
>>>>>>>> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
>>>>>>>> re-initialisation of tracking state after a buffer wrap to keep the sanity
>>>>>>>> checks accurate.
>>>>>>>>
>>>>>>>> v3: Incremented the reserved size to accommodate Ironlake (after finally
>>>>>>>> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>>>>>>>>
>>>>>>>> v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
>>>>>>>>
>>>>>>>> v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
>>>>>>>> Daniel Vetter).
>>>>>>> This didn't actually implement what I suggested (wrapping is the worst
>>>>>>> case, hence skipping the check for that is breaking the sanity check) and
>>>>>>> so changed the patch from "correct, but a bit fragile" to broken. I've
>>>>>>> merged the previous version instead.
>>>>>>> -Daniel
>>>>>> I'm confused. I thought your main issue was the early wrapping not the
>>>>>> sanity check. The check is to ensure that the reservation is large enough to
>>>>>> cover all the commands written during request submission. That should not be
>>>>>> affected by whether a wrap occurs or not. Wrapping does not magically add an
>>>>>> extra bunch of dwords to the emit_request() call. Whereas making the check
>>>>>> work with the wrap condition requires adding in extra tracking state of
>>>>>> exactly where the wrap occurred. That is extra code that only exists to
>>>>>> catch something in the very rare case which should already have been caught
>>>>>> in the very common case. I.e. if your reserved size is too small then you
>>>>>> will hit the warning on every batch buffer submission.
>>>>> The problem is that if you allow a wrap in the reserve size then the
>>>>> ringspace requirements are bigger than if you don't wrap. And since the
>>>>> add request is split up into many intel_ring_begin that's possible. Hence
>>>>> if you allow wrapping in the reserved space, then the most important case
>>>>> for the debug check is to make sure that it catches any kind of
>>>>> reservation overflow while wrapping. The not-wrapped case is probably the
>>>>> boring one.
>>>>>
>>>>> And indeed eventually we should overflow since according to your comment
>>>>> the worst case add request on ilk is 136 dwords. And the largest
>>>>> intel_ring_begin in there is 32 dwords, which means at most we'll throw
>>>>> away 31 dwords when wrapping. Which means the 160 dwords of reservation
>>>>> are not enough since we'd need 167 dwords of space for the worst case. But
>>>>> since the space_end debug check was a no-op for the wrapped case you won't
>>>>> catch this one.
>>>> The minimum reservation size in this case is still only 136. The prepare
>>>> code checks for the 32 words actually requested and wraps if necessary. It
>>>> then checks for 136+32 words of space. If that would cause a wrap it will
>>>> then add on the amount of space actually left in the ring and wait for that
>>>> bigger total. That guarantees that it has waited for the 136 at the start of
>>>> the ring. The caller is then free to fill in the 32 words and there is still
>>>> guaranteed to be a minimum of 136 words available (with or without wrapping)
>>>> before any further wait for space is necessary. Thus the add_request() code
>>>> is safe from fear of failure irrespective of where any wrap might occur.
>>>>
>>>>
>>>>> Wrt keeping track of wrapping while the reservation is in use, the
>>>>> following should do that without any need of additional tracking:
>>>>>
>>>>>
>>>>> 	int used_size = ringbuf->tail - ringbuf->reserved_tail;
>>>>>
>>>>> 	if (used_size < 0)
>>>>> 		used_size += ringbuf->size;
>>>>>
>>>>> 	WARN(used_size < ringbuf->reserved_size,
>>>>> 	     "request reserved size too small: %d vs %d!\n",
>>>>> 	     used_size, ringbuf->reserved_size);
>>>>>
>>>>> I was mistaken that you can reuse __intel_ring_space (since that has
>>>>> slightly different requirements), but this gives you a nicely localized
>>>>> check for reservation overflow which works even when you wrap. Ofc it
>>>>> won't work if an add_request is bigger than the entire ring, but that's
>>>>> impossible anyway since we can at most reserve ringbuf->size -
>>>>> I915_RING_FREE_SPACE.
>>>> The problem with the above calculation is that it includes the wasted space
>>>> at the end of the ring. Thus it will complain the reserved size was too
>>>> small when in fact it was just fine.
>>> Ok I again misunderstood your patch a bit since it didn't quite do what I
>>> expect, and I stand corrected that v5 works too. But I still seem to fail
>>> to get my main concern across. I'll see whether I can whip up a patch as a
>>> short demonstration, maybe that helps to unconfuse this dicussion.
>>>
>>> For now I think we're covered with either v4 or v5 so sticking with either
>>> is ok with me.
>>> -Daniel
>> I think v5 is much better. It reduces the ring space wastage which I thought
>> was your main concern.
> Ok with me too - I simply didn't pick it up when merging yesterday because
> I couldn't immediately convince myself it's correct, but really wanted to
> pull in your series. Unfortunately it's now burried below piles of
> patches, so can you please do a delta patch?
Delta patch posted:  '[PATCH] drm/i915: Reserve space improvements'.


>
>> The problem with a more simplistic approach that just doubles the minimum
>> reserve size to ensure that it will fit before or after a wrap is that you
>> are doubling the reserve size. That too is rather wasteful of ring space. It
>> also means that you only find out when the reserve size is too small when
>> you hit the maximum usage coincident with a worst case wrap point. Whereas
>> the v5 method means that you notice a too small reserve whether wrapping or
>> not.
> We don't need to double the reservation since the add_request tail is
> split up into many individual intel_ring_begin. And we'd only need to wrap
> for the largest of those, which is substantially less than the entire
> reservation. Furthermore with the reservation these commands can't ever
> fail, so for those we know are only used in the add_request tail we could
> go to a wrap-only intel_ring_begin which never waits and have one at a
> dword cmd boundary. That means we'd need to overestimate the needed
> ringbuffer space by just a few dwords (namely the size of the longest CS
> cmd we emit under reservation). Which is around 6 dwords or so iirc. And
> to avoid changing ilk we could just special case that in reserve_space().
>
> In practice I don't think there would be any difference with your v5 since
> especially with the scheduler we shouldn't ever overfill rings really. But
> the clear upside is that the reserve_space_end check would be independent
> of any implementation details of how reservation vs. wrapping is done
> exactly. And hence robust against any future fumbles in this area. Looking
> at our history of the relevant code we can expect a lot of those.
> -Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 120+ messages in thread

end of thread, other threads:[~2015-06-24 17:05 UTC | newest]

Thread overview: 120+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
2015-05-29 16:43 ` [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful John.C.Harrison
2015-06-03 11:14   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
2015-06-02 18:14   ` Tomas Elf
2015-06-04 12:06   ` John.C.Harrison
2015-06-09 16:00     ` Tomas Elf
2015-06-18 12:10       ` John.C.Harrison
2015-06-17 14:04     ` Daniel Vetter
2015-06-18 10:43       ` John Harrison
2015-06-19 16:34   ` John.C.Harrison
2015-06-22 20:12     ` Daniel Vetter
2015-06-23 11:38       ` John Harrison
2015-06-23 13:24         ` Daniel Vetter
2015-06-23 15:43           ` John Harrison
2015-06-23 20:00             ` Daniel Vetter
2015-06-24 12:18               ` John Harrison
2015-06-24 12:45                 ` Daniel Vetter
2015-06-24 17:05                   ` John Harrison
2015-05-29 16:43 ` [PATCH 03/55] drm/i915: i915_add_request must not fail John.C.Harrison
2015-06-02 18:16   ` Tomas Elf
2015-06-04 14:07     ` John Harrison
2015-06-05 10:55       ` Tomas Elf
2015-06-23 10:16   ` Chris Wilson
2015-06-23 10:47     ` John Harrison
2015-05-29 16:43 ` [PATCH 04/55] drm/i915: Early alloc request in execbuff John.C.Harrison
2015-05-29 16:43 ` [PATCH 05/55] drm/i915: Set context in request from creation even in legacy mode John.C.Harrison
2015-05-29 16:43 ` [PATCH 06/55] drm/i915: Merged the many do_execbuf() parameters into a structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 07/55] drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters John.C.Harrison
2015-05-29 16:43 ` [PATCH 08/55] drm/i915: Update alloc_request to return the allocated request John.C.Harrison
2015-05-29 16:43 ` [PATCH 09/55] drm/i915: Add request to execbuf params and add explicit cleanup John.C.Harrison
2015-05-29 16:43 ` [PATCH 10/55] drm/i915: Update the dispatch tracepoint to use params->request John.C.Harrison
2015-05-29 16:43 ` [PATCH 11/55] drm/i915: Update move_to_gpu() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 12/55] drm/i915: Update execbuffer_move_to_active() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush John.C.Harrison
2015-06-02 18:19   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 14/55] drm/i915: Update i915_gpu_idle() to manage its own request John.C.Harrison
2015-05-29 16:43 ` [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring John.C.Harrison
2015-06-18 12:11   ` John.C.Harrison
2015-05-29 16:43 ` [PATCH 16/55] drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() John.C.Harrison
2015-05-29 16:43 ` [PATCH 17/55] drm/i915: Don't tag kernel batches as user batches John.C.Harrison
2015-05-29 16:43 ` [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw() John.C.Harrison
2015-06-02 18:20   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 19/55] drm/i915: Update ppgtt_init_ring() & context_enable() to take requests John.C.Harrison
2015-05-29 16:43 ` [PATCH 20/55] drm/i915: Update i915_switch_context() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 21/55] drm/i915: Update do_switch() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management John.C.Harrison
2015-06-02 18:22   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 23/55] drm/i915: Update init_context() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 24/55] drm/i915: Update render_state_init() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 25/55] drm/i915: Update i915_gem_object_sync() " John.C.Harrison
2015-06-02 18:26   ` Tomas Elf
2015-06-04 12:57     ` John Harrison
2015-06-18 12:14       ` John.C.Harrison
2015-06-18 12:21         ` Chris Wilson
2015-06-18 12:59           ` John Harrison
2015-06-18 14:24             ` Daniel Vetter
2015-06-18 15:39               ` Chris Wilson
2015-06-18 16:16                 ` John Harrison
2015-06-22 20:03                   ` Daniel Vetter
2015-06-22 20:14                     ` Chris Wilson
2015-06-18 16:36         ` 3.16 backlight kernel options Stéphane ANCELOT
2015-05-29 16:43 ` [PATCH 26/55] drm/i915: Update overlay code to do explicit request management John.C.Harrison
2015-05-29 16:43 ` [PATCH 27/55] drm/i915: Update queue_flip() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 28/55] drm/i915: Update add_request() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 29/55] drm/i915: Update [vma|object]_move_to_active() to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 30/55] drm/i915: Update l3_remap to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 31/55] drm/i915: Update mi_set_context() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 32/55] drm/i915: Update a bunch of execbuffer helpers to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 33/55] drm/i915: Update workarounds_emit() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 34/55] drm/i915: Update flush_all_caches() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 35/55] drm/i915: Update switch_mm() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 36/55] drm/i915: Update ring->flush() to take a requests structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 37/55] drm/i915: Update some flush helpers to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 38/55] drm/i915: Update ring->emit_flush() to take a request structure John.C.Harrison
2015-05-29 16:44 ` [PATCH 39/55] drm/i915: Update ring->add_request() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 40/55] drm/i915: Update ring->emit_request() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 41/55] drm/i915: Update ring->dispatch_execbuffer() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 42/55] drm/i915: Update ring->emit_bb_start() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 43/55] drm/i915: Update ring->sync_to() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 44/55] drm/i915: Update ring->signal() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 45/55] drm/i915: Update cacheline_align() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 46/55] drm/i915: Update intel_ring_begin() " John.C.Harrison
2015-06-23 10:24   ` Chris Wilson
2015-06-23 10:37     ` John Harrison
2015-06-23 13:25       ` Daniel Vetter
2015-06-23 15:27         ` John Harrison
2015-06-23 15:34           ` Daniel Vetter
2015-05-29 16:44 ` [PATCH 47/55] drm/i915: Update intel_logical_ring_begin() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation John.C.Harrison
2015-06-17 13:31   ` Daniel Vetter
2015-06-17 14:27     ` Chris Wilson
2015-06-17 14:54       ` Daniel Vetter
2015-06-17 15:52         ` Chris Wilson
2015-06-18 11:21           ` John Harrison
2015-06-18 13:29             ` Daniel Vetter
2015-06-19 16:34               ` John Harrison
2015-05-29 16:44 ` [PATCH 49/55] drm/i915: Remove the now obsolete intel_ring_get_request() John.C.Harrison
2015-05-29 16:44 ` [PATCH 50/55] drm/i915: Remove the now obsolete 'outstanding_lazy_request' John.C.Harrison
2015-05-29 16:44 ` [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time John.C.Harrison
2015-06-03 11:15   ` Tomas Elf
2015-05-29 16:44 ` [PATCH 52/55] drm/i915: Remove 'faked' request from LRC submission John.C.Harrison
2015-05-29 16:44 ` [PATCH 53/55] drm/i915: Update a bunch of LRC functions to take requests John.C.Harrison
2015-05-29 16:44 ` [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()' John.C.Harrison
2015-06-02 18:27   ` Tomas Elf
2015-06-23 10:23   ` Chris Wilson
2015-06-23 10:39     ` John Harrison
2015-05-29 16:44 ` [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active() John.C.Harrison
2015-06-02 18:27   ` Tomas Elf
2015-06-17 14:06   ` Daniel Vetter
2015-06-17 14:21     ` Chris Wilson
2015-06-18 11:03       ` John Harrison
2015-06-18 11:10         ` Chris Wilson
2015-06-18 11:27           ` John Harrison
2015-06-18 10:57     ` John Harrison
2015-06-04 18:23 ` [PATCH 14/56] drm/i915: Make retire condition check for requests not objects John.C.Harrison
2015-06-04 18:24   ` John Harrison
2015-06-09 15:56   ` Tomas Elf
2015-06-17 15:01     ` Daniel Vetter
2015-06-22 21:04 ` [PATCH 00/55] Remove the outstanding_lazy_request Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.