All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Dave Gordon <david.s.gordon@intel.com>
Cc: intel-gfx@lists.freedesktop.org, mika.kuoppala@intel.com
Subject: Re: [PATCH 14/18] drm/i915/guc: Prepare for nonblocking execbuf submission
Date: Mon, 5 Sep 2016 09:08:25 +0100	[thread overview]
Message-ID: <20160905080825.GA22557@nuc-i3427.alporthouse.com> (raw)
In-Reply-To: <747a1369-6a56-17f6-6b3b-8660032f671a@intel.com>

On Fri, Sep 02, 2016 at 07:20:52PM +0100, Dave Gordon wrote:
> On 30/08/16 09:18, Chris Wilson wrote:
> >Currently the presumption is that the request construction and its
> >submission to the GuC are all under the same holding of struct_mutex. We
> >wish to relax this to separate the request construction and the later
> >submission to the GuC. This requires us to reserve some space in the
> >GuC command queue for the future submission. For flexibility to handle
> >out-of-order request submission we do not preallocate the next slot in
> >the GuC command queue during request construction, just ensuring that
> >there is enough space later.
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> > drivers/gpu/drm/i915/i915_guc_submission.c | 55 ++++++++++++++----------------
> > drivers/gpu/drm/i915/intel_guc.h           |  3 ++
> > 2 files changed, 29 insertions(+), 29 deletions(-)
> 
> Hmm .. the functional changes look mostly OK, apart from some
> locking, but there seems to be a great deal of unnecessary churn,
> such as combining statements which had been kept separate for
> clarity :(
> 
> >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >index 2332f9c98bdd..a047e61adc2a 100644
> >--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >@@ -434,20 +434,23 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request *request)
> > {
> > 	const size_t wqi_size = sizeof(struct guc_wq_item);
> > 	struct i915_guc_client *gc = request->i915->guc.execbuf_client;
> >-	struct guc_process_desc *desc;
> >+	struct guc_process_desc *desc = gc->client_base + gc->proc_desc_offset;
> > 	u32 freespace;
> >+	int ret;
> >
> >-	GEM_BUG_ON(gc == NULL);
> >-
> >-	desc = gc->client_base + gc->proc_desc_offset;
> >-
> >+	spin_lock(&gc->lock);
> > 	freespace = CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size);
> >-	if (likely(freespace >= wqi_size))
> >-		return 0;
> >-
> >-	gc->no_wq_space += 1;
> >+	freespace -= gc->wq_rsvd;
> >+	if (likely(freespace >= wqi_size)) {
> >+		gc->wq_rsvd += wqi_size;
> >+		ret = 0;
> >+	} else {
> >+		gc->no_wq_space++;
> >+		ret = -EAGAIN;
> >+	}
> >+	spin_unlock(&gc->lock);
> >
> >-	return -EAGAIN;
> >+	return ret;
> > }
> >
> > static void guc_add_workqueue_item(struct i915_guc_client *gc,
> >@@ -457,22 +460,9 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc,
> > 	const size_t wqi_size = sizeof(struct guc_wq_item);
> > 	const u32 wqi_len = wqi_size/sizeof(u32) - 1;
> > 	struct intel_engine_cs *engine = rq->engine;
> >-	struct guc_process_desc *desc;
> > 	struct guc_wq_item *wqi;
> > 	void *base;
> >-	u32 freespace, tail, wq_off, wq_page;
> >-
> >-	desc = gc->client_base + gc->proc_desc_offset;
> >-
> >-	/* Free space is guaranteed, see i915_guc_wq_check_space() above */
> 
> This comment seems to have been lost. It still applies (mutatis
> mutandis), so it should be relocated to some part of the new version
> ...
> 
> >-	freespace = CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size);
> >-	GEM_BUG_ON(freespace < wqi_size);
> >-
> >-	/* The GuC firmware wants the tail index in QWords, not bytes */
> >-	tail = rq->tail;
> >-	GEM_BUG_ON(tail & 7);
> >-	tail >>= 3;
> >-	GEM_BUG_ON(tail > WQ_RING_TAIL_MAX);
> 
> This *commented* sequence of statements seems clearer than the
> replacement below ...
> 
> >+	u32 wq_off, wq_page;
> >
> > 	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
> > 	 * should not have the case where structure wqi is across page, neither
> >@@ -482,18 +472,19 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc,
> > 	 * workqueue buffer dw by dw.
> > 	 */
> > 	BUILD_BUG_ON(wqi_size != 16);
> 
> This would be a good place to note that:
> 
> /* Reserved space is guaranteed, see i915_guc_wq_check_space() above */
> 
> >+	GEM_BUG_ON(gc->wq_rsvd < wqi_size);
> >
> > 	/* postincrement WQ tail for next time */
> > 	wq_off = gc->wq_tail;
> >+	GEM_BUG_ON(wq_off & (wqi_size - 1));
> > 	gc->wq_tail += wqi_size;
> > 	gc->wq_tail &= gc->wq_size - 1;
> >-	GEM_BUG_ON(wq_off & (wqi_size - 1));
> >+	gc->wq_rsvd -= wqi_size;
> >
> > 	/* WQ starts from the page after doorbell / process_desc */
> > 	wq_page = (wq_off + GUC_DB_SIZE) >> PAGE_SHIFT;
> >-	wq_off &= PAGE_SIZE - 1;
> > 	base = kmap_atomic(i915_gem_object_get_page(gc->vma->obj, wq_page));
> >-	wqi = (struct guc_wq_item *)((char *)base + wq_off);
> >+	wqi = (struct guc_wq_item *)((char *)base + offset_in_page(wq_off));
> >
> > 	/* Now fill in the 4-word work queue item */
> > 	wqi->header = WQ_TYPE_INORDER |
> >@@ -504,7 +495,7 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc,
> > 	/* The GuC wants only the low-order word of the context descriptor */
> > 	wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx, engine);
> >
> >-	wqi->ring_tail = tail << WQ_RING_TAIL_SHIFT;
> >+	wqi->ring_tail = rq->tail >> 3 << WQ_RING_TAIL_SHIFT;
> 
> This line is particularly ugly. I think it's the >> chevron << effect.
> Parenthesisation would help, but it would be nicer as separate lines.
> Also, there's no explanation of the "3" here, unlike the original
> version above.

Correct. The code was and still is ugly.

> > 	wqi->fence_id = rq->fence.seqno;
> >
> > 	kunmap_atomic(base);
> >@@ -591,8 +582,10 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
> > 	struct i915_guc_client *client = guc->execbuf_client;
> > 	int b_ret;
> >
> >+	spin_lock(&client->lock);
> > 	guc_add_workqueue_item(client, rq);
> > 	b_ret = guc_ring_doorbell(client);
> >+	spin_unlock(&client->lock);
> >
> > 	client->submissions[engine_id] += 1;
> 
> Outside the spinlock? Do we still have the BKL during submit(), just
> not during i915_guc_wq_check_space() ? If so, then
> guc_ring_doorbell() doesn't strictly need to be inside the spinlock
> (or the lock could be inside guc_add_workqueue_item()); but if not
> then the update of submissions[] should be inside.

Nope, this code is just garbage, so I didn't bother polishing it.

> > 	client->retcode = b_ret;
> >@@ -770,6 +763,8 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
> > 	if (!client)
> > 		return NULL;
> >
> >+	spin_lock_init(&client->lock);
> >+
> > 	client->owner = ctx;
> > 	client->guc = guc;
> > 	client->engines = engines;
> >@@ -1019,9 +1014,11 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
> > 		engine->submit_request = i915_guc_submit;
> >
> > 		/* Replay the current set of previously submitted requests */
> >-		list_for_each_entry(request, &engine->request_list, link)
> >+		list_for_each_entry(request, &engine->request_list, link) {
> >+			client->wq_rsvd += sizeof(struct guc_wq_item);
> 
> Presumably this is being called in some context that ensures that we
> don't need to hold the spinlock while updating wq_rsvd ? Maybe that
> should be mentioned ...

Obvious.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-09-05  8:08 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-30  8:17 Non-blocking fences, now GuC compatible Chris Wilson
2016-08-30  8:17 ` [PATCH 01/18] drm/i915: Add a sw fence for collecting up dma fences Chris Wilson
2016-08-30  8:17 ` [PATCH 02/18] drm/i915: Only queue requests during execlists submission Chris Wilson
2016-08-30  8:17 ` [PATCH 03/18] drm/i915: Record the position of the workarounds in the tail of the request Chris Wilson
2016-08-30  8:17 ` [PATCH 04/18] drm/i915: Compute the ELSP register location once Chris Wilson
2016-08-30  8:17 ` [PATCH 05/18] drm/i915: Reorder submitting the requests to ELSP Chris Wilson
2016-08-30  8:18 ` [PATCH 06/18] drm/i915: Simplify ELSP queue request tracking Chris Wilson
2016-08-30  8:18 ` [PATCH 07/18] drm/i915: Separate out reset flags from the reset counter Chris Wilson
2016-08-30  8:18 ` [PATCH 08/18] drm/i915: Drop local struct_mutex around intel_init_emon[ilk] Chris Wilson
2016-08-30  8:18 ` [PATCH 09/18] drm/i915: Expand bool interruptible to pass flags to i915_wait_request() Chris Wilson
2016-08-30  8:18 ` [PATCH 10/18] drm/i915: Perform a direct reset of the GPU from the waiter Chris Wilson
2016-08-30  8:18 ` [PATCH 11/18] drm/i915: Update reset path to fix incomplete requests Chris Wilson
2016-08-30  8:18 ` [PATCH 12/18] drm/i915: Drive request submission through fence callbacks Chris Wilson
2016-08-31 11:05   ` John Harrison
2016-08-30  8:18 ` [PATCH 13/18] drm/i915: Move execbuf object synchronisation to i915_gem_execbuffer Chris Wilson
2016-08-30  8:18 ` [PATCH 14/18] drm/i915/guc: Prepare for nonblocking execbuf submission Chris Wilson
2016-09-02 18:20   ` Dave Gordon
2016-09-05  8:08     ` Chris Wilson [this message]
2016-08-30  8:18 ` [PATCH 15/18] drm/i915: Nonblocking request submission Chris Wilson
2016-09-02 15:49   ` John Harrison
2016-08-30  8:18 ` [PATCH 16/18] drm/i915: Serialise execbuf operation after a dma-buf reservation object Chris Wilson
2016-08-30  8:18 ` [PATCH 17/18] drm/i915: Enable userspace to opt-out of implicit fencing Chris Wilson
2016-08-30  8:18 ` [PATCH 18/18] drm/i915: Support explicit fencing for execbuf Chris Wilson
2016-09-02 15:56   ` John Harrison
2016-08-30  8:57 ` ✓ Fi.CI.BAT: success for series starting with [01/18] drm/i915: Add a sw fence for collecting up dma fences Patchwork
2016-08-30  9:04   ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160905080825.GA22557@nuc-i3427.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=david.s.gordon@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mika.kuoppala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.