All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Subject: [PATCH 015/262] drm/i915/execlists: Process one CSB interrupt at a time
Date: Thu, 17 May 2018 07:03:31 +0100	[thread overview]
Message-ID: <20180517060738.19193-15-chris@chris-wilson.co.uk> (raw)
In-Reply-To: <20180517060738.19193-1-chris@chris-wilson.co.uk>

In the next patch, we will process the CSB events directly from the CS
interrupt handler, being called for each interrupt. Hence, we will no
longer have the need for a loop until the has-interrupt bit is clear,
and in the meantime can remove that small optimisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c | 274 +++++++++++++++----------------
 1 file changed, 135 insertions(+), 139 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6c050d553d4e..0c7108249244 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -964,6 +964,11 @@ static void process_csb(struct intel_engine_cs *engine)
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
 	struct drm_i915_private *i915 = engine->i915;
+
+	/* The HWSP contains a (cacheable) mirror of the CSB */
+	const u32 *buf =
+		&engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
+	unsigned int head, tail;
 	bool fw = false;
 
 	/*
@@ -972,164 +977,155 @@ static void process_csb(struct intel_engine_cs *engine)
 	 */
 	GEM_BUG_ON(!engine->i915->gt.awake);
 
-	do {
-		/* The HWSP contains a (cacheable) mirror of the CSB */
-		const u32 *buf =
-			&engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
-		unsigned int head, tail;
+	/* Clear before reading to catch new interrupts */
+	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+	smp_mb__after_atomic();
 
-		/* Clear before reading to catch new interrupts */
-		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
-		smp_mb__after_atomic();
+	if (unlikely(execlists->csb_use_mmio)) {
+		intel_uncore_forcewake_get(i915, execlists->fw_domains);
+		fw = true;
 
-		if (unlikely(execlists->csb_use_mmio)) {
-			if (!fw) {
-				intel_uncore_forcewake_get(i915, execlists->fw_domains);
-				fw = true;
-			}
+		buf = (u32 * __force)
+			(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
 
-			buf = (u32 * __force)
-				(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0)));
+		head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
+		tail = GEN8_CSB_WRITE_PTR(head);
+		head = GEN8_CSB_READ_PTR(head);
+		execlists->csb_head = head;
+	} else {
+		const int write_idx =
+			intel_hws_csb_write_index(i915) -
+			I915_HWS_CSB_BUF0_INDEX;
 
-			head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
-			tail = GEN8_CSB_WRITE_PTR(head);
-			head = GEN8_CSB_READ_PTR(head);
-			execlists->csb_head = head;
-		} else {
-			const int write_idx =
-				intel_hws_csb_write_index(i915) -
-				I915_HWS_CSB_BUF0_INDEX;
+		head = execlists->csb_head;
+		tail = READ_ONCE(buf[write_idx]);
+		rmb(); /* Hopefully paired with a wmb() in HW */
+	}
+	GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n",
+		  engine->name,
+		  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
+		  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
 
-			head = execlists->csb_head;
-			tail = READ_ONCE(buf[write_idx]);
-			rmb(); /* Hopefully paired with a wmb() in HW */
-		}
-		GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n",
-			  engine->name,
-			  head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
-			  tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
+	while (head != tail) {
+		struct i915_request *rq;
+		unsigned int status;
+		unsigned int count;
 
-		while (head != tail) {
-			struct i915_request *rq;
-			unsigned int status;
-			unsigned int count;
+		if (++head == GEN8_CSB_ENTRIES)
+			head = 0;
 
-			if (++head == GEN8_CSB_ENTRIES)
-				head = 0;
+		/*
+		 * We are flying near dragons again.
+		 *
+		 * We hold a reference to the request in execlist_port[]
+		 * but no more than that. We are operating in softirq
+		 * context and so cannot hold any mutex or sleep. That
+		 * prevents us stopping the requests we are processing
+		 * in port[] from being retired simultaneously (the
+		 * breadcrumb will be complete before we see the
+		 * context-switch). As we only hold the reference to the
+		 * request, any pointer chasing underneath the request
+		 * is subject to a potential use-after-free. Thus we
+		 * store all of the bookkeeping within port[] as
+		 * required, and avoid using unguarded pointers beneath
+		 * request itself. The same applies to the atomic
+		 * status notifier.
+		 */
 
-			/*
-			 * We are flying near dragons again.
-			 *
-			 * We hold a reference to the request in execlist_port[]
-			 * but no more than that. We are operating in softirq
-			 * context and so cannot hold any mutex or sleep. That
-			 * prevents us stopping the requests we are processing
-			 * in port[] from being retired simultaneously (the
-			 * breadcrumb will be complete before we see the
-			 * context-switch). As we only hold the reference to the
-			 * request, any pointer chasing underneath the request
-			 * is subject to a potential use-after-free. Thus we
-			 * store all of the bookkeeping within port[] as
-			 * required, and avoid using unguarded pointers beneath
-			 * request itself. The same applies to the atomic
-			 * status notifier.
-			 */
+		status = READ_ONCE(buf[2 * head]); /* maybe mmio! */
+		GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x, active=0x%x\n",
+			  engine->name, head,
+			  status, buf[2*head + 1],
+			  execlists->active);
+
+		if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
+			      GEN8_CTX_STATUS_PREEMPTED))
+			execlists_set_active(execlists,
+					     EXECLISTS_ACTIVE_HWACK);
+		if (status & GEN8_CTX_STATUS_ACTIVE_IDLE)
+			execlists_clear_active(execlists,
+					       EXECLISTS_ACTIVE_HWACK);
+
+		if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
+			continue;
 
-			status = READ_ONCE(buf[2 * head]); /* maybe mmio! */
-			GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x, active=0x%x\n",
-				  engine->name, head,
-				  status, buf[2*head + 1],
-				  execlists->active);
-
-			if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
-				      GEN8_CTX_STATUS_PREEMPTED))
-				execlists_set_active(execlists,
-						     EXECLISTS_ACTIVE_HWACK);
-			if (status & GEN8_CTX_STATUS_ACTIVE_IDLE)
-				execlists_clear_active(execlists,
-						       EXECLISTS_ACTIVE_HWACK);
-
-			if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
-				continue;
+		/* We should never get a COMPLETED | IDLE_ACTIVE! */
+		GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE);
 
-			/* We should never get a COMPLETED | IDLE_ACTIVE! */
-			GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE);
+		if (status & GEN8_CTX_STATUS_COMPLETE &&
+		    buf[2*head + 1] == execlists->preempt_complete_status) {
+			GEM_TRACE("%s preempt-idle\n", engine->name);
+			complete_preempt_context(execlists);
+			continue;
+		}
 
-			if (status & GEN8_CTX_STATUS_COMPLETE &&
-			    buf[2*head + 1] == execlists->preempt_complete_status) {
-				GEM_TRACE("%s preempt-idle\n", engine->name);
-				complete_preempt_context(execlists);
-				continue;
-			}
+		if (status & GEN8_CTX_STATUS_PREEMPTED &&
+		    execlists_is_active(execlists,
+					EXECLISTS_ACTIVE_PREEMPT))
+			continue;
 
-			if (status & GEN8_CTX_STATUS_PREEMPTED &&
-			    execlists_is_active(execlists,
-						EXECLISTS_ACTIVE_PREEMPT))
-				continue;
+		GEM_BUG_ON(!execlists_is_active(execlists,
+						EXECLISTS_ACTIVE_USER));
 
-			GEM_BUG_ON(!execlists_is_active(execlists,
-							EXECLISTS_ACTIVE_USER));
+		rq = port_unpack(port, &count);
+		GEM_TRACE("%s out[0]: ctx=%d.%d, global=%d (fence %llx:%d) (current %d), prio=%d\n",
+			  engine->name,
+			  port->context_id, count,
+			  rq ? rq->global_seqno : 0,
+			  rq ? rq->fence.context : 0,
+			  rq ? rq->fence.seqno : 0,
+			  intel_engine_get_seqno(engine),
+			  rq ? rq_prio(rq) : 0);
+
+		/* Check the context/desc id for this event matches */
+		GEM_DEBUG_BUG_ON(buf[2 * head + 1] != port->context_id);
+
+		GEM_BUG_ON(count == 0);
+		if (--count == 0) {
+			/*
+			 * On the final event corresponding to the
+			 * submission of this context, we expect either
+			 * an element-switch event or a completion
+			 * event (and on completion, the active-idle
+			 * marker). No more preemptions, lite-restore
+			 * or otherwise.
+			 */
+			GEM_BUG_ON(status & GEN8_CTX_STATUS_PREEMPTED);
+			GEM_BUG_ON(port_isset(&port[1]) &&
+				   !(status & GEN8_CTX_STATUS_ELEMENT_SWITCH));
+			GEM_BUG_ON(!port_isset(&port[1]) &&
+				   !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
 
-			rq = port_unpack(port, &count);
-			GEM_TRACE("%s out[0]: ctx=%d.%d, global=%d (fence %llx:%d) (current %d), prio=%d\n",
-				  engine->name,
-				  port->context_id, count,
-				  rq ? rq->global_seqno : 0,
-				  rq ? rq->fence.context : 0,
-				  rq ? rq->fence.seqno : 0,
-				  intel_engine_get_seqno(engine),
-				  rq ? rq_prio(rq) : 0);
+			/*
+			 * We rely on the hardware being strongly
+			 * ordered, that the breadcrumb write is
+			 * coherent (visible from the CPU) before the
+			 * user interrupt and CSB is processed.
+			 */
+			GEM_BUG_ON(!i915_request_completed(rq));
 
-			/* Check the context/desc id for this event matches */
-			GEM_DEBUG_BUG_ON(buf[2 * head + 1] != port->context_id);
+			execlists_context_schedule_out(rq,
+						       INTEL_CONTEXT_SCHEDULE_OUT);
+			i915_request_put(rq);
 
-			GEM_BUG_ON(count == 0);
-			if (--count == 0) {
-				/*
-				 * On the final event corresponding to the
-				 * submission of this context, we expect either
-				 * an element-switch event or a completion
-				 * event (and on completion, the active-idle
-				 * marker). No more preemptions, lite-restore
-				 * or otherwise.
-				 */
-				GEM_BUG_ON(status & GEN8_CTX_STATUS_PREEMPTED);
-				GEM_BUG_ON(port_isset(&port[1]) &&
-					   !(status & GEN8_CTX_STATUS_ELEMENT_SWITCH));
-				GEM_BUG_ON(!port_isset(&port[1]) &&
-					   !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
+			GEM_TRACE("%s completed ctx=%d\n",
+				  engine->name, port->context_id);
 
-				/*
-				 * We rely on the hardware being strongly
-				 * ordered, that the breadcrumb write is
-				 * coherent (visible from the CPU) before the
-				 * user interrupt and CSB is processed.
-				 */
-				GEM_BUG_ON(!i915_request_completed(rq));
-
-				execlists_context_schedule_out(rq,
-							       INTEL_CONTEXT_SCHEDULE_OUT);
-				i915_request_put(rq);
-
-				GEM_TRACE("%s completed ctx=%d\n",
-					  engine->name, port->context_id);
-
-				port = execlists_port_complete(execlists, port);
-				if (port_isset(port))
-					execlists_user_begin(execlists, port);
-				else
-					execlists_user_end(execlists);
-			} else {
-				port_set(port, port_pack(rq, count));
-			}
+			port = execlists_port_complete(execlists, port);
+			if (port_isset(port))
+				execlists_user_begin(execlists, port);
+			else
+				execlists_user_end(execlists);
+		} else {
+			port_set(port, port_pack(rq, count));
 		}
+	}
 
-		if (head != execlists->csb_head) {
-			execlists->csb_head = head;
-			writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
-			       i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
-		}
-	} while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted));
+	if (head != execlists->csb_head) {
+		execlists->csb_head = head;
+		writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
+		       i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
+	}
 
 	if (unlikely(fw))
 		intel_uncore_forcewake_put(i915, execlists->fw_domains);
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2018-05-17  6:08 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-17  6:03 [PATCH 001/262] drm/i915: Move request->ctx aside Chris Wilson
2018-05-17  6:03 ` [PATCH 002/262] drm/i915: Move fiddling with engine->last_retired_context Chris Wilson
2018-05-17  6:03 ` [PATCH 003/262] drm/i915: Store a pointer to intel_context in i915_request Chris Wilson
2018-05-17  6:03 ` [PATCH 004/262] drm/i915: Pull the context->pin_count dec into the common intel_context_unpin Chris Wilson
2018-05-17  6:03 ` [PATCH 005/262] drm/i915: Be irqsafe inside reset Chris Wilson
2018-05-17  6:03 ` [PATCH 006/262] drm/i915: Make intel_engine_dump irqsafe Chris Wilson
2018-05-17  6:03 ` [PATCH 007/262] drm/i915/execlists: Handle copying default context state for atomic reset Chris Wilson
2018-05-17  6:03 ` [PATCH 008/262] drm/i915: Allow init_breadcrumbs to be used from irq context Chris Wilson
2018-05-17  6:03 ` [PATCH 009/262] drm/i915/execlists: HWACK checking superseded checking port[0].count Chris Wilson
2018-05-17  6:03 ` [PATCH 010/262] drm/i915: Remove USES_GUC_SUBMISSION() pointer chasing from gen8_cs_irq_handler Chris Wilson
2018-05-17  6:03 ` [PATCH 011/262] drm/i915/execlists: Double check rpm wakeref Chris Wilson
2018-05-17  6:03 ` [PATCH 012/262] drm/i915: After reset on sanitization, reset the engine backends Chris Wilson
2018-05-17  6:03 ` [PATCH 013/262] drm/i915/execlists: Reset the CSB head tracking on reset/sanitization Chris Wilson
2018-05-17  6:03 ` [PATCH 014/262] drm/i915/execlists: Pull submit after dequeue under timeline lock Chris Wilson
2018-05-17  6:03 ` Chris Wilson [this message]
2018-05-17  6:03 ` [PATCH 016/262] drm/i915/execlists: Unify CSB access pointers Chris Wilson
2018-05-17  6:03 ` [PATCH 017/262] drm/i915/execlists: Process the CSB directly from inside the irq handler Chris Wilson
2018-05-17  6:03 ` [PATCH 018/262] drm/i915/execlists: Direct submission (avoid tasklet/ksoftirqd) Chris Wilson
2018-05-17  6:03 ` [PATCH 019/262] drm/i915: Combine gt irq ack/handlers Chris Wilson
2018-05-17  6:03 ` [PATCH 020/262] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
2018-05-17  6:03 ` [PATCH 021/262] drm/i915/execlists: Try preempt-reset from hardirq timer context Chris Wilson
2018-05-17  6:03 ` [PATCH 022/262] drm/i915/preemption: Select timeout when scheduling Chris Wilson
2018-05-17  6:03 ` [PATCH 023/262] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
2018-05-17  6:03 ` [PATCH 024/262] drm/i915: Allow user control over preempt timeout on their important context Chris Wilson
2018-05-17  6:03 ` [PATCH 025/262] drm/mm: Reject over-sized allocation requests early Chris Wilson
2018-05-17  6:03 ` [PATCH 026/262] drm/mm: Add a search-by-address variant to only inspect a single hole Chris Wilson
2018-05-17  6:03 ` [PATCH 027/262] drm/i915: Limit searching for PIN_HIGH Chris Wilson
2018-05-17  6:03 ` [PATCH 028/262] drm/i915: Pin the ring high Chris Wilson
2018-05-17  6:03 ` [PATCH 029/262] drm/i915: Track the purgeable objects on a separate eviction list Chris Wilson
2018-05-18 11:36   ` Matthew Auld
2018-05-18 11:59     ` Chris Wilson
2018-05-17  6:03 ` [PATCH 030/262] drm/i915: Refactor unsettting obj->mm.pages Chris Wilson
2018-05-18 13:35   ` Matthew Auld
2018-05-17  6:03 ` [PATCH 031/262] drm/i915: Report all objects with allocated pages to the shrinker Chris Wilson
2018-05-18 16:42   ` Matthew Auld
2018-05-18 16:45     ` Chris Wilson
2018-05-17  6:03 ` [PATCH 032/262] drm/i915: Disable preemption and sleeping while using the punit sideband Chris Wilson
2018-05-17  6:03 ` [PATCH 033/262] drm/i915: Lift acquiring the vlv punit magic to a common sb-get Chris Wilson
2018-05-17  6:03 ` [PATCH 034/262] drm/i915: Lift sideband locking for vlv_punit_(read|write) Chris Wilson
2018-05-17  6:03 ` [PATCH 035/262] drm/i915: Reduce RPS update frequency on Valleyview/Cherryview Chris Wilson
2018-05-17  6:03 ` [PATCH 036/262] Revert "drm/i915: Avoid tweaking evaluation thresholds on Baytrail v3" Chris Wilson
2018-05-17  6:03 ` [PATCH 037/262] drm/i915: Replace pcu_lock with sb_lock Chris Wilson
2018-05-17  6:03 ` [PATCH 038/262] drm/i915: Separate sideband declarations to intel_sideband.h Chris Wilson
2018-05-17  6:03 ` [PATCH 039/262] drm/i915: Merge sbi read/write into a single accessor Chris Wilson
2018-05-17  6:03 ` [PATCH 040/262] drm/i915: Merge sandybridge_pcode_(read|write) Chris Wilson
2018-05-17  6:03 ` [PATCH 041/262] drm/i915: Move sandybride pcode access to intel_sideband.c Chris Wilson
2018-05-17  6:03 ` [PATCH 042/262] drm/i915: Mark up Ironlake ips with rpm wakerefs Chris Wilson
2018-05-17  6:03 ` [PATCH 043/262] drm/i915: Record logical context support in driver caps Chris Wilson
2018-05-17  6:04 ` [PATCH 044/262] drm/i915: Generalize i915_gem_sanitize() to reset contexts Chris Wilson
2018-05-17  6:04 ` [PATCH 045/262] drm/i915: Enable render context support for Ironlake (gen5) Chris Wilson
2018-05-17  6:04 ` [PATCH 046/262] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
2018-05-17  6:04 ` [PATCH 047/262] drm/i915: Split GT powermanagement functions to intel_gt_pm.c Chris Wilson
2018-05-17  6:04 ` [PATCH 048/262] drm/i915: Move rps worker " Chris Wilson
2018-05-17  6:04 ` [PATCH 049/262] drm/i915: Move all the RPS irq handlers to intel_gt_pm Chris Wilson
2018-05-17  6:04 ` [PATCH 050/262] drm/i915: Track HAS_RPS alongside HAS_RC6 in the device info Chris Wilson
2018-05-17  6:04 ` [PATCH 051/262] drm/i915: Remove defunct intel_suspend_gt_powersave() Chris Wilson
2018-05-17  6:04 ` [PATCH 052/262] drm/i915: Reorder GT interface code Chris Wilson
2018-05-17  6:04 ` [PATCH 053/262] drm/i915: Split control of rps and rc6 Chris Wilson
2018-05-17  6:04 ` [PATCH 054/262] drm/i915: Enabling rc6 and rps have different requirements, so separate them Chris Wilson
2018-05-17  6:04 ` [PATCH 055/262] drm/i915: Simplify rc6/rps enabling Chris Wilson
2018-05-17  6:04 ` [PATCH 056/262] drm/i915: Refactor frequency bounds computation Chris Wilson
2018-05-17  6:12 ` [PATCH 001/262] drm/i915: Move request->ctx aside Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180517060738.19193-15-chris@chris-wilson.co.uk \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.