All of lore.kernel.org
 help / color / mirror / Atom feed
From: oscar.mateo@intel.com
To: intel-gfx@lists.freedesktop.org
Cc: Ben Widawsky <ben@bwidawsk.net>,
	Ben Widawsky <benjamin.widawsky@intel.com>
Subject: [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat)
Date: Fri,  9 May 2014 13:08:54 +0100	[thread overview]
Message-ID: <1399637360-4277-25-git-send-email-oscar.mateo@intel.com> (raw)
In-Reply-To: <1399637360-4277-1-git-send-email-oscar.mateo@intel.com>

From: Ben Widawsky <benjamin.widawsky@intel.com>

For the most part, logical ring context objects are similar to hardware
contexts in that the backing object is meant to be opaque. There are
some exceptions where we need to poke certain offsets of the object for
initialization, updating the tail pointer or updating the PDPs.

For our basic execlist implementation we'll only need our PPGTT PDs,
and ringbuffer addresses in order to set up the context. With previous
patches, we have both, so start prepping the context to be load.

Before running a context for the first time you must populate some
fields in the context object. These fields begin 1 PAGE + LRCA, ie. the
first page (in 0 based counting) of the context  image. These same
fields will be read and written to as contexts are saved and restored
once the system is up and running.

Many of these fields are completely reused from previous global
registers: ringbuffer head/tail/control, context control matches some
previous MI_SET_CONTEXT flags, and page directories. There are other
fields which we don't touch which we may want in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11)
for other engines.

Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>

v3: Several rebases and general changes to the code.

v4: Squash with "Extract LR context object populating"
Also, Damien's review comments:
- Set the Force Posted bit on the LRI header, as the BSpec suggest we do.
- Prevent warning when compiling a 32-bits kernel without HIGHMEM64.
- Add a clarifying comment to the context population code.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h  |   1 +
 drivers/gpu/drm/i915/intel_lrc.c | 159 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 160 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 03ffc57..33d007d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -269,6 +269,7 @@
  *   address/value pairs. Don't overdue it, though, x <= 2^4 must hold!
  */
 #define MI_LOAD_REGISTER_IMM(x)	MI_INSTR(0x22, 2*(x)-1)
+#define   MI_LRI_FORCE_POSTED		(1<<12)
 #define MI_STORE_REGISTER_MEM(x) MI_INSTR(0x24, 2*(x)-1)
 #define MI_STORE_REGISTER_MEM_GEN8(x) MI_INSTR(0x24, 3*(x)-1)
 #define   MI_SRM_LRM_GLOBAL_GTT		(1<<22)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0f2c5cb..5a85496 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -46,6 +46,152 @@
 
 #define GEN8_LR_CONTEXT_ALIGN 4096
 
+#define RING_ELSP(ring)			((ring)->mmio_base+0x230)
+#define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
+
+#define CTX_LRI_HEADER_0		0x01
+#define CTX_CONTEXT_CONTROL		0x02
+#define CTX_RING_HEAD			0x04
+#define CTX_RING_TAIL			0x06
+#define CTX_RING_BUFFER_START		0x08
+#define CTX_RING_BUFFER_CONTROL	0x0a
+#define CTX_BB_HEAD_U			0x0c
+#define CTX_BB_HEAD_L			0x0e
+#define CTX_BB_STATE			0x10
+#define CTX_SECOND_BB_HEAD_U		0x12
+#define CTX_SECOND_BB_HEAD_L		0x14
+#define CTX_SECOND_BB_STATE		0x16
+#define CTX_BB_PER_CTX_PTR		0x18
+#define CTX_RCS_INDIRECT_CTX		0x1a
+#define CTX_RCS_INDIRECT_CTX_OFFSET	0x1c
+#define CTX_LRI_HEADER_1		0x21
+#define CTX_CTX_TIMESTAMP		0x22
+#define CTX_PDP3_UDW			0x24
+#define CTX_PDP3_LDW			0x26
+#define CTX_PDP2_UDW			0x28
+#define CTX_PDP2_LDW			0x2a
+#define CTX_PDP1_UDW			0x2c
+#define CTX_PDP1_LDW			0x2e
+#define CTX_PDP0_UDW			0x30
+#define CTX_PDP0_LDW			0x32
+#define CTX_LRI_HEADER_2		0x41
+#define CTX_R_PWR_CLK_STATE		0x42
+#define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
+
+static int
+intel_populate_lr_context(struct i915_hw_context *ctx,
+			  struct intel_engine *ring)
+{
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].obj;
+	struct drm_i915_gem_object *ring_obj = ctx->engine[ring->id].ringbuf->obj;
+	struct i915_hw_ppgtt *ppgtt;
+	struct page *page;
+	uint32_t *reg_state;
+	int ret;
+
+	ppgtt = ctx_to_ppgtt(ctx);
+
+	ret = i915_gem_object_set_to_cpu_domain(ctx_obj, true);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Could not set to CPU domain\n");
+		return ret;
+	}
+
+	ret = i915_gem_object_get_pages(ctx_obj);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Could not get object pages\n");
+		return ret;
+	}
+
+	i915_gem_object_pin_pages(ctx_obj);
+
+	/* The second page of the context object contains some fields which must
+	 * be set up prior to the first execution. */
+	page = i915_gem_object_get_page(ctx_obj, 1);
+	reg_state = kmap_atomic(page);
+
+	/* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
+	 * commands followed by (reg, value) pairs. The values we are setting here are
+	 * only for the first context restore: on a subsequent save, the GPU will
+	 * recreate this batchbuffer with new values (including all the missing
+	 * MI_LOAD_REGISTER_IMM commands that we are not initializing here). */
+	if (ring->id == RCS)
+		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
+	else
+		reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
+	reg_state[CTX_LRI_HEADER_0] |= MI_LRI_FORCE_POSTED;
+	reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
+	reg_state[CTX_CONTEXT_CONTROL+1] = (1<<3) | MI_RESTORE_INHIBIT;
+	reg_state[CTX_CONTEXT_CONTROL+1] |= reg_state[CTX_CONTEXT_CONTROL+1] << 16;
+	reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
+	reg_state[CTX_RING_HEAD+1] = 0;
+	reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
+	reg_state[CTX_RING_TAIL+1] = 0;
+	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
+	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
+	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
+	reg_state[CTX_RING_BUFFER_CONTROL+1] = (31 * PAGE_SIZE) | RING_VALID;
+	reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
+	reg_state[CTX_BB_HEAD_U+1] = 0;
+	reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
+	reg_state[CTX_BB_HEAD_L+1] = 0;
+	reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
+	reg_state[CTX_BB_STATE+1] = (1<<5);
+	reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
+	reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
+	reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
+	reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
+	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
+	reg_state[CTX_SECOND_BB_STATE+1] = 0;
+	if (ring->id == RCS) {
+		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
+		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
+		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
+		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
+		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
+		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
+	}
+	reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
+	reg_state[CTX_LRI_HEADER_1] |= MI_LRI_FORCE_POSTED;
+	reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
+	reg_state[CTX_CTX_TIMESTAMP+1] = 0;
+	reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
+	reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
+	reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
+	reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
+	reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
+	reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
+	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
+	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
+	reg_state[CTX_PDP3_UDW+1] = (u64)ppgtt->pd_dma_addr[3] >> 32;
+	reg_state[CTX_PDP3_LDW+1] = ppgtt->pd_dma_addr[3];
+	reg_state[CTX_PDP2_UDW+1] = (u64)ppgtt->pd_dma_addr[2] >> 32;
+	reg_state[CTX_PDP2_LDW+1] = ppgtt->pd_dma_addr[2];
+	reg_state[CTX_PDP1_UDW+1] = (u64)ppgtt->pd_dma_addr[1] >> 32;
+	reg_state[CTX_PDP1_LDW+1] = ppgtt->pd_dma_addr[1];
+	reg_state[CTX_PDP0_UDW+1] = (u64)ppgtt->pd_dma_addr[0] >> 32;
+	reg_state[CTX_PDP0_LDW+1] = ppgtt->pd_dma_addr[0];
+	if (ring->id == RCS) {
+		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
+		reg_state[CTX_LRI_HEADER_2] |= MI_LRI_FORCE_POSTED;
+		reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
+		reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
+#if 0
+		/* Offsets not yet defined for these */
+		reg_state[CTX_GPGPU_CSR_BASE_ADDRESS] = 0;
+		reg_state[CTX_GPGPU_CSR_BASE_ADDRESS+1] = 0;
+#endif
+	}
+
+	kunmap_atomic(reg_state);
+
+	ctx_obj->dirty = 1;
+	set_page_dirty(page);
+	i915_gem_object_unpin_pages(ctx_obj);
+
+	return 0;
+}
+
 static uint32_t get_lr_context_size(struct intel_engine *ring)
 {
 	int ret = 0;
@@ -135,6 +281,19 @@ int gen8_create_lr_context(struct i915_hw_context *ctx,
 	ctx->engine[ring->id].ringbuf = ringbuf;
 	ctx->engine[ring->id].obj = ctx_obj;
 
+	ret = intel_populate_lr_context(ctx, ring);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Failed to populate LRC: %d\n", ret);
+		ctx->engine[ring->id].ringbuf = NULL;
+		ctx->engine[ring->id].obj = NULL;
+		intel_destroy_ring_buffer(ringbuf);
+		if (file_priv)
+			kfree(ringbuf);
+		i915_gem_object_ggtt_unpin(ctx_obj);
+		drm_gem_object_unreference(&ctx_obj->base);
+		return ret;
+	}
+
 	return 0;
 }
 
-- 
1.9.0

  parent reply	other threads:[~2014-05-09 12:14 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
2014-05-09 12:08 ` [PATCH 01/50] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
2014-05-09 12:08 ` [PATCH 02/50] drm/i915: for_each_ring oscar.mateo
2014-05-13 13:25   ` Daniel Vetter
2014-05-19 16:33   ` Volkin, Bradley D
2014-05-19 16:36     ` Mateo Lozano, Oscar
2014-05-09 12:08 ` [PATCH 03/50] drm/i915: Simplify a couple of functions thanks to for_each_ring oscar.mateo
2014-05-09 12:08 ` [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
2014-05-13 13:26   ` Daniel Vetter
2014-05-13 13:47     ` Chris Wilson
2014-05-14 11:53     ` Mateo Lozano, Oscar
2014-05-14 12:28       ` Daniel Vetter
2014-05-09 12:08 ` [PATCH 05/50] drm/i915: Extract ringbuffer destroy, make destroy & alloc outside accesible oscar.mateo
2014-05-09 12:08 ` [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
2014-05-13 13:28   ` Daniel Vetter
2014-05-14 13:26     ` Damien Lespiau
2014-05-15 14:17       ` Mateo Lozano, Oscar
2014-05-15 20:52         ` Daniel Vetter
2014-05-19 10:02           ` Mateo Lozano, Oscar
2014-05-19 12:20             ` Daniel Vetter
2014-05-19 13:41               ` Mateo Lozano, Oscar
2014-05-19 13:52                 ` Daniel Vetter
2014-05-19 14:43                   ` Mateo Lozano, Oscar
2014-05-19 15:11                     ` Daniel Vetter
2014-05-19 15:26                       ` Mateo Lozano, Oscar
2014-05-19 15:49                         ` Daniel Vetter
2014-05-19 16:12                           ` Mateo Lozano, Oscar
2014-05-19 16:24                             ` Volkin, Bradley D
2014-05-19 16:33                               ` Mateo Lozano, Oscar
2014-05-19 16:40                                 ` Volkin, Bradley D
2014-05-19 16:49                                   ` Mateo Lozano, Oscar
2014-05-19 17:00                                     ` Volkin, Bradley D
2014-05-20  8:11                             ` Daniel Vetter
2014-05-09 12:08 ` [PATCH 07/50] drm/i915: Split the ringbuffers and the rings oscar.mateo
2014-05-09 12:08 ` [PATCH 08/50] drm/i915: Rename functions that mention ringbuffers (meaning rings) oscar.mateo
2014-05-09 12:08 ` [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path oscar.mateo
2014-05-16 11:04   ` Chris Wilson
2014-05-16 11:11     ` Mateo Lozano, Oscar
2014-05-16 11:31       ` Chris Wilson
2014-05-09 12:08 ` [PATCH 10/50] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit oscar.mateo
2014-05-09 12:08 ` [PATCH 11/50] drm/i915: Write a new set of context-aware ringbuffer management functions oscar.mateo
2014-05-09 12:08 ` [PATCH 12/50] drm/i915: Final touches to ringbuffer and context plumbing and refactoring oscar.mateo
2014-05-09 12:08 ` [PATCH 13/50] drm/i915: s/write_tail/submit oscar.mateo
2014-05-09 12:08 ` [PATCH 14/50] drm/i915: Introduce one context backing object per engine oscar.mateo
2014-05-09 12:08 ` [PATCH 15/50] drm/i915: Make i915_gem_create_context outside accessible oscar.mateo
2014-05-09 12:08 ` [PATCH 16/50] drm/i915: Option to skip backing object allocation during context creation oscar.mateo
2014-05-09 12:08 ` [PATCH 17/50] drm/i915: Extract context backing object allocation oscar.mateo
2014-05-09 12:08 ` [PATCH 18/50] drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring Contexts) oscar.mateo
2014-05-09 12:08 ` [PATCH 19/50] drm/i915/bdw: New file for Logical Ring Contexts and Execlists oscar.mateo
2014-05-09 12:08 ` [PATCH 20/50] drm/i915/bdw: Rework init code for Logical Ring Contexts oscar.mateo
2014-05-09 12:08 ` [PATCH 21/50] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
2014-05-09 12:08 ` [PATCH 22/50] drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC oscar.mateo
2014-05-09 12:08 ` [PATCH 23/50] drm/i915/bdw: Allocate ringbuffer for user-created LRCs oscar.mateo
2014-05-09 12:08 ` oscar.mateo [this message]
2014-05-09 13:36   ` [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat) Damien Lespiau
2014-05-12 17:00   ` [PATCH v2 " oscar.mateo
2014-05-09 12:08 ` [PATCH 25/50] drm/i915/bdw: Deferred creation of user-created LRCs oscar.mateo
2014-05-09 12:08 ` [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, " oscar.mateo
2014-05-13 13:35   ` Daniel Vetter
2014-05-14 11:38     ` Mateo Lozano, Oscar
2014-05-09 12:08 ` [PATCH 27/50] drm/i915/bdw: Status page for LR contexts oscar.mateo
2014-05-09 12:08 ` [PATCH 28/50] drm/i915/bdw: Enable execlists in the hardware oscar.mateo
2014-05-09 12:08 ` [PATCH 29/50] drm/i915/bdw: Execlists ring tail writing oscar.mateo
2014-05-09 12:09 ` [PATCH 30/50] drm/i915/bdw: LR context ring init oscar.mateo
2014-05-09 12:09 ` [PATCH 31/50] drm/i915/bdw: Set the request context information correctly in the LRC case oscar.mateo
2014-05-09 12:09 ` [PATCH 32/50] drm/i915/bdw: GEN8 new ring flush oscar.mateo
2014-05-09 12:09 ` [PATCH 33/50] drm/i915/bdw: Always write seqno to default context oscar.mateo
2014-05-09 12:09 ` [PATCH 34/50] drm/i915/bdw: Implement context switching (somewhat) oscar.mateo
2014-05-09 12:09 ` [PATCH 35/50] drm/i915/bdw: Add forcewake lock around ELSP writes oscar.mateo
2014-05-09 12:09 ` [PATCH 36/50] drm/i915/bdw: Write the tail pointer, LRC style oscar.mateo
2014-05-09 12:09 ` [PATCH 37/50] drm/i915/bdw: Don't write PDP in the legacy way when using LRCs oscar.mateo
2014-05-09 12:09 ` [PATCH 38/50] drm/i915/bdw: LR context switch interrupts oscar.mateo
2014-05-09 12:09 ` [PATCH 39/50] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
2014-05-09 12:09 ` [PATCH 40/50] drm/i915/bdw: Handle context switch events oscar.mateo
2014-06-11 11:52   ` Daniel Vetter
2014-06-11 12:02     ` Mateo Lozano, Oscar
2014-06-11 15:23       ` Mateo Lozano, Oscar
2014-06-12  6:53         ` Daniel Vetter
2014-05-09 12:09 ` [PATCH 41/50] drm/i915/bdw: Start queueing contexts to be submitted oscar.mateo
2014-05-09 12:09 ` [PATCH 42/50] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
2014-05-09 12:09 ` [PATCH 43/50] drm/i915/bdw: Display context backing obj & ringbuffer " oscar.mateo
2014-05-09 12:09 ` [PATCH 44/50] drm/i915/bdw: Print context state " oscar.mateo
2014-05-09 12:09 ` [PATCH 45/50] drm/i915/bdw: Document execlists and logical ring contexts oscar.mateo
2014-05-09 12:09 ` [PATCH 46/50] drm/i915/bdw: Avoid non-lite-restore preemptions oscar.mateo
2014-05-09 12:09 ` [PATCH 47/50] drm/i915/bdw: Make sure gpu reset still works with Execlists oscar.mateo
2014-05-09 12:09 ` [PATCH 48/50] drm/i915/bdw: Make sure error capture keeps working " oscar.mateo
2014-05-09 12:09 ` [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler oscar.mateo
2014-06-11 11:50   ` Daniel Vetter
2014-06-11 12:01     ` Mateo Lozano, Oscar
2014-06-11 13:57       ` Daniel Vetter
2014-06-11 14:26         ` Mateo Lozano, Oscar
2014-05-09 12:09 ` [PATCH 50/50] drm/i915/bdw: Enable logical ring contexts oscar.mateo
2014-05-12 17:04 ` [PATCH 49.1/50] drm/i915/bdw: Do not call intel_runtime_pm_get() in an interrupt oscar.mateo
2014-05-13 13:48 ` [PATCH 00/50] Execlists v2 Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1399637360-4277-25-git-send-email-oscar.mateo@intel.com \
    --to=oscar.mateo@intel.com \
    --cc=ben@bwidawsk.net \
    --cc=benjamin.widawsky@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.