All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Support to enable TRTT on GEN9
@ 2016-01-09 11:30 akash.goel
  2016-01-10 17:39 ` Chris Wilson
                   ` (10 more replies)
  0 siblings, 11 replies; 59+ messages in thread
From: akash.goel @ 2016-01-09 11:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
The assumption here is that UMD only will do the complete PPGTT address space
management and use the Soft Pin API for all the buffer objects associated with
a given Context. So UMD will also have to allocate the L3/L2/L1 table pages
as a regular GEM BO only & assign them a PPGTT address through the Soft Pin API.
UMD would have to emit the MI_STORE_DATA_IMM commands in the batch buffer to
program the relevant entries of L3/L2/L1 tables.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
abstracted from the KMD, UMD can do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
The BOs must not be assigned an address from TR-TT segment, they will be mapped
to PPGTT in a regular way by KMD, using the Soft Pin offset provided by UMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_ENABLE_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, pattern value for the
Null & invalid Tile registers.

Testcase: igt/gem_trtt

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  3 ++
 drivers/gpu/drm/i915/i915_drv.h         | 12 +++++++
 drivers/gpu/drm/i915/i915_gem_context.c | 45 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c     | 57 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |  6 ++++
 drivers/gpu/drm/i915/i915_reg.h         | 19 +++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 41 ++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h             |  8 +++++
 8 files changed, 191 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 988a380..c247c25 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -172,6 +172,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_SOFTPIN:
 		value = 1;
 		break;
+	case I915_PARAM_HAS_TRTT:
+		value = HAS_TRTT(dev);
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c6dd4db..12c612e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -839,6 +839,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1<<1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -881,6 +882,15 @@ struct intel_context {
 		int pin_count;
 	} engine[I915_NUM_RINGS];
 
+	/* TRTT info */
+	struct {
+		uint32_t invd_tile_val;
+		uint32_t null_tile_val;
+		uint64_t l3_table_address;
+		struct i915_vma *vma;
+		bool update_trtt_params;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2626,6 +2636,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 900ffd0..ae9fc34 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -146,6 +146,9 @@ static void i915_gem_context_clean(struct intel_context *ctx)
 		if (WARN_ON(__i915_vma_unbind_no_wait(vma)))
 			break;
 	}
+
+	if (ctx->flags & CONTEXT_USE_TRTT)
+		i915_gem_destroy_trtt_vma(ctx->trtt_info.vma);
 }
 
 void i915_gem_context_free(struct kref *ctx_ref)
@@ -512,6 +515,35 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+i915_setup_trtt_ctx(struct intel_context *ctx,
+		    struct drm_i915_gem_context_trtt_param *trtt_params)
+{
+	if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+
+	/* basic sanity checks for the l3 table pointer */
+	if ((ctx->trtt_info.l3_table_address >= GEN9_TRTT_SEGMENT_START) &&
+	    (ctx->trtt_info.l3_table_address <
+			(GEN9_TRTT_SEGMENT_START + GEN9_TRTT_SEGMENT_SIZE)))
+		return -EINVAL;
+
+	if (ctx->trtt_info.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK)
+		return -EINVAL;
+
+	ctx->trtt_info.vma = i915_gem_setup_trtt_vma(&ctx->ppgtt->base);
+	if (IS_ERR(ctx->trtt_info.vma))
+		return PTR_ERR(ctx->trtt_info.vma);
+
+	ctx->trtt_info.null_tile_val = trtt_params->null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params->invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params->l3_table_address;
+	ctx->trtt_info.update_trtt_params = 1;
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+	return 0;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -952,6 +984,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct drm_i915_gem_context_param *args = data;
+	struct drm_i915_gem_context_trtt_param trtt_params;
 	struct intel_context *ctx;
 	int ret;
 
@@ -983,6 +1016,18 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_ENABLE_TRTT:
+		if (args->size < sizeof(trtt_params))
+			ret = -EINVAL;
+		else if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+			ret = -ENODEV;
+		else if (copy_from_user(&trtt_params,
+					to_user_ptr(args->value),
+					sizeof(trtt_params)))
+			ret = -EFAULT;
+		else
+			ret = i915_setup_trtt_ctx(ctx, &trtt_params);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 56f4f2e..28fc1ea 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2146,6 +2146,13 @@ int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 
 int i915_ppgtt_init_hw(struct drm_device *dev)
 {
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3328,6 +3335,56 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void i915_gem_destroy_trtt_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->vma_link));
+	WARN_ON(!list_empty(&vma->mm_list));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	drm_mm_remove_node(&vma->node);
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+i915_gem_setup_trtt_vma(struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (vma == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->vma_link);
+	INIT_LIST_HEAD(&vma->mm_list);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as perennially pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = GEN9_TRTT_SEGMENT_START;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		DRM_ERROR("Reservation for TRTT segment failed: %i\n", ret);
+		i915_gem_destroy_trtt_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b448ad8..acb942d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -129,6 +129,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Lies at the top of 48 bit PPGTT space */
+#define GEN9_TRTT_SEGMENT_START		((1ULL << 48) - (1ULL << 44))
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << 44)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -559,4 +563,6 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *i915_gem_setup_trtt_vma(struct i915_address_space *vm);
+void i915_gem_destroy_trtt_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 007ae83..5859be6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1<<0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_VALUE		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1<<1)
+#define   GEN9_TRTT_ENABLE		(1<<0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 8096c6a..a8b795d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -183,6 +183,12 @@
 #define CTX_LRI_HEADER_2		0x41
 #define CTX_R_PWR_CLK_STATE		0x42
 #define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
+#define CTX_TRTT_L3_PTR_DW0		0x202
+#define CTX_TRTT_L3_PTR_DW1		0x204
+#define CTX_TRTT_NULL_TILE		0x206
+#define CTX_TRTT_INVD_TILE		0x208
+#define CTX_TRTT_VA_MASKDATA		0x20A
+#define CTX_TRTT_TBL_CTL		0x20C
 
 #define GEN8_CTX_VALID (1<<0)
 #define GEN8_CTX_FORCE_PD_RESTORE (1<<1)
@@ -228,6 +234,8 @@ enum {
 static int intel_lr_context_pin(struct drm_i915_gem_request *rq);
 static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring,
 		struct drm_i915_gem_object *default_ctx_obj);
+static void populate_lr_context_trtt(struct intel_context *ctx,
+		uint32_t *reg_state);
 
 
 /**
@@ -390,6 +398,14 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 		ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
 	}
 
+	if (ring->id == RCS && rq->ctx->trtt_info.update_trtt_params) {
+		/* The same page of the context object also contain fields
+		 * related for TRTT setup.
+		 */
+		populate_lr_context_trtt(rq->ctx, reg_state);
+		rq->ctx->trtt_info.update_trtt_params = 0;
+	}
+
 	kunmap_atomic(reg_state);
 
 	return 0;
@@ -2247,6 +2263,31 @@ make_rpcs(struct drm_device *dev)
 	return rpcs;
 }
 
+static void
+populate_lr_context_trtt(struct intel_context *ctx, uint32_t *reg_state)
+{
+	unsigned long masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+
+	ASSIGN_CTX_REG(reg_state, CTX_TRTT_L3_PTR_DW0, GEN9_TRTT_L3_POINTER_DW0,
+		       lower_32_bits(masked_l3_gfx_address));
+
+	ASSIGN_CTX_REG(reg_state, CTX_TRTT_L3_PTR_DW1, GEN9_TRTT_L3_POINTER_DW1,
+		       upper_32_bits(masked_l3_gfx_address));
+
+	ASSIGN_CTX_REG(reg_state, CTX_TRTT_NULL_TILE, GEN9_TRTT_NULL_TILE_REG,
+		       ctx->trtt_info.null_tile_val);
+
+	ASSIGN_CTX_REG(reg_state, CTX_TRTT_INVD_TILE, GEN9_TRTT_INVD_TILE_REG,
+		       ctx->trtt_info.invd_tile_val);
+
+	ASSIGN_CTX_REG(reg_state, CTX_TRTT_VA_MASKDATA, GEN9_TRTT_VA_MASKDATA,
+		       GEN9_TRVA_MASK_VALUE | GEN9_TRVA_DATA_VALUE);
+
+	ASSIGN_CTX_REG(reg_state, CTX_TRTT_TBL_CTL, GEN9_TRTT_TABLE_CONTROL,
+		       GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+}
+
 static int
 populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_obj,
 		    struct intel_engine_cs *ring, struct intel_ringbuffer *ringbuf)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index acf2102..6d6f448 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -357,6 +357,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_GPU_RESET	 35
 #define I915_PARAM_HAS_RESOURCE_STREAMER 36
 #define I915_PARAM_HAS_EXEC_SOFTPIN	 37
+#define I915_PARAM_HAS_TRTT		 38
 
 typedef struct drm_i915_getparam {
 	__s32 param;
@@ -1140,7 +1141,14 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_ENABLE_TRTT	0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH] drm/i915: Support to enable TRTT on GEN9
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
@ 2016-01-10 17:39 ` Chris Wilson
  2016-01-11  7:39   ` Goel, Akash
  2016-01-11 11:19 ` ✓ success: Fi.CI.BAT Patchwork
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2016-01-10 17:39 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Sat, Jan 09, 2016 at 05:00:21PM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Gen9 has an additional address translation hardware support in form of
> Tiled Resource Translation Table (TR-TT) which provides an extra level
> of abstraction over PPGTT.
> This is useful for mapping Sparse/Tiled texture resources.
> Sparse resources are created as virtual-only allocations. Regions of the
> resource that the application intends to use is bound to the physical memory
> on the fly and can be re-bound to different memory allocations over the
> lifetime of the resource.
> 
> TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
> for a new PPGTT instance, but TR-TT may not enabled for every context.
> 1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
> which such chunk to use is conveyed to HW through a register.
> Any GFX address, which lies in that reserved 44 bit range will be translated
> through TR-TT first and then through PPGTT to get the actual physical address,
> so the output of translation from TR-TT will be a PPGTT offset.
> 
> TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
> leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
> each level is contained within a 4KB page hence L3 and L2 is composed of
> 512 64b entries and L1 is composed of 1024 32b entries.
> 
> There is a provision to keep TR-TT Tables in virtual space, where the pages of
> TRTT tables will be mapped to PPGTT.
> Currently this is the supported mode, in this mode UMD will have a full control
> on TR-TT management, with bare minimum support from KMD.
> So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
> similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
> The entries of L1 table will contain the PPGTT offset of BOs actually backing
> the Sparse resources.

> The assumption here is that UMD only will do the complete PPGTT address space
> management and use the Soft Pin API for all the buffer objects associated with
> a given Context.

That is a poor assumption, and not one required for this to work.

> So UMD will also have to allocate the L3/L2/L1 table pages
> as a regular GEM BO only & assign them a PPGTT address through the Soft Pin API.
> UMD would have to emit the MI_STORE_DATA_IMM commands in the batch buffer to
> program the relevant entries of L3/L2/L1 tables.

This only applies to te TR-TT L1-L3 cache, right?

> Any space in TR-TT segment not bound to any Sparse texture, will be handled
> through Invalid tile, User is expected to initialize the entries of a new
> L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
> the holes in the Sparse texture resource will be set with the Null tile pattern
> The improper programming of TRTT should only lead to a recoverable GPU hang,
> eventually leading to banning of the culprit context without victimizing others.
> 
> The association of any Sparse resource with the BOs will be known only to UMD,
> and only the Sparse resources shall be assigned an offset from the TR-TT segment
> by UMD. The use of TR-TT segment or mapping of Sparse resources will be
> abstracted from the KMD,

s/abstracted from/transparent to/ s/,/;/

> UMD can do the address assignment from TR-TT segment

s/can/will/

> autonomously and KMD will be oblivious of it.
> The BOs must not be assigned an address from TR-TT segment, they will be mapped

s/The BOs/Any object/

> to PPGTT in a regular way by KMD

s/using the Soft Pin offset provided by UMD// as this is irrelevant.

> This patch provides an interface through which UMD can convey KMD to enable
> TR-TT for a given context. A new I915_CONTEXT_PARAM_ENABLE_TRTT param has been
> added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
> UMD will have to pass the GFX address of L3 table page,

+along with the

> pattern value for the
> Null & invalid Tile registers.
> 
> Testcase: igt/gem_trtt
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c         |  3 ++
>  drivers/gpu/drm/i915/i915_drv.h         | 12 +++++++
>  drivers/gpu/drm/i915/i915_gem_context.c | 45 ++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_gtt.c     | 57 +++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_gtt.h     |  6 ++++
>  drivers/gpu/drm/i915/i915_reg.h         | 19 +++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        | 41 ++++++++++++++++++++++++
>  include/uapi/drm/i915_drm.h             |  8 +++++
>  8 files changed, 191 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 988a380..c247c25 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -172,6 +172,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>  	case I915_PARAM_HAS_EXEC_SOFTPIN:
>  		value = 1;
>  		break;
> +	case I915_PARAM_HAS_TRTT:
> +		value = HAS_TRTT(dev);
> +		break;

Should we do this here, or just query the context? In fact you are
missing the context getparam path any way.

>  	default:
>  		DRM_DEBUG("Unknown parameter %d\n", param->param);
>  		return -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c6dd4db..12c612e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -839,6 +839,7 @@ struct i915_ctx_hang_stats {
>  #define DEFAULT_CONTEXT_HANDLE 0
>  
>  #define CONTEXT_NO_ZEROMAP (1<<0)
> +#define CONTEXT_USE_TRTT   (1<<1)

Make flags unsigned whilst you are here, and fix the holes!

>  /**
>   * struct intel_context - as the name implies, represents a context.
>   * @ref: reference count.
> @@ -881,6 +882,15 @@ struct intel_context {
>  		int pin_count;
>  	} engine[I915_NUM_RINGS];
>  
> +	/* TRTT info */
> +	struct {

Give this a name now, we will be thankful in the future.

> +		uint32_t invd_tile_val;
> +		uint32_t null_tile_val;
> +		uint64_t l3_table_address;
> +		struct i915_vma *vma;
> +		bool update_trtt_params;
> +	} trtt_info;
> +
>  	struct list_head link;
>  };
>  
> @@ -2626,6 +2636,8 @@ struct drm_i915_cmd_table {
>  				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
>  				 !IS_BROXTON(dev))
>  
> +#define HAS_TRTT(dev)		(IS_GEN9(dev))
> +
>  #define INTEL_PCH_DEVICE_ID_MASK		0xff00
>  #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
>  #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 900ffd0..ae9fc34 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -146,6 +146,9 @@ static void i915_gem_context_clean(struct intel_context *ctx)
>  		if (WARN_ON(__i915_vma_unbind_no_wait(vma)))
>  			break;
>  	}
> +
> +	if (ctx->flags & CONTEXT_USE_TRTT)
> +		i915_gem_destroy_trtt_vma(ctx->trtt_info.vma);

Sould be in context free.

>  }
>  
>  void i915_gem_context_free(struct kref *ctx_ref)
> @@ -512,6 +515,35 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
>  	return ctx;
>  }
>  
> +static int
> +i915_setup_trtt_ctx(struct intel_context *ctx,
> +		    struct drm_i915_gem_context_trtt_param *trtt_params)
> +{
> +	if (ctx->flags & CONTEXT_USE_TRTT)
> +		return -EEXIST;
> +
> +	/* basic sanity checks for the l3 table pointer */
> +	if ((ctx->trtt_info.l3_table_address >= GEN9_TRTT_SEGMENT_START) &&
> +	    (ctx->trtt_info.l3_table_address <
> +			(GEN9_TRTT_SEGMENT_START + GEN9_TRTT_SEGMENT_SIZE)))

Presumably l3_table has an actual size and you want to do a range
overlap test, not just the start address.

> +		return -EINVAL;
> +
> +	if (ctx->trtt_info.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK)
> +		return -EINVAL;

These are worth adding DRM_DEBUG() or even better start using dev_debug()
so that we can debug userspace startup issues.

> @@ -952,6 +984,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
>  	struct drm_i915_gem_context_param *args = data;
> +	struct drm_i915_gem_context_trtt_param trtt_params;
>  	struct intel_context *ctx;
>  	int ret;
>  
> @@ -983,6 +1016,18 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
>  		}
>  		break;
> +	case I915_CONTEXT_PARAM_ENABLE_TRTT:

Bump this case to i915_setup_trtt_ctx.
i.e. just have
	ret = i915_setup_trtt_ctx(ctx, args);
	break;

Otherwise this function will become very unwieldly very quickly.

>  int i915_ppgtt_init_hw(struct drm_device *dev)
>  {
> +	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
> +		struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
> +			   GEN9_TRTT_BYPASS_DISABLE);

Shouldn't this be a context specific register? In which case you need to
set it in the context image instead.

Hmm. given you already do the context image tweaks, how does work with
non-trtt contexts?

> +struct i915_vma *
> +i915_gem_setup_trtt_vma(struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +	int ret;
> +
> +	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
> +	if (vma == NULL)
> +		return ERR_PTR(-ENOMEM);
> +
> +	INIT_LIST_HEAD(&vma->vma_link);
> +	INIT_LIST_HEAD(&vma->mm_list);
> +	INIT_LIST_HEAD(&vma->exec_list);
> +	vma->vm = vm;
> +	i915_ppgtt_get(i915_vm_to_ppgtt(vm));

Tempted to write a patch to allow

	vma->vm = i915_ppggtt_get(i915_vm_to_ppgtt(vm));
?

> +	/* Mark the vma as perennially pinned */

s/perennially/permanently/

We don't want to lose the reservation as opposed to having it grow back
next year.

> +	vma->pin_count = 1;
> +
> +	/* Reserve from the 48 bit PPGTT space */
> +	vma->node.start = GEN9_TRTT_SEGMENT_START;
> +	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
> +	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> +	if (ret) {
> +		ret = i915_gem_evict_for_vma(vma);
> +		if (ret == 0)
> +			ret = drm_mm_reserve_node(&vm->mm, &vma->node);

Good. I think we want i915_vm_reserve_node(vm, START, SIZE, &vma) - but
have a look at the other callsites to see if we have a common interface.
Looks like this would improve i915_vgpu.

> +struct drm_i915_gem_context_trtt_param {
> +	__u64 l3_table_address;
> +	__u32 invd_tile_val;
> +	__u32 null_tile_val;
> +};

Passes the ABI structure sanity checks.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] drm/i915: Support to enable TRTT on GEN9
  2016-01-10 17:39 ` Chris Wilson
@ 2016-01-11  7:39   ` Goel, Akash
  2016-01-11  8:49     ` Chris Wilson
  0 siblings, 1 reply; 59+ messages in thread
From: Goel, Akash @ 2016-01-11  7:39 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: akash.goel



On 1/10/2016 11:09 PM, Chris Wilson wrote:
> On Sat, Jan 09, 2016 at 05:00:21PM +0530, akash.goel@intel.com wrote:
>> From: Akash Goel <akash.goel@intel.com>
>>
>> Gen9 has an additional address translation hardware support in form of
>> Tiled Resource Translation Table (TR-TT) which provides an extra level
>> of abstraction over PPGTT.
>> This is useful for mapping Sparse/Tiled texture resources.
>> Sparse resources are created as virtual-only allocations. Regions of the
>> resource that the application intends to use is bound to the physical memory
>> on the fly and can be re-bound to different memory allocations over the
>> lifetime of the resource.
>>
>> TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
>> for a new PPGTT instance, but TR-TT may not enabled for every context.
>> 1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
>> which such chunk to use is conveyed to HW through a register.
>> Any GFX address, which lies in that reserved 44 bit range will be translated
>> through TR-TT first and then through PPGTT to get the actual physical address,
>> so the output of translation from TR-TT will be a PPGTT offset.
>>
>> TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
>> leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
>> each level is contained within a 4KB page hence L3 and L2 is composed of
>> 512 64b entries and L1 is composed of 1024 32b entries.
>>
>> There is a provision to keep TR-TT Tables in virtual space, where the pages of
>> TRTT tables will be mapped to PPGTT.
>> Currently this is the supported mode, in this mode UMD will have a full control
>> on TR-TT management, with bare minimum support from KMD.
>> So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
>> similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
>> The entries of L1 table will contain the PPGTT offset of BOs actually backing
>> the Sparse resources.
>
>> The assumption here is that UMD only will do the complete PPGTT address space
>> management and use the Soft Pin API for all the buffer objects associated with
>> a given Context.
>
> That is a poor assumption, and not one required for this to work.
>
This is not a strict requirement.
But I thought that conflicts will be minimized if UMD itself can do the 
full address space management.
At least UMD has to ensure that PPGTT offset of L3 table remains same 
throughout.

>> So UMD will also have to allocate the L3/L2/L1 table pages
>> as a regular GEM BO only & assign them a PPGTT address through the Soft Pin API.
>> UMD would have to emit the MI_STORE_DATA_IMM commands in the batch buffer to
>> program the relevant entries of L3/L2/L1 tables.
>
> This only applies to te TR-TT L1-L3 cache, right?
>
Yes applies only to the TR-TT L1-L3 tables.
The backing pages of L3/L2/L1 tables shall be allocated as a BO, which 
should be assigned a PPGTT address.
The table entries could be written directly also by UMD by mmapping the 
table BOs, but adding MI_STORE_DATA_IMM commands in the batch buffer 
itself would help to achieve serialization (implicitly).

>> Any space in TR-TT segment not bound to any Sparse texture, will be handled
>> through Invalid tile, User is expected to initialize the entries of a new
>> L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
>> the holes in the Sparse texture resource will be set with the Null tile pattern
>> The improper programming of TRTT should only lead to a recoverable GPU hang,
>> eventually leading to banning of the culprit context without victimizing others.
>>
>> The association of any Sparse resource with the BOs will be known only to UMD,
>> and only the Sparse resources shall be assigned an offset from the TR-TT segment
>> by UMD. The use of TR-TT segment or mapping of Sparse resources will be
>> abstracted from the KMD,
>
> s/abstracted from/transparent to/ s/,/;/
>
Ok will rephrase as 'transparent to KMD;'
>> UMD can do the address assignment from TR-TT segment
>
> s/can/will/
>
Ok

>> autonomously and KMD will be oblivious of it.
>> The BOs must not be assigned an address from TR-TT segment, they will be mapped
>
> s/The BOs/Any object/
>
Ok will use 'Any object'
>> to PPGTT in a regular way by KMD
>
> s/using the Soft Pin offset provided by UMD// as this is irrelevant.
>
You mean to say that it is needless or inappropriate to state that KMD 
will use the Soft PIN offset provided by UMD, it doesn't matter that 
whether the Soft PIN offset is used or KMD itself assigns an address.

>> This patch provides an interface through which UMD can convey KMD to enable
>> TR-TT for a given context. A new I915_CONTEXT_PARAM_ENABLE_TRTT param has been
>> added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
>> UMD will have to pass the GFX address of L3 table page,
>
> +along with the
>
Ok.

>> pattern value for the
>> Null & invalid Tile registers.
>>
>> Testcase: igt/gem_trtt
>>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_dma.c         |  3 ++
>>   drivers/gpu/drm/i915/i915_drv.h         | 12 +++++++
>>   drivers/gpu/drm/i915/i915_gem_context.c | 45 ++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_gtt.c     | 57 +++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_gtt.h     |  6 ++++
>>   drivers/gpu/drm/i915/i915_reg.h         | 19 +++++++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        | 41 ++++++++++++++++++++++++
>>   include/uapi/drm/i915_drm.h             |  8 +++++
>>   8 files changed, 191 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
>> index 988a380..c247c25 100644
>> --- a/drivers/gpu/drm/i915/i915_dma.c
>> +++ b/drivers/gpu/drm/i915/i915_dma.c
>> @@ -172,6 +172,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>>   	case I915_PARAM_HAS_EXEC_SOFTPIN:
>>   		value = 1;
>>   		break;
>> +	case I915_PARAM_HAS_TRTT:
>> +		value = HAS_TRTT(dev);
>> +		break;
>
> Should we do this here, or just query the context? In fact you are
> missing the context getparam path any way.
>
Sorry, do you mean to say that with -ENODEV error also, on context 
setparam, User can make out the TR-TT support, so no need to have an 
explicit getparam case.

Would the context getparam path be really useful for TR-TT?.
If its needed, then would be better to rename 
I915_CONTEXT_PARAM_ENABLE_TRTT to I915_CONTEXT_PARAM_TRTT_INFO ?


>>   	default:
>>   		DRM_DEBUG("Unknown parameter %d\n", param->param);
>>   		return -EINVAL;
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index c6dd4db..12c612e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -839,6 +839,7 @@ struct i915_ctx_hang_stats {
>>   #define DEFAULT_CONTEXT_HANDLE 0
>>
>>   #define CONTEXT_NO_ZEROMAP (1<<0)
>> +#define CONTEXT_USE_TRTT   (1<<1)
>
> Make flags unsigned whilst you are here, and fix the holes!
>

Ok will change the type of 'flags' field inside 'intel_context' to unsigned.
Sorry, but apart from this anything else required here ?

>>   /**
>>    * struct intel_context - as the name implies, represents a context.
>>    * @ref: reference count.
>> @@ -881,6 +882,15 @@ struct intel_context {
>>   		int pin_count;
>>   	} engine[I915_NUM_RINGS];
>>
>> +	/* TRTT info */
>> +	struct {
>
> Give this a name now, we will be thankful in the future.
>
Would ctx_trtt_params be fine ?

>> +		uint32_t invd_tile_val;
>> +		uint32_t null_tile_val;
>> +		uint64_t l3_table_address;
>> +		struct i915_vma *vma;
>> +		bool update_trtt_params;
>> +	} trtt_info;
>> +
>>   	struct list_head link;
>>   };
>>
>> @@ -2626,6 +2636,8 @@ struct drm_i915_cmd_table {
>>   				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
>>   				 !IS_BROXTON(dev))
>>
>> +#define HAS_TRTT(dev)		(IS_GEN9(dev))
>> +
>>   #define INTEL_PCH_DEVICE_ID_MASK		0xff00
>>   #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
>>   #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index 900ffd0..ae9fc34 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -146,6 +146,9 @@ static void i915_gem_context_clean(struct intel_context *ctx)
>>   		if (WARN_ON(__i915_vma_unbind_no_wait(vma)))
>>   			break;
>>   	}
>> +
>> +	if (ctx->flags & CONTEXT_USE_TRTT)
>> +		i915_gem_destroy_trtt_vma(ctx->trtt_info.vma);
>
> Sould be in context free.

Fine, will move it to the gem_context_free()

>
>>   }
>>
>>   void i915_gem_context_free(struct kref *ctx_ref)
>> @@ -512,6 +515,35 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
>>   	return ctx;
>>   }
>>
>> +static int
>> +i915_setup_trtt_ctx(struct intel_context *ctx,
>> +		    struct drm_i915_gem_context_trtt_param *trtt_params)
>> +{
>> +	if (ctx->flags & CONTEXT_USE_TRTT)
>> +		return -EEXIST;
>> +
>> +	/* basic sanity checks for the l3 table pointer */
>> +	if ((ctx->trtt_info.l3_table_address >= GEN9_TRTT_SEGMENT_START) &&
>> +	    (ctx->trtt_info.l3_table_address <
>> +			(GEN9_TRTT_SEGMENT_START + GEN9_TRTT_SEGMENT_SIZE)))
>
> Presumably l3_table has an actual size and you want to do a range
> overlap test, not just the start address.
>
Yes intend to do a range overlap test only. But since L3 table size is 
fixed as 4KB, thought there is no real need to also include the size in 
the range check, considering the allocations are always in multiple of 4KB.

>> +		return -EINVAL;
>> +
>> +	if (ctx->trtt_info.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK)
>> +		return -EINVAL;
>
> These are worth adding DRM_DEBUG() or even better start using dev_debug()
> so that we can debug userspace startup issues.
>
Fine, I think DRM_DEBUG_DRIVER will be more appropriate compared to 
DRM_DEBUG.
Or
	dev_dbg(dev->dev, "invalid l3 table address\n");

>> @@ -952,6 +984,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>>   {
>>   	struct drm_i915_file_private *file_priv = file->driver_priv;
>>   	struct drm_i915_gem_context_param *args = data;
>> +	struct drm_i915_gem_context_trtt_param trtt_params;
>>   	struct intel_context *ctx;
>>   	int ret;
>>
>> @@ -983,6 +1016,18 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>>   			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
>>   		}
>>   		break;
>> +	case I915_CONTEXT_PARAM_ENABLE_TRTT:
>
> Bump this case to i915_setup_trtt_ctx.
> i.e. just have
> 	ret = i915_setup_trtt_ctx(ctx, args);
> 	break;
>
> Otherwise this function will become very unwieldly very quickly.
>
Fine, this will be much better.

>>   int i915_ppgtt_init_hw(struct drm_device *dev)
>>   {
>> +	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
>> +		struct drm_i915_private *dev_priv = dev->dev_private;
>> +
>> +		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
>> +			   GEN9_TRTT_BYPASS_DISABLE);
>
> Shouldn't this be a context specific register? In which case you need to
> set it in the context image instead.
>
> Hmm. given you already do the context image tweaks, how does work with
> non-trtt contexts?
>

GEN9_TR_CHICKEN_BIT_VECTOR is not a context specific register.
It globally enables TR-TT support in Hw. Still TR-TT enabling on per 
context basis is required.
Non-trtt contexts are not affected by this setting.

>> +struct i915_vma *
>> +i915_gem_setup_trtt_vma(struct i915_address_space *vm)
>> +{
>> +	struct i915_vma *vma;
>> +	int ret;
>> +
>> +	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
>> +	if (vma == NULL)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	INIT_LIST_HEAD(&vma->vma_link);
>> +	INIT_LIST_HEAD(&vma->mm_list);
>> +	INIT_LIST_HEAD(&vma->exec_list);
>> +	vma->vm = vm;
>> +	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
>
> Tempted to write a patch to allow
>
> 	vma->vm = i915_ppggtt_get(i915_vm_to_ppgtt(vm));
> ?
>
>> +	/* Mark the vma as perennially pinned */
>
> s/perennially/permanently/
>

Thanks, will rephrase as 'permanently pinned'.
> We don't want to lose the reservation as opposed to having it grow back
> next year.
>
>> +	vma->pin_count = 1;
>> +
>> +	/* Reserve from the 48 bit PPGTT space */
>> +	vma->node.start = GEN9_TRTT_SEGMENT_START;
>> +	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
>> +	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
>> +	if (ret) {
>> +		ret = i915_gem_evict_for_vma(vma);
>> +		if (ret == 0)
>> +			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
>
> Good. I think we want i915_vm_reserve_node(vm, START, SIZE, &vma) - but
> have a look at the other callsites to see if we have a common interface.
> Looks like this would improve i915_vgpu.
>

Ok so need to define a new wrapper function,
	i915_vm_reserve_node(vm, START, SIZE, &vma).

After looking at the other callsites of drm_mm_reserve_node, including 
i915_vgpu, I think it would be better to have the prototype as,
    i915_vm_reserve_node(vm, &node);

However this should be done as a separate patch ?

>> +struct drm_i915_gem_context_trtt_param {
>> +	__u64 l3_table_address;
>> +	__u32 invd_tile_val;
>> +	__u32 null_tile_val;
>> +};
>
> Passes the ABI structure sanity checks.

Should we allow User to also choose the location of TR-TT segment (size 
is anyways fixed as 1<<44).

Best regards
Akash
> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] drm/i915: Support to enable TRTT on GEN9
  2016-01-11  7:39   ` Goel, Akash
@ 2016-01-11  8:49     ` Chris Wilson
  2016-01-11 12:29       ` Goel, Akash
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2016-01-11  8:49 UTC (permalink / raw)
  To: Goel, Akash; +Cc: intel-gfx

On Mon, Jan 11, 2016 at 01:09:50PM +0530, Goel, Akash wrote:
> 
> 
> On 1/10/2016 11:09 PM, Chris Wilson wrote:
> >On Sat, Jan 09, 2016 at 05:00:21PM +0530, akash.goel@intel.com wrote:
> >>From: Akash Goel <akash.goel@intel.com>
> >>
> >>Gen9 has an additional address translation hardware support in form of
> >>Tiled Resource Translation Table (TR-TT) which provides an extra level
> >>of abstraction over PPGTT.
> >>This is useful for mapping Sparse/Tiled texture resources.
> >>Sparse resources are created as virtual-only allocations. Regions of the
> >>resource that the application intends to use is bound to the physical memory
> >>on the fly and can be re-bound to different memory allocations over the
> >>lifetime of the resource.
> >>
> >>TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
> >>for a new PPGTT instance, but TR-TT may not enabled for every context.
> >>1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
> >>which such chunk to use is conveyed to HW through a register.
> >>Any GFX address, which lies in that reserved 44 bit range will be translated
> >>through TR-TT first and then through PPGTT to get the actual physical address,
> >>so the output of translation from TR-TT will be a PPGTT offset.
> >>
> >>TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
> >>leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
> >>each level is contained within a 4KB page hence L3 and L2 is composed of
> >>512 64b entries and L1 is composed of 1024 32b entries.
> >>
> >>There is a provision to keep TR-TT Tables in virtual space, where the pages of
> >>TRTT tables will be mapped to PPGTT.
> >>Currently this is the supported mode, in this mode UMD will have a full control
> >>on TR-TT management, with bare minimum support from KMD.
> >>So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
> >>similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
> >>The entries of L1 table will contain the PPGTT offset of BOs actually backing
> >>the Sparse resources.
> >
> >>The assumption here is that UMD only will do the complete PPGTT address space
> >>management and use the Soft Pin API for all the buffer objects associated with
> >>a given Context.
> >
> >That is a poor assumption, and not one required for this to work.
> >
> This is not a strict requirement.
> But I thought that conflicts will be minimized if UMD itself can do
> the full address space management.
> At least UMD has to ensure that PPGTT offset of L3 table remains
> same throughout.

Yes, userspace must control that object, and that would require softpin
to preserve it across execbuffer calls. The kernel does not require that
all addresses be handled in userspace afterwards, that's the language I
wish to avoid. (Hence I don't like using "assumption" as that just
invites userspace to break the kernel.)
 
> >>So UMD will also have to allocate the L3/L2/L1 table pages
> >>as a regular GEM BO only & assign them a PPGTT address through the Soft Pin API.
> >>UMD would have to emit the MI_STORE_DATA_IMM commands in the batch buffer to
> >>program the relevant entries of L3/L2/L1 tables.
> >
> >This only applies to te TR-TT L1-L3 cache, right?
> >
> Yes applies only to the TR-TT L1-L3 tables.
> The backing pages of L3/L2/L1 tables shall be allocated as a BO,
> which should be assigned a PPGTT address.
> The table entries could be written directly also by UMD by mmapping
> the table BOs, but adding MI_STORE_DATA_IMM commands in the batch
> buffer itself would help to achieve serialization (implicitly).

Can you tighten up the phrasing here? My first read was that you indeed
for all PTE writes to be in userspace, which is scary.

"UMD will then allocate the L3/L32/L1 page tables for TR-TT as a regular
bo, and will use softpin to assign it to the l3_table_address when used.
UMD will also maintain the entries in the TR-TT page tables using
regular batch commands (MI_STORE_DATA_IMM), or via mmapping of the page
table bo."

> >>autonomously and KMD will be oblivious of it.
> >>The BOs must not be assigned an address from TR-TT segment, they will be mapped
> >
> >s/The BOs/Any object/
> >
> Ok will use 'Any object'
> >>to PPGTT in a regular way by KMD
> >
> >s/using the Soft Pin offset provided by UMD// as this is irrelevant.
> >
> You mean to say that it is needless or inappropriate to state that
> KMD will use the Soft PIN offset provided by UMD, it doesn't matter
> that whether the Soft PIN offset is used or KMD itself assigns an
> address.

I just want to avoid implying that userspace must use softpin on every
single bo for this to work. (Mainly because I don't really want
userspace to have to do full address space management, as we will always
have to do the double check inside the kernel. Unless there is a real
need (e.g. svm), I'd rather improve the kernel allocator/verification, rather
than try and circumvent it.)

> >>@@ -172,6 +172,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
> >>  	case I915_PARAM_HAS_EXEC_SOFTPIN:
> >>  		value = 1;
> >>  		break;
> >>+	case I915_PARAM_HAS_TRTT:
> >>+		value = HAS_TRTT(dev);
> >>+		break;
> >
> >Should we do this here, or just query the context? In fact you are
> >missing the context getparam path any way.
> >
> Sorry, do you mean to say that with -ENODEV error also, on context
> setparam, User can make out the TR-TT support, so no need to have an
> explicit getparam case.
> 
> Would the context getparam path be really useful for TR-TT?.
> If its needed, then would be better to rename
> I915_CONTEXT_PARAM_ENABLE_TRTT to I915_CONTEXT_PARAM_TRTT_INFO ?

The question I have is do we want:

GETPARAM + CONTEXT_SETPARAM

or

CONTEXT_GETPARAM + CONTEXT_SETPARAM

the latter seems more symmetric and flexible, and we can use as a double
check later on that we set the right address etc.

Indeed, hindsight says ENABLE_TRTT is a bad name :)

I915_CONTEXT_PARAM_TRTT (let's assume for now  this will be the last,
any future PARAM_TRTT can think of a good name for its extension).

> >>  	default:
> >>  		DRM_DEBUG("Unknown parameter %d\n", param->param);
> >>  		return -EINVAL;
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index c6dd4db..12c612e 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -839,6 +839,7 @@ struct i915_ctx_hang_stats {
> >>  #define DEFAULT_CONTEXT_HANDLE 0
> >>
> >>  #define CONTEXT_NO_ZEROMAP (1<<0)
> >>+#define CONTEXT_USE_TRTT   (1<<1)
> >
> >Make flags unsigned whilst you are here, and fix the holes!
> >
> 
> Ok will change the type of 'flags' field inside 'intel_context' to unsigned.
> Sorry, but apart from this anything else required here ?

No, it's just belated anger about the silly "int flags".

> 
> >>  /**
> >>   * struct intel_context - as the name implies, represents a context.
> >>   * @ref: reference count.
> >>@@ -881,6 +882,15 @@ struct intel_context {
> >>  		int pin_count;
> >>  	} engine[I915_NUM_RINGS];
> >>
> >>+	/* TRTT info */
> >>+	struct {
> >
> >Give this a name now, we will be thankful in the future.
> >
> Would ctx_trtt_params be fine ?

struct intel_context_trtt

(Avoid using params for internals, let's keep those for uAPI - that
helps us distinguish pieces of code / context.)

> >>
> >>  void i915_gem_context_free(struct kref *ctx_ref)
> >>@@ -512,6 +515,35 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
> >>  	return ctx;
> >>  }
> >>
> >>+static int
> >>+i915_setup_trtt_ctx(struct intel_context *ctx,
> >>+		    struct drm_i915_gem_context_trtt_param *trtt_params)
> >>+{
> >>+	if (ctx->flags & CONTEXT_USE_TRTT)
> >>+		return -EEXIST;
> >>+
> >>+	/* basic sanity checks for the l3 table pointer */
> >>+	if ((ctx->trtt_info.l3_table_address >= GEN9_TRTT_SEGMENT_START) &&
> >>+	    (ctx->trtt_info.l3_table_address <
> >>+			(GEN9_TRTT_SEGMENT_START + GEN9_TRTT_SEGMENT_SIZE)))
> >
> >Presumably l3_table has an actual size and you want to do a range
> >overlap test, not just the start address.
> >
> Yes intend to do a range overlap test only. But since L3 table size
> is fixed as 4KB, thought there is no real need to also include the
> size in the range check, considering the allocations are always in
> multiple of 4KB.

Ok. You have a choice of writting that up as a comment, or just doing
the page overlap test :) Honestly, I would just go for the range test
since this is a one-off init path and the reader then doesn't even have
to think.

> >>+		return -EINVAL;
> >>+
> >>+	if (ctx->trtt_info.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK)
> >>+		return -EINVAL;
> >
> >These are worth adding DRM_DEBUG() or even better start using dev_debug()
> >so that we can debug userspace startup issues.
> >
> Fine, I think DRM_DEBUG_DRIVER will be more appropriate compared to
> DRM_DEBUG.

No, these are userspace errors for which we use DRM_DEBUG.
DRM_DEBUG_DRIVER is for a driver error :)

> Or
> 	dev_dbg(dev->dev, "invalid l3 table address\n");

Include as much info as you can (without giving away kernel internals),
since the user gave use the l3_address that we reject, report it. That
makes it easier to spot if it is the same address as they expected.

dev_dbg() would be my preference.

#define i915_dbg(DEV, args...) dev_dbg(__I915__(DEV)->dev->dev, ##args)
(not the prettiest yet, the pointer dancing is in the wrong direction!)

and let's get the ball rolling.

> >>  int i915_ppgtt_init_hw(struct drm_device *dev)
> >>  {
> >>+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
> >>+		struct drm_i915_private *dev_priv = dev->dev_private;
> >>+
> >>+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
> >>+			   GEN9_TRTT_BYPASS_DISABLE);
> >
> >Shouldn't this be a context specific register? In which case you need to
> >set it in the context image instead.
> >
> >Hmm. given you already do the context image tweaks, how does work with
> >non-trtt contexts?
> >
> 
> GEN9_TR_CHICKEN_BIT_VECTOR is not a context specific register.
> It globally enables TR-TT support in Hw. Still TR-TT enabling on per
> context basis is required.
> Non-trtt contexts are not affected by this setting.

Please add that as a comment here. What are the downsides, potential
regressions? It's behind a chicken bit after all...

> Ok so need to define a new wrapper function,
> 	i915_vm_reserve_node(vm, START, SIZE, &vma).
> 
> After looking at the other callsites of drm_mm_reserve_node,
> including i915_vgpu, I think it would be better to have the
> prototype as,
>    i915_vm_reserve_node(vm, &node);
> 
> However this should be done as a separate patch ?

Yes, I was just recognising the code duplication and found 3 places
where we could use it - 3 being the magic number to refactor.
 
> >>+struct drm_i915_gem_context_trtt_param {
> >>+	__u64 l3_table_address;
> >>+	__u32 invd_tile_val;
> >>+	__u32 null_tile_val;
> >>+};
> >
> >Passes the ABI structure sanity checks.
> 
> Should we allow User to also choose the location of TR-TT segment
> (size is anyways fixed as 1<<44).

The kernel is much more agnostic with your approach than I anticipated,
so from our pov, allowing the user to shoot themselves in the foot is
ok.

There is only one sensible location, but that one location may be
sensible for a few things.

i.e. it shouldn't be below 1<<40 so that you can do full aliasing
between CPU and GPU addresses, and you want to avoid cutting your
address space in two, so it has to go at the ends, ergo it should be at
the very top.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
  2016-01-10 17:39 ` Chris Wilson
@ 2016-01-11 11:19 ` Patchwork
  2016-01-22 15:44 ` ✗ Fi.CI.BAT: warning for drm/i915: Support to enable TRTT on GEN9 (rev2) Patchwork
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-11 11:19 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Summary ==

Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                pass       -> DMESG-WARN (skl-i5k-2) UNSTABLE
                dmesg-warn -> PASS       (bdw-ultra)
                dmesg-warn -> PASS       (skl-i7k-2) UNSTABLE
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b:
                dmesg-warn -> PASS       (byt-nuc)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:114  dwarn:3   dfail:0   fail:0   skip:24 
byt-nuc          total:141  pass:119  dwarn:7   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:100  dwarn:4   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1121/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] drm/i915: Support to enable TRTT on GEN9
  2016-01-11  8:49     ` Chris Wilson
@ 2016-01-11 12:29       ` Goel, Akash
  2016-01-22 15:34         ` [PATCH v2] " akash.goel
  0 siblings, 1 reply; 59+ messages in thread
From: Goel, Akash @ 2016-01-11 12:29 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: akash.goel



On 1/11/2016 2:19 PM, Chris Wilson wrote:
> On Mon, Jan 11, 2016 at 01:09:50PM +0530, Goel, Akash wrote:
>>
>>
>> On 1/10/2016 11:09 PM, Chris Wilson wrote:
>>> On Sat, Jan 09, 2016 at 05:00:21PM +0530, akash.goel@intel.com wrote:
>>>> From: Akash Goel <akash.goel@intel.com>
>>>>
>>>> Gen9 has an additional address translation hardware support in form of
>>>> Tiled Resource Translation Table (TR-TT) which provides an extra level
>>>> of abstraction over PPGTT.
>>>> This is useful for mapping Sparse/Tiled texture resources.
>>>> Sparse resources are created as virtual-only allocations. Regions of the
>>>> resource that the application intends to use is bound to the physical memory
>>>> on the fly and can be re-bound to different memory allocations over the
>>>> lifetime of the resource.
>>>>
>>>> TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
>>>> for a new PPGTT instance, but TR-TT may not enabled for every context.
>>>> 1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
>>>> which such chunk to use is conveyed to HW through a register.
>>>> Any GFX address, which lies in that reserved 44 bit range will be translated
>>>> through TR-TT first and then through PPGTT to get the actual physical address,
>>>> so the output of translation from TR-TT will be a PPGTT offset.
>>>>
>>>> TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
>>>> leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
>>>> each level is contained within a 4KB page hence L3 and L2 is composed of
>>>> 512 64b entries and L1 is composed of 1024 32b entries.
>>>>
>>>> There is a provision to keep TR-TT Tables in virtual space, where the pages of
>>>> TRTT tables will be mapped to PPGTT.
>>>> Currently this is the supported mode, in this mode UMD will have a full control
>>>> on TR-TT management, with bare minimum support from KMD.
>>>> So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
>>>> similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
>>>> The entries of L1 table will contain the PPGTT offset of BOs actually backing
>>>> the Sparse resources.
>>>
>>>> The assumption here is that UMD only will do the complete PPGTT address space
>>>> management and use the Soft Pin API for all the buffer objects associated with
>>>> a given Context.
>>>
>>> That is a poor assumption, and not one required for this to work.
>>>
>> This is not a strict requirement.
>> But I thought that conflicts will be minimized if UMD itself can do
>> the full address space management.
>> At least UMD has to ensure that PPGTT offset of L3 table remains
>> same throughout.
>
> Yes, userspace must control that object, and that would require softpin
> to preserve it across execbuffer calls. The kernel does not require that
> all addresses be handled in userspace afterwards, that's the language I
> wish to avoid. (Hence I don't like using "assumption" as that just
> invites userspace to break the kernel.)
>
Fine will remove the word "assumption", instead can I put it as "UMD may 
do the complete PPGTT address space management, on the pretext that it 
could help to might minimize the conflicts".

>>>> So UMD will also have to allocate the L3/L2/L1 table pages
>>>> as a regular GEM BO only & assign them a PPGTT address through the Soft Pin API.
>>>> UMD would have to emit the MI_STORE_DATA_IMM commands in the batch buffer to
>>>> program the relevant entries of L3/L2/L1 tables.
>>>
>>> This only applies to te TR-TT L1-L3 cache, right?
>>>
>> Yes applies only to the TR-TT L1-L3 tables.
>> The backing pages of L3/L2/L1 tables shall be allocated as a BO,
>> which should be assigned a PPGTT address.
>> The table entries could be written directly also by UMD by mmapping
>> the table BOs, but adding MI_STORE_DATA_IMM commands in the batch
>> buffer itself would help to achieve serialization (implicitly).
>
> Can you tighten up the phrasing here? My first read was that you indeed
> for all PTE writes to be in userspace, which is scary.
>
> "UMD will then allocate the L3/L32/L1 page tables for TR-TT as a regular
> bo, and will use softpin to assign it to the l3_table_address when used.
> UMD will also maintain the entries in the TR-TT page tables using
> regular batch commands (MI_STORE_DATA_IMM), or via mmapping of the page
> table bo."
>

Yes, UMD will have to use softpin to assign l3_table_address to L3 table BO.
Similarly the softpin will be needed for L2/L1 table BOs.

>>>> autonomously and KMD will be oblivious of it.
>>>> The BOs must not be assigned an address from TR-TT segment, they will be mapped
>>>
>>> s/The BOs/Any object/
>>>
>> Ok will use 'Any object'
>>>> to PPGTT in a regular way by KMD
>>>
>>> s/using the Soft Pin offset provided by UMD// as this is irrelevant.
>>>
>> You mean to say that it is needless or inappropriate to state that
>> KMD will use the Soft PIN offset provided by UMD, it doesn't matter
>> that whether the Soft PIN offset is used or KMD itself assigns an
>> address.
>
> I just want to avoid implying that userspace must use softpin on every
> single bo for this to work. (Mainly because I don't really want
> userspace to have to do full address space management, as we will always
> have to do the double check inside the kernel. Unless there is a real
> need (e.g. svm), I'd rather improve the kernel allocator/verification, rather
> than try and circumvent it.)
>
>>>> @@ -172,6 +172,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>>>>   	case I915_PARAM_HAS_EXEC_SOFTPIN:
>>>>   		value = 1;
>>>>   		break;
>>>> +	case I915_PARAM_HAS_TRTT:
>>>> +		value = HAS_TRTT(dev);
>>>> +		break;
>>>
>>> Should we do this here, or just query the context? In fact you are
>>> missing the context getparam path any way.
>>>
>> Sorry, do you mean to say that with -ENODEV error also, on context
>> setparam, User can make out the TR-TT support, so no need to have an
>> explicit getparam case.
>>
>> Would the context getparam path be really useful for TR-TT?.
>> If its needed, then would be better to rename
>> I915_CONTEXT_PARAM_ENABLE_TRTT to I915_CONTEXT_PARAM_TRTT_INFO ?
>
> The question I have is do we want:
>
> GETPARAM + CONTEXT_SETPARAM
>
> or
>
> CONTEXT_GETPARAM + CONTEXT_SETPARAM
>
> the latter seems more symmetric and flexible, and we can use as a double
> check later on that we set the right address etc.
>
> Indeed, hindsight says ENABLE_TRTT is a bad name :)
>
> I915_CONTEXT_PARAM_TRTT (let's assume for now  this will be the last,
> any future PARAM_TRTT can think of a good name for its extension).
>

fine, will rename as I915_CONTEXT_PARAM_TRTT.
And use the CONTEXT_GETPARAM + CONTEXT_SETPARAM pair.

Would the following change be fine ?

@@ -974,6 +987,24 @@ int i915_gem_context_getparam_ioctl(struct 
drm_device *dev, void *data,
  		else
  			args->value = to_i915(dev)->gtt.base.total;
  		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+			return -ENODEV;
+		else if (args->size < sizeof(trtt_params))
+			ret = -EINVAL;
+		else {
+			trtt_params.l3_table_address =
+				ctx->trtt_info.l3_table_address;
+			trtt_params.null_tile_val =
+				ctx->trtt_info.null_tile_val;
+			trtt_params.invd_tile_val =
+				ctx->trtt_info.invd_tile_val;
+
+			if (__copy_to_user(to_user_ptr(args->value),
+					   &trtt_params,
+					   sizeof(trtt_params)))
+				ret = -EFAULT;
+		}
  	default:

>>>>   	default:
>>>>   		DRM_DEBUG("Unknown parameter %d\n", param->param);
>>>>   		return -EINVAL;
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>> index c6dd4db..12c612e 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -839,6 +839,7 @@ struct i915_ctx_hang_stats {
>>>>   #define DEFAULT_CONTEXT_HANDLE 0
>>>>
>>>>   #define CONTEXT_NO_ZEROMAP (1<<0)
>>>> +#define CONTEXT_USE_TRTT   (1<<1)
>>>
>>> Make flags unsigned whilst you are here, and fix the holes!
>>>
>>
>> Ok will change the type of 'flags' field inside 'intel_context' to unsigned.
>> Sorry, but apart from this anything else required here ?
>
> No, it's just belated anger about the silly "int flags".
>

Fine will modify the type in this patch only.

>>
>>>>   /**
>>>>    * struct intel_context - as the name implies, represents a context.
>>>>    * @ref: reference count.
>>>> @@ -881,6 +882,15 @@ struct intel_context {
>>>>   		int pin_count;
>>>>   	} engine[I915_NUM_RINGS];
>>>>
>>>> +	/* TRTT info */
>>>> +	struct {
>>>
>>> Give this a name now, we will be thankful in the future.
>>>
>> Would ctx_trtt_params be fine ?
>
> struct intel_context_trtt
>
> (Avoid using params for internals, let's keep those for uAPI - that
> helps us distinguish pieces of code / context.)
>
Thanks, intel_context_trtt is more appropriate.

>>>>
>>>>   void i915_gem_context_free(struct kref *ctx_ref)
>>>> @@ -512,6 +515,35 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
>>>>   	return ctx;
>>>>   }
>>>>
>>>> +static int
>>>> +i915_setup_trtt_ctx(struct intel_context *ctx,
>>>> +		    struct drm_i915_gem_context_trtt_param *trtt_params)
>>>> +{
>>>> +	if (ctx->flags & CONTEXT_USE_TRTT)
>>>> +		return -EEXIST;
>>>> +
>>>> +	/* basic sanity checks for the l3 table pointer */
>>>> +	if ((ctx->trtt_info.l3_table_address >= GEN9_TRTT_SEGMENT_START) &&
>>>> +	    (ctx->trtt_info.l3_table_address <
>>>> +			(GEN9_TRTT_SEGMENT_START + GEN9_TRTT_SEGMENT_SIZE)))
>>>
>>> Presumably l3_table has an actual size and you want to do a range
>>> overlap test, not just the start address.
>>>
>> Yes intend to do a range overlap test only. But since L3 table size
>> is fixed as 4KB, thought there is no real need to also include the
>> size in the range check, considering the allocations are always in
>> multiple of 4KB.
>
> Ok. You have a choice of writting that up as a comment, or just doing
> the page overlap test :) Honestly, I would just go for the range test
> since this is a one-off init path and the reader then doesn't even have
> to think.
>

Fine, will do the page overlap test.

>>>> +		return -EINVAL;
>>>> +
>>>> +	if (ctx->trtt_info.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK)
>>>> +		return -EINVAL;
>>>
>>> These are worth adding DRM_DEBUG() or even better start using dev_debug()
>>> so that we can debug userspace startup issues.
>>>
>> Fine, I think DRM_DEBUG_DRIVER will be more appropriate compared to
>> DRM_DEBUG.
>
> No, these are userspace errors for which we use DRM_DEBUG.
> DRM_DEBUG_DRIVER is for a driver error :)
>
>> Or
>> 	dev_dbg(dev->dev, "invalid l3 table address\n");
>
> Include as much info as you can (without giving away kernel internals),
> since the user gave use the l3_address that we reject, report it. That
> makes it easier to spot if it is the same address as they expected.
>
> dev_dbg() would be my preference.
>
> #define i915_dbg(DEV, args...) dev_dbg(__I915__(DEV)->dev->dev, ##args)
> (not the prettiest yet, the pointer dancing is in the wrong direction!)
>
> and let's get the ball rolling.
>

This will also go as a separate patch.
One doubt here, by using dev_dbg() we intend to avoid dependency on the 
value of drm.debug parameter and always get certain error messages ?. 
Sorry just want to understand the rationale behind it.

>>>>   int i915_ppgtt_init_hw(struct drm_device *dev)
>>>>   {
>>>> +	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
>>>> +		struct drm_i915_private *dev_priv = dev->dev_private;
>>>> +
>>>> +		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
>>>> +			   GEN9_TRTT_BYPASS_DISABLE);
>>>
>>> Shouldn't this be a context specific register? In which case you need to
>>> set it in the context image instead.
>>>
>>> Hmm. given you already do the context image tweaks, how does work with
>>> non-trtt contexts?
>>>
>>
>> GEN9_TR_CHICKEN_BIT_VECTOR is not a context specific register.
>> It globally enables TR-TT support in Hw. Still TR-TT enabling on per
>> context basis is required.
>> Non-trtt contexts are not affected by this setting.
>
> Please add that as a comment here. What are the downsides, potential
> regressions? It's behind a chicken bit after all...
>
Fine, will add a comment here for clarity.
So not aware of downsides, this setting should come into picture only 
when TR-TT is enabled for a context.

>> Ok so need to define a new wrapper function,
>> 	i915_vm_reserve_node(vm, START, SIZE, &vma).
>>
>> After looking at the other callsites of drm_mm_reserve_node,
>> including i915_vgpu, I think it would be better to have the
>> prototype as,
>>     i915_vm_reserve_node(vm, &node);
>>
>> However this should be done as a separate patch ?
>
> Yes, I was just recognising the code duplication and found 3 places
> where we could use it - 3 being the magic number to refactor.
>
>>>> +struct drm_i915_gem_context_trtt_param {
>>>> +	__u64 l3_table_address;
>>>> +	__u32 invd_tile_val;
>>>> +	__u32 null_tile_val;
>>>> +};
>>>
>>> Passes the ABI structure sanity checks.
>>
>> Should we allow User to also choose the location of TR-TT segment
>> (size is anyways fixed as 1<<44).
>
> The kernel is much more agnostic with your approach than I anticipated,
> so from our pov, allowing the user to shoot themselves in the foot is
> ok.
>
> There is only one sensible location, but that one location may be
> sensible for a few things.
>
> i.e. it shouldn't be below 1<<40 so that you can do full aliasing
> between CPU and GPU addresses, and you want to avoid cutting your
> address space in two, so it has to go at the ends, ergo it should be at
> the very top.

So Top most region is the most suitable location, hence should be used 
always.

Best regards
Akash

> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2] drm/i915: Support to enable TRTT on GEN9
  2016-01-22 15:34         ` [PATCH v2] " akash.goel
@ 2016-01-22 15:33           ` kbuild test robot
  2016-03-03  4:54           ` [PATCH v3] " akash.goel
  1 sibling, 0 replies; 59+ messages in thread
From: kbuild test robot @ 2016-01-22 15:33 UTC (permalink / raw)
  Cc: Akash Goel, intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1554 bytes --]

Hi Akash,

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20160122]
[cannot apply to v4.4]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/akash-goel-intel-com/drm-i915-Support-to-enable-TRTT-on-GEN9/20160122-232410
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-x019-01201142 (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_gem_context.c: In function 'intel_context_set_trtt':
>> drivers/gpu/drm/i915/i915_gem_context.c:579:3: error: implicit declaration of function 'i915_dbg' [-Werror=implicit-function-declaration]
      i915_dbg(dev, "segment base address not correctly aligned\n");
      ^
   cc1: some warnings being treated as errors

vim +/i915_dbg +579 drivers/gpu/drm/i915/i915_gem_context.c

   573					to_user_ptr(args->value),
   574					sizeof(trtt_params)))
   575			return -EFAULT;
   576	
   577		/* basic sanity checks for the segment location & l3 table pointer */
   578		if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
 > 579			i915_dbg(dev, "segment base address not correctly aligned\n");
   580			return -EINVAL;
   581		}
   582	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 22319 bytes --]

[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2] drm/i915: Support to enable TRTT on GEN9
  2016-01-11 12:29       ` Goel, Akash
@ 2016-01-22 15:34         ` akash.goel
  2016-01-22 15:33           ` kbuild test robot
  2016-03-03  4:54           ` [PATCH v3] " akash.goel
  0 siblings, 2 replies; 59+ messages in thread
From: akash.goel @ 2016-01-22 15:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
assign them a PPGTT address through the Soft Pin API (for example, use soft pin
to assign l3_table_address to the L3 table BO, when used).
UMD will also program the entries in the TR-TT page tables using regular batch
commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
UMD may do the complete PPGTT address space management, on the pretext that it
could help minimize the conflicts.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
transparent to the KMD, UMD will do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
Any object must not be assigned an address from TR-TT segment, they will be
mapped to PPGTT in a regular way by KMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, start location of TR-TT
segment alongwith the pattern value for the Null & invalid Tile registers.

v2:
 - Support context_getparam for TRTT also and dispense with a separate
   GETPARAM case for TRTT (Chris).
 - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
   from user space (Chris).
 - Move all the argument checking for TRTT in context_setparam to the
   set_trtt function (Chris).
 - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
 - Rename certain functions to rightly reflect their purpose, rename
   the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
   rephrase few lines in the commit message body, add more comments (Chris).
 - Extend ABI to allow User specify TRTT segment location also.
 - Fix for selective enabling of TRTT on per context basis, explicitly
   disable TR-TT at the start of a new context.

Testcase: igt/gem_trtt

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  17 +++++-
 drivers/gpu/drm/i915/i915_gem_context.c |  98 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  62 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 +++
 drivers/gpu/drm/i915/i915_reg.h         |  19 ++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 103 +++++++++++++++++++++++++++++++-
 include/uapi/drm/i915_drm.h             |   8 +++
 7 files changed, 311 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cef82c5..749513f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -843,6 +843,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1<<1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -867,7 +868,7 @@ struct intel_context {
 	int user_handle;
 	uint8_t remap_slice;
 	struct drm_i915_private *i915;
-	int flags;
+	unsigned int flags;
 	struct drm_i915_file_private *file_priv;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
@@ -885,6 +886,18 @@ struct intel_context {
 		int pin_count;
 	} engine[I915_NUM_RINGS];
 
+	/* TRTT info (redirection tables for userspace,
+	 * for sparse resource management)
+	 */
+	struct intel_context_trtt {
+		uint32_t invd_tile_val;
+		uint32_t null_tile_val;
+		uint64_t l3_table_address;
+		uint64_t segment_base_addr;
+		struct i915_vma *vma;
+		bool update_trtt_params;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2638,6 +2651,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c25083c..89c7de5 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
 	return ret;
 }
 
+static void intel_context_free_trtt(struct intel_context *ctx)
+{
+	if (ctx->trtt_info.vma == NULL)
+		return;
+
+	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+}
+
 static void i915_gem_context_clean(struct intel_context *ctx)
 {
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
@@ -164,6 +172,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	 */
 	i915_gem_context_clean(ctx);
 
+	intel_context_free_trtt(ctx);
+
 	i915_ppgtt_put(ctx->ppgtt);
 
 	if (ctx->legacy_hw_ctx.rcs_state)
@@ -516,6 +526,88 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+intel_context_get_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (args->size < sizeof(trtt_params))
+		args->size = sizeof(trtt_params);
+	else {
+		trtt_params.segment_base_addr =
+			ctx->trtt_info.segment_base_addr;
+		trtt_params.l3_table_address =
+			ctx->trtt_info.l3_table_address;
+		trtt_params.null_tile_val =
+			ctx->trtt_info.null_tile_val;
+		trtt_params.invd_tile_val =
+			ctx->trtt_info.invd_tile_val;
+
+		if (__copy_to_user(to_user_ptr(args->value),
+				   &trtt_params,
+				   sizeof(trtt_params)))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int
+intel_context_set_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+	else if (args->size < sizeof(trtt_params))
+		return -EINVAL;
+	else if (copy_from_user(&trtt_params,
+				to_user_ptr(args->value),
+				sizeof(trtt_params)))
+		return -EFAULT;
+
+	/* basic sanity checks for the segment location & l3 table pointer */
+	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
+		i915_dbg(dev, "segment base address not correctly aligned\n");
+		return -EINVAL;
+	}
+
+	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
+	     trtt_params.segment_base_addr) &&
+	    (trtt_params.l3_table_address <
+		    (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
+		i915_dbg(dev, "l3 table address conflicts with trtt segment\n");
+		return -EINVAL;
+	}
+
+	if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
+		i915_dbg(dev, "invalid l3 table address\n");
+		return -EINVAL;
+	}
+
+	ctx->trtt_info.vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
+						trtt_params.segment_base_addr);
+	if (IS_ERR(ctx->trtt_info.vma))
+		return PTR_ERR(ctx->trtt_info.vma);
+
+	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
+	ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
+	ctx->trtt_info.update_trtt_params = 1;
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+	return 0;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -942,6 +1034,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		else
 			args->value = to_i915(dev)->gtt.base.total;
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_get_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -987,6 +1082,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_set_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 56f4f2e..a23a179 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2146,6 +2146,17 @@ int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 
 int i915_ppgtt_init_hw(struct drm_device *dev)
 {
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		/*
+		 * Globally enable TR-TT support in Hw.
+		 * Still TR-TT enabling on per context basis is required.
+		 * Non-trtt contexts are not affected by this setting.
+		 */
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3328,6 +3339,57 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void intel_trtt_context_destroy_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->vma_link));
+	WARN_ON(!list_empty(&vma->mm_list));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	drm_mm_remove_node(&vma->node);
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (vma == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->vma_link);
+	INIT_LIST_HEAD(&vma->mm_list);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as permanently pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = segment_base_addr;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		DRM_ERROR("Reservation for TRTT segment failed: %i\n", ret);
+		intel_trtt_context_destroy_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b448ad8..7c49965 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -129,6 +129,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Fixed size segment */
+#define GEN9_TRTT_SEG_SIZE_SHIFT	44
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -559,4 +563,8 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr);
+void intel_trtt_context_destroy_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0a98889..e7bc721 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1<<0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_MASK		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1<<1)
+#define   GEN9_TRTT_ENABLE		(1<<0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f7fac5f..3ae1361 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1578,6 +1578,70 @@ static int gen9_init_render_ring(struct intel_engine_cs *ring)
 	return init_workarounds_ring(ring);
 }
 
+static int gen9_init_context_trtt(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf, 0);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
+{
+	struct intel_context *ctx = req->ctx;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	unsigned long masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+	uint32_t trva_data_value =
+		(ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
+		GEN9_TRVA_DATA_MASK;
+	const int num_lri_cmds = 6;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW0);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW1);
+	intel_logical_ring_emit(ringbuf, upper_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRVA_MASK_VALUE | trva_data_value);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
 {
 	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -1631,6 +1695,17 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
 		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
 	}
 
+	/*
+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
+	 * of directly updating the context image, as this will ensure that
+	 * update happens in a serialized manner for the context and also
+	 * lite-restore scenario will get handled.
+	 */
+	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
+		gen9_emit_trtt_regs(req);
+		req->ctx->trtt_info.update_trtt_params = 0;
+	}
+
 	ret = intel_logical_ring_begin(req, 4);
 	if (ret)
 		return ret;
@@ -1910,6 +1985,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	return intel_lr_context_render_state_init(req);
 }
 
+static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
+{
+	int ret;
+
+	/*
+	 * Explictily disable TR-TT at the start of a new context.
+	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
+	 * context the TR-TT settings of the outgoing context could get
+	 * spilled on to the new incoming context as only the Ring Context
+	 * part is loaded on the first submission of a new context, due to
+	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
+	 */
+	ret = gen9_init_context_trtt(req);
+	if (ret)
+		return ret;
+
+	return gen8_init_rcs_context(req);
+}
+
 /**
  * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  *
@@ -2006,11 +2100,14 @@ static int logical_render_ring_init(struct drm_device *dev)
 	if (HAS_L3_DPF(dev))
 		ring->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
 
-	if (INTEL_INFO(dev)->gen >= 9)
+	if (INTEL_INFO(dev)->gen >= 9) {
 		ring->init_hw = gen9_init_render_ring;
-	else
+		ring->init_context = gen9_init_rcs_context;
+	} else {
 		ring->init_hw = gen8_init_render_ring;
-	ring->init_context = gen8_init_rcs_context;
+		ring->init_context = gen8_init_rcs_context;
+	}
+
 	ring->cleanup = intel_fini_pipe_control;
 	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
 		ring->get_seqno = bxt_a_get_seqno;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index acf2102..f6a2835 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1140,7 +1140,15 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_TRTT		0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 segment_base_addr;
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* ✗ Fi.CI.BAT: warning for drm/i915: Support to enable TRTT on GEN9 (rev2)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
  2016-01-10 17:39 ` Chris Wilson
  2016-01-11 11:19 ` ✓ success: Fi.CI.BAT Patchwork
@ 2016-01-22 15:44 ` Patchwork
  2016-01-25 12:12 ` Patchwork
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-22 15:44 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Summary ==

Built on 8fe9e785ae04fa7c37f7935cff12d62e38054b60 drm-intel-nightly: 2016y-01m-21d-11h-02m-42s UTC integration manifest

Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                pass       -> DMESG-WARN (ilk-hp8440p)

bdw-nuci7        total:140  pass:131  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:143  pass:137  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:143  pass:119  dwarn:0   dfail:0   fail:0   skip:24 
byt-nuc          total:143  pass:128  dwarn:0   dfail:0   fail:0   skip:15 
hsw-brixbox      total:143  pass:136  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:143  pass:139  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:143  pass:103  dwarn:2   dfail:0   fail:0   skip:38 
ivb-t430s        total:143  pass:137  dwarn:0   dfail:0   fail:0   skip:6  
skl-i5k-2        total:143  pass:134  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:143  pass:129  dwarn:0   dfail:0   fail:0   skip:14 
snb-x220t        total:143  pass:129  dwarn:0   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1252/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✗ Fi.CI.BAT: warning for drm/i915: Support to enable TRTT on GEN9 (rev2)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
                   ` (2 preceding siblings ...)
  2016-01-22 15:44 ` ✗ Fi.CI.BAT: warning for drm/i915: Support to enable TRTT on GEN9 (rev2) Patchwork
@ 2016-01-25 12:12 ` Patchwork
  2016-01-25 12:14 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-25 12:12 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Summary ==

Built on 8fe9e785ae04fa7c37f7935cff12d62e38054b60 drm-intel-nightly: 2016y-01m-21d-11h-02m-42s UTC integration manifest

Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                pass       -> DMESG-WARN (ilk-hp8440p)

bdw-nuci7        total:140  pass:131  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:143  pass:137  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:143  pass:119  dwarn:0   dfail:0   fail:0   skip:24 
byt-nuc          total:143  pass:128  dwarn:0   dfail:0   fail:0   skip:15 
hsw-brixbox      total:143  pass:136  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:143  pass:139  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:143  pass:103  dwarn:2   dfail:0   fail:0   skip:38 
ivb-t430s        total:143  pass:137  dwarn:0   dfail:0   fail:0   skip:6  
skl-i5k-2        total:143  pass:134  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:143  pass:129  dwarn:0   dfail:0   fail:0   skip:14 
snb-x220t        total:143  pass:129  dwarn:0   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1252/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Support to enable TRTT on GEN9 (rev2)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
                   ` (3 preceding siblings ...)
  2016-01-25 12:12 ` Patchwork
@ 2016-01-25 12:14 ` Patchwork
  2016-03-03  6:42 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev3) Patchwork
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-25 12:14 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Summary ==

Built on 8fe9e785ae04fa7c37f7935cff12d62e38054b60 drm-intel-nightly: 2016y-01m-21d-11h-02m-42s UTC integration manifest

Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                pass       -> DMESG-WARN (ilk-hp8440p) UNSTABLE

bdw-nuci7        total:140  pass:131  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:143  pass:137  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:143  pass:119  dwarn:0   dfail:0   fail:0   skip:24 
byt-nuc          total:143  pass:128  dwarn:0   dfail:0   fail:0   skip:15 
hsw-brixbox      total:143  pass:136  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:143  pass:139  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:143  pass:103  dwarn:2   dfail:0   fail:0   skip:38 
ivb-t430s        total:143  pass:137  dwarn:0   dfail:0   fail:0   skip:6  
skl-i5k-2        total:143  pass:134  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:143  pass:129  dwarn:0   dfail:0   fail:0   skip:14 
snb-x220t        total:143  pass:129  dwarn:0   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1252/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3] drm/i915: Support to enable TRTT on GEN9
  2016-03-03  4:54           ` [PATCH v3] " akash.goel
@ 2016-03-03  4:54             ` kbuild test robot
  2016-03-09 11:30             ` [PATCH v4] " akash.goel
  1 sibling, 0 replies; 59+ messages in thread
From: kbuild test robot @ 2016-03-03  4:54 UTC (permalink / raw)
  Cc: Akash Goel, intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]

Hi Akash,

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20160302]
[cannot apply to v4.5-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/akash-goel-intel-com/drm-i915-Support-to-enable-TRTT-on-GEN9/20160303-124341
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x019-201609 (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_gem_context.c: In function 'intel_context_set_trtt':
>> drivers/gpu/drm/i915/i915_gem_context.c:570:3: error: implicit declaration of function 'i915_dbg' [-Werror=implicit-function-declaration]
      i915_dbg(dev, "segment base address not correctly aligned\n");
      ^
   cc1: some warnings being treated as errors

vim +/i915_dbg +570 drivers/gpu/drm/i915/i915_gem_context.c

   564					to_user_ptr(args->value),
   565					sizeof(trtt_params)))
   566			return -EFAULT;
   567	
   568		/* basic sanity checks for the segment location & l3 table pointer */
   569		if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
 > 570			i915_dbg(dev, "segment base address not correctly aligned\n");
   571			return -EINVAL;
   572		}
   573	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 23845 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v3] drm/i915: Support to enable TRTT on GEN9
  2016-01-22 15:34         ` [PATCH v2] " akash.goel
  2016-01-22 15:33           ` kbuild test robot
@ 2016-03-03  4:54           ` akash.goel
  2016-03-03  4:54             ` kbuild test robot
  2016-03-09 11:30             ` [PATCH v4] " akash.goel
  1 sibling, 2 replies; 59+ messages in thread
From: akash.goel @ 2016-03-03  4:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
assign them a PPGTT address through the Soft Pin API (for example, use soft pin
to assign l3_table_address to the L3 table BO, when used).
UMD will also program the entries in the TR-TT page tables using regular batch
commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
UMD may do the complete PPGTT address space management, on the pretext that it
could help minimize the conflicts.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
transparent to the KMD, UMD will do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
Any object must not be assigned an address from TR-TT segment, they will be
mapped to PPGTT in a regular way by KMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, start location of TR-TT
segment alongwith the pattern value for the Null & invalid Tile registers.

v2:
 - Support context_getparam for TRTT also and dispense with a separate
   GETPARAM case for TRTT (Chris).
 - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
   from user space (Chris).
 - Move all the argument checking for TRTT in context_setparam to the
   set_trtt function (Chris).
 - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
 - Rename certain functions to rightly reflect their purpose, rename
   the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
   rephrase few lines in the commit message body, add more comments (Chris).
 - Extend ABI to allow User specify TRTT segment location also.
 - Fix for selective enabling of TRTT on per context basis, explicitly
   disable TR-TT at the start of a new context.

v3:
 - Check the return value of gen9_emit_trtt_regs (Chris)
 - Update the kernel doc for intel_context structure.
 - Rebased.

Testcase: igt/gem_trtt

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  17 ++++-
 drivers/gpu/drm/i915/i915_gem_context.c |  98 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  62 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 +++
 drivers/gpu/drm/i915/i915_reg.h         |  19 ++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 106 +++++++++++++++++++++++++++++++-
 include/uapi/drm/i915_drm.h             |   8 +++
 7 files changed, 314 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f7b6caf..e45bc57 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -856,6 +856,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1<<1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -870,6 +871,8 @@ struct i915_ctx_hang_stats {
  * @ppgtt: virtual memory space used by this context.
  * @legacy_hw_ctx: render context backing object and whether it is correctly
  *                initialized (legacy ring submission mechanism only).
+ * @trtt_info: Programming parameters for tr-tt (redirection tables for
+ *             userspace, for sparse resource management)
  * @link: link in the global list of contexts.
  *
  * Contexts are memory images used by the hardware to store copies of their
@@ -880,7 +883,7 @@ struct intel_context {
 	int user_handle;
 	uint8_t remap_slice;
 	struct drm_i915_private *i915;
-	int flags;
+	unsigned int flags;
 	struct drm_i915_file_private *file_priv;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
@@ -901,6 +904,16 @@ struct intel_context {
 		uint32_t *lrc_reg_state;
 	} engine[I915_NUM_RINGS];
 
+	/* TRTT info */
+	struct intel_context_trtt {
+		uint32_t invd_tile_val;
+		uint32_t null_tile_val;
+		uint64_t l3_table_address;
+		uint64_t segment_base_addr;
+		struct i915_vma *vma;
+		bool update_trtt_params;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2703,6 +2716,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5dd84e1..948731c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
 	return ret;
 }
 
+static void intel_context_free_trtt(struct intel_context *ctx)
+{
+	if (ctx->trtt_info.vma == NULL)
+		return;
+
+	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+}
+
 static void i915_gem_context_clean(struct intel_context *ctx)
 {
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
@@ -164,6 +172,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	 */
 	i915_gem_context_clean(ctx);
 
+	intel_context_free_trtt(ctx);
+
 	i915_ppgtt_put(ctx->ppgtt);
 
 	if (ctx->legacy_hw_ctx.rcs_state)
@@ -507,6 +517,88 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+intel_context_get_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (args->size < sizeof(trtt_params))
+		args->size = sizeof(trtt_params);
+	else {
+		trtt_params.segment_base_addr =
+			ctx->trtt_info.segment_base_addr;
+		trtt_params.l3_table_address =
+			ctx->trtt_info.l3_table_address;
+		trtt_params.null_tile_val =
+			ctx->trtt_info.null_tile_val;
+		trtt_params.invd_tile_val =
+			ctx->trtt_info.invd_tile_val;
+
+		if (__copy_to_user(to_user_ptr(args->value),
+				   &trtt_params,
+				   sizeof(trtt_params)))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int
+intel_context_set_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+	else if (args->size < sizeof(trtt_params))
+		return -EINVAL;
+	else if (copy_from_user(&trtt_params,
+				to_user_ptr(args->value),
+				sizeof(trtt_params)))
+		return -EFAULT;
+
+	/* basic sanity checks for the segment location & l3 table pointer */
+	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
+		i915_dbg(dev, "segment base address not correctly aligned\n");
+		return -EINVAL;
+	}
+
+	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
+	     trtt_params.segment_base_addr) &&
+	    (trtt_params.l3_table_address <
+		    (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
+		i915_dbg(dev, "l3 table address conflicts with trtt segment\n");
+		return -EINVAL;
+	}
+
+	if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
+		i915_dbg(dev, "invalid l3 table address\n");
+		return -EINVAL;
+	}
+
+	ctx->trtt_info.vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
+						trtt_params.segment_base_addr);
+	if (IS_ERR(ctx->trtt_info.vma))
+		return PTR_ERR(ctx->trtt_info.vma);
+
+	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
+	ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
+	ctx->trtt_info.update_trtt_params = 1;
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+	return 0;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -939,6 +1031,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		else
 			args->value = to_i915(dev)->gtt.base.total;
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_get_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -984,6 +1079,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_set_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7b8de85..d23c5c8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 {
 	gtt_write_workarounds(dev);
 
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		/*
+		 * Globally enable TR-TT support in Hw.
+		 * Still TR-TT enabling on per context basis is required.
+		 * Non-trtt contexts are not affected by this setting.
+		 */
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3368,6 +3379,57 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void intel_trtt_context_destroy_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->obj_link));
+	WARN_ON(!list_empty(&vma->vm_link));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	drm_mm_remove_node(&vma->node);
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (vma == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->obj_link);
+	INIT_LIST_HEAD(&vma->vm_link);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as permanently pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = segment_base_addr;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		DRM_ERROR("Reservation for TRTT segment failed: %i\n", ret);
+		intel_trtt_context_destroy_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(const dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index dc208c0..2374cb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Fixed size segment */
+#define GEN9_TRTT_SEG_SIZE_SHIFT	44
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -562,4 +566,8 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr);
+void intel_trtt_context_destroy_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 71abf57..59cf062 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1<<0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_MASK		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1<<1)
+#define   GEN9_TRTT_ENABLE		(1<<0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 27c9ee3..2b2ae15 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1640,6 +1640,70 @@ static int gen9_init_render_ring(struct intel_engine_cs *ring)
 	return init_workarounds_ring(ring);
 }
 
+static int gen9_init_context_trtt(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf, 0);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
+{
+	struct intel_context *ctx = req->ctx;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	uint64_t masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+	uint32_t trva_data_value =
+		(ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
+		GEN9_TRVA_DATA_MASK;
+	const int num_lri_cmds = 6;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW0);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW1);
+	intel_logical_ring_emit(ringbuf, upper_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRVA_MASK_VALUE | trva_data_value);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
 {
 	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -1693,6 +1757,20 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
 		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
 	}
 
+	/*
+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
+	 * of directly updating the context image, as this will ensure that
+	 * update happens in a serialized manner for the context and also
+	 * lite-restore scenario will get handled.
+	 */
+	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
+		ret = gen9_emit_trtt_regs(req);
+		if (ret)
+			return ret;
+
+		req->ctx->trtt_info.update_trtt_params = false;
+	}
+
 	ret = intel_logical_ring_begin(req, 4);
 	if (ret)
 		return ret;
@@ -1994,6 +2072,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	return intel_lr_context_render_state_init(req);
 }
 
+static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
+{
+	int ret;
+
+	/*
+	 * Explictily disable TR-TT at the start of a new context.
+	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
+	 * context the TR-TT settings of the outgoing context could get
+	 * spilled on to the new incoming context as only the Ring Context
+	 * part is loaded on the first submission of a new context, due to
+	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
+	 */
+	ret = gen9_init_context_trtt(req);
+	if (ret)
+		return ret;
+
+	return gen8_init_rcs_context(req);
+}
+
 /**
  * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  *
@@ -2125,11 +2222,14 @@ static int logical_render_ring_init(struct drm_device *dev)
 	logical_ring_default_vfuncs(dev, ring);
 
 	/* Override some for render ring. */
-	if (INTEL_INFO(dev)->gen >= 9)
+	if (INTEL_INFO(dev)->gen >= 9) {
 		ring->init_hw = gen9_init_render_ring;
-	else
+		ring->init_context = gen9_init_rcs_context;
+	} else {
 		ring->init_hw = gen8_init_render_ring;
-	ring->init_context = gen8_init_rcs_context;
+		ring->init_context = gen8_init_rcs_context;
+	}
+
 	ring->cleanup = intel_fini_pipe_control;
 	ring->emit_flush = gen8_emit_flush_render;
 	ring->emit_request = gen8_emit_request_render;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a5524cc..604da23 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_TRTT		0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 segment_base_addr;
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev3)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
                   ` (4 preceding siblings ...)
  2016-01-25 12:14 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2016-03-03  6:42 ` Patchwork
  2016-03-09 11:10 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev4) Patchwork
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-03-03  6:42 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Support to enable TRTT on GEN9 (rev3)
URL   : https://patchwork.freedesktop.org/series/2321/
State : failure

== Summary ==

  LD      net/ipv6/built-in.o
  CC [M]  drivers/net/ethernet/intel/e1000e/phy.o
  CC [M]  drivers/net/ethernet/intel/e1000e/param.o
  LD      net/built-in.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ethtool.o
  LD      drivers/net/ethernet/synopsys/built-in.o
  CC [M]  drivers/net/ethernet/intel/e1000e/netdev.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_ethtool.o
  CC [M]  drivers/net/ethernet/realtek/8139too.o
  CC [M]  drivers/net/ethernet/realtek/r8169.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ptp.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_82575.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mac.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_nvm.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_phy.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mbx.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_i210.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_ptp.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_hwmon.o
  LD [M]  drivers/net/ethernet/intel/e1000/e1000.o
  LD [M]  drivers/net/ethernet/intel/igbvf/igbvf.o
  LD      drivers/usb/host/xhci-hcd.o
  LD      drivers/usb/host/built-in.o
  LD      drivers/usb/built-in.o
  LD [M]  drivers/net/ethernet/intel/igb/igb.o
  LD [M]  drivers/net/ethernet/intel/e1000e/e1000e.o
  LD      drivers/net/ethernet/built-in.o
  LD      drivers/net/built-in.o
Makefile:950: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev4)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
                   ` (5 preceding siblings ...)
  2016-03-03  6:42 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev3) Patchwork
@ 2016-03-09 11:10 ` Patchwork
  2016-03-10  7:10 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev5) Patchwork
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-03-09 11:10 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Support to enable TRTT on GEN9 (rev4)
URL   : https://patchwork.freedesktop.org/series/2321/
State : failure

== Summary ==

  CC      drivers/usb/host/xhci-hub.o
  CC [M]  drivers/net/phy/bcm-phy-lib.o
  LD      drivers/tty/serial/8250/8250_base.o
  CC [M]  drivers/net/phy/broadcom.o
  LD      drivers/tty/serial/8250/built-in.o
  CC [M]  drivers/net/usb/cdc_eem.o
  LD      drivers/tty/serial/built-in.o
  CC      drivers/usb/host/xhci-dbg.o
  LD      drivers/tty/built-in.o
  CC [M]  drivers/net/usb/smsc75xx.o
  CC      drivers/usb/host/xhci-trace.o
  CC [M]  drivers/net/phy/bcm7xxx.o
  CC      drivers/usb/host/xhci-pci.o
  CC [M]  drivers/net/usb/smsc95xx.o
  CC [M]  drivers/net/phy/bcm87xx.o
  CC [M]  drivers/net/phy/realtek.o
  CC [M]  drivers/net/phy/fixed_phy.o
  LD      drivers/net/phy/libphy.o
  LD      drivers/net/phy/built-in.o
  CC [M]  drivers/net/usb/mcs7830.o
  CC [M]  drivers/net/usb/usbnet.o
  CC [M]  drivers/net/usb/cdc_ncm.o
  LD [M]  drivers/net/ethernet/intel/e1000e/e1000e.o
  LD      drivers/net/ethernet/built-in.o
  LD      drivers/usb/host/xhci-hcd.o
  LD      drivers/usb/host/built-in.o
  LD      drivers/usb/built-in.o
  LD      drivers/net/built-in.o
Makefile:950: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v4] drm/i915: Support to enable TRTT on GEN9
  2016-03-03  4:54           ` [PATCH v3] " akash.goel
  2016-03-03  4:54             ` kbuild test robot
@ 2016-03-09 11:30             ` akash.goel
  2016-03-09 12:04               ` Chris Wilson
  2016-03-09 14:18               ` [PATCH v4] " kbuild test robot
  1 sibling, 2 replies; 59+ messages in thread
From: akash.goel @ 2016-03-09 11:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
assign them a PPGTT address through the Soft Pin API (for example, use soft pin
to assign l3_table_address to the L3 table BO, when used).
UMD will also program the entries in the TR-TT page tables using regular batch
commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
UMD may do the complete PPGTT address space management, on the pretext that it
could help minimize the conflicts.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
transparent to the KMD, UMD will do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
Any object must not be assigned an address from TR-TT segment, they will be
mapped to PPGTT in a regular way by KMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, start location of TR-TT
segment alongwith the pattern value for the Null & invalid Tile registers.

v2:
 - Support context_getparam for TRTT also and dispense with a separate
   GETPARAM case for TRTT (Chris).
 - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
   from user space (Chris).
 - Move all the argument checking for TRTT in context_setparam to the
   set_trtt function (Chris).
 - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
 - Rename certain functions to rightly reflect their purpose, rename
   the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
   rephrase few lines in the commit message body, add more comments (Chris).
 - Extend ABI to allow User specify TRTT segment location also.
 - Fix for selective enabling of TRTT on per context basis, explicitly
   disable TR-TT at the start of a new context.

v3:
 - Check the return value of gen9_emit_trtt_regs (Chris)
 - Update the kernel doc for intel_context structure.
 - Rebased.

v4:
 - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
 - Fix the context_getparam implementation avoiding the reset of size field,
   affecting the TRTT case.

Testcase: igt/gem_trtt

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  17 ++++-
 drivers/gpu/drm/i915/i915_gem_context.c |  99 ++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  62 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 +++
 drivers/gpu/drm/i915/i915_reg.h         |  19 ++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 106 +++++++++++++++++++++++++++++++-
 include/uapi/drm/i915_drm.h             |   8 +++
 7 files changed, 314 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f7b6caf..d648fdc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -856,6 +856,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1 << 1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -870,6 +871,8 @@ struct i915_ctx_hang_stats {
  * @ppgtt: virtual memory space used by this context.
  * @legacy_hw_ctx: render context backing object and whether it is correctly
  *                initialized (legacy ring submission mechanism only).
+ * @trtt_info: Programming parameters for tr-tt (redirection tables for
+ *             userspace, for sparse resource management)
  * @link: link in the global list of contexts.
  *
  * Contexts are memory images used by the hardware to store copies of their
@@ -880,7 +883,7 @@ struct intel_context {
 	int user_handle;
 	uint8_t remap_slice;
 	struct drm_i915_private *i915;
-	int flags;
+	unsigned int flags;
 	struct drm_i915_file_private *file_priv;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
@@ -901,6 +904,16 @@ struct intel_context {
 		uint32_t *lrc_reg_state;
 	} engine[I915_NUM_RINGS];
 
+	/* TRTT info */
+	struct intel_context_trtt {
+		u32 invd_tile_val;
+		u32 null_tile_val;
+		u64 l3_table_address;
+		u64 segment_base_addr;
+		struct i915_vma *vma;
+		bool update_trtt_params;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2703,6 +2716,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5dd84e1..0e4c6c2 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
 	return ret;
 }
 
+static void intel_context_free_trtt(struct intel_context *ctx)
+{
+	if (!ctx->trtt_info.vma)
+		return;
+
+	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+}
+
 static void i915_gem_context_clean(struct intel_context *ctx)
 {
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
@@ -164,6 +172,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	 */
 	i915_gem_context_clean(ctx);
 
+	intel_context_free_trtt(ctx);
+
 	i915_ppgtt_put(ctx->ppgtt);
 
 	if (ctx->legacy_hw_ctx.rcs_state)
@@ -507,6 +517,88 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+intel_context_get_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {
+		return -ENODEV;
+	} else if (args->size < sizeof(trtt_params)) {
+		args->size = sizeof(trtt_params);
+	} else {
+		trtt_params.segment_base_addr =
+			ctx->trtt_info.segment_base_addr;
+		trtt_params.l3_table_address =
+			ctx->trtt_info.l3_table_address;
+		trtt_params.null_tile_val =
+			ctx->trtt_info.null_tile_val;
+		trtt_params.invd_tile_val =
+			ctx->trtt_info.invd_tile_val;
+
+		if (__copy_to_user(to_user_ptr(args->value),
+				   &trtt_params,
+				   sizeof(trtt_params)))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int
+intel_context_set_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+	else if (args->size < sizeof(trtt_params))
+		return -EINVAL;
+	else if (copy_from_user(&trtt_params,
+				to_user_ptr(args->value),
+				sizeof(trtt_params)))
+		return -EFAULT;
+
+	/* basic sanity checks for the segment location & l3 table pointer */
+	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
+		i915_dbg(dev, "segment base address not correctly aligned\n");
+		return -EINVAL;
+	}
+
+	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
+	     trtt_params.segment_base_addr) &&
+	    (trtt_params.l3_table_address <
+		    (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
+		i915_dbg(dev, "l3 table address conflicts with trtt segment\n");
+		return -EINVAL;
+	}
+
+	if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
+		i915_dbg(dev, "invalid l3 table address\n");
+		return -EINVAL;
+	}
+
+	ctx->trtt_info.vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
+						trtt_params.segment_base_addr);
+	if (IS_ERR(ctx->trtt_info.vma))
+		return PTR_ERR(ctx->trtt_info.vma);
+
+	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
+	ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
+	ctx->trtt_info.update_trtt_params = 1;
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+	return 0;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -923,7 +1015,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
-	args->size = 0;
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		args->value = ctx->hang_stats.ban_period_seconds;
@@ -939,6 +1030,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		else
 			args->value = to_i915(dev)->gtt.base.total;
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_get_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -984,6 +1078,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_set_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7b8de85..8de0319 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 {
 	gtt_write_workarounds(dev);
 
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		/*
+		 * Globally enable TR-TT support in Hw.
+		 * Still TR-TT enabling on per context basis is required.
+		 * Non-trtt contexts are not affected by this setting.
+		 */
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3368,6 +3379,57 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void intel_trtt_context_destroy_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->obj_link));
+	WARN_ON(!list_empty(&vma->vm_link));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	drm_mm_remove_node(&vma->node);
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (!vma)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->obj_link);
+	INIT_LIST_HEAD(&vma->vm_link);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as permanently pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = segment_base_addr;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		DRM_ERROR("Reservation for TRTT segment failed: %i\n", ret);
+		intel_trtt_context_destroy_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(const dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index dc208c0..2374cb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Fixed size segment */
+#define GEN9_TRTT_SEG_SIZE_SHIFT	44
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -562,4 +566,8 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr);
+void intel_trtt_context_destroy_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 71abf57..0f32021 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1 << 0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_MASK		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1 << 1)
+#define   GEN9_TRTT_ENABLE		(1 << 0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 27c9ee3..4186e2c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1640,6 +1640,70 @@ static int gen9_init_render_ring(struct intel_engine_cs *ring)
 	return init_workarounds_ring(ring);
 }
 
+static int gen9_init_context_trtt(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf, 0);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
+{
+	struct intel_context *ctx = req->ctx;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	u64 masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+	u32 trva_data_value =
+		(ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
+		GEN9_TRVA_DATA_MASK;
+	const int num_lri_cmds = 6;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW0);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW1);
+	intel_logical_ring_emit(ringbuf, upper_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRVA_MASK_VALUE | trva_data_value);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
 {
 	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -1693,6 +1757,20 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
 		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
 	}
 
+	/*
+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
+	 * of directly updating the context image, as this will ensure that
+	 * update happens in a serialized manner for the context and also
+	 * lite-restore scenario will get handled.
+	 */
+	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
+		ret = gen9_emit_trtt_regs(req);
+		if (ret)
+			return ret;
+
+		req->ctx->trtt_info.update_trtt_params = false;
+	}
+
 	ret = intel_logical_ring_begin(req, 4);
 	if (ret)
 		return ret;
@@ -1994,6 +2072,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	return intel_lr_context_render_state_init(req);
 }
 
+static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
+{
+	int ret;
+
+	/*
+	 * Explictily disable TR-TT at the start of a new context.
+	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
+	 * context the TR-TT settings of the outgoing context could get
+	 * spilled on to the new incoming context as only the Ring Context
+	 * part is loaded on the first submission of a new context, due to
+	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
+	 */
+	ret = gen9_init_context_trtt(req);
+	if (ret)
+		return ret;
+
+	return gen8_init_rcs_context(req);
+}
+
 /**
  * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  *
@@ -2125,11 +2222,14 @@ static int logical_render_ring_init(struct drm_device *dev)
 	logical_ring_default_vfuncs(dev, ring);
 
 	/* Override some for render ring. */
-	if (INTEL_INFO(dev)->gen >= 9)
+	if (INTEL_INFO(dev)->gen >= 9) {
 		ring->init_hw = gen9_init_render_ring;
-	else
+		ring->init_context = gen9_init_rcs_context;
+	} else {
 		ring->init_hw = gen8_init_render_ring;
-	ring->init_context = gen8_init_rcs_context;
+		ring->init_context = gen8_init_rcs_context;
+	}
+
 	ring->cleanup = intel_fini_pipe_control;
 	ring->emit_flush = gen8_emit_flush_render;
 	ring->emit_request = gen8_emit_request_render;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a5524cc..604da23 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_TRTT		0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 segment_base_addr;
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v4] drm/i915: Support to enable TRTT on GEN9
  2016-03-09 11:30             ` [PATCH v4] " akash.goel
@ 2016-03-09 12:04               ` Chris Wilson
  2016-03-09 14:50                 ` Goel, Akash
  2016-03-09 14:18               ` [PATCH v4] " kbuild test robot
  1 sibling, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2016-03-09 12:04 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Wed, Mar 09, 2016 at 05:00:24PM +0530, akash.goel@intel.com wrote:
> +static int
> +intel_context_get_trtt(struct intel_context *ctx,
> +		       struct drm_i915_gem_context_param *args)
> +{
> +	struct drm_i915_gem_context_trtt_param trtt_params;
> +	struct drm_device *dev = ctx->i915->dev;
> +
> +	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {

Both of these actually inspect dev_priv (and magically convert dev into
dev_priv).

> +		return -ENODEV;
> +	} else if (args->size < sizeof(trtt_params)) {
> +		args->size = sizeof(trtt_params);
> +	} else {
> +		trtt_params.segment_base_addr =
> +			ctx->trtt_info.segment_base_addr;
> +		trtt_params.l3_table_address =
> +			ctx->trtt_info.l3_table_address;
> +		trtt_params.null_tile_val =
> +			ctx->trtt_info.null_tile_val;
> +		trtt_params.invd_tile_val =
> +			ctx->trtt_info.invd_tile_val;
> +
> +		if (__copy_to_user(to_user_ptr(args->value),
> +				   &trtt_params,
> +				   sizeof(trtt_params)))
> +			return -EFAULT;

args->size = sizeof(trtt_params);

in case the use passed in size > sizeof(trtt_params) we want to report
how many bytes we wrote.

> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +intel_context_set_trtt(struct intel_context *ctx,
> +		       struct drm_i915_gem_context_param *args)
> +{
> +	struct drm_i915_gem_context_trtt_param trtt_params;
> +	struct drm_device *dev = ctx->i915->dev;
> +
> +	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))

Ditto (dev_priv)

> +		return -ENODEV;
> +	else if (ctx->flags & CONTEXT_USE_TRTT)
> +		return -EEXIST;

What locks are we holding here?

> +	else if (args->size < sizeof(trtt_params))
> +		return -EINVAL;
> +	else if (copy_from_user(&trtt_params,
> +				to_user_ptr(args->value),
> +				sizeof(trtt_params)))

Because whatever they are, we can't hold them here!

(Imagine/write a test that passes in the trtt_params inside a GTT mmaping.)

> @@ -923,7 +1015,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		return PTR_ERR(ctx);
>  	}
>  
> -	args->size = 0;

Awooga. Does every path then set it?

> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 7b8de85..8de0319 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
>  {
>  	gtt_write_workarounds(dev);
>  
> +	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
> +		struct drm_i915_private *dev_priv = dev->dev_private;
> +		/*
> +		 * Globally enable TR-TT support in Hw.
> +		 * Still TR-TT enabling on per context basis is required.
> +		 * Non-trtt contexts are not affected by this setting.
> +		 */
> +		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
> +			   GEN9_TRTT_BYPASS_DISABLE);
> +	}
> +
>  	/* In the case of execlists, PPGTT is enabled by the context descriptor
>  	 * and the PDPs are contained within the context itself.  We don't
>  	 * need to do anything here. */
> @@ -3368,6 +3379,57 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
>  
>  }
>  
> +void intel_trtt_context_destroy_vma(struct i915_vma *vma)
> +{
> +	struct i915_address_space *vm = vma->vm;
> +
> +	WARN_ON(!list_empty(&vma->obj_link));
> +	WARN_ON(!list_empty(&vma->vm_link));
> +	WARN_ON(!list_empty(&vma->exec_list));

WARN_ON(!vma->pin_count);

> +
> +	drm_mm_remove_node(&vma->node);
> +	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
> +	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
> +}
> +
> +struct i915_vma *
> +intel_trtt_context_allocate_vma(struct i915_address_space *vm,
> +				uint64_t segment_base_addr)
> +{
> +	struct i915_vma *vma;
> +	int ret;
> +
> +	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
> +	if (!vma)
> +		return ERR_PTR(-ENOMEM);
> +
> +	INIT_LIST_HEAD(&vma->obj_link);
> +	INIT_LIST_HEAD(&vma->vm_link);
> +	INIT_LIST_HEAD(&vma->exec_list);
> +	vma->vm = vm;
> +	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
> +
> +	/* Mark the vma as permanently pinned */
> +	vma->pin_count = 1;
> +
> +	/* Reserve from the 48 bit PPGTT space */
> +	vma->node.start = segment_base_addr;
> +	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
> +	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> +	if (ret) {
> +		ret = i915_gem_evict_for_vma(vma);

Given that this has a known GPF, you need a test case that tries to
evict an active/hanging object in order to make room for the trtt.

> +static int gen9_init_context_trtt(struct drm_i915_gem_request *req)

Since TRTT is render only, call this gen9_init_rcs_context_trtt()

>  static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
>  {
>  	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
> @@ -1693,6 +1757,20 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
>  		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
>  	}
>  
> +	/*
> +	 * Emitting LRIs to update the TRTT registers is most reliable, instead
> +	 * of directly updating the context image, as this will ensure that
> +	 * update happens in a serialized manner for the context and also
> +	 * lite-restore scenario will get handled.
> +	 */
> +	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
> +		ret = gen9_emit_trtt_regs(req);
> +		if (ret)
> +			return ret;
> +
> +		req->ctx->trtt_info.update_trtt_params = false;

Bah. Since we can only update the params once (EEXIST otherwise),
we emit the change when the user sets the new params.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4] drm/i915: Support to enable TRTT on GEN9
  2016-03-09 11:30             ` [PATCH v4] " akash.goel
  2016-03-09 12:04               ` Chris Wilson
@ 2016-03-09 14:18               ` kbuild test robot
  1 sibling, 0 replies; 59+ messages in thread
From: kbuild test robot @ 2016-03-09 14:18 UTC (permalink / raw)
  Cc: Akash Goel, intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

Hi Akash,

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20160309]
[cannot apply to v4.5-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/akash-goel-intel-com/drm-i915-Support-to-enable-TRTT-on-GEN9/20160309-192019
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-rhel (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_gem_context.c: In function 'intel_context_set_trtt':
>> drivers/gpu/drm/i915/i915_gem_context.c:570:3: error: implicit declaration of function 'i915_dbg' [-Werror=implicit-function-declaration]
      i915_dbg(dev, "segment base address not correctly aligned\n");
      ^
   cc1: some warnings being treated as errors

vim +/i915_dbg +570 drivers/gpu/drm/i915/i915_gem_context.c

   564					to_user_ptr(args->value),
   565					sizeof(trtt_params)))
   566			return -EFAULT;
   567	
   568		/* basic sanity checks for the segment location & l3 table pointer */
   569		if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
 > 570			i915_dbg(dev, "segment base address not correctly aligned\n");
   571			return -EINVAL;
   572		}
   573	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 36094 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4] drm/i915: Support to enable TRTT on GEN9
  2016-03-09 12:04               ` Chris Wilson
@ 2016-03-09 14:50                 ` Goel, Akash
  2016-03-09 15:02                   ` Chris Wilson
  0 siblings, 1 reply; 59+ messages in thread
From: Goel, Akash @ 2016-03-09 14:50 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: akash.goel



On 3/9/2016 5:34 PM, Chris Wilson wrote:
> On Wed, Mar 09, 2016 at 05:00:24PM +0530, akash.goel@intel.com wrote:
>> +static int
>> +intel_context_get_trtt(struct intel_context *ctx,
>> +		       struct drm_i915_gem_context_param *args)
>> +{
>> +	struct drm_i915_gem_context_trtt_param trtt_params;
>> +	struct drm_device *dev = ctx->i915->dev;
>> +
>> +	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {
>
> Both of these actually inspect dev_priv (and magically convert dev into
> dev_priv).

Sorry, my bad. Missed the __I915__ macro.
>
>> +		return -ENODEV;
>> +	} else if (args->size < sizeof(trtt_params)) {
>> +		args->size = sizeof(trtt_params);
>> +	} else {
>> +		trtt_params.segment_base_addr =
>> +			ctx->trtt_info.segment_base_addr;
>> +		trtt_params.l3_table_address =
>> +			ctx->trtt_info.l3_table_address;
>> +		trtt_params.null_tile_val =
>> +			ctx->trtt_info.null_tile_val;
>> +		trtt_params.invd_tile_val =
>> +			ctx->trtt_info.invd_tile_val;
>> +
>> +		if (__copy_to_user(to_user_ptr(args->value),
>> +				   &trtt_params,
>> +				   sizeof(trtt_params)))
>> +			return -EFAULT;
>
> args->size = sizeof(trtt_params);
>
> in case the use passed in size > sizeof(trtt_params) we want to report
> how many bytes we wrote.

fine will add this.
>
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int
>> +intel_context_set_trtt(struct intel_context *ctx,
>> +		       struct drm_i915_gem_context_param *args)
>> +{
>> +	struct drm_i915_gem_context_trtt_param trtt_params;
>> +	struct drm_device *dev = ctx->i915->dev;
>> +
>> +	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
>
> Ditto (dev_priv)
>
>> +		return -ENODEV;
>> +	else if (ctx->flags & CONTEXT_USE_TRTT)
>> +		return -EEXIST;
>
> What locks are we holding here?
>
>> +	else if (args->size < sizeof(trtt_params))
>> +		return -EINVAL;
>> +	else if (copy_from_user(&trtt_params,
>> +				to_user_ptr(args->value),
>> +				sizeof(trtt_params)))
>
> Because whatever they are, we can't hold them here!
>
The struct_mutex lock was taken in the caller, ioctl function.
Ok, so need to release that before invoking copy_from_user.

> (Imagine/write a test that passes in the trtt_params inside a GTT mmaping.)

This could cause a recursive locking of struct_mutex from the gem_fault() ?

>
>> @@ -923,7 +1015,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>>   		return PTR_ERR(ctx);
>>   	}
>>
>> -	args->size = 0;
>
> Awooga. Does every path then set it?
>

It is being set only for the TRTT case. For the other existing cases, 
should it be explicitly set to 0, is that really needed ?

>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index 7b8de85..8de0319 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
>>   {
>>   	gtt_write_workarounds(dev);
>>
>> +	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
>> +		struct drm_i915_private *dev_priv = dev->dev_private;
>> +		/*
>> +		 * Globally enable TR-TT support in Hw.
>> +		 * Still TR-TT enabling on per context basis is required.
>> +		 * Non-trtt contexts are not affected by this setting.
>> +		 */
>> +		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
>> +			   GEN9_TRTT_BYPASS_DISABLE);
>> +	}
>> +
>>   	/* In the case of execlists, PPGTT is enabled by the context descriptor
>>   	 * and the PDPs are contained within the context itself.  We don't
>>   	 * need to do anything here. */
>> @@ -3368,6 +3379,57 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
>>
>>   }
>>
>> +void intel_trtt_context_destroy_vma(struct i915_vma *vma)
>> +{
>> +	struct i915_address_space *vm = vma->vm;
>> +
>> +	WARN_ON(!list_empty(&vma->obj_link));
>> +	WARN_ON(!list_empty(&vma->vm_link));
>> +	WARN_ON(!list_empty(&vma->exec_list));
>
> WARN_ON(!vma->pin_count);

Thanks, will add.

>
>> +
>> +	drm_mm_remove_node(&vma->node);
>> +	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
>> +	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
>> +}
>> +
>> +struct i915_vma *
>> +intel_trtt_context_allocate_vma(struct i915_address_space *vm,
>> +				uint64_t segment_base_addr)
>> +{
>> +	struct i915_vma *vma;
>> +	int ret;
>> +
>> +	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
>> +	if (!vma)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	INIT_LIST_HEAD(&vma->obj_link);
>> +	INIT_LIST_HEAD(&vma->vm_link);
>> +	INIT_LIST_HEAD(&vma->exec_list);
>> +	vma->vm = vm;
>> +	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
>> +
>> +	/* Mark the vma as permanently pinned */
>> +	vma->pin_count = 1;
>> +
>> +	/* Reserve from the 48 bit PPGTT space */
>> +	vma->node.start = segment_base_addr;
>> +	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
>> +	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
>> +	if (ret) {
>> +		ret = i915_gem_evict_for_vma(vma);
>
> Given that this has a known GPF, you need a test case that tries to
> evict an active/hanging object in order to make room for the trtt.
>
In the new test case, will soft pin objects in TR-TT segment first. Then 
later on enabling TR-TT, those objects should get evicted.

>> +static int gen9_init_context_trtt(struct drm_i915_gem_request *req)
>
> Since TRTT is render only, call this gen9_init_rcs_context_trtt()
>
Thanks, will change.

>>   static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
>>   {
>>   	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
>> @@ -1693,6 +1757,20 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
>>   		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
>>   	}
>>
>> +	/*
>> +	 * Emitting LRIs to update the TRTT registers is most reliable, instead
>> +	 * of directly updating the context image, as this will ensure that
>> +	 * update happens in a serialized manner for the context and also
>> +	 * lite-restore scenario will get handled.
>> +	 */
>> +	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
>> +		ret = gen9_emit_trtt_regs(req);
>> +		if (ret)
>> +			return ret;
>> +
>> +		req->ctx->trtt_info.update_trtt_params = false;
>
> Bah. Since we can only update the params once (EEXIST otherwise),
> we emit the change when the user sets the new params.

Sorry couldn't get this point. We can't emit the params right away when 
User sets them (only once). We need to emit/apply the params (onetime) 
in a deferred manner on the next submission.

Best regards
Akash

> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4] drm/i915: Support to enable TRTT on GEN9
  2016-03-09 14:50                 ` Goel, Akash
@ 2016-03-09 15:02                   ` Chris Wilson
  2016-03-09 15:56                     ` Goel, Akash
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2016-03-09 15:02 UTC (permalink / raw)
  To: Goel, Akash; +Cc: intel-gfx

On Wed, Mar 09, 2016 at 08:20:07PM +0530, Goel, Akash wrote:
> >What locks are we holding here?
> >
> >>+	else if (args->size < sizeof(trtt_params))
> >>+		return -EINVAL;
> >>+	else if (copy_from_user(&trtt_params,
> >>+				to_user_ptr(args->value),
> >>+				sizeof(trtt_params)))
> >
> >Because whatever they are, we can't hold them here!
> >
> The struct_mutex lock was taken in the caller, ioctl function.
> Ok, so need to release that before invoking copy_from_user.
> 
> >(Imagine/write a test that passes in the trtt_params inside a GTT mmaping.)
> 
> This could cause a recursive locking of struct_mutex from the gem_fault() ?

Exactly. At the least lockdep should warn if we hit a fault along this
path (due to the illegal nesting of mmap_sem inside struct_mtuex).

> 
> >
> >>@@ -923,7 +1015,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> >>  		return PTR_ERR(ctx);
> >>  	}
> >>
> >>-	args->size = 0;
> >
> >Awooga. Does every path then set it?
> >
> 
> It is being set only for the TRTT case. For the other existing
> cases, should it be explicitly set to 0, is that really needed ?

Yes. All other paths need to report .size = 0 (as they don't write
through a pointer).

> >>+struct i915_vma *
> >>+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
> >>+				uint64_t segment_base_addr)
> >>+{
> >>+	struct i915_vma *vma;
> >>+	int ret;
> >>+
> >>+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
> >>+	if (!vma)
> >>+		return ERR_PTR(-ENOMEM);
> >>+
> >>+	INIT_LIST_HEAD(&vma->obj_link);
> >>+	INIT_LIST_HEAD(&vma->vm_link);
> >>+	INIT_LIST_HEAD(&vma->exec_list);
> >>+	vma->vm = vm;
> >>+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
> >>+
> >>+	/* Mark the vma as permanently pinned */
> >>+	vma->pin_count = 1;
> >>+
> >>+	/* Reserve from the 48 bit PPGTT space */
> >>+	vma->node.start = segment_base_addr;
> >>+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
> >>+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> >>+	if (ret) {
> >>+		ret = i915_gem_evict_for_vma(vma);
> >
> >Given that this has a known GPF, you need a test case that tries to
> >evict an active/hanging object in order to make room for the trtt.
> >
> In the new test case, will soft pin objects in TR-TT segment first.
> Then later on enabling TR-TT, those objects should get evicted.

Yes. But make sure you have combinations of inactive, active, and
hanging objects inside the to-be-evicted segment. Those cover the most
frequent errors we have to handle (and easiest to reproduce).
 
> >>+static int gen9_init_context_trtt(struct drm_i915_gem_request *req)
> >
> >Since TRTT is render only, call this gen9_init_rcs_context_trtt()
> >
> Thanks, will change.
> 
> >>  static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
> >>  {
> >>  	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
> >>@@ -1693,6 +1757,20 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
> >>  		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
> >>  	}
> >>
> >>+	/*
> >>+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
> >>+	 * of directly updating the context image, as this will ensure that
> >>+	 * update happens in a serialized manner for the context and also
> >>+	 * lite-restore scenario will get handled.
> >>+	 */
> >>+	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
> >>+		ret = gen9_emit_trtt_regs(req);
> >>+		if (ret)
> >>+			return ret;
> >>+
> >>+		req->ctx->trtt_info.update_trtt_params = false;
> >
> >Bah. Since we can only update the params once (EEXIST otherwise),
> >we emit the change when the user sets the new params.
> 
> Sorry couldn't get this point. We can't emit the params right away
> when User sets them (only once). We need to emit/apply the params
> (onetime) in a deferred manner on the next submission.

Why can't we? We can construct and submit a request setting the
registers inside the right context image at that point, and they never
change after that point.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4] drm/i915: Support to enable TRTT on GEN9
  2016-03-09 15:02                   ` Chris Wilson
@ 2016-03-09 15:56                     ` Goel, Akash
  2016-03-09 16:21                       ` Chris Wilson
  0 siblings, 1 reply; 59+ messages in thread
From: Goel, Akash @ 2016-03-09 15:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Michel Thierry, akash.goel



On 3/9/2016 8:32 PM, Chris Wilson wrote:
> On Wed, Mar 09, 2016 at 08:20:07PM +0530, Goel, Akash wrote:
>>> What locks are we holding here?
>>>
>>>> +	else if (args->size < sizeof(trtt_params))
>>>> +		return -EINVAL;
>>>> +	else if (copy_from_user(&trtt_params,
>>>> +				to_user_ptr(args->value),
>>>> +				sizeof(trtt_params)))
>>>
>>> Because whatever they are, we can't hold them here!
>>>
>> The struct_mutex lock was taken in the caller, ioctl function.
>> Ok, so need to release that before invoking copy_from_user.
>>
>>> (Imagine/write a test that passes in the trtt_params inside a GTT mmaping.)
>>
>> This could cause a recursive locking of struct_mutex from the gem_fault() ?
>
> Exactly. At the least lockdep should warn if we hit a fault along this
> path (due to the illegal nesting of mmap_sem inside struct_mtuex).
>

I hope it won't look ungainly to unlock the struct_mutex before 
copy_from_user and lock it back right after that.

>>
>>>
>>>> @@ -923,7 +1015,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>>>>   		return PTR_ERR(ctx);
>>>>   	}
>>>>
>>>> -	args->size = 0;
>>>
>>> Awooga. Does every path then set it?
>>>
>>
>> It is being set only for the TRTT case. For the other existing
>> cases, should it be explicitly set to 0, is that really needed ?
>
> Yes. All other paths need to report .size = 0 (as they don't write
> through a pointer).
>

Fine will add the args->size = 0 for all the other cases.

>>>> +	/* Mark the vma as permanently pinned */
>>>> +	vma->pin_count = 1;
>>>> +
>>>> +	/* Reserve from the 48 bit PPGTT space */
>>>> +	vma->node.start = segment_base_addr;
>>>> +	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
>>>> +	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
>>>> +	if (ret) {
>>>> +		ret = i915_gem_evict_for_vma(vma);
>>>
>>> Given that this has a known GPF, you need a test case that tries to
>>> evict an active/hanging object in order to make room for the trtt.
>>>
>> In the new test case, will soft pin objects in TR-TT segment first.
>> Then later on enabling TR-TT, those objects should get evicted.
>
> Yes. But make sure you have combinations of inactive, active, and
> hanging objects inside the to-be-evicted segment. Those cover the most
> frequent errors we have to handle (and easiest to reproduce).
>
Fine, will refer other tests logic to see how to ensure that previously 
soft pinned object is still marked as active, when the eviction happens 
on enabling TR-TT.

Sorry what is the hanging object type ?

>>>> +static int gen9_init_context_trtt(struct drm_i915_gem_request *req)
>>>
>>> Since TRTT is render only, call this gen9_init_rcs_context_trtt()
>>>
>> Thanks, will change.
>>
>>>>   static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
>>>>   {
>>>>   	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
>>>> @@ -1693,6 +1757,20 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
>>>>   		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
>>>>   	}
>>>>
>>>> +	/*
>>>> +	 * Emitting LRIs to update the TRTT registers is most reliable, instead
>>>> +	 * of directly updating the context image, as this will ensure that
>>>> +	 * update happens in a serialized manner for the context and also
>>>> +	 * lite-restore scenario will get handled.
>>>> +	 */
>>>> +	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
>>>> +		ret = gen9_emit_trtt_regs(req);
>>>> +		if (ret)
>>>> +			return ret;
>>>> +
>>>> +		req->ctx->trtt_info.update_trtt_params = false;
>>>
>>> Bah. Since we can only update the params once (EEXIST otherwise),
>>> we emit the change when the user sets the new params.
>>
>> Sorry couldn't get this point. We can't emit the params right away
>> when User sets them (only once). We need to emit/apply the params
>> (onetime) in a deferred manner on the next submission.
>
> Why can't we? We can construct and submit a request setting the
> registers inside the right context image at that point, and they never
> change after that point.

Ok yes a new request can be constructed & submitted for the Context, 
emitting the LRIs to update the TRTT params in the Context image.
But won't that be relatively cumbersome considering that we are able to 
easily defer & conflate that with next batch submission, through an 
extra flag trtt_info.update_trtt_params.

Best regards
Akash


> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4] drm/i915: Support to enable TRTT on GEN9
  2016-03-09 15:56                     ` Goel, Akash
@ 2016-03-09 16:21                       ` Chris Wilson
  2016-03-09 16:38                         ` Goel, Akash
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2016-03-09 16:21 UTC (permalink / raw)
  To: Goel, Akash; +Cc: intel-gfx

On Wed, Mar 09, 2016 at 09:26:08PM +0530, Goel, Akash wrote:
> 
> 
> On 3/9/2016 8:32 PM, Chris Wilson wrote:
> >On Wed, Mar 09, 2016 at 08:20:07PM +0530, Goel, Akash wrote:
> >>>What locks are we holding here?
> >>>
> >>>>+	else if (args->size < sizeof(trtt_params))
> >>>>+		return -EINVAL;
> >>>>+	else if (copy_from_user(&trtt_params,
> >>>>+				to_user_ptr(args->value),
> >>>>+				sizeof(trtt_params)))
> >>>
> >>>Because whatever they are, we can't hold them here!
> >>>
> >>The struct_mutex lock was taken in the caller, ioctl function.
> >>Ok, so need to release that before invoking copy_from_user.
> >>
> >>>(Imagine/write a test that passes in the trtt_params inside a GTT mmaping.)
> >>
> >>This could cause a recursive locking of struct_mutex from the gem_fault() ?
> >
> >Exactly. At the least lockdep should warn if we hit a fault along this
> >path (due to the illegal nesting of mmap_sem inside struct_mtuex).
> >
> 
> I hope it won't look ungainly to unlock the struct_mutex before
> copy_from_user and lock it back right after that.

It what's we have to do. However, we have to make sure that we do not
lose state, or the user doesn't interfere, across the unlock. i.e. make
sure we have a reference on the context, double check that the state is
still valid (so do the EEXISTS check after the copy) etc.

> >>In the new test case, will soft pin objects in TR-TT segment first.
> >>Then later on enabling TR-TT, those objects should get evicted.
> >
> >Yes. But make sure you have combinations of inactive, active, and
> >hanging objects inside the to-be-evicted segment. Those cover the most
> >frequent errors we have to handle (and easiest to reproduce).
> >
> Fine, will refer other tests logic to see how to ensure that
> previously soft pinned object is still marked as active, when the
> eviction happens on enabling TR-TT.
> 
> Sorry what is the hanging object type ?

Submit a recursive batch using the vma inside your trtt region.
See igt_hang_ctx() if you are free to select the trtt region using the
offset generated by igt_hang_ctx() (and for this test you are), then it
is very simple. See gem_softpin, test_evict_hang() and
test_evict_active().

> >>>>+static int gen9_init_context_trtt(struct drm_i915_gem_request *req)
> >>>
> >>>Since TRTT is render only, call this gen9_init_rcs_context_trtt()
> >>>
> >>Thanks, will change.
> >>
> >>>>  static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
> >>>>  {
> >>>>  	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
> >>>>@@ -1693,6 +1757,20 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
> >>>>  		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
> >>>>  	}
> >>>>
> >>>>+	/*
> >>>>+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
> >>>>+	 * of directly updating the context image, as this will ensure that
> >>>>+	 * update happens in a serialized manner for the context and also
> >>>>+	 * lite-restore scenario will get handled.
> >>>>+	 */
> >>>>+	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
> >>>>+		ret = gen9_emit_trtt_regs(req);
> >>>>+		if (ret)
> >>>>+			return ret;
> >>>>+
> >>>>+		req->ctx->trtt_info.update_trtt_params = false;
> >>>
> >>>Bah. Since we can only update the params once (EEXIST otherwise),
> >>>we emit the change when the user sets the new params.
> >>
> >>Sorry couldn't get this point. We can't emit the params right away
> >>when User sets them (only once). We need to emit/apply the params
> >>(onetime) in a deferred manner on the next submission.
> >
> >Why can't we? We can construct and submit a request setting the
> >registers inside the right context image at that point, and they never
> >change after that point.
> 
> Ok yes a new request can be constructed & submitted for the Context,
> emitting the LRIs to update the TRTT params in the Context image.
> But won't that be relatively cumbersome considering that we are able
> to easily defer & conflate that with next batch submission, through
> an extra flag trtt_info.update_trtt_params.

A conditional on every batch vs a one-off ?

request = i915_gem_request_alloc(&dev_priv->ring[RCS], ctx);
if (IS_ERR(request))
	return ERR_PTR(request);

ret = gen9_emit_trtt_regs(request);
if (ret) {
	i915_gem_request_cancel(request);
	return ret;
}

i915_add_request(request);
return 0;

Complain to whoever sold you your kernel if it is not that simple. (And
that is quite byzantine compared to how it should be!)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4] drm/i915: Support to enable TRTT on GEN9
  2016-03-09 16:21                       ` Chris Wilson
@ 2016-03-09 16:38                         ` Goel, Akash
  2016-03-10  7:06                           ` [PATCH v5] " akash.goel
  0 siblings, 1 reply; 59+ messages in thread
From: Goel, Akash @ 2016-03-09 16:38 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Michel Thierry, akash.goel



On 3/9/2016 9:51 PM, Chris Wilson wrote:
> On Wed, Mar 09, 2016 at 09:26:08PM +0530, Goel, Akash wrote:
>>
>>
>> On 3/9/2016 8:32 PM, Chris Wilson wrote:
>>> On Wed, Mar 09, 2016 at 08:20:07PM +0530, Goel, Akash wrote:
>>>>> What locks are we holding here?
>>>>>
>>>>>> +	else if (args->size < sizeof(trtt_params))
>>>>>> +		return -EINVAL;
>>>>>> +	else if (copy_from_user(&trtt_params,
>>>>>> +				to_user_ptr(args->value),
>>>>>> +				sizeof(trtt_params)))
>>>>>
>>>>> Because whatever they are, we can't hold them here!
>>>>>
>>>> The struct_mutex lock was taken in the caller, ioctl function.
>>>> Ok, so need to release that before invoking copy_from_user.
>>>>
>>>>> (Imagine/write a test that passes in the trtt_params inside a GTT mmaping.)
>>>>
>>>> This could cause a recursive locking of struct_mutex from the gem_fault() ?
>>>
>>> Exactly. At the least lockdep should warn if we hit a fault along this
>>> path (due to the illegal nesting of mmap_sem inside struct_mtuex).
>>>
>>
>> I hope it won't look ungainly to unlock the struct_mutex before
>> copy_from_user and lock it back right after that.
>
> It what's we have to do. However, we have to make sure that we do not
> lose state, or the user doesn't interfere, across the unlock. i.e. make
> sure we have a reference on the context, double check that the state is
> still valid (so do the EEXISTS check after the copy) etc.
>

Thanks for the inputs, will keep them in mind.

>>>> In the new test case, will soft pin objects in TR-TT segment first.
>>>> Then later on enabling TR-TT, those objects should get evicted.
>>>
>>> Yes. But make sure you have combinations of inactive, active, and
>>> hanging objects inside the to-be-evicted segment. Those cover the most
>>> frequent errors we have to handle (and easiest to reproduce).
>>>
>> Fine, will refer other tests logic to see how to ensure that
>> previously soft pinned object is still marked as active, when the
>> eviction happens on enabling TR-TT.
>>
>> Sorry what is the hanging object type ?
>
> Submit a recursive batch using the vma inside your trtt region.
> See igt_hang_ctx() if you are free to select the trtt region using the
> offset generated by igt_hang_ctx() (and for this test you are), then it
> is very simple. See gem_softpin, test_evict_hang() and
> test_evict_active().
>

Thanks for suggesting these tests, will refer them.

>>>>>> +static int gen9_init_context_trtt(struct drm_i915_gem_request *req)
>>>>>
>>>>> Since TRTT is render only, call this gen9_init_rcs_context_trtt()
>>>>>
>>>> Thanks, will change.
>>>>
>>>>>>   static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
>>>>>>   {
>>>>>>   	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
>>>>>> @@ -1693,6 +1757,20 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
>>>>>>   		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
>>>>>>   	}
>>>>>>
>>>>>> +	/*
>>>>>> +	 * Emitting LRIs to update the TRTT registers is most reliable, instead
>>>>>> +	 * of directly updating the context image, as this will ensure that
>>>>>> +	 * update happens in a serialized manner for the context and also
>>>>>> +	 * lite-restore scenario will get handled.
>>>>>> +	 */
>>>>>> +	if ((req->ring->id == RCS) && req->ctx->trtt_info.update_trtt_params) {
>>>>>> +		ret = gen9_emit_trtt_regs(req);
>>>>>> +		if (ret)
>>>>>> +			return ret;
>>>>>> +
>>>>>> +		req->ctx->trtt_info.update_trtt_params = false;
>>>>>
>>>>> Bah. Since we can only update the params once (EEXIST otherwise),
>>>>> we emit the change when the user sets the new params.
>>>>
>>>> Sorry couldn't get this point. We can't emit the params right away
>>>> when User sets them (only once). We need to emit/apply the params
>>>> (onetime) in a deferred manner on the next submission.
>>>
>>> Why can't we? We can construct and submit a request setting the
>>> registers inside the right context image at that point, and they never
>>> change after that point.
>>
>> Ok yes a new request can be constructed & submitted for the Context,
>> emitting the LRIs to update the TRTT params in the Context image.
>> But won't that be relatively cumbersome considering that we are able
>> to easily defer & conflate that with next batch submission, through
>> an extra flag trtt_info.update_trtt_params.
>
> A conditional on every batch vs a one-off ?
>
> request = i915_gem_request_alloc(&dev_priv->ring[RCS], ctx);
> if (IS_ERR(request))
> 	return ERR_PTR(request);
>
> ret = gen9_emit_trtt_regs(request);
> if (ret) {
> 	i915_gem_request_cancel(request);
> 	return ret;
> }
>
> i915_add_request(request);
> return 0;
>
> Complain to whoever sold you your kernel if it is not that simple. (And
> that is quite byzantine compared to how it should be!)

Fine, thanks much for the required code snippet, will update the patch.

Sorry actually was bit skeptical about introducing a new non-execbuffer 
path, from the where the request allocation & submission happens.

Best regards
Akash

> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v5] drm/i915: Support to enable TRTT on GEN9
  2016-03-09 16:38                         ` Goel, Akash
@ 2016-03-10  7:06                           ` akash.goel
  2016-03-10 16:09                             ` kbuild test robot
  2016-03-11 11:50                             ` [PATCH v6] " akash.goel
  0 siblings, 2 replies; 59+ messages in thread
From: akash.goel @ 2016-03-10  7:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
assign them a PPGTT address through the Soft Pin API (for example, use soft pin
to assign l3_table_address to the L3 table BO, when used).
UMD will also program the entries in the TR-TT page tables using regular batch
commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
UMD may do the complete PPGTT address space management, on the pretext that it
could help minimize the conflicts.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
transparent to the KMD, UMD will do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
Any object must not be assigned an address from TR-TT segment, they will be
mapped to PPGTT in a regular way by KMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, start location of TR-TT
segment alongwith the pattern value for the Null & invalid Tile registers.

v2:
 - Support context_getparam for TRTT also and dispense with a separate
   GETPARAM case for TRTT (Chris).
 - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
   from user space (Chris).
 - Move all the argument checking for TRTT in context_setparam to the
   set_trtt function (Chris).
 - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
 - Rename certain functions to rightly reflect their purpose, rename
   the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
   rephrase few lines in the commit message body, add more comments (Chris).
 - Extend ABI to allow User specify TRTT segment location also.
 - Fix for selective enabling of TRTT on per context basis, explicitly
   disable TR-TT at the start of a new context.

v3:
 - Check the return value of gen9_emit_trtt_regs (Chris)
 - Update the kernel doc for intel_context structure.
 - Rebased.

v4:
 - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
 - Fix the context_getparam implementation avoiding the reset of size field,
   affecting the TRTT case.

v5:
 - Update the TR-TT params right away in context_setparam, by constructing
   & submitting a request emitting LRIs, instead of deferring it and
   conflating with the next batch submission (Chris)
 - Follow the struct_mutex handling related prescribed rules, while accessing
   User space buffer, both in context_setparam & getparam functions (Chris).

Testcase: igt/gem_trtt

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  16 +++-
 drivers/gpu/drm/i915/i915_gem_context.c | 139 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  64 +++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 ++
 drivers/gpu/drm/i915/i915_reg.h         |  19 +++++
 drivers/gpu/drm/i915/intel_lrc.c        | 124 +++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h        |   1 +
 include/uapi/drm/i915_drm.h             |   8 ++
 8 files changed, 374 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f7b6caf..1303d47 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -856,6 +856,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1 << 1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -870,6 +871,8 @@ struct i915_ctx_hang_stats {
  * @ppgtt: virtual memory space used by this context.
  * @legacy_hw_ctx: render context backing object and whether it is correctly
  *                initialized (legacy ring submission mechanism only).
+ * @trtt_info: Programming parameters for tr-tt (redirection tables for
+ *             userspace, for sparse resource management)
  * @link: link in the global list of contexts.
  *
  * Contexts are memory images used by the hardware to store copies of their
@@ -880,7 +883,7 @@ struct intel_context {
 	int user_handle;
 	uint8_t remap_slice;
 	struct drm_i915_private *i915;
-	int flags;
+	unsigned int flags;
 	struct drm_i915_file_private *file_priv;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
@@ -901,6 +904,15 @@ struct intel_context {
 		uint32_t *lrc_reg_state;
 	} engine[I915_NUM_RINGS];
 
+	/* TRTT info */
+	struct intel_context_trtt {
+		u32 invd_tile_val;
+		u32 null_tile_val;
+		u64 l3_table_address;
+		u64 segment_base_addr;
+		struct i915_vma *vma;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2703,6 +2715,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5dd84e1..ac8fd99 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
 	return ret;
 }
 
+static void intel_context_free_trtt(struct intel_context *ctx)
+{
+	if (!ctx->trtt_info.vma)
+		return;
+
+	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+}
+
 static void i915_gem_context_clean(struct intel_context *ctx)
 {
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
@@ -164,6 +172,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	 */
 	i915_gem_context_clean(ctx);
 
+	intel_context_free_trtt(ctx);
+
 	i915_ppgtt_put(ctx->ppgtt);
 
 	if (ctx->legacy_hw_ctx.rcs_state)
@@ -507,6 +517,127 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+intel_context_get_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {
+		return -ENODEV;
+	} else if (args->size < sizeof(trtt_params)) {
+		args->size = sizeof(trtt_params);
+	} else {
+		trtt_params.segment_base_addr =
+			ctx->trtt_info.segment_base_addr;
+		trtt_params.l3_table_address =
+			ctx->trtt_info.l3_table_address;
+		trtt_params.null_tile_val =
+			ctx->trtt_info.null_tile_val;
+		trtt_params.invd_tile_val =
+			ctx->trtt_info.invd_tile_val;
+
+		i915_gem_context_reference(ctx);
+		mutex_unlock(&dev->struct_mutex);
+
+		if (__copy_to_user(to_user_ptr(args->value),
+				   &trtt_params,
+				   sizeof(trtt_params))) {
+			mutex_lock(&dev->struct_mutex);
+			i915_gem_context_unreference(ctx);
+			return -EFAULT;
+		}
+
+		args->size = sizeof(trtt_params);
+		mutex_lock(&dev->struct_mutex);
+		i915_gem_context_unreference(ctx);
+	}
+
+	return 0;
+}
+
+static int
+intel_context_set_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+	int ret;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+	else if (args->size < sizeof(trtt_params))
+		return -EINVAL;
+
+	i915_gem_context_reference(ctx);
+	mutex_unlock(&dev->struct_mutex);
+
+	if (copy_from_user(&trtt_params,
+			   to_user_ptr(args->value),
+			   sizeof(trtt_params))) {
+		mutex_lock(&dev->struct_mutex);
+		ret = -EFAULT;
+		goto exit;
+	}
+
+	mutex_lock(&dev->struct_mutex);
+
+	/* Check if the setup happened from another path */
+	if (ctx->flags & CONTEXT_USE_TRTT) {
+		ret = -EEXIST;
+		goto exit;
+	}
+
+	/* basic sanity checks for the segment location & l3 table pointer */
+	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
+		i915_dbg(dev, "segment base address not correctly aligned\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
+	     trtt_params.segment_base_addr) &&
+	    (trtt_params.l3_table_address <
+		    (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
+		i915_dbg(dev, "l3 table address conflicts with trtt segment\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
+		i915_dbg(dev, "invalid l3 table address\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	ctx->trtt_info.vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
+						trtt_params.segment_base_addr);
+	if (IS_ERR(ctx->trtt_info.vma)) {
+		ret = PTR_ERR(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
+	ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
+
+	ret = intel_lr_rcs_context_setup_trtt(ctx);
+	if (ret) {
+		intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+
+exit:
+	i915_gem_context_unreference(ctx);
+	return ret;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -923,7 +1054,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
-	args->size = 0;
+	args->size = (args->param != I915_CONTEXT_PARAM_TRTT) ? 0 : args->size;
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		args->value = ctx->hang_stats.ban_period_seconds;
@@ -939,6 +1070,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		else
 			args->value = to_i915(dev)->gtt.base.total;
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_get_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -984,6 +1118,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_set_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7b8de85..d880472 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 {
 	gtt_write_workarounds(dev);
 
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		/*
+		 * Globally enable TR-TT support in Hw.
+		 * Still TR-TT enabling on per context basis is required.
+		 * Non-trtt contexts are not affected by this setting.
+		 */
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3368,6 +3379,59 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void intel_trtt_context_destroy_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->obj_link));
+	WARN_ON(!list_empty(&vma->vm_link));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	WARN_ON(!vma->pin_count);
+
+	drm_mm_remove_node(&vma->node);
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (!vma)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->obj_link);
+	INIT_LIST_HEAD(&vma->vm_link);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as permanently pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = segment_base_addr;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		DRM_ERROR("Reservation for TRTT segment failed: %i\n", ret);
+		intel_trtt_context_destroy_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(const dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index dc208c0..2374cb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Fixed size segment */
+#define GEN9_TRTT_SEG_SIZE_SHIFT	44
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -562,4 +566,8 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr);
+void intel_trtt_context_destroy_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 71abf57..0f32021 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1 << 0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_MASK		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1 << 1)
+#define   GEN9_TRTT_ENABLE		(1 << 0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 27c9ee3..f60d5eb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1640,6 +1640,76 @@ static int gen9_init_render_ring(struct intel_engine_cs *ring)
 	return init_workarounds_ring(ring);
 }
 
+static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf, 0);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
+{
+	struct intel_context *ctx = req->ctx;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	u64 masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+	u32 trva_data_value =
+		(ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
+		GEN9_TRVA_DATA_MASK;
+	const int num_lri_cmds = 6;
+	int ret;
+
+	/*
+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
+	 * of directly updating the context image, as this will ensure that
+	 * update happens in a serialized manner for the context and also
+	 * lite-restore scenario will get handled.
+	 */
+	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW0);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW1);
+	intel_logical_ring_emit(ringbuf, upper_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRVA_MASK_VALUE | trva_data_value);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
 {
 	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -1994,6 +2064,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	return intel_lr_context_render_state_init(req);
 }
 
+static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
+{
+	int ret;
+
+	/*
+	 * Explictily disable TR-TT at the start of a new context.
+	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
+	 * context the TR-TT settings of the outgoing context could get
+	 * spilled on to the new incoming context as only the Ring Context
+	 * part is loaded on the first submission of a new context, due to
+	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
+	 */
+	ret = gen9_init_rcs_context_trtt(req);
+	if (ret)
+		return ret;
+
+	return gen8_init_rcs_context(req);
+}
+
 /**
  * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  *
@@ -2125,11 +2214,14 @@ static int logical_render_ring_init(struct drm_device *dev)
 	logical_ring_default_vfuncs(dev, ring);
 
 	/* Override some for render ring. */
-	if (INTEL_INFO(dev)->gen >= 9)
+	if (INTEL_INFO(dev)->gen >= 9) {
 		ring->init_hw = gen9_init_render_ring;
-	else
+		ring->init_context = gen9_init_rcs_context;
+	} else {
 		ring->init_hw = gen8_init_render_ring;
-	ring->init_context = gen8_init_rcs_context;
+		ring->init_context = gen8_init_rcs_context;
+	}
+
 	ring->cleanup = intel_fini_pipe_control;
 	ring->emit_flush = gen8_emit_flush_render;
 	ring->emit_request = gen8_emit_request_render;
@@ -2669,3 +2761,29 @@ void intel_lr_context_reset(struct drm_device *dev,
 		ringbuf->tail = 0;
 	}
 }
+
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx)
+{
+	struct intel_engine_cs *ring = &(ctx->i915->ring[RCS]);
+	struct drm_i915_gem_request *req;
+	int ret;
+
+	if (!ctx->engine[RCS].state) {
+		ret = intel_lr_context_deferred_alloc(ctx, ring);
+		if (ret)
+			return ret;
+	}
+
+	req = i915_gem_request_alloc(ring, ctx);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = gen9_emit_trtt_regs(req);
+	if (ret) {
+		i915_gem_request_cancel(req);
+		return ret;
+	}
+
+	i915_add_request(req);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index e6cda3e..914d2a1b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,6 +107,7 @@ void intel_lr_context_reset(struct drm_device *dev,
 			struct intel_context *ctx);
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *ring);
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx);
 
 u32 intel_execlists_ctx_id(struct intel_context *ctx,
 			   struct intel_engine_cs *ring);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a5524cc..604da23 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_TRTT		0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 segment_base_addr;
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev5)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
                   ` (6 preceding siblings ...)
  2016-03-09 11:10 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev4) Patchwork
@ 2016-03-10  7:10 ` Patchwork
  2016-03-11 11:41 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev6) Patchwork
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-03-10  7:10 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Support to enable TRTT on GEN9 (rev5)
URL   : https://patchwork.freedesktop.org/series/2321/
State : failure

== Summary ==

  CC      drivers/usb/storage/usual-tables.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mbx.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_i210.o
  LD      drivers/usb/host/xhci-hcd.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_ptp.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_hw.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_hwmon.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_ethtool.o
  LD      drivers/usb/host/built-in.o
  CC [M]  drivers/net/ethernet/intel/e1000e/80003es2lan.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_param.o
  CC [M]  drivers/net/ethernet/intel/e1000e/mac.o
  CC [M]  drivers/net/ethernet/intel/e1000e/manage.o
  LD      drivers/usb/storage/usb-storage.o
  LD      drivers/usb/storage/built-in.o
  LD      drivers/usb/built-in.o
  CC [M]  drivers/net/ethernet/intel/e1000e/nvm.o
  CC [M]  drivers/net/ethernet/intel/e1000e/phy.o
  CC [M]  drivers/net/ethernet/intel/e1000e/param.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ethtool.o
  CC [M]  drivers/net/ethernet/intel/e1000e/netdev.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ptp.o
  LD [M]  drivers/net/ethernet/intel/igbvf/igbvf.o
  LD [M]  drivers/net/ethernet/intel/e1000/e1000.o
  LD [M]  drivers/net/ethernet/intel/igb/igb.o
  LD [M]  drivers/net/ethernet/intel/e1000e/e1000e.o
  LD      drivers/net/ethernet/built-in.o
  LD      drivers/net/built-in.o
Makefile:950: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v5] drm/i915: Support to enable TRTT on GEN9
  2016-03-10  7:06                           ` [PATCH v5] " akash.goel
@ 2016-03-10 16:09                             ` kbuild test robot
  2016-03-11 11:50                             ` [PATCH v6] " akash.goel
  1 sibling, 0 replies; 59+ messages in thread
From: kbuild test robot @ 2016-03-10 16:09 UTC (permalink / raw)
  Cc: Akash Goel, intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1508 bytes --]

Hi Akash,

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20160310]
[cannot apply to v4.5-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/akash-goel-intel-com/drm-i915-Support-to-enable-TRTT-on-GEN9/20160310-145901
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-rhel (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_gem_context.c: In function 'intel_context_set_trtt':
>> drivers/gpu/drm/i915/i915_gem_context.c:596:3: error: implicit declaration of function 'i915_dbg' [-Werror=implicit-function-declaration]
      i915_dbg(dev, "segment base address not correctly aligned\n");
      ^
   cc1: some warnings being treated as errors

vim +/i915_dbg +596 drivers/gpu/drm/i915/i915_gem_context.c

   590			ret = -EEXIST;
   591			goto exit;
   592		}
   593	
   594		/* basic sanity checks for the segment location & l3 table pointer */
   595		if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
 > 596			i915_dbg(dev, "segment base address not correctly aligned\n");
   597			ret = -EINVAL;
   598			goto exit;
   599		}

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 36094 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev6)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
                   ` (7 preceding siblings ...)
  2016-03-10  7:10 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev5) Patchwork
@ 2016-03-11 11:41 ` Patchwork
  2016-03-18 12:45 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev8) Patchwork
  2016-03-22 10:45 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev9) Patchwork
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-03-11 11:41 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Support to enable TRTT on GEN9 (rev6)
URL   : https://patchwork.freedesktop.org/series/2321/
State : failure

== Summary ==

  CC [M]  drivers/net/ethernet/intel/igb/e1000_mac.o
  CC [M]  drivers/net/ethernet/intel/e1000e/mac.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_nvm.o
  CC [M]  drivers/net/ethernet/intel/e1000e/manage.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_phy.o
  CC [M]  drivers/net/ethernet/intel/e1000e/nvm.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mbx.o
  LD      drivers/tty/vt/built-in.o
  LD      drivers/tty/built-in.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_i210.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_ethtool.o
  LD      drivers/scsi/sd_mod.o
  LD      drivers/scsi/built-in.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_ptp.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_param.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_hwmon.o
  CC [M]  drivers/net/ethernet/intel/e1000e/phy.o
  CC [M]  drivers/net/ethernet/intel/e1000e/param.o
  CC [M]  drivers/net/ethernet/realtek/r8169.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ethtool.o
  CC [M]  drivers/net/ethernet/intel/e1000e/netdev.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ptp.o
  LD [M]  drivers/net/ethernet/intel/igbvf/igbvf.o
  LD [M]  drivers/net/ethernet/intel/e1000/e1000.o
  LD [M]  drivers/net/ethernet/intel/igb/igb.o
  LD [M]  drivers/net/ethernet/intel/e1000e/e1000e.o
  LD      drivers/net/ethernet/built-in.o
  LD      drivers/net/built-in.o
Makefile:950: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v6] drm/i915: Support to enable TRTT on GEN9
  2016-03-10  7:06                           ` [PATCH v5] " akash.goel
  2016-03-10 16:09                             ` kbuild test robot
@ 2016-03-11 11:50                             ` akash.goel
  2016-03-11 15:57                               ` kbuild test robot
  2016-03-18  8:35                               ` [PATCH v7] " akash.goel
  1 sibling, 2 replies; 59+ messages in thread
From: akash.goel @ 2016-03-11 11:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
assign them a PPGTT address through the Soft Pin API (for example, use soft pin
to assign l3_table_address to the L3 table BO, when used).
UMD will also program the entries in the TR-TT page tables using regular batch
commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
UMD may do the complete PPGTT address space management, on the pretext that it
could help minimize the conflicts.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
transparent to the KMD, UMD will do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
Any object must not be assigned an address from TR-TT segment, they will be
mapped to PPGTT in a regular way by KMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, start location of TR-TT
segment alongwith the pattern value for the Null & invalid Tile registers.

v2:
 - Support context_getparam for TRTT also and dispense with a separate
   GETPARAM case for TRTT (Chris).
 - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
   from user space (Chris).
 - Move all the argument checking for TRTT in context_setparam to the
   set_trtt function (Chris).
 - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
 - Rename certain functions to rightly reflect their purpose, rename
   the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
   rephrase few lines in the commit message body, add more comments (Chris).
 - Extend ABI to allow User specify TRTT segment location also.
 - Fix for selective enabling of TRTT on per context basis, explicitly
   disable TR-TT at the start of a new context.

v3:
 - Check the return value of gen9_emit_trtt_regs (Chris)
 - Update the kernel doc for intel_context structure.
 - Rebased.

v4:
 - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
 - Fix the context_getparam implementation avoiding the reset of size field,
   affecting the TRTT case.

v5:
 - Update the TR-TT params right away in context_setparam, by constructing
   & submitting a request emitting LRIs, instead of deferring it and
   conflating with the next batch submission (Chris)
 - Follow the struct_mutex handling related prescribed rules, while accessing
   User space buffer, both in context_setparam & getparam functions (Chris).

v6:
 - Fix the warning caused due to removal of un-allocated trtt vma node.

Testcase: igt/gem_trtt

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  16 +++-
 drivers/gpu/drm/i915/i915_gem_context.c | 139 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  66 +++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 ++
 drivers/gpu/drm/i915/i915_reg.h         |  19 +++++
 drivers/gpu/drm/i915/intel_lrc.c        | 124 +++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h        |   1 +
 include/uapi/drm/i915_drm.h             |   8 ++
 8 files changed, 376 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f7b6caf..1303d47 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -856,6 +856,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1 << 1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -870,6 +871,8 @@ struct i915_ctx_hang_stats {
  * @ppgtt: virtual memory space used by this context.
  * @legacy_hw_ctx: render context backing object and whether it is correctly
  *                initialized (legacy ring submission mechanism only).
+ * @trtt_info: Programming parameters for tr-tt (redirection tables for
+ *             userspace, for sparse resource management)
  * @link: link in the global list of contexts.
  *
  * Contexts are memory images used by the hardware to store copies of their
@@ -880,7 +883,7 @@ struct intel_context {
 	int user_handle;
 	uint8_t remap_slice;
 	struct drm_i915_private *i915;
-	int flags;
+	unsigned int flags;
 	struct drm_i915_file_private *file_priv;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
@@ -901,6 +904,15 @@ struct intel_context {
 		uint32_t *lrc_reg_state;
 	} engine[I915_NUM_RINGS];
 
+	/* TRTT info */
+	struct intel_context_trtt {
+		u32 invd_tile_val;
+		u32 null_tile_val;
+		u64 l3_table_address;
+		u64 segment_base_addr;
+		struct i915_vma *vma;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2703,6 +2715,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5dd84e1..ac8fd99 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
 	return ret;
 }
 
+static void intel_context_free_trtt(struct intel_context *ctx)
+{
+	if (!ctx->trtt_info.vma)
+		return;
+
+	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+}
+
 static void i915_gem_context_clean(struct intel_context *ctx)
 {
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
@@ -164,6 +172,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	 */
 	i915_gem_context_clean(ctx);
 
+	intel_context_free_trtt(ctx);
+
 	i915_ppgtt_put(ctx->ppgtt);
 
 	if (ctx->legacy_hw_ctx.rcs_state)
@@ -507,6 +517,127 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+intel_context_get_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {
+		return -ENODEV;
+	} else if (args->size < sizeof(trtt_params)) {
+		args->size = sizeof(trtt_params);
+	} else {
+		trtt_params.segment_base_addr =
+			ctx->trtt_info.segment_base_addr;
+		trtt_params.l3_table_address =
+			ctx->trtt_info.l3_table_address;
+		trtt_params.null_tile_val =
+			ctx->trtt_info.null_tile_val;
+		trtt_params.invd_tile_val =
+			ctx->trtt_info.invd_tile_val;
+
+		i915_gem_context_reference(ctx);
+		mutex_unlock(&dev->struct_mutex);
+
+		if (__copy_to_user(to_user_ptr(args->value),
+				   &trtt_params,
+				   sizeof(trtt_params))) {
+			mutex_lock(&dev->struct_mutex);
+			i915_gem_context_unreference(ctx);
+			return -EFAULT;
+		}
+
+		args->size = sizeof(trtt_params);
+		mutex_lock(&dev->struct_mutex);
+		i915_gem_context_unreference(ctx);
+	}
+
+	return 0;
+}
+
+static int
+intel_context_set_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+	int ret;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+	else if (args->size < sizeof(trtt_params))
+		return -EINVAL;
+
+	i915_gem_context_reference(ctx);
+	mutex_unlock(&dev->struct_mutex);
+
+	if (copy_from_user(&trtt_params,
+			   to_user_ptr(args->value),
+			   sizeof(trtt_params))) {
+		mutex_lock(&dev->struct_mutex);
+		ret = -EFAULT;
+		goto exit;
+	}
+
+	mutex_lock(&dev->struct_mutex);
+
+	/* Check if the setup happened from another path */
+	if (ctx->flags & CONTEXT_USE_TRTT) {
+		ret = -EEXIST;
+		goto exit;
+	}
+
+	/* basic sanity checks for the segment location & l3 table pointer */
+	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
+		i915_dbg(dev, "segment base address not correctly aligned\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
+	     trtt_params.segment_base_addr) &&
+	    (trtt_params.l3_table_address <
+		    (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
+		i915_dbg(dev, "l3 table address conflicts with trtt segment\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
+		i915_dbg(dev, "invalid l3 table address\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	ctx->trtt_info.vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
+						trtt_params.segment_base_addr);
+	if (IS_ERR(ctx->trtt_info.vma)) {
+		ret = PTR_ERR(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
+	ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
+
+	ret = intel_lr_rcs_context_setup_trtt(ctx);
+	if (ret) {
+		intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+
+exit:
+	i915_gem_context_unreference(ctx);
+	return ret;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -923,7 +1054,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
-	args->size = 0;
+	args->size = (args->param != I915_CONTEXT_PARAM_TRTT) ? 0 : args->size;
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		args->value = ctx->hang_stats.ban_period_seconds;
@@ -939,6 +1070,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		else
 			args->value = to_i915(dev)->gtt.base.total;
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_get_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -984,6 +1118,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_set_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7b8de85..666a576 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 {
 	gtt_write_workarounds(dev);
 
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		/*
+		 * Globally enable TR-TT support in Hw.
+		 * Still TR-TT enabling on per context basis is required.
+		 * Non-trtt contexts are not affected by this setting.
+		 */
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3368,6 +3379,61 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void intel_trtt_context_destroy_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->obj_link));
+	WARN_ON(!list_empty(&vma->vm_link));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	WARN_ON(!vma->pin_count);
+
+	if (drm_mm_node_allocated(&vma->node))
+		drm_mm_remove_node(&vma->node);
+
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (!vma)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->obj_link);
+	INIT_LIST_HEAD(&vma->vm_link);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as permanently pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = segment_base_addr;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		DRM_ERROR("Reservation for TRTT segment failed: %i\n", ret);
+		intel_trtt_context_destroy_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(const dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index dc208c0..2374cb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Fixed size segment */
+#define GEN9_TRTT_SEG_SIZE_SHIFT	44
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -562,4 +566,8 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr);
+void intel_trtt_context_destroy_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 71abf57..0f32021 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1 << 0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_MASK		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1 << 1)
+#define   GEN9_TRTT_ENABLE		(1 << 0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 27c9ee3..f60d5eb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1640,6 +1640,76 @@ static int gen9_init_render_ring(struct intel_engine_cs *ring)
 	return init_workarounds_ring(ring);
 }
 
+static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf, 0);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
+{
+	struct intel_context *ctx = req->ctx;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	u64 masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+	u32 trva_data_value =
+		(ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
+		GEN9_TRVA_DATA_MASK;
+	const int num_lri_cmds = 6;
+	int ret;
+
+	/*
+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
+	 * of directly updating the context image, as this will ensure that
+	 * update happens in a serialized manner for the context and also
+	 * lite-restore scenario will get handled.
+	 */
+	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW0);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW1);
+	intel_logical_ring_emit(ringbuf, upper_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRVA_MASK_VALUE | trva_data_value);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
 {
 	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -1994,6 +2064,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	return intel_lr_context_render_state_init(req);
 }
 
+static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
+{
+	int ret;
+
+	/*
+	 * Explictily disable TR-TT at the start of a new context.
+	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
+	 * context the TR-TT settings of the outgoing context could get
+	 * spilled on to the new incoming context as only the Ring Context
+	 * part is loaded on the first submission of a new context, due to
+	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
+	 */
+	ret = gen9_init_rcs_context_trtt(req);
+	if (ret)
+		return ret;
+
+	return gen8_init_rcs_context(req);
+}
+
 /**
  * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  *
@@ -2125,11 +2214,14 @@ static int logical_render_ring_init(struct drm_device *dev)
 	logical_ring_default_vfuncs(dev, ring);
 
 	/* Override some for render ring. */
-	if (INTEL_INFO(dev)->gen >= 9)
+	if (INTEL_INFO(dev)->gen >= 9) {
 		ring->init_hw = gen9_init_render_ring;
-	else
+		ring->init_context = gen9_init_rcs_context;
+	} else {
 		ring->init_hw = gen8_init_render_ring;
-	ring->init_context = gen8_init_rcs_context;
+		ring->init_context = gen8_init_rcs_context;
+	}
+
 	ring->cleanup = intel_fini_pipe_control;
 	ring->emit_flush = gen8_emit_flush_render;
 	ring->emit_request = gen8_emit_request_render;
@@ -2669,3 +2761,29 @@ void intel_lr_context_reset(struct drm_device *dev,
 		ringbuf->tail = 0;
 	}
 }
+
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx)
+{
+	struct intel_engine_cs *ring = &(ctx->i915->ring[RCS]);
+	struct drm_i915_gem_request *req;
+	int ret;
+
+	if (!ctx->engine[RCS].state) {
+		ret = intel_lr_context_deferred_alloc(ctx, ring);
+		if (ret)
+			return ret;
+	}
+
+	req = i915_gem_request_alloc(ring, ctx);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = gen9_emit_trtt_regs(req);
+	if (ret) {
+		i915_gem_request_cancel(req);
+		return ret;
+	}
+
+	i915_add_request(req);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index e6cda3e..914d2a1b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,6 +107,7 @@ void intel_lr_context_reset(struct drm_device *dev,
 			struct intel_context *ctx);
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *ring);
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx);
 
 u32 intel_execlists_ctx_id(struct intel_context *ctx,
 			   struct intel_engine_cs *ring);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a5524cc..604da23 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_TRTT		0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 segment_base_addr;
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v6] drm/i915: Support to enable TRTT on GEN9
  2016-03-11 11:50                             ` [PATCH v6] " akash.goel
@ 2016-03-11 15:57                               ` kbuild test robot
  2016-03-18  8:35                               ` [PATCH v7] " akash.goel
  1 sibling, 0 replies; 59+ messages in thread
From: kbuild test robot @ 2016-03-11 15:57 UTC (permalink / raw)
  Cc: Akash Goel, intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1508 bytes --]

Hi Akash,

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20160311]
[cannot apply to v4.5-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/akash-goel-intel-com/drm-i915-Support-to-enable-TRTT-on-GEN9/20160311-194039
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-rhel (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_gem_context.c: In function 'intel_context_set_trtt':
>> drivers/gpu/drm/i915/i915_gem_context.c:596:3: error: implicit declaration of function 'i915_dbg' [-Werror=implicit-function-declaration]
      i915_dbg(dev, "segment base address not correctly aligned\n");
      ^
   cc1: some warnings being treated as errors

vim +/i915_dbg +596 drivers/gpu/drm/i915/i915_gem_context.c

   590			ret = -EEXIST;
   591			goto exit;
   592		}
   593	
   594		/* basic sanity checks for the segment location & l3 table pointer */
   595		if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
 > 596			i915_dbg(dev, "segment base address not correctly aligned\n");
   597			ret = -EINVAL;
   598			goto exit;
   599		}

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 36094 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v7] drm/i915: Support to enable TRTT on GEN9
  2016-03-11 11:50                             ` [PATCH v6] " akash.goel
  2016-03-11 15:57                               ` kbuild test robot
@ 2016-03-18  8:35                               ` akash.goel
  2016-03-18  9:35                                 ` Chris Wilson
  1 sibling, 1 reply; 59+ messages in thread
From: akash.goel @ 2016-03-18  8:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
assign them a PPGTT address through the Soft Pin API (for example, use soft pin
to assign l3_table_address to the L3 table BO, when used).
UMD will also program the entries in the TR-TT page tables using regular batch
commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
UMD may do the complete PPGTT address space management, on the pretext that it
could help minimize the conflicts.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
transparent to the KMD, UMD will do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
Any object must not be assigned an address from TR-TT segment, they will be
mapped to PPGTT in a regular way by KMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, start location of TR-TT
segment alongwith the pattern value for the Null & invalid Tile registers.

v2:
 - Support context_getparam for TRTT also and dispense with a separate
   GETPARAM case for TRTT (Chris).
 - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
   from user space (Chris).
 - Move all the argument checking for TRTT in context_setparam to the
   set_trtt function (Chris).
 - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
 - Rename certain functions to rightly reflect their purpose, rename
   the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
   rephrase few lines in the commit message body, add more comments (Chris).
 - Extend ABI to allow User specify TRTT segment location also.
 - Fix for selective enabling of TRTT on per context basis, explicitly
   disable TR-TT at the start of a new context.

v3:
 - Check the return value of gen9_emit_trtt_regs (Chris)
 - Update the kernel doc for intel_context structure.
 - Rebased.

v4:
 - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
 - Fix the context_getparam implementation avoiding the reset of size field,
   affecting the TRTT case.

v5:
 - Update the TR-TT params right away in context_setparam, by constructing
   & submitting a request emitting LRIs, instead of deferring it and
   conflating with the next batch submission (Chris)
 - Follow the struct_mutex handling related prescribed rules, while accessing
   User space buffer, both in context_setparam & getparam functions (Chris).

v6:
 - Fix the warning caused due to removal of un-allocated trtt vma node.

v7:
 - Move context ref/unref to context_setparam_ioctl from set_trtt() & remove
   that from get_trtt() as not really needed there (Chris).
 - Add a check for improper values for Null & Invalid Tiles.
 - Remove superfluous DRM_ERROR from trtt_context_allocate_vma (Chris).
 - Rebased.

Testcase: igt/gem_trtt

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  16 +++-
 drivers/gpu/drm/i915/i915_gem_context.c | 147 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  65 ++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 ++
 drivers/gpu/drm/i915/i915_reg.h         |  19 +++++
 drivers/gpu/drm/i915/intel_lrc.c        | 124 ++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h        |   1 +
 include/uapi/drm/i915_drm.h             |   8 ++
 8 files changed, 383 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8a8cb7e..6f06d72 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -800,6 +800,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1 << 1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -814,6 +815,8 @@ struct i915_ctx_hang_stats {
  * @ppgtt: virtual memory space used by this context.
  * @legacy_hw_ctx: render context backing object and whether it is correctly
  *                initialized (legacy ring submission mechanism only).
+ * @trtt_info: Programming parameters for tr-tt (redirection tables for
+ *             userspace, for sparse resource management)
  * @link: link in the global list of contexts.
  *
  * Contexts are memory images used by the hardware to store copies of their
@@ -824,7 +827,7 @@ struct intel_context {
 	int user_handle;
 	uint8_t remap_slice;
 	struct drm_i915_private *i915;
-	int flags;
+	unsigned int flags;
 	struct drm_i915_file_private *file_priv;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
@@ -845,6 +848,15 @@ struct intel_context {
 		uint32_t *lrc_reg_state;
 	} engine[I915_NUM_ENGINES];
 
+	/* TRTT info */
+	struct intel_context_trtt {
+		u32 invd_tile_val;
+		u32 null_tile_val;
+		u64 l3_table_address;
+		u64 segment_base_addr;
+		struct i915_vma *vma;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2647,6 +2659,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 1993449..ed71135 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
 	return ret;
 }
 
+static void intel_context_free_trtt(struct intel_context *ctx)
+{
+	if (!ctx->trtt_info.vma)
+		return;
+
+	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+}
+
 static void i915_gem_context_clean(struct intel_context *ctx)
 {
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
@@ -164,6 +172,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	 */
 	i915_gem_context_clean(ctx);
 
+	intel_context_free_trtt(ctx);
+
 	i915_ppgtt_put(ctx->ppgtt);
 
 	if (ctx->legacy_hw_ctx.rcs_state)
@@ -507,6 +517,127 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+intel_context_get_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {
+		return -ENODEV;
+	} else if (args->size < sizeof(trtt_params)) {
+		args->size = sizeof(trtt_params);
+	} else {
+		trtt_params.segment_base_addr =
+			ctx->trtt_info.segment_base_addr;
+		trtt_params.l3_table_address =
+			ctx->trtt_info.l3_table_address;
+		trtt_params.null_tile_val =
+			ctx->trtt_info.null_tile_val;
+		trtt_params.invd_tile_val =
+			ctx->trtt_info.invd_tile_val;
+
+		mutex_unlock(&dev->struct_mutex);
+
+		if (__copy_to_user(to_user_ptr(args->value),
+				   &trtt_params,
+				   sizeof(trtt_params))) {
+			mutex_lock(&dev->struct_mutex);
+			return -EFAULT;
+		}
+
+		args->size = sizeof(trtt_params);
+		mutex_lock(&dev->struct_mutex);
+	}
+
+	return 0;
+}
+
+static int
+intel_context_set_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+	int ret;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+	else if (args->size < sizeof(trtt_params))
+		return -EINVAL;
+
+	mutex_unlock(&dev->struct_mutex);
+
+	if (copy_from_user(&trtt_params,
+			   to_user_ptr(args->value),
+			   sizeof(trtt_params))) {
+		mutex_lock(&dev->struct_mutex);
+		ret = -EFAULT;
+		goto exit;
+	}
+
+	mutex_lock(&dev->struct_mutex);
+
+	/* Check if the setup happened from another path */
+	if (ctx->flags & CONTEXT_USE_TRTT) {
+		ret = -EEXIST;
+		goto exit;
+	}
+
+	/* basic sanity checks for the segment location & l3 table pointer */
+	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
+		i915_dbg(dev, "segment base address not correctly aligned\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
+	     trtt_params.segment_base_addr) &&
+	    (trtt_params.l3_table_address <
+		    (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
+		i915_dbg(dev, "l3 table address conflicts with trtt segment\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
+		i915_dbg(dev, "invalid l3 table address\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (trtt_params.null_tile_val == trtt_params.invd_tile_val) {
+		i915_dbg(dev, "incorrect values for null & invalid tiles\n");
+		return -EINVAL;
+	}
+
+	ctx->trtt_info.vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
+						trtt_params.segment_base_addr);
+	if (IS_ERR(ctx->trtt_info.vma)) {
+		ret = PTR_ERR(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
+	ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
+
+	ret = intel_lr_rcs_context_setup_trtt(ctx);
+	if (ret) {
+		intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+
+exit:
+	return ret;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -931,7 +1062,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
-	args->size = 0;
+	args->size = (args->param != I915_CONTEXT_PARAM_TRTT) ? 0 : args->size;
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		args->value = ctx->hang_stats.ban_period_seconds;
@@ -947,6 +1078,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		else
 			args->value = to_i915(dev)->gtt.base.total;
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_get_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -974,6 +1108,13 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
+	/*
+	 * Take a reference also, as in certain cases we have to release &
+	 * reacquire the struct_mutex and we don't want the context to
+	 * go away.
+	 */
+	i915_gem_context_reference(ctx);
+
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		if (args->size)
@@ -992,10 +1133,14 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_set_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
 	}
+	i915_gem_context_unreference(ctx);
 	mutex_unlock(&dev->struct_mutex);
 
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9c752fe..c99bebf 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 {
 	gtt_write_workarounds(dev);
 
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		/*
+		 * Globally enable TR-TT support in Hw.
+		 * Still TR-TT enabling on per context basis is required.
+		 * Non-trtt contexts are not affected by this setting.
+		 */
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3368,6 +3379,60 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void intel_trtt_context_destroy_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->obj_link));
+	WARN_ON(!list_empty(&vma->vm_link));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	WARN_ON(!vma->pin_count);
+
+	if (drm_mm_node_allocated(&vma->node))
+		drm_mm_remove_node(&vma->node);
+
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (!vma)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->obj_link);
+	INIT_LIST_HEAD(&vma->vm_link);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as permanently pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = segment_base_addr;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		intel_trtt_context_destroy_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(const dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index dc208c0..2374cb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Fixed size segment */
+#define GEN9_TRTT_SEG_SIZE_SHIFT	44
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -562,4 +566,8 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr);
+void intel_trtt_context_destroy_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d4a298f..ead9dc5 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1 << 0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_MASK		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1 << 1)
+#define   GEN9_TRTT_ENABLE		(1 << 0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f727822..0dba2fe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1644,6 +1644,76 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	return init_workarounds_ring(engine);
 }
 
+static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf, 0);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
+{
+	struct intel_context *ctx = req->ctx;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	u64 masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+	u32 trva_data_value =
+		(ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
+		GEN9_TRVA_DATA_MASK;
+	const int num_lri_cmds = 6;
+	int ret;
+
+	/*
+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
+	 * of directly updating the context image, as this will ensure that
+	 * update happens in a serialized manner for the context and also
+	 * lite-restore scenario will get handled.
+	 */
+	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW0);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW1);
+	intel_logical_ring_emit(ringbuf, upper_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRVA_MASK_VALUE | trva_data_value);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
 {
 	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -2002,6 +2072,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	return intel_lr_context_render_state_init(req);
 }
 
+static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
+{
+	int ret;
+
+	/*
+	 * Explictily disable TR-TT at the start of a new context.
+	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
+	 * context the TR-TT settings of the outgoing context could get
+	 * spilled on to the new incoming context as only the Ring Context
+	 * part is loaded on the first submission of a new context, due to
+	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
+	 */
+	ret = gen9_init_rcs_context_trtt(req);
+	if (ret)
+		return ret;
+
+	return gen8_init_rcs_context(req);
+}
+
 /**
  * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  *
@@ -2133,11 +2222,14 @@ static int logical_render_ring_init(struct drm_device *dev)
 	logical_ring_default_vfuncs(dev, engine);
 
 	/* Override some for render ring. */
-	if (INTEL_INFO(dev)->gen >= 9)
+	if (INTEL_INFO(dev)->gen >= 9) {
 		engine->init_hw = gen9_init_render_ring;
-	else
+		engine->init_context = gen9_init_rcs_context;
+	} else {
 		engine->init_hw = gen8_init_render_ring;
-	engine->init_context = gen8_init_rcs_context;
+		engine->init_context = gen8_init_rcs_context;
+	}
+
 	engine->cleanup = intel_fini_pipe_control;
 	engine->emit_flush = gen8_emit_flush_render;
 	engine->emit_request = gen8_emit_request_render;
@@ -2701,3 +2793,29 @@ void intel_lr_context_reset(struct drm_device *dev,
 		ringbuf->tail = 0;
 	}
 }
+
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx)
+{
+	struct intel_engine_cs *engine = &(ctx->i915->engine[RCS]);
+	struct drm_i915_gem_request *req;
+	int ret;
+
+	if (!ctx->engine[RCS].state) {
+		ret = intel_lr_context_deferred_alloc(ctx, engine);
+		if (ret)
+			return ret;
+	}
+
+	req = i915_gem_request_alloc(engine, ctx);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = gen9_emit_trtt_regs(req);
+	if (ret) {
+		i915_gem_request_cancel(req);
+		return ret;
+	}
+
+	i915_add_request(req);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index a17cb12..f3600b2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,6 +107,7 @@ void intel_lr_context_reset(struct drm_device *dev,
 			struct intel_context *ctx);
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *engine);
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx);
 
 u32 intel_execlists_ctx_id(struct intel_context *ctx,
 			   struct intel_engine_cs *engine);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a5524cc..604da23 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_TRTT		0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 segment_base_addr;
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v7] drm/i915: Support to enable TRTT on GEN9
  2016-03-18  8:35                               ` [PATCH v7] " akash.goel
@ 2016-03-18  9:35                                 ` Chris Wilson
  2016-03-18 10:23                                   ` [PATCH v8] " akash.goel
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2016-03-18  9:35 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Fri, Mar 18, 2016 at 02:05:26PM +0530, akash.goel@intel.com wrote:
> @@ -974,6 +1108,13 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  		return PTR_ERR(ctx);
>  	}
>  
> +	/*
> +	 * Take a reference also, as in certain cases we have to release &
> +	 * reacquire the struct_mutex and we don't want the context to
> +	 * go away.
> +	 */
> +	i915_gem_context_reference(ctx);
> +
>  	switch (args->param) {
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
>  		if (args->size)
> @@ -992,10 +1133,14 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
>  		}
>  		break;
> +	case I915_CONTEXT_PARAM_TRTT:
> +		ret = intel_context_set_trtt(ctx, args);
> +		break;
>  	default:
>  		ret = -EINVAL;
>  		break;
>  	}
> +	i915_gem_context_unreference(ctx);
>  	mutex_unlock(&dev->struct_mutex);

Having applied the safety net with the nice comment to setparam, we
should do the same for getparam for consistency. It just makes it easier
for us to keep extending the ioctls in future.

With that,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8] drm/i915: Support to enable TRTT on GEN9
  2016-03-18  9:35                                 ` Chris Wilson
@ 2016-03-18 10:23                                   ` akash.goel
  2016-03-22  8:42                                     ` [PATCH v9] " akash.goel
  0 siblings, 1 reply; 59+ messages in thread
From: akash.goel @ 2016-03-18 10:23 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
assign them a PPGTT address through the Soft Pin API (for example, use soft pin
to assign l3_table_address to the L3 table BO, when used).
UMD will also program the entries in the TR-TT page tables using regular batch
commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
UMD may do the complete PPGTT address space management, on the pretext that it
could help minimize the conflicts.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
transparent to the KMD, UMD will do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
Any object must not be assigned an address from TR-TT segment, they will be
mapped to PPGTT in a regular way by KMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, start location of TR-TT
segment alongwith the pattern value for the Null & invalid Tile registers.

v2:
 - Support context_getparam for TRTT also and dispense with a separate
   GETPARAM case for TRTT (Chris).
 - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
   from user space (Chris).
 - Move all the argument checking for TRTT in context_setparam to the
   set_trtt function (Chris).
 - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
 - Rename certain functions to rightly reflect their purpose, rename
   the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
   rephrase few lines in the commit message body, add more comments (Chris).
 - Extend ABI to allow User specify TRTT segment location also.
 - Fix for selective enabling of TRTT on per context basis, explicitly
   disable TR-TT at the start of a new context.

v3:
 - Check the return value of gen9_emit_trtt_regs (Chris)
 - Update the kernel doc for intel_context structure.
 - Rebased.

v4:
 - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
 - Fix the context_getparam implementation avoiding the reset of size field,
   affecting the TRTT case.

v5:
 - Update the TR-TT params right away in context_setparam, by constructing
   & submitting a request emitting LRIs, instead of deferring it and
   conflating with the next batch submission (Chris)
 - Follow the struct_mutex handling related prescribed rules, while accessing
   User space buffer, both in context_setparam & getparam functions (Chris).

v6:
 - Fix the warning caused due to removal of un-allocated trtt vma node.

v7:
 - Move context ref/unref to context_setparam_ioctl from set_trtt() & remove
   that from get_trtt() as not really needed there (Chris).
 - Add a check for improper values for Null & Invalid Tiles.
 - Remove superfluous DRM_ERROR from trtt_context_allocate_vma (Chris).
 - Rebased.

v8:
 - Add context ref/unref to context_getparam_ioctl also so as to be consistent
   and ease the extension of ioctl in future (Chris)

Testcase: igt/gem_trtt

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h         |  16 +++-
 drivers/gpu/drm/i915/i915_gem_context.c | 155 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  65 ++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 ++
 drivers/gpu/drm/i915/i915_reg.h         |  19 ++++
 drivers/gpu/drm/i915/intel_lrc.c        | 124 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h        |   1 +
 include/uapi/drm/i915_drm.h             |   8 ++
 8 files changed, 391 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8a8cb7e..6f06d72 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -800,6 +800,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1 << 1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -814,6 +815,8 @@ struct i915_ctx_hang_stats {
  * @ppgtt: virtual memory space used by this context.
  * @legacy_hw_ctx: render context backing object and whether it is correctly
  *                initialized (legacy ring submission mechanism only).
+ * @trtt_info: Programming parameters for tr-tt (redirection tables for
+ *             userspace, for sparse resource management)
  * @link: link in the global list of contexts.
  *
  * Contexts are memory images used by the hardware to store copies of their
@@ -824,7 +827,7 @@ struct intel_context {
 	int user_handle;
 	uint8_t remap_slice;
 	struct drm_i915_private *i915;
-	int flags;
+	unsigned int flags;
 	struct drm_i915_file_private *file_priv;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
@@ -845,6 +848,15 @@ struct intel_context {
 		uint32_t *lrc_reg_state;
 	} engine[I915_NUM_ENGINES];
 
+	/* TRTT info */
+	struct intel_context_trtt {
+		u32 invd_tile_val;
+		u32 null_tile_val;
+		u64 l3_table_address;
+		u64 segment_base_addr;
+		struct i915_vma *vma;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2647,6 +2659,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 1993449..3f6d87c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
 	return ret;
 }
 
+static void intel_context_free_trtt(struct intel_context *ctx)
+{
+	if (!ctx->trtt_info.vma)
+		return;
+
+	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+}
+
 static void i915_gem_context_clean(struct intel_context *ctx)
 {
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
@@ -164,6 +172,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	 */
 	i915_gem_context_clean(ctx);
 
+	intel_context_free_trtt(ctx);
+
 	i915_ppgtt_put(ctx->ppgtt);
 
 	if (ctx->legacy_hw_ctx.rcs_state)
@@ -507,6 +517,127 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+intel_context_get_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {
+		return -ENODEV;
+	} else if (args->size < sizeof(trtt_params)) {
+		args->size = sizeof(trtt_params);
+	} else {
+		trtt_params.segment_base_addr =
+			ctx->trtt_info.segment_base_addr;
+		trtt_params.l3_table_address =
+			ctx->trtt_info.l3_table_address;
+		trtt_params.null_tile_val =
+			ctx->trtt_info.null_tile_val;
+		trtt_params.invd_tile_val =
+			ctx->trtt_info.invd_tile_val;
+
+		mutex_unlock(&dev->struct_mutex);
+
+		if (__copy_to_user(to_user_ptr(args->value),
+				   &trtt_params,
+				   sizeof(trtt_params))) {
+			mutex_lock(&dev->struct_mutex);
+			return -EFAULT;
+		}
+
+		args->size = sizeof(trtt_params);
+		mutex_lock(&dev->struct_mutex);
+	}
+
+	return 0;
+}
+
+static int
+intel_context_set_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+	int ret;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+	else if (args->size < sizeof(trtt_params))
+		return -EINVAL;
+
+	mutex_unlock(&dev->struct_mutex);
+
+	if (copy_from_user(&trtt_params,
+			   to_user_ptr(args->value),
+			   sizeof(trtt_params))) {
+		mutex_lock(&dev->struct_mutex);
+		ret = -EFAULT;
+		goto exit;
+	}
+
+	mutex_lock(&dev->struct_mutex);
+
+	/* Check if the setup happened from another path */
+	if (ctx->flags & CONTEXT_USE_TRTT) {
+		ret = -EEXIST;
+		goto exit;
+	}
+
+	/* basic sanity checks for the segment location & l3 table pointer */
+	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
+		i915_dbg(dev, "segment base address not correctly aligned\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
+	     trtt_params.segment_base_addr) &&
+	    (trtt_params.l3_table_address <
+		    (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
+		i915_dbg(dev, "l3 table address conflicts with trtt segment\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
+		i915_dbg(dev, "invalid l3 table address\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (trtt_params.null_tile_val == trtt_params.invd_tile_val) {
+		i915_dbg(dev, "incorrect values for null & invalid tiles\n");
+		return -EINVAL;
+	}
+
+	ctx->trtt_info.vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
+						trtt_params.segment_base_addr);
+	if (IS_ERR(ctx->trtt_info.vma)) {
+		ret = PTR_ERR(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
+	ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
+
+	ret = intel_lr_rcs_context_setup_trtt(ctx);
+	if (ret) {
+		intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+
+exit:
+	return ret;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -931,7 +1062,14 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
-	args->size = 0;
+	/*
+	 * Take a reference also, as in certain cases we have to release &
+	 * reacquire the struct_mutex and we don't want the context to
+	 * go away.
+	 */
+	i915_gem_context_reference(ctx);
+
+	args->size = (args->param != I915_CONTEXT_PARAM_TRTT) ? 0 : args->size;
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		args->value = ctx->hang_stats.ban_period_seconds;
@@ -947,10 +1085,14 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		else
 			args->value = to_i915(dev)->gtt.base.total;
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_get_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
 	}
+	i915_gem_context_unreference(ctx);
 	mutex_unlock(&dev->struct_mutex);
 
 	return ret;
@@ -974,6 +1116,13 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
+	/*
+	 * Take a reference also, as in certain cases we have to release &
+	 * reacquire the struct_mutex and we don't want the context to
+	 * go away.
+	 */
+	i915_gem_context_reference(ctx);
+
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		if (args->size)
@@ -992,10 +1141,14 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_set_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
 	}
+	i915_gem_context_unreference(ctx);
 	mutex_unlock(&dev->struct_mutex);
 
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9c752fe..c99bebf 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 {
 	gtt_write_workarounds(dev);
 
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		/*
+		 * Globally enable TR-TT support in Hw.
+		 * Still TR-TT enabling on per context basis is required.
+		 * Non-trtt contexts are not affected by this setting.
+		 */
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3368,6 +3379,60 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void intel_trtt_context_destroy_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->obj_link));
+	WARN_ON(!list_empty(&vma->vm_link));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	WARN_ON(!vma->pin_count);
+
+	if (drm_mm_node_allocated(&vma->node))
+		drm_mm_remove_node(&vma->node);
+
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (!vma)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->obj_link);
+	INIT_LIST_HEAD(&vma->vm_link);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as permanently pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = segment_base_addr;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		intel_trtt_context_destroy_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(const dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index dc208c0..2374cb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Fixed size segment */
+#define GEN9_TRTT_SEG_SIZE_SHIFT	44
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -562,4 +566,8 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr);
+void intel_trtt_context_destroy_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d4a298f..ead9dc5 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1 << 0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_MASK		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1 << 1)
+#define   GEN9_TRTT_ENABLE		(1 << 0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f727822..0dba2fe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1644,6 +1644,76 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	return init_workarounds_ring(engine);
 }
 
+static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf, 0);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
+{
+	struct intel_context *ctx = req->ctx;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	u64 masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+	u32 trva_data_value =
+		(ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
+		GEN9_TRVA_DATA_MASK;
+	const int num_lri_cmds = 6;
+	int ret;
+
+	/*
+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
+	 * of directly updating the context image, as this will ensure that
+	 * update happens in a serialized manner for the context and also
+	 * lite-restore scenario will get handled.
+	 */
+	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW0);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW1);
+	intel_logical_ring_emit(ringbuf, upper_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRVA_MASK_VALUE | trva_data_value);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
 {
 	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -2002,6 +2072,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	return intel_lr_context_render_state_init(req);
 }
 
+static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
+{
+	int ret;
+
+	/*
+	 * Explictily disable TR-TT at the start of a new context.
+	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
+	 * context the TR-TT settings of the outgoing context could get
+	 * spilled on to the new incoming context as only the Ring Context
+	 * part is loaded on the first submission of a new context, due to
+	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
+	 */
+	ret = gen9_init_rcs_context_trtt(req);
+	if (ret)
+		return ret;
+
+	return gen8_init_rcs_context(req);
+}
+
 /**
  * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  *
@@ -2133,11 +2222,14 @@ static int logical_render_ring_init(struct drm_device *dev)
 	logical_ring_default_vfuncs(dev, engine);
 
 	/* Override some for render ring. */
-	if (INTEL_INFO(dev)->gen >= 9)
+	if (INTEL_INFO(dev)->gen >= 9) {
 		engine->init_hw = gen9_init_render_ring;
-	else
+		engine->init_context = gen9_init_rcs_context;
+	} else {
 		engine->init_hw = gen8_init_render_ring;
-	engine->init_context = gen8_init_rcs_context;
+		engine->init_context = gen8_init_rcs_context;
+	}
+
 	engine->cleanup = intel_fini_pipe_control;
 	engine->emit_flush = gen8_emit_flush_render;
 	engine->emit_request = gen8_emit_request_render;
@@ -2701,3 +2793,29 @@ void intel_lr_context_reset(struct drm_device *dev,
 		ringbuf->tail = 0;
 	}
 }
+
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx)
+{
+	struct intel_engine_cs *engine = &(ctx->i915->engine[RCS]);
+	struct drm_i915_gem_request *req;
+	int ret;
+
+	if (!ctx->engine[RCS].state) {
+		ret = intel_lr_context_deferred_alloc(ctx, engine);
+		if (ret)
+			return ret;
+	}
+
+	req = i915_gem_request_alloc(engine, ctx);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = gen9_emit_trtt_regs(req);
+	if (ret) {
+		i915_gem_request_cancel(req);
+		return ret;
+	}
+
+	i915_add_request(req);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index a17cb12..f3600b2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,6 +107,7 @@ void intel_lr_context_reset(struct drm_device *dev,
 			struct intel_context *ctx);
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *engine);
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx);
 
 u32 intel_execlists_ctx_id(struct intel_context *ctx,
 			   struct intel_engine_cs *engine);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a5524cc..604da23 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_TRTT		0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 segment_base_addr;
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev8)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
                   ` (8 preceding siblings ...)
  2016-03-11 11:41 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev6) Patchwork
@ 2016-03-18 12:45 ` Patchwork
  2016-03-22 10:45 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev9) Patchwork
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-03-18 12:45 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Support to enable TRTT on GEN9 (rev8)
URL   : https://patchwork.freedesktop.org/series/2321/
State : failure

== Summary ==

  CC      drivers/usb/host/xhci-pci.o
  CC [M]  drivers/net/usb/ax88179_178a.o
  LD      drivers/usb/storage/usb-storage.o
  LD      drivers/net/ethernet/synopsys/built-in.o
  CC [M]  drivers/net/usb/cdc_ether.o
  LD      drivers/usb/storage/built-in.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mac.o
  CC [M]  drivers/net/usb/cdc_eem.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_nvm.o
  CC [M]  drivers/net/usb/smsc75xx.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_phy.o
  CC [M]  drivers/net/usb/smsc95xx.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mbx.o
  CC [M]  drivers/net/usb/mcs7830.o
  CC [M]  drivers/net/usb/usbnet.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_i210.o
  CC [M]  drivers/net/usb/cdc_ncm.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_ptp.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_hwmon.o
  LD      drivers/usb/host/xhci-hcd.o
  LD      drivers/usb/host/built-in.o
  LD      drivers/usb/built-in.o
  LD [M]  drivers/net/ethernet/intel/igb/igb.o
  LD      net/xfrm/built-in.o
  LD      net/built-in.o
  LD [M]  drivers/net/ethernet/intel/e1000e/e1000e.o
  LD      drivers/net/ethernet/built-in.o
  LD      drivers/net/built-in.o
Makefile:950: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v9] drm/i915: Support to enable TRTT on GEN9
  2016-03-18 10:23                                   ` [PATCH v8] " akash.goel
@ 2016-03-22  8:42                                     ` akash.goel
  2016-03-24 16:29                                       ` Gore, Tim
  0 siblings, 1 reply; 59+ messages in thread
From: akash.goel @ 2016-03-22  8:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Gen9 has an additional address translation hardware support in form of
Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.
Sparse resources are created as virtual-only allocations. Regions of the
resource that the application intends to use is bound to the physical memory
on the fly and can be re-bound to different memory allocations over the
lifetime of the resource.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address,
so the output of translation from TR-TT will be a PPGTT offset.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT.
Currently this is the supported mode, in this mode UMD will have a full control
on TR-TT management, with bare minimum support from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.
UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
assign them a PPGTT address through the Soft Pin API (for example, use soft pin
to assign l3_table_address to the L3 table BO, when used).
UMD will also program the entries in the TR-TT page tables using regular batch
commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
UMD may do the complete PPGTT address space management, on the pretext that it
could help minimize the conflicts.

Any space in TR-TT segment not bound to any Sparse texture, will be handled
through Invalid tile, User is expected to initialize the entries of a new
L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
the holes in the Sparse texture resource will be set with the Null tile pattern
The improper programming of TRTT should only lead to a recoverable GPU hang,
eventually leading to banning of the culprit context without victimizing others.

The association of any Sparse resource with the BOs will be known only to UMD,
and only the Sparse resources shall be assigned an offset from the TR-TT segment
by UMD. The use of TR-TT segment or mapping of Sparse resources will be
transparent to the KMD, UMD will do the address assignment from TR-TT segment
autonomously and KMD will be oblivious of it.
Any object must not be assigned an address from TR-TT segment, they will be
mapped to PPGTT in a regular way by KMD.

This patch provides an interface through which UMD can convey KMD to enable
TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
UMD will have to pass the GFX address of L3 table page, start location of TR-TT
segment alongwith the pattern value for the Null & invalid Tile registers.

v2:
 - Support context_getparam for TRTT also and dispense with a separate
   GETPARAM case for TRTT (Chris).
 - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
   from user space (Chris).
 - Move all the argument checking for TRTT in context_setparam to the
   set_trtt function (Chris).
 - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
 - Rename certain functions to rightly reflect their purpose, rename
   the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
   rephrase few lines in the commit message body, add more comments (Chris).
 - Extend ABI to allow User specify TRTT segment location also.
 - Fix for selective enabling of TRTT on per context basis, explicitly
   disable TR-TT at the start of a new context.

v3:
 - Check the return value of gen9_emit_trtt_regs (Chris)
 - Update the kernel doc for intel_context structure.
 - Rebased.

v4:
 - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
 - Fix the context_getparam implementation avoiding the reset of size field,
   affecting the TRTT case.

v5:
 - Update the TR-TT params right away in context_setparam, by constructing
   & submitting a request emitting LRIs, instead of deferring it and
   conflating with the next batch submission (Chris)
 - Follow the struct_mutex handling related prescribed rules, while accessing
   User space buffer, both in context_setparam & getparam functions (Chris).

v6:
 - Fix the warning caused due to removal of un-allocated trtt vma node.

v7:
 - Move context ref/unref to context_setparam_ioctl from set_trtt() & remove
   that from get_trtt() as not really needed there (Chris).
 - Add a check for improper values for Null & Invalid Tiles.
 - Remove superfluous DRM_ERROR from trtt_context_allocate_vma (Chris).
 - Rebased.

v8:
 - Add context ref/unref to context_getparam_ioctl also so as to be consistent
   and ease the extension of ioctl in future (Chris)

v9:
 - Fix the handling of return value from trtt_context_allocate_vma() function,
   causing kernel panic at the time of destroying context, in case of
   unsuccessful allocation of trtt vma.
 - Rebased.

Testcase: igt/gem_trtt

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h         |  16 +++-
 drivers/gpu/drm/i915/i915_gem_context.c | 157 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  65 +++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 ++
 drivers/gpu/drm/i915/i915_reg.h         |  19 ++++
 drivers/gpu/drm/i915/intel_lrc.c        | 124 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h        |   1 +
 include/uapi/drm/i915_drm.h             |   8 ++
 8 files changed, 393 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ecbd418..272d1f8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -804,6 +804,7 @@ struct i915_ctx_hang_stats {
 #define DEFAULT_CONTEXT_HANDLE 0
 
 #define CONTEXT_NO_ZEROMAP (1<<0)
+#define CONTEXT_USE_TRTT   (1 << 1)
 /**
  * struct intel_context - as the name implies, represents a context.
  * @ref: reference count.
@@ -818,6 +819,8 @@ struct i915_ctx_hang_stats {
  * @ppgtt: virtual memory space used by this context.
  * @legacy_hw_ctx: render context backing object and whether it is correctly
  *                initialized (legacy ring submission mechanism only).
+ * @trtt_info: Programming parameters for tr-tt (redirection tables for
+ *             userspace, for sparse resource management)
  * @link: link in the global list of contexts.
  *
  * Contexts are memory images used by the hardware to store copies of their
@@ -828,7 +831,7 @@ struct intel_context {
 	int user_handle;
 	uint8_t remap_slice;
 	struct drm_i915_private *i915;
-	int flags;
+	unsigned int flags;
 	struct drm_i915_file_private *file_priv;
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
@@ -849,6 +852,15 @@ struct intel_context {
 		uint32_t *lrc_reg_state;
 	} engine[I915_NUM_ENGINES];
 
+	/* TRTT info */
+	struct intel_context_trtt {
+		u32 invd_tile_val;
+		u32 null_tile_val;
+		u64 l3_table_address;
+		u64 segment_base_addr;
+		struct i915_vma *vma;
+	} trtt_info;
+
 	struct list_head link;
 };
 
@@ -2657,6 +2669,8 @@ struct drm_i915_cmd_table {
 				 !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \
 				 !IS_BROXTON(dev))
 
+#define HAS_TRTT(dev)		(IS_GEN9(dev))
+
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 394e525..5f28c23 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
 	return ret;
 }
 
+static void intel_context_free_trtt(struct intel_context *ctx)
+{
+	if (!ctx->trtt_info.vma)
+		return;
+
+	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+}
+
 static void i915_gem_context_clean(struct intel_context *ctx)
 {
 	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
@@ -164,6 +172,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	 */
 	i915_gem_context_clean(ctx);
 
+	intel_context_free_trtt(ctx);
+
 	i915_ppgtt_put(ctx->ppgtt);
 
 	if (ctx->legacy_hw_ctx.rcs_state)
@@ -507,6 +517,129 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static int
+intel_context_get_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct drm_device *dev = ctx->i915->dev;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {
+		return -ENODEV;
+	} else if (args->size < sizeof(trtt_params)) {
+		args->size = sizeof(trtt_params);
+	} else {
+		trtt_params.segment_base_addr =
+			ctx->trtt_info.segment_base_addr;
+		trtt_params.l3_table_address =
+			ctx->trtt_info.l3_table_address;
+		trtt_params.null_tile_val =
+			ctx->trtt_info.null_tile_val;
+		trtt_params.invd_tile_val =
+			ctx->trtt_info.invd_tile_val;
+
+		mutex_unlock(&dev->struct_mutex);
+
+		if (__copy_to_user(to_user_ptr(args->value),
+				   &trtt_params,
+				   sizeof(trtt_params))) {
+			mutex_lock(&dev->struct_mutex);
+			return -EFAULT;
+		}
+
+		args->size = sizeof(trtt_params);
+		mutex_lock(&dev->struct_mutex);
+	}
+
+	return 0;
+}
+
+static int
+intel_context_set_trtt(struct intel_context *ctx,
+		       struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_trtt_param trtt_params;
+	struct i915_vma *vma;
+	struct drm_device *dev = ctx->i915->dev;
+	int ret;
+
+	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
+		return -ENODEV;
+	else if (ctx->flags & CONTEXT_USE_TRTT)
+		return -EEXIST;
+	else if (args->size < sizeof(trtt_params))
+		return -EINVAL;
+
+	mutex_unlock(&dev->struct_mutex);
+
+	if (copy_from_user(&trtt_params,
+			   to_user_ptr(args->value),
+			   sizeof(trtt_params))) {
+		mutex_lock(&dev->struct_mutex);
+		ret = -EFAULT;
+		goto exit;
+	}
+
+	mutex_lock(&dev->struct_mutex);
+
+	/* Check if the setup happened from another path */
+	if (ctx->flags & CONTEXT_USE_TRTT) {
+		ret = -EEXIST;
+		goto exit;
+	}
+
+	/* basic sanity checks for the segment location & l3 table pointer */
+	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
+		i915_dbg(dev, "segment base address not correctly aligned\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
+	     trtt_params.segment_base_addr) &&
+	    (trtt_params.l3_table_address <
+		    (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
+		i915_dbg(dev, "l3 table address conflicts with trtt segment\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
+		i915_dbg(dev, "invalid l3 table address\n");
+		ret = -EINVAL;
+		goto exit;
+	}
+
+	if (trtt_params.null_tile_val == trtt_params.invd_tile_val) {
+		i915_dbg(dev, "incorrect values for null & invalid tiles\n");
+		return -EINVAL;
+	}
+
+	vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
+					trtt_params.segment_base_addr);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto exit;
+	}
+
+	ctx->trtt_info.vma = vma;
+	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
+	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
+	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
+	ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
+
+	ret = intel_lr_rcs_context_setup_trtt(ctx);
+	if (ret) {
+		intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
+		goto exit;
+	}
+
+	ctx->flags |= CONTEXT_USE_TRTT;
+
+exit:
+	return ret;
+}
+
 static inline int
 mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 {
@@ -931,7 +1064,14 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
-	args->size = 0;
+	/*
+	 * Take a reference also, as in certain cases we have to release &
+	 * reacquire the struct_mutex and we don't want the context to
+	 * go away.
+	 */
+	i915_gem_context_reference(ctx);
+
+	args->size = (args->param != I915_CONTEXT_PARAM_TRTT) ? 0 : args->size;
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		args->value = ctx->hang_stats.ban_period_seconds;
@@ -947,10 +1087,14 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		else
 			args->value = to_i915(dev)->ggtt.base.total;
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_get_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
 	}
+	i915_gem_context_unreference(ctx);
 	mutex_unlock(&dev->struct_mutex);
 
 	return ret;
@@ -974,6 +1118,13 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		return PTR_ERR(ctx);
 	}
 
+	/*
+	 * Take a reference also, as in certain cases we have to release &
+	 * reacquire the struct_mutex and we don't want the context to
+	 * go away.
+	 */
+	i915_gem_context_reference(ctx);
+
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		if (args->size)
@@ -992,10 +1143,14 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0;
 		}
 		break;
+	case I915_CONTEXT_PARAM_TRTT:
+		ret = intel_context_set_trtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
 	}
+	i915_gem_context_unreference(ctx);
 	mutex_unlock(&dev->struct_mutex);
 
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0715bb7..cbf8a03 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 {
 	gtt_write_workarounds(dev);
 
+	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		/*
+		 * Globally enable TR-TT support in Hw.
+		 * Still TR-TT enabling on per context basis is required.
+		 * Non-trtt contexts are not affected by this setting.
+		 */
+		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
+			   GEN9_TRTT_BYPASS_DISABLE);
+	}
+
 	/* In the case of execlists, PPGTT is enabled by the context descriptor
 	 * and the PDPs are contained within the context itself.  We don't
 	 * need to do anything here. */
@@ -3362,6 +3373,60 @@ i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object *obj,
 
 }
 
+void intel_trtt_context_destroy_vma(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+
+	WARN_ON(!list_empty(&vma->obj_link));
+	WARN_ON(!list_empty(&vma->vm_link));
+	WARN_ON(!list_empty(&vma->exec_list));
+
+	WARN_ON(!vma->pin_count);
+
+	if (drm_mm_node_allocated(&vma->node))
+		drm_mm_remove_node(&vma->node);
+
+	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
+	kmem_cache_free(to_i915(vm->dev)->vmas, vma);
+}
+
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr)
+{
+	struct i915_vma *vma;
+	int ret;
+
+	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
+	if (!vma)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->obj_link);
+	INIT_LIST_HEAD(&vma->vm_link);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
+
+	/* Mark the vma as permanently pinned */
+	vma->pin_count = 1;
+
+	/* Reserve from the 48 bit PPGTT space */
+	vma->node.start = segment_base_addr;
+	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
+	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	if (ret) {
+		ret = i915_gem_evict_for_vma(vma);
+		if (ret == 0)
+			ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+	}
+	if (ret) {
+		intel_trtt_context_destroy_vma(vma);
+		return ERR_PTR(ret);
+	}
+
+	return vma;
+}
+
 static struct scatterlist *
 rotate_pages(const dma_addr_t *in, unsigned int offset,
 	     unsigned int width, unsigned int height,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d804be0..8cbaca2 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+/* Fixed size segment */
+#define GEN9_TRTT_SEG_SIZE_SHIFT	44
+#define GEN9_TRTT_SEGMENT_SIZE		(1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
+
 enum i915_ggtt_view_type {
 	I915_GGTT_VIEW_NORMAL = 0,
 	I915_GGTT_VIEW_ROTATED,
@@ -560,4 +564,8 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+struct i915_vma *
+intel_trtt_context_allocate_vma(struct i915_address_space *vm,
+				uint64_t segment_base_addr);
+void intel_trtt_context_destroy_vma(struct i915_vma *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 264885f..07936b6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -188,6 +188,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
+#define   GEN9_TRTT_BYPASS_DISABLE	(1 << 0)
+
+/* TRTT registers in the H/W Context */
+#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
+#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
+#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
+
+#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
+#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
+
+#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
+#define   GEN9_TRVA_MASK_VALUE		0xF0
+#define   GEN9_TRVA_DATA_MASK		0xF
+
+#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
+#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1 << 1)
+#define   GEN9_TRTT_ENABLE		(1 << 0)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3a23b95..8af480b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1645,6 +1645,76 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	return init_workarounds_ring(engine);
 }
 
+static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	int ret;
+
+	ret = intel_logical_ring_begin(req, 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf, 0);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
+{
+	struct intel_context *ctx = req->ctx;
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	u64 masked_l3_gfx_address =
+		ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
+	u32 trva_data_value =
+		(ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
+		GEN9_TRVA_DATA_MASK;
+	const int num_lri_cmds = 6;
+	int ret;
+
+	/*
+	 * Emitting LRIs to update the TRTT registers is most reliable, instead
+	 * of directly updating the context image, as this will ensure that
+	 * update happens in a serialized manner for the context and also
+	 * lite-restore scenario will get handled.
+	 */
+	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW0);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_L3_POINTER_DW1);
+	intel_logical_ring_emit(ringbuf, upper_32_bits(masked_l3_gfx_address));
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
+	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRVA_MASK_VALUE | trva_data_value);
+
+	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
+	intel_logical_ring_emit(ringbuf,
+				GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE);
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
 {
 	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -2003,6 +2073,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 	return intel_lr_context_render_state_init(req);
 }
 
+static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
+{
+	int ret;
+
+	/*
+	 * Explictily disable TR-TT at the start of a new context.
+	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
+	 * context the TR-TT settings of the outgoing context could get
+	 * spilled on to the new incoming context as only the Ring Context
+	 * part is loaded on the first submission of a new context, due to
+	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
+	 */
+	ret = gen9_init_rcs_context_trtt(req);
+	if (ret)
+		return ret;
+
+	return gen8_init_rcs_context(req);
+}
+
 /**
  * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  *
@@ -2134,11 +2223,14 @@ static int logical_render_ring_init(struct drm_device *dev)
 	logical_ring_default_vfuncs(dev, engine);
 
 	/* Override some for render ring. */
-	if (INTEL_INFO(dev)->gen >= 9)
+	if (INTEL_INFO(dev)->gen >= 9) {
 		engine->init_hw = gen9_init_render_ring;
-	else
+		engine->init_context = gen9_init_rcs_context;
+	} else {
 		engine->init_hw = gen8_init_render_ring;
-	engine->init_context = gen8_init_rcs_context;
+		engine->init_context = gen8_init_rcs_context;
+	}
+
 	engine->cleanup = intel_fini_pipe_control;
 	engine->emit_flush = gen8_emit_flush_render;
 	engine->emit_request = gen8_emit_request_render;
@@ -2702,3 +2794,29 @@ void intel_lr_context_reset(struct drm_device *dev,
 		ringbuf->tail = 0;
 	}
 }
+
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx)
+{
+	struct intel_engine_cs *engine = &(ctx->i915->engine[RCS]);
+	struct drm_i915_gem_request *req;
+	int ret;
+
+	if (!ctx->engine[RCS].state) {
+		ret = intel_lr_context_deferred_alloc(ctx, engine);
+		if (ret)
+			return ret;
+	}
+
+	req = i915_gem_request_alloc(engine, ctx);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = gen9_emit_trtt_regs(req);
+	if (ret) {
+		i915_gem_request_cancel(req);
+		return ret;
+	}
+
+	i915_add_request(req);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index a17cb12..f3600b2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -107,6 +107,7 @@ void intel_lr_context_reset(struct drm_device *dev,
 			struct intel_context *ctx);
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *engine);
+int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx);
 
 u32 intel_execlists_ctx_id(struct intel_context *ctx,
 			   struct intel_engine_cs *engine);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a5524cc..604da23 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
+#define I915_CONTEXT_PARAM_TRTT		0x4
 	__u64 value;
 };
 
+struct drm_i915_gem_context_trtt_param {
+	__u64 segment_base_addr;
+	__u64 l3_table_address;
+	__u32 invd_tile_val;
+	__u32 null_tile_val;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev9)
  2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
                   ` (9 preceding siblings ...)
  2016-03-18 12:45 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev8) Patchwork
@ 2016-03-22 10:45 ` Patchwork
  10 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-03-22 10:45 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Support to enable TRTT on GEN9 (rev9)
URL   : https://patchwork.freedesktop.org/series/2321/
State : failure

== Summary ==

  CC [M]  drivers/net/ethernet/intel/e1000e/ethtool.o
  LD      drivers/pnp/pnpacpi/pnp.o
  LD      drivers/pnp/pnpacpi/built-in.o
  LD      drivers/pnp/built-in.o
  CC [M]  drivers/net/usb/smsc75xx.o
  CC [M]  drivers/net/ethernet/intel/e1000e/netdev.o
  CC [M]  drivers/net/usb/smsc95xx.o
  LD      drivers/md/dm-mod.o
  LD      drivers/md/built-in.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ptp.o
  CC [M]  drivers/net/usb/mcs7830.o
  CC [M]  drivers/net/usb/usbnet.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_param.o
  CC [M]  drivers/net/usb/cdc_ncm.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_82575.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mac.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_nvm.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_phy.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mbx.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_i210.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_ptp.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_hwmon.o
  LD [M]  drivers/net/ethernet/intel/e1000/e1000.o
  LD [M]  drivers/net/ethernet/intel/igbvf/igbvf.o
  LD [M]  drivers/net/ethernet/intel/igb/igb.o
  LD [M]  drivers/net/ethernet/intel/e1000e/e1000e.o
  LD      drivers/net/ethernet/built-in.o
  LD      drivers/net/built-in.o
Makefile:950: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v9] drm/i915: Support to enable TRTT on GEN9
  2016-03-22  8:42                                     ` [PATCH v9] " akash.goel
@ 2016-03-24 16:29                                       ` Gore, Tim
  2016-03-24 16:36                                         ` Goel, Akash
  0 siblings, 1 reply; 59+ messages in thread
From: Gore, Tim @ 2016-03-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Goel, Akash


Tim Gore 
Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ

> -----Original Message-----
> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
> Of akash.goel@intel.com
> Sent: Tuesday, March 22, 2016 8:43 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Goel, Akash
> Subject: [Intel-gfx] [PATCH v9] drm/i915: Support to enable TRTT on GEN9
> 
> From: Akash Goel <akash.goel@intel.com>
> 
> Gen9 has an additional address translation hardware support in form of Tiled
> Resource Translation Table (TR-TT) which provides an extra level of
> abstraction over PPGTT.
> This is useful for mapping Sparse/Tiled texture resources.
> Sparse resources are created as virtual-only allocations. Regions of the
> resource that the application intends to use is bound to the physical memory
> on the fly and can be re-bound to different memory allocations over the
> lifetime of the resource.
> 
> TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
> for a new PPGTT instance, but TR-TT may not enabled for every context.
> 1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
> which such chunk to use is conveyed to HW through a register.
> Any GFX address, which lies in that reserved 44 bit range will be translated
> through TR-TT first and then through PPGTT to get the actual physical
> address, so the output of translation from TR-TT will be a PPGTT offset.
> 
> TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which leaves
> behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and each
> level is contained within a 4KB page hence L3 and L2 is composed of
> 512 64b entries and L1 is composed of 1024 32b entries.
> 
> There is a provision to keep TR-TT Tables in virtual space, where the pages of
> TRTT tables will be mapped to PPGTT.
> Currently this is the supported mode, in this mode UMD will have a full
> control on TR-TT management, with bare minimum support from KMD.
> So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
> similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
> The entries of L1 table will contain the PPGTT offset of BOs actually backing
> the Sparse resources.
> UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
> assign them a PPGTT address through the Soft Pin API (for example, use soft
> pin to assign l3_table_address to the L3 table BO, when used).
> UMD will also program the entries in the TR-TT page tables using regular
> batch commands (MI_STORE_DATA_IMM), or via mmapping of the page
> table BOs.
> UMD may do the complete PPGTT address space management, on the
> pretext that it could help minimize the conflicts.
> 
> Any space in TR-TT segment not bound to any Sparse texture, will be handled
> through Invalid tile, User is expected to initialize the entries of a new
> L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
> the holes in the Sparse texture resource will be set with the Null tile pattern
> The improper programming of TRTT should only lead to a recoverable GPU
> hang, eventually leading to banning of the culprit context without victimizing
> others.
> 
> The association of any Sparse resource with the BOs will be known only to
> UMD, and only the Sparse resources shall be assigned an offset from the TR-
> TT segment by UMD. The use of TR-TT segment or mapping of Sparse
> resources will be transparent to the KMD, UMD will do the address
> assignment from TR-TT segment autonomously and KMD will be oblivious of
> it.
> Any object must not be assigned an address from TR-TT segment, they will
> be mapped to PPGTT in a regular way by KMD.
> 
> This patch provides an interface through which UMD can convey KMD to
> enable TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param
> has been added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
> UMD will have to pass the GFX address of L3 table page, start location of TR-
> TT segment alongwith the pattern value for the Null & invalid Tile registers.
> 
> v2:
>  - Support context_getparam for TRTT also and dispense with a separate
>    GETPARAM case for TRTT (Chris).
>  - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
>    from user space (Chris).
>  - Move all the argument checking for TRTT in context_setparam to the
>    set_trtt function (Chris).
>  - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
>  - Rename certain functions to rightly reflect their purpose, rename
>    the new param for TRTT in gem_context_param to
> I915_CONTEXT_PARAM_TRTT,
>    rephrase few lines in the commit message body, add more comments
> (Chris).
>  - Extend ABI to allow User specify TRTT segment location also.
>  - Fix for selective enabling of TRTT on per context basis, explicitly
>    disable TR-TT at the start of a new context.
> 
> v3:
>  - Check the return value of gen9_emit_trtt_regs (Chris)
>  - Update the kernel doc for intel_context structure.
>  - Rebased.
> 
> v4:
>  - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
>  - Fix the context_getparam implementation avoiding the reset of size field,
>    affecting the TRTT case.
> 
> v5:
>  - Update the TR-TT params right away in context_setparam, by constructing
>    & submitting a request emitting LRIs, instead of deferring it and
>    conflating with the next batch submission (Chris)
>  - Follow the struct_mutex handling related prescribed rules, while accessing
>    User space buffer, both in context_setparam & getparam functions (Chris).
> 
> v6:
>  - Fix the warning caused due to removal of un-allocated trtt vma node.
> 
> v7:
>  - Move context ref/unref to context_setparam_ioctl from set_trtt() &
> remove
>    that from get_trtt() as not really needed there (Chris).
>  - Add a check for improper values for Null & Invalid Tiles.
>  - Remove superfluous DRM_ERROR from trtt_context_allocate_vma (Chris).
>  - Rebased.
> 
> v8:
>  - Add context ref/unref to context_getparam_ioctl also so as to be
> consistent
>    and ease the extension of ioctl in future (Chris)
> 
> v9:
>  - Fix the handling of return value from trtt_context_allocate_vma() function,
>    causing kernel panic at the time of destroying context, in case of
>    unsuccessful allocation of trtt vma.
>  - Rebased.
> 
> Testcase: igt/gem_trtt
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  16 +++-
>  drivers/gpu/drm/i915/i915_gem_context.c | 157
> +++++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_gem_gtt.c     |  65 +++++++++++++
>  drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 ++
>  drivers/gpu/drm/i915/i915_reg.h         |  19 ++++
>  drivers/gpu/drm/i915/intel_lrc.c        | 124 ++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_lrc.h        |   1 +
>  include/uapi/drm/i915_drm.h             |   8 ++
>  8 files changed, 393 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index ecbd418..272d1f8 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -804,6 +804,7 @@ struct i915_ctx_hang_stats {  #define
> DEFAULT_CONTEXT_HANDLE 0
> 
>  #define CONTEXT_NO_ZEROMAP (1<<0)
> +#define CONTEXT_USE_TRTT   (1 << 1)
>  /**
>   * struct intel_context - as the name implies, represents a context.
>   * @ref: reference count.
> @@ -818,6 +819,8 @@ struct i915_ctx_hang_stats {
>   * @ppgtt: virtual memory space used by this context.
>   * @legacy_hw_ctx: render context backing object and whether it is
> correctly
>   *                initialized (legacy ring submission mechanism only).
> + * @trtt_info: Programming parameters for tr-tt (redirection tables for
> + *             userspace, for sparse resource management)
>   * @link: link in the global list of contexts.
>   *
>   * Contexts are memory images used by the hardware to store copies of
> their @@ -828,7 +831,7 @@ struct intel_context {
>  	int user_handle;
>  	uint8_t remap_slice;
>  	struct drm_i915_private *i915;
> -	int flags;
> +	unsigned int flags;
>  	struct drm_i915_file_private *file_priv;
>  	struct i915_ctx_hang_stats hang_stats;
>  	struct i915_hw_ppgtt *ppgtt;
> @@ -849,6 +852,15 @@ struct intel_context {
>  		uint32_t *lrc_reg_state;
>  	} engine[I915_NUM_ENGINES];
> 
> +	/* TRTT info */
> +	struct intel_context_trtt {
> +		u32 invd_tile_val;
> +		u32 null_tile_val;
> +		u64 l3_table_address;
> +		u64 segment_base_addr;
> +		struct i915_vma *vma;
> +	} trtt_info;
> +
>  	struct list_head link;
>  };
> 
> @@ -2657,6 +2669,8 @@ struct drm_i915_cmd_table {
>  				 !IS_VALLEYVIEW(dev) &&
> !IS_CHERRYVIEW(dev) && \
>  				 !IS_BROXTON(dev))
> 
> +#define HAS_TRTT(dev)		(IS_GEN9(dev))
> +

A very minor point, but there is a w/a to disable TRTT for BXT_REVID_A0/1. I realise this
is basically obsolete now, but I'm still using one!

>  #define INTEL_PCH_DEVICE_ID_MASK		0xff00
>  #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
>  #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> b/drivers/gpu/drm/i915/i915_gem_context.c
> index 394e525..5f28c23 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev)
>  	return ret;
>  }
> 
> +static void intel_context_free_trtt(struct intel_context *ctx) {
> +	if (!ctx->trtt_info.vma)
> +		return;
> +
> +	intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
> +}
> +
>  static void i915_gem_context_clean(struct intel_context *ctx)  {
>  	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt; @@ -164,6 +172,8 @@
> void i915_gem_context_free(struct kref *ctx_ref)
>  	 */
>  	i915_gem_context_clean(ctx);
> 
> +	intel_context_free_trtt(ctx);
> +
>  	i915_ppgtt_put(ctx->ppgtt);
> 
>  	if (ctx->legacy_hw_ctx.rcs_state)
> @@ -507,6 +517,129 @@ i915_gem_context_get(struct
> drm_i915_file_private *file_priv, u32 id)
>  	return ctx;
>  }
> 
> +static int
> +intel_context_get_trtt(struct intel_context *ctx,
> +		       struct drm_i915_gem_context_param *args) {
> +	struct drm_i915_gem_context_trtt_param trtt_params;
> +	struct drm_device *dev = ctx->i915->dev;
> +
> +	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) {
> +		return -ENODEV;
> +	} else if (args->size < sizeof(trtt_params)) {
> +		args->size = sizeof(trtt_params);
> +	} else {
> +		trtt_params.segment_base_addr =
> +			ctx->trtt_info.segment_base_addr;
> +		trtt_params.l3_table_address =
> +			ctx->trtt_info.l3_table_address;
> +		trtt_params.null_tile_val =
> +			ctx->trtt_info.null_tile_val;
> +		trtt_params.invd_tile_val =
> +			ctx->trtt_info.invd_tile_val;
> +
> +		mutex_unlock(&dev->struct_mutex);
> +
> +		if (__copy_to_user(to_user_ptr(args->value),
> +				   &trtt_params,
> +				   sizeof(trtt_params))) {
> +			mutex_lock(&dev->struct_mutex);
> +			return -EFAULT;
> +		}
> +
> +		args->size = sizeof(trtt_params);
> +		mutex_lock(&dev->struct_mutex);
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +intel_context_set_trtt(struct intel_context *ctx,
> +		       struct drm_i915_gem_context_param *args) {
> +	struct drm_i915_gem_context_trtt_param trtt_params;
> +	struct i915_vma *vma;
> +	struct drm_device *dev = ctx->i915->dev;
> +	int ret;
> +
> +	if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev))
> +		return -ENODEV;
> +	else if (ctx->flags & CONTEXT_USE_TRTT)
> +		return -EEXIST;
> +	else if (args->size < sizeof(trtt_params))
> +		return -EINVAL;
> +
> +	mutex_unlock(&dev->struct_mutex);
> +
> +	if (copy_from_user(&trtt_params,
> +			   to_user_ptr(args->value),
> +			   sizeof(trtt_params))) {
> +		mutex_lock(&dev->struct_mutex);
> +		ret = -EFAULT;
> +		goto exit;
> +	}
> +
> +	mutex_lock(&dev->struct_mutex);
> +
> +	/* Check if the setup happened from another path */
> +	if (ctx->flags & CONTEXT_USE_TRTT) {
> +		ret = -EEXIST;
> +		goto exit;
> +	}
> +
> +	/* basic sanity checks for the segment location & l3 table pointer */
> +	if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE -
> 1)) {
> +		i915_dbg(dev, "segment base address not correctly
> aligned\n");
> +		ret = -EINVAL;
> +		goto exit;
> +	}
> +
> +	if (((trtt_params.l3_table_address + PAGE_SIZE) >=
> +	     trtt_params.segment_base_addr) &&
> +	    (trtt_params.l3_table_address <
> +		    (trtt_params.segment_base_addr +
> GEN9_TRTT_SEGMENT_SIZE))) {
> +		i915_dbg(dev, "l3 table address conflicts with trtt
> segment\n");
> +		ret = -EINVAL;
> +		goto exit;
> +	}
> +
> +	if (trtt_params.l3_table_address &
> ~GEN9_TRTT_L3_GFXADDR_MASK) {
> +		i915_dbg(dev, "invalid l3 table address\n");
> +		ret = -EINVAL;
> +		goto exit;
> +	}
> +
> +	if (trtt_params.null_tile_val == trtt_params.invd_tile_val) {
> +		i915_dbg(dev, "incorrect values for null & invalid tiles\n");
> +		return -EINVAL;
> +	}
> +
> +	vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
> +					trtt_params.segment_base_addr);
> +	if (IS_ERR(vma)) {
> +		ret = PTR_ERR(vma);
> +		goto exit;
> +	}
> +
> +	ctx->trtt_info.vma = vma;
> +	ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
> +	ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
> +	ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
> +	ctx->trtt_info.segment_base_addr =
> trtt_params.segment_base_addr;
> +
> +	ret = intel_lr_rcs_context_setup_trtt(ctx);
> +	if (ret) {
> +		intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
> +		goto exit;
> +	}
> +
> +	ctx->flags |= CONTEXT_USE_TRTT;
> +
> +exit:
> +	return ret;
> +}
> +
>  static inline int
>  mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)  { @@ -
> 931,7 +1064,14 @@ int i915_gem_context_getparam_ioctl(struct drm_device
> *dev, void *data,
>  		return PTR_ERR(ctx);
>  	}
> 
> -	args->size = 0;
> +	/*
> +	 * Take a reference also, as in certain cases we have to release &
> +	 * reacquire the struct_mutex and we don't want the context to
> +	 * go away.
> +	 */
> +	i915_gem_context_reference(ctx);
> +
> +	args->size = (args->param != I915_CONTEXT_PARAM_TRTT) ? 0 :
> +args->size;
>  	switch (args->param) {
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
>  		args->value = ctx->hang_stats.ban_period_seconds;
> @@ -947,10 +1087,14 @@ int i915_gem_context_getparam_ioctl(struct
> drm_device *dev, void *data,
>  		else
>  			args->value = to_i915(dev)->ggtt.base.total;
>  		break;
> +	case I915_CONTEXT_PARAM_TRTT:
> +		ret = intel_context_get_trtt(ctx, args);
> +		break;
>  	default:
>  		ret = -EINVAL;
>  		break;
>  	}
> +	i915_gem_context_unreference(ctx);
>  	mutex_unlock(&dev->struct_mutex);
> 
>  	return ret;
> @@ -974,6 +1118,13 @@ int i915_gem_context_setparam_ioctl(struct
> drm_device *dev, void *data,
>  		return PTR_ERR(ctx);
>  	}
> 
> +	/*
> +	 * Take a reference also, as in certain cases we have to release &
> +	 * reacquire the struct_mutex and we don't want the context to
> +	 * go away.
> +	 */
> +	i915_gem_context_reference(ctx);
> +
>  	switch (args->param) {
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
>  		if (args->size)
> @@ -992,10 +1143,14 @@ int i915_gem_context_setparam_ioctl(struct
> drm_device *dev, void *data,
>  			ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP :
> 0;
>  		}
>  		break;
> +	case I915_CONTEXT_PARAM_TRTT:
> +		ret = intel_context_set_trtt(ctx, args);
> +		break;
>  	default:
>  		ret = -EINVAL;
>  		break;
>  	}
> +	i915_gem_context_unreference(ctx);
>  	mutex_unlock(&dev->struct_mutex);
> 
>  	return ret;
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
> b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 0715bb7..cbf8a03 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)  {
>  	gtt_write_workarounds(dev);
> 
> +	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
> +		struct drm_i915_private *dev_priv = dev->dev_private;
> +		/*
> +		 * Globally enable TR-TT support in Hw.
> +		 * Still TR-TT enabling on per context basis is required.
> +		 * Non-trtt contexts are not affected by this setting.
> +		 */
> +		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
> +			   GEN9_TRTT_BYPASS_DISABLE);
> +	}
> +
>  	/* In the case of execlists, PPGTT is enabled by the context
> descriptor
>  	 * and the PDPs are contained within the context itself.  We don't
>  	 * need to do anything here. */
> @@ -3362,6 +3373,60 @@
> i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object
> *obj,
> 
>  }
> 
> +void intel_trtt_context_destroy_vma(struct i915_vma *vma) {
> +	struct i915_address_space *vm = vma->vm;
> +
> +	WARN_ON(!list_empty(&vma->obj_link));
> +	WARN_ON(!list_empty(&vma->vm_link));
> +	WARN_ON(!list_empty(&vma->exec_list));
> +
> +	WARN_ON(!vma->pin_count);
> +
> +	if (drm_mm_node_allocated(&vma->node))
> +		drm_mm_remove_node(&vma->node);
> +
> +	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
> +	kmem_cache_free(to_i915(vm->dev)->vmas, vma); }
> +
> +struct i915_vma *
> +intel_trtt_context_allocate_vma(struct i915_address_space *vm,
> +				uint64_t segment_base_addr)
> +{
> +	struct i915_vma *vma;
> +	int ret;
> +
> +	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
> +	if (!vma)
> +		return ERR_PTR(-ENOMEM);
> +
> +	INIT_LIST_HEAD(&vma->obj_link);
> +	INIT_LIST_HEAD(&vma->vm_link);
> +	INIT_LIST_HEAD(&vma->exec_list);
> +	vma->vm = vm;
> +	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
> +
> +	/* Mark the vma as permanently pinned */
> +	vma->pin_count = 1;
> +
> +	/* Reserve from the 48 bit PPGTT space */
> +	vma->node.start = segment_base_addr;
> +	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
> +	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> +	if (ret) {
> +		ret = i915_gem_evict_for_vma(vma);
> +		if (ret == 0)
> +			ret = drm_mm_reserve_node(&vm->mm, &vma-
> >node);
> +	}
> +	if (ret) {
> +		intel_trtt_context_destroy_vma(vma);
> +		return ERR_PTR(ret);
> +	}
> +
> +	return vma;
> +}
> +
>  static struct scatterlist *
>  rotate_pages(const dma_addr_t *in, unsigned int offset,
>  	     unsigned int width, unsigned int height, diff --git
> a/drivers/gpu/drm/i915/i915_gem_gtt.h
> b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index d804be0..8cbaca2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
>  #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
>  #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
> 
> +/* Fixed size segment */
> +#define GEN9_TRTT_SEG_SIZE_SHIFT	44
> +#define GEN9_TRTT_SEGMENT_SIZE		(1ULL <<
> GEN9_TRTT_SEG_SIZE_SHIFT)
> +
>  enum i915_ggtt_view_type {
>  	I915_GGTT_VIEW_NORMAL = 0,
>  	I915_GGTT_VIEW_ROTATED,
> @@ -560,4 +564,8 @@ size_t
>  i915_ggtt_view_size(struct drm_i915_gem_object *obj,
>  		    const struct i915_ggtt_view *view);
> 
> +struct i915_vma *
> +intel_trtt_context_allocate_vma(struct i915_address_space *vm,
> +				uint64_t segment_base_addr);
> +void intel_trtt_context_destroy_vma(struct i915_vma *vma);
>  #endif
> diff --git a/drivers/gpu/drm/i915/i915_reg.h
> b/drivers/gpu/drm/i915/i915_reg.h index 264885f..07936b6 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -188,6 +188,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t
> reg)
>  #define   GEN8_RPCS_EU_MIN_SHIFT	0
>  #define   GEN8_RPCS_EU_MIN_MASK		(0xf <<
> GEN8_RPCS_EU_MIN_SHIFT)
> 
> +#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
> +#define   GEN9_TRTT_BYPASS_DISABLE	(1 << 0)
> +
> +/* TRTT registers in the H/W Context */
> +#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
> +#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
> +#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
> +
> +#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
> +#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
> +
> +#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
> +#define   GEN9_TRVA_MASK_VALUE		0xF0
> +#define   GEN9_TRVA_DATA_MASK		0xF
> +
> +#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
> +#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1 << 1)
> +#define   GEN9_TRTT_ENABLE		(1 << 0)
> +
>  #define GAM_ECOCHK			_MMIO(0x4090)
>  #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
>  #define   ECOCHK_SNB_BIT		(1<<10)
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> b/drivers/gpu/drm/i915/intel_lrc.c
> index 3a23b95..8af480b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1645,6 +1645,76 @@ static int gen9_init_render_ring(struct
> intel_engine_cs *engine)
>  	return init_workarounds_ring(engine);
>  }
> 
> +static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req)
> +{
> +	struct intel_ringbuffer *ringbuf = req->ringbuf;
> +	int ret;
> +
> +	ret = intel_logical_ring_begin(req, 2 + 2);
> +	if (ret)
> +		return ret;
> +
> +	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
> +
> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
> +	intel_logical_ring_emit(ringbuf, 0);
> +
> +	intel_logical_ring_emit(ringbuf, MI_NOOP);
> +	intel_logical_ring_advance(ringbuf);
> +
> +	return 0;
> +}
> +
> +static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req) {
> +	struct intel_context *ctx = req->ctx;
> +	struct intel_ringbuffer *ringbuf = req->ringbuf;
> +	u64 masked_l3_gfx_address =
> +		ctx->trtt_info.l3_table_address &
> GEN9_TRTT_L3_GFXADDR_MASK;
> +	u32 trva_data_value =
> +		(ctx->trtt_info.segment_base_addr >>
> GEN9_TRTT_SEG_SIZE_SHIFT) &
> +		GEN9_TRVA_DATA_MASK;
> +	const int num_lri_cmds = 6;
> +	int ret;
> +
> +	/*
> +	 * Emitting LRIs to update the TRTT registers is most reliable, instead
> +	 * of directly updating the context image, as this will ensure that
> +	 * update happens in a serialized manner for the context and also
> +	 * lite-restore scenario will get handled.
> +	 */
> +	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
> +	if (ret)
> +		return ret;
> +
> +	intel_logical_ring_emit(ringbuf,
> MI_LOAD_REGISTER_IMM(num_lri_cmds));
> +
> +	intel_logical_ring_emit_reg(ringbuf,
> GEN9_TRTT_L3_POINTER_DW0);
> +	intel_logical_ring_emit(ringbuf,
> +lower_32_bits(masked_l3_gfx_address));
> +
> +	intel_logical_ring_emit_reg(ringbuf,
> GEN9_TRTT_L3_POINTER_DW1);
> +	intel_logical_ring_emit(ringbuf,
> +upper_32_bits(masked_l3_gfx_address));
> +
> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
> +	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
> +
> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
> +	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
> +
> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
> +	intel_logical_ring_emit(ringbuf,
> +				GEN9_TRVA_MASK_VALUE |
> trva_data_value);
> +
> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
> +	intel_logical_ring_emit(ringbuf,
> +				GEN9_TRTT_IN_GFX_VA_SPACE |
> GEN9_TRTT_ENABLE);
> +
> +	intel_logical_ring_emit(ringbuf, MI_NOOP);
> +	intel_logical_ring_advance(ringbuf);
> +
> +	return 0;
> +}
> +
>  static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
> {
>  	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt; @@ -2003,6
> +2073,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request
> *req)
>  	return intel_lr_context_render_state_init(req);
>  }
> 
> +static int gen9_init_rcs_context(struct drm_i915_gem_request *req) {
> +	int ret;
> +
> +	/*
> +	 * Explictily disable TR-TT at the start of a new context.
> +	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
> +	 * context the TR-TT settings of the outgoing context could get
> +	 * spilled on to the new incoming context as only the Ring Context
> +	 * part is loaded on the first submission of a new context, due to
> +	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
> +	 */
> +	ret = gen9_init_rcs_context_trtt(req);
> +	if (ret)
> +		return ret;
> +
> +	return gen8_init_rcs_context(req);
> +}
> +
>  /**
>   * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
>   *
> @@ -2134,11 +2223,14 @@ static int logical_render_ring_init(struct
> drm_device *dev)
>  	logical_ring_default_vfuncs(dev, engine);
> 
>  	/* Override some for render ring. */
> -	if (INTEL_INFO(dev)->gen >= 9)
> +	if (INTEL_INFO(dev)->gen >= 9) {
>  		engine->init_hw = gen9_init_render_ring;
> -	else
> +		engine->init_context = gen9_init_rcs_context;
> +	} else {
>  		engine->init_hw = gen8_init_render_ring;
> -	engine->init_context = gen8_init_rcs_context;
> +		engine->init_context = gen8_init_rcs_context;
> +	}
> +
>  	engine->cleanup = intel_fini_pipe_control;
>  	engine->emit_flush = gen8_emit_flush_render;
>  	engine->emit_request = gen8_emit_request_render; @@ -2702,3
> +2794,29 @@ void intel_lr_context_reset(struct drm_device *dev,
>  		ringbuf->tail = 0;
>  	}
>  }
> +
> +int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx) {
> +	struct intel_engine_cs *engine = &(ctx->i915->engine[RCS]);
> +	struct drm_i915_gem_request *req;
> +	int ret;
> +
> +	if (!ctx->engine[RCS].state) {
> +		ret = intel_lr_context_deferred_alloc(ctx, engine);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	req = i915_gem_request_alloc(engine, ctx);
> +	if (IS_ERR(req))
> +		return PTR_ERR(req);
> +
> +	ret = gen9_emit_trtt_regs(req);
> +	if (ret) {
> +		i915_gem_request_cancel(req);
> +		return ret;
> +	}
> +
> +	i915_add_request(req);
> +	return 0;
> +}
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h
> b/drivers/gpu/drm/i915/intel_lrc.h
> index a17cb12..f3600b2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -107,6 +107,7 @@ void intel_lr_context_reset(struct drm_device *dev,
>  			struct intel_context *ctx);
>  uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>  				     struct intel_engine_cs *engine);
> +int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx);
> 
>  u32 intel_execlists_ctx_id(struct intel_context *ctx,
>  			   struct intel_engine_cs *engine);
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index a5524cc..604da23 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
>  #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
>  #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
>  #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
> +#define I915_CONTEXT_PARAM_TRTT		0x4
>  	__u64 value;
>  };
> 
> +struct drm_i915_gem_context_trtt_param {
> +	__u64 segment_base_addr;
> +	__u64 l3_table_address;
> +	__u32 invd_tile_val;
> +	__u32 null_tile_val;
> +};
> +
>  #endif /* _UAPI_I915_DRM_H_ */
> --
> 1.9.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v9] drm/i915: Support to enable TRTT on GEN9
  2016-03-24 16:29                                       ` Gore, Tim
@ 2016-03-24 16:36                                         ` Goel, Akash
  0 siblings, 0 replies; 59+ messages in thread
From: Goel, Akash @ 2016-03-24 16:36 UTC (permalink / raw)
  To: Gore, Tim, intel-gfx; +Cc: akash.goel



On 3/24/2016 9:59 PM, Gore, Tim wrote:
>
> Tim Gore
> Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ
>
>> -----Original Message-----
>> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
>> Of akash.goel@intel.com
>> Sent: Tuesday, March 22, 2016 8:43 AM
>> To: intel-gfx@lists.freedesktop.org
>> Cc: Goel, Akash
>> Subject: [Intel-gfx] [PATCH v9] drm/i915: Support to enable TRTT on GEN9
>>
>> From: Akash Goel <akash.goel@intel.com>
>>
>> Gen9 has an additional address translation hardware support in form of Tiled
>> Resource Translation Table (TR-TT) which provides an extra level of
>> abstraction over PPGTT.
>> This is useful for mapping Sparse/Tiled texture resources.
>> Sparse resources are created as virtual-only allocations. Regions of the
>> resource that the application intends to use is bound to the physical memory
>> on the fly and can be re-bound to different memory allocations over the
>> lifetime of the resource.
>>
>> TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
>> for a new PPGTT instance, but TR-TT may not enabled for every context.
>> 1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
>> which such chunk to use is conveyed to HW through a register.
>> Any GFX address, which lies in that reserved 44 bit range will be translated
>> through TR-TT first and then through PPGTT to get the actual physical
>> address, so the output of translation from TR-TT will be a PPGTT offset.
>>
>> TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which leaves
>> behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and each
>> level is contained within a 4KB page hence L3 and L2 is composed of
>> 512 64b entries and L1 is composed of 1024 32b entries.
>>
>> There is a provision to keep TR-TT Tables in virtual space, where the pages of
>> TRTT tables will be mapped to PPGTT.
>> Currently this is the supported mode, in this mode UMD will have a full
>> control on TR-TT management, with bare minimum support from KMD.
>> So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
>> similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
>> The entries of L1 table will contain the PPGTT offset of BOs actually backing
>> the Sparse resources.
>> UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
>> assign them a PPGTT address through the Soft Pin API (for example, use soft
>> pin to assign l3_table_address to the L3 table BO, when used).
>> UMD will also program the entries in the TR-TT page tables using regular
>> batch commands (MI_STORE_DATA_IMM), or via mmapping of the page
>> table BOs.
>> UMD may do the complete PPGTT address space management, on the
>> pretext that it could help minimize the conflicts.
>>
>> Any space in TR-TT segment not bound to any Sparse texture, will be handled
>> through Invalid tile, User is expected to initialize the entries of a new
>> L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
>> the holes in the Sparse texture resource will be set with the Null tile pattern
>> The improper programming of TRTT should only lead to a recoverable GPU
>> hang, eventually leading to banning of the culprit context without victimizing
>> others.
>>
>> The association of any Sparse resource with the BOs will be known only to
>> UMD, and only the Sparse resources shall be assigned an offset from the TR-
>> TT segment by UMD. The use of TR-TT segment or mapping of Sparse
>> resources will be transparent to the KMD, UMD will do the address
>> assignment from TR-TT segment autonomously and KMD will be oblivious of
>> it.
>> Any object must not be assigned an address from TR-TT segment, they will
>> be mapped to PPGTT in a regular way by KMD.
>>
>> This patch provides an interface through which UMD can convey KMD to
>> enable TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param
>> has been added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
>> UMD will have to pass the GFX address of L3 table page, start location of TR-
>> TT segment alongwith the pattern value for the Null & invalid Tile registers.
>>
>> v2:
>>   - Support context_getparam for TRTT also and dispense with a separate
>>     GETPARAM case for TRTT (Chris).
>>   - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
>>     from user space (Chris).
>>   - Move all the argument checking for TRTT in context_setparam to the
>>     set_trtt function (Chris).
>>   - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
>>   - Rename certain functions to rightly reflect their purpose, rename
>>     the new param for TRTT in gem_context_param to
>> I915_CONTEXT_PARAM_TRTT,
>>     rephrase few lines in the commit message body, add more comments
>> (Chris).
>>   - Extend ABI to allow User specify TRTT segment location also.
>>   - Fix for selective enabling of TRTT on per context basis, explicitly
>>     disable TR-TT at the start of a new context.
>>
>> v3:
>>   - Check the return value of gen9_emit_trtt_regs (Chris)
>>   - Update the kernel doc for intel_context structure.
>>   - Rebased.
>>
>> v4:
>>   - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
>>   - Fix the context_getparam implementation avoiding the reset of size field,
>>     affecting the TRTT case.
>>
>> v5:
>>   - Update the TR-TT params right away in context_setparam, by constructing
>>     & submitting a request emitting LRIs, instead of deferring it and
>>     conflating with the next batch submission (Chris)
>>   - Follow the struct_mutex handling related prescribed rules, while accessing
>>     User space buffer, both in context_setparam & getparam functions (Chris).
>>
>> v6:
>>   - Fix the warning caused due to removal of un-allocated trtt vma node.
>>
>> v7:
>>   - Move context ref/unref to context_setparam_ioctl from set_trtt() &
>> remove
>>     that from get_trtt() as not really needed there (Chris).
>>   - Add a check for improper values for Null & Invalid Tiles.
>>   - Remove superfluous DRM_ERROR from trtt_context_allocate_vma (Chris).
>>   - Rebased.
>>
>> v8:
>>   - Add context ref/unref to context_getparam_ioctl also so as to be
>> consistent
>>     and ease the extension of ioctl in future (Chris)
>>
>> v9:
>>   - Fix the handling of return value from trtt_context_allocate_vma() function,
>>     causing kernel panic at the time of destroying context, in case of
>>     unsuccessful allocation of trtt vma.
>>   - Rebased.
>>
>> Testcase: igt/gem_trtt
>>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Michel Thierry <michel.thierry@intel.com>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         |  16 +++-
>>   drivers/gpu/drm/i915/i915_gem_context.c | 157
>> +++++++++++++++++++++++++++++++-
>>   drivers/gpu/drm/i915/i915_gem_gtt.c     |  65 +++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_gtt.h     |   8 ++
>>   drivers/gpu/drm/i915/i915_reg.h         |  19 ++++
>>   drivers/gpu/drm/i915/intel_lrc.c        | 124 ++++++++++++++++++++++++-
>>   drivers/gpu/drm/i915/intel_lrc.h        |   1 +
>>   include/uapi/drm/i915_drm.h             |   8 ++
>>   8 files changed, 393 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h index ecbd418..272d1f8 100644

>>
>> @@ -2657,6 +2669,8 @@ struct drm_i915_cmd_table {
>>   				 !IS_VALLEYVIEW(dev) &&
>> !IS_CHERRYVIEW(dev) && \
>>   				 !IS_BROXTON(dev))
>>
>> +#define HAS_TRTT(dev)		(IS_GEN9(dev))
>> +
>
> A very minor point, but there is a w/a to disable TRTT for BXT_REVID_A0/1. I realise this
> is basically obsolete now, but I'm still using one!
>
Thanks for raising this.
Michel & Thomas also apprised me about a similar WA for KBL.
Was thinking to submit that as a follow up patch.

Best regards
Akash
>>   	return ret;
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index 0715bb7..cbf8a03 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev)  {
>>   	gtt_write_workarounds(dev);
>>
>> +	if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) {
>> +		struct drm_i915_private *dev_priv = dev->dev_private;
>> +		/*
>> +		 * Globally enable TR-TT support in Hw.
>> +		 * Still TR-TT enabling on per context basis is required.
>> +		 * Non-trtt contexts are not affected by this setting.
>> +		 */
>> +		I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
>> +			   GEN9_TRTT_BYPASS_DISABLE);
>> +	}
>> +
>>   	/* In the case of execlists, PPGTT is enabled by the context
>> descriptor
>>   	 * and the PDPs are contained within the context itself.  We don't
>>   	 * need to do anything here. */
>> @@ -3362,6 +3373,60 @@
>> i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object
>> *obj,
>>
>>   }
>>
>> +void intel_trtt_context_destroy_vma(struct i915_vma *vma) {
>> +	struct i915_address_space *vm = vma->vm;
>> +
>> +	WARN_ON(!list_empty(&vma->obj_link));
>> +	WARN_ON(!list_empty(&vma->vm_link));
>> +	WARN_ON(!list_empty(&vma->exec_list));
>> +
>> +	WARN_ON(!vma->pin_count);
>> +
>> +	if (drm_mm_node_allocated(&vma->node))
>> +		drm_mm_remove_node(&vma->node);
>> +
>> +	i915_ppgtt_put(i915_vm_to_ppgtt(vm));
>> +	kmem_cache_free(to_i915(vm->dev)->vmas, vma); }
>> +
>> +struct i915_vma *
>> +intel_trtt_context_allocate_vma(struct i915_address_space *vm,
>> +				uint64_t segment_base_addr)
>> +{
>> +	struct i915_vma *vma;
>> +	int ret;
>> +
>> +	vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL);
>> +	if (!vma)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	INIT_LIST_HEAD(&vma->obj_link);
>> +	INIT_LIST_HEAD(&vma->vm_link);
>> +	INIT_LIST_HEAD(&vma->exec_list);
>> +	vma->vm = vm;
>> +	i915_ppgtt_get(i915_vm_to_ppgtt(vm));
>> +
>> +	/* Mark the vma as permanently pinned */
>> +	vma->pin_count = 1;
>> +
>> +	/* Reserve from the 48 bit PPGTT space */
>> +	vma->node.start = segment_base_addr;
>> +	vma->node.size = GEN9_TRTT_SEGMENT_SIZE;
>> +	ret = drm_mm_reserve_node(&vm->mm, &vma->node);
>> +	if (ret) {
>> +		ret = i915_gem_evict_for_vma(vma);
>> +		if (ret == 0)
>> +			ret = drm_mm_reserve_node(&vm->mm, &vma-
>>> node);
>> +	}
>> +	if (ret) {
>> +		intel_trtt_context_destroy_vma(vma);
>> +		return ERR_PTR(ret);
>> +	}
>> +
>> +	return vma;
>> +}
>> +
>>   static struct scatterlist *
>>   rotate_pages(const dma_addr_t *in, unsigned int offset,
>>   	     unsigned int width, unsigned int height, diff --git
>> a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> index d804be0..8cbaca2 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> @@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
>>   #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
>>   #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
>>
>> +/* Fixed size segment */
>> +#define GEN9_TRTT_SEG_SIZE_SHIFT	44
>> +#define GEN9_TRTT_SEGMENT_SIZE		(1ULL <<
>> GEN9_TRTT_SEG_SIZE_SHIFT)
>> +
>>   enum i915_ggtt_view_type {
>>   	I915_GGTT_VIEW_NORMAL = 0,
>>   	I915_GGTT_VIEW_ROTATED,
>> @@ -560,4 +564,8 @@ size_t
>>   i915_ggtt_view_size(struct drm_i915_gem_object *obj,
>>   		    const struct i915_ggtt_view *view);
>>
>> +struct i915_vma *
>> +intel_trtt_context_allocate_vma(struct i915_address_space *vm,
>> +				uint64_t segment_base_addr);
>> +void intel_trtt_context_destroy_vma(struct i915_vma *vma);
>>   #endif
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_reg.h index 264885f..07936b6 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -188,6 +188,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t
>> reg)
>>   #define   GEN8_RPCS_EU_MIN_SHIFT	0
>>   #define   GEN8_RPCS_EU_MIN_MASK		(0xf <<
>> GEN8_RPCS_EU_MIN_SHIFT)
>>
>> +#define GEN9_TR_CHICKEN_BIT_VECTOR	_MMIO(0x4DFC)
>> +#define   GEN9_TRTT_BYPASS_DISABLE	(1 << 0)
>> +
>> +/* TRTT registers in the H/W Context */
>> +#define GEN9_TRTT_L3_POINTER_DW0	_MMIO(0x4DE0)
>> +#define GEN9_TRTT_L3_POINTER_DW1	_MMIO(0x4DE4)
>> +#define   GEN9_TRTT_L3_GFXADDR_MASK	0xFFFFFFFF0000
>> +
>> +#define GEN9_TRTT_NULL_TILE_REG		_MMIO(0x4DE8)
>> +#define GEN9_TRTT_INVD_TILE_REG		_MMIO(0x4DEC)
>> +
>> +#define GEN9_TRTT_VA_MASKDATA		_MMIO(0x4DF0)
>> +#define   GEN9_TRVA_MASK_VALUE		0xF0
>> +#define   GEN9_TRVA_DATA_MASK		0xF
>> +
>> +#define GEN9_TRTT_TABLE_CONTROL		_MMIO(0x4DF4)
>> +#define   GEN9_TRTT_IN_GFX_VA_SPACE	(1 << 1)
>> +#define   GEN9_TRTT_ENABLE		(1 << 0)
>> +
>>   #define GAM_ECOCHK			_MMIO(0x4090)
>>   #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
>>   #define   ECOCHK_SNB_BIT		(1<<10)
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index 3a23b95..8af480b 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1645,6 +1645,76 @@ static int gen9_init_render_ring(struct
>> intel_engine_cs *engine)
>>   	return init_workarounds_ring(engine);
>>   }
>>
>> +static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req)
>> +{
>> +	struct intel_ringbuffer *ringbuf = req->ringbuf;
>> +	int ret;
>> +
>> +	ret = intel_logical_ring_begin(req, 2 + 2);
>> +	if (ret)
>> +		return ret;
>> +
>> +	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
>> +
>> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
>> +	intel_logical_ring_emit(ringbuf, 0);
>> +
>> +	intel_logical_ring_emit(ringbuf, MI_NOOP);
>> +	intel_logical_ring_advance(ringbuf);
>> +
>> +	return 0;
>> +}
>> +
>> +static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req) {
>> +	struct intel_context *ctx = req->ctx;
>> +	struct intel_ringbuffer *ringbuf = req->ringbuf;
>> +	u64 masked_l3_gfx_address =
>> +		ctx->trtt_info.l3_table_address &
>> GEN9_TRTT_L3_GFXADDR_MASK;
>> +	u32 trva_data_value =
>> +		(ctx->trtt_info.segment_base_addr >>
>> GEN9_TRTT_SEG_SIZE_SHIFT) &
>> +		GEN9_TRVA_DATA_MASK;
>> +	const int num_lri_cmds = 6;
>> +	int ret;
>> +
>> +	/*
>> +	 * Emitting LRIs to update the TRTT registers is most reliable, instead
>> +	 * of directly updating the context image, as this will ensure that
>> +	 * update happens in a serialized manner for the context and also
>> +	 * lite-restore scenario will get handled.
>> +	 */
>> +	ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2);
>> +	if (ret)
>> +		return ret;
>> +
>> +	intel_logical_ring_emit(ringbuf,
>> MI_LOAD_REGISTER_IMM(num_lri_cmds));
>> +
>> +	intel_logical_ring_emit_reg(ringbuf,
>> GEN9_TRTT_L3_POINTER_DW0);
>> +	intel_logical_ring_emit(ringbuf,
>> +lower_32_bits(masked_l3_gfx_address));
>> +
>> +	intel_logical_ring_emit_reg(ringbuf,
>> GEN9_TRTT_L3_POINTER_DW1);
>> +	intel_logical_ring_emit(ringbuf,
>> +upper_32_bits(masked_l3_gfx_address));
>> +
>> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG);
>> +	intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val);
>> +
>> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG);
>> +	intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val);
>> +
>> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA);
>> +	intel_logical_ring_emit(ringbuf,
>> +				GEN9_TRVA_MASK_VALUE |
>> trva_data_value);
>> +
>> +	intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL);
>> +	intel_logical_ring_emit(ringbuf,
>> +				GEN9_TRTT_IN_GFX_VA_SPACE |
>> GEN9_TRTT_ENABLE);
>> +
>> +	intel_logical_ring_emit(ringbuf, MI_NOOP);
>> +	intel_logical_ring_advance(ringbuf);
>> +
>> +	return 0;
>> +}
>> +
>>   static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
>> {
>>   	struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt; @@ -2003,6
>> +2073,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request
>> *req)
>>   	return intel_lr_context_render_state_init(req);
>>   }
>>
>> +static int gen9_init_rcs_context(struct drm_i915_gem_request *req) {
>> +	int ret;
>> +
>> +	/*
>> +	 * Explictily disable TR-TT at the start of a new context.
>> +	 * Otherwise on switching from a TR-TT context to a new Non TR-TT
>> +	 * context the TR-TT settings of the outgoing context could get
>> +	 * spilled on to the new incoming context as only the Ring Context
>> +	 * part is loaded on the first submission of a new context, due to
>> +	 * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
>> +	 */
>> +	ret = gen9_init_rcs_context_trtt(req);
>> +	if (ret)
>> +		return ret;
>> +
>> +	return gen8_init_rcs_context(req);
>> +}
>> +
>>   /**
>>    * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
>>    *
>> @@ -2134,11 +2223,14 @@ static int logical_render_ring_init(struct
>> drm_device *dev)
>>   	logical_ring_default_vfuncs(dev, engine);
>>
>>   	/* Override some for render ring. */
>> -	if (INTEL_INFO(dev)->gen >= 9)
>> +	if (INTEL_INFO(dev)->gen >= 9) {
>>   		engine->init_hw = gen9_init_render_ring;
>> -	else
>> +		engine->init_context = gen9_init_rcs_context;
>> +	} else {
>>   		engine->init_hw = gen8_init_render_ring;
>> -	engine->init_context = gen8_init_rcs_context;
>> +		engine->init_context = gen8_init_rcs_context;
>> +	}
>> +
>>   	engine->cleanup = intel_fini_pipe_control;
>>   	engine->emit_flush = gen8_emit_flush_render;
>>   	engine->emit_request = gen8_emit_request_render; @@ -2702,3
>> +2794,29 @@ void intel_lr_context_reset(struct drm_device *dev,
>>   		ringbuf->tail = 0;
>>   	}
>>   }
>> +
>> +int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx) {
>> +	struct intel_engine_cs *engine = &(ctx->i915->engine[RCS]);
>> +	struct drm_i915_gem_request *req;
>> +	int ret;
>> +
>> +	if (!ctx->engine[RCS].state) {
>> +		ret = intel_lr_context_deferred_alloc(ctx, engine);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +
>> +	req = i915_gem_request_alloc(engine, ctx);
>> +	if (IS_ERR(req))
>> +		return PTR_ERR(req);
>> +
>> +	ret = gen9_emit_trtt_regs(req);
>> +	if (ret) {
>> +		i915_gem_request_cancel(req);
>> +		return ret;
>> +	}
>> +
>> +	i915_add_request(req);
>> +	return 0;
>> +}
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h
>> b/drivers/gpu/drm/i915/intel_lrc.h
>> index a17cb12..f3600b2 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>> @@ -107,6 +107,7 @@ void intel_lr_context_reset(struct drm_device *dev,
>>   			struct intel_context *ctx);
>>   uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>>   				     struct intel_engine_cs *engine);
>> +int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx);
>>
>>   u32 intel_execlists_ctx_id(struct intel_context *ctx,
>>   			   struct intel_engine_cs *engine);
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index a5524cc..604da23 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param {
>>   #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
>>   #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
>>   #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
>> +#define I915_CONTEXT_PARAM_TRTT		0x4
>>   	__u64 value;
>>   };
>>
>> +struct drm_i915_gem_context_trtt_param {
>> +	__u64 segment_base_addr;
>> +	__u64 l3_table_address;
>> +	__u32 invd_tile_val;
>> +	__u32 null_tile_val;
>> +};
>> +
>>   #endif /* _UAPI_I915_DRM_H_ */
>> --
>> 1.9.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-14 10:53 [PATCH] drm/i915: add onoff utility function Jani Nikula
@ 2016-01-14 12:49 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-14 12:49 UTC (permalink / raw)
  To: Jani Nikula; +Cc: intel-gfx

== Summary ==

Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest


bdw-nuci7        total:138  pass:128  dwarn:1   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:100  dwarn:4   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 

Results at /archive/results/CI_IGT_test/Patchwork_1185/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-14  2:59 [PATCH] drm/i915: Make sure DC writes are coherent on flush Francisco Jerez
@ 2016-01-14 10:49 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-14 10:49 UTC (permalink / raw)
  To: Francisco Jerez; +Cc: intel-gfx

== Summary ==

Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (skl-i7k-2) UNSTABLE
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (ilk-hp8440p)

bdw-nuci7        total:138  pass:128  dwarn:1   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1181/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-13 17:38 [PATCH] drm/i915: Demote user facing DMC firmware load failure message Chris Wilson
@ 2016-01-14  9:20 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-14  9:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Summary ==

Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest

Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (ilk-hp8440p)

bdw-nuci7        total:138  pass:128  dwarn:1   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1178/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-13 17:04 [PATCH] drm/i915: Force ordering on request submission and hangcheck Mika Kuoppala
@ 2016-01-14  8:20 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-14  8:20 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

== Summary ==

Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (bdw-nuci7)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (ilk-hp8440p)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1176/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-13 16:33 [PATCH] drm/i915: Dump power well states on unclaimed trace Mika Kuoppala
@ 2016-01-14  7:49 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-14  7:49 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

== Summary ==

Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (bdw-nuci7)
                dmesg-warn -> PASS       (skl-i7k-2) UNSTABLE

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
ilk-hp8440p      total:141  pass:100  dwarn:4   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1175/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2015-12-21 11:57 [PATCH] drm/i915: Handle PipeC fused off on HSW Gabriel Feceoru
@ 2016-01-13 15:41 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-13 15:41 UTC (permalink / raw)
  To: Feceoru, Gabriel; +Cc: intel-gfx

== Summary ==

Built on aa7ddea990dfc10c7e90ad10820e0121a9667453 drm-intel-nightly: 2016y-01m-13d-15h-05m-13s UTC integration manifest

Test kms_pipe_crc_basic:
        Subgroup nonblocking-crc-pipe-b:
                skip       -> PASS       (bdw-nuci7)
        Subgroup nonblocking-crc-pipe-c:
                pass       -> SKIP       (bdw-nuci7)

bdw-nuci7        total:138  pass:128  dwarn:0   dfail:0   fail:0   skip:10 
bdw-ultra        total:138  pass:131  dwarn:1   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 

Results at /archive/results/CI_IGT_test/Patchwork_1171/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-13 10:44 [PATCH] drm/i915: Expose exec parameter to force non IA-Coherent for Gen9+ Artur Harasimiuk
@ 2016-01-13 11:49 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-13 11:49 UTC (permalink / raw)
  To: Artur Harasimiuk; +Cc: intel-gfx

== Summary ==

Built on 8da57dfe6c675c35109dac986e3f8b627cffab49 drm-intel-nightly: 2016y-01m-13d-10h-33m-04s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                pass       -> DMESG-WARN (skl-i5k-2) UNSTABLE
                dmesg-warn -> PASS       (bdw-nuci7)
                dmesg-warn -> PASS       (bdw-ultra)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (ilk-hp8440p)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b:
                dmesg-warn -> PASS       (bdw-ultra)
        Subgroup suspend-read-crc-pipe-a:
                dmesg-warn -> PASS       (snb-x220t)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1162/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-13 10:06 [PATCH 0/8] Gen9 HW whitelist and Preemption WA patches Arun Siluvery
@ 2016-01-13 10:50 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-13 10:50 UTC (permalink / raw)
  To: arun.siluvery; +Cc: intel-gfx

== Summary ==

Built on dd4a7926b4118f72b7ae0f7b97e9644172df472c drm-intel-nightly: 2016y-01m-13d-09h-05m-34s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                pass       -> DMESG-WARN (skl-i5k-2) UNSTABLE

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
hsw-xps12        total:138  pass:133  dwarn:1   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1161/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-12 23:40 [PATCH] drm/i915: Allow i915_gem_object_get_page() on userptr as well Chris Wilson
@ 2016-01-13  9:20 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-13  9:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Summary ==

Built on 06d0112e293dfdea7f796d4085f755898850947b drm-intel-nightly: 2016y-01m-12d-21h-16m-40s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (bdw-ultra)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (skl-i7k-2)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-a-frame-sequence:
                fail       -> PASS       (snb-x220t)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc          total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:100  dwarn:4   dfail:0   fail:0   skip:37 
skl-i5k-2        total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1159/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-12 17:32 [PATCH 1/3] drm/i915: Extract vfunc setup from logical ring initializers Tvrtko Ursulin
@ 2016-01-13  8:11 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-13  8:11 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Summary ==

Built on 06d0112e293dfdea7f796d4085f755898850947b drm-intel-nightly: 2016y-01m-12d-21h-16m-40s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (bdw-ultra)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (skl-i7k-2)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-a-frame-sequence:
                fail       -> PASS       (snb-x220t)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
byt-nuc          total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
hsw-xps12        total:138  pass:133  dwarn:1   dfail:0   fail:0   skip:4  
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1156/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: ✓ success:  Fi.CI.BAT
  2016-01-11 11:53 ` ✓ success: Fi.CI.BAT Patchwork
@ 2016-01-12 16:50   ` Daniel Vetter
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Vetter @ 2016-01-12 16:50 UTC (permalink / raw)
  To: Patchwork; +Cc: intel-gfx

On Mon, Jan 11, 2016 at 11:53:53AM -0000, Patchwork wrote:
> == Summary ==
> 
> Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC integration manifest
> 
> Test gem_storedw_loop:
>         Subgroup basic-render:
>                 dmesg-warn -> PASS       (bdw-ultra)
> Test kms_flip:
>         Subgroup basic-flip-vs-dpms:
>                 dmesg-warn -> PASS       (ilk-hp8440p)
> Test kms_pipe_crc_basic:
>         Subgroup read-crc-pipe-b:
>                 dmesg-warn -> PASS       (byt-nuc)
> 
> bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
> bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
> bsw-nuc-2        total:141  pass:114  dwarn:3   dfail:0   fail:0   skip:24 
> byt-nuc          total:141  pass:119  dwarn:7   dfail:0   fail:0   skip:15 
> hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
> hsw-gt2          total:141  pass:136  dwarn:0   dfail:0   fail:1   skip:4  
> hsw-xps12        total:138  pass:133  dwarn:1   dfail:0   fail:0   skip:4  
> ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
> ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
> skl-i5k-2        total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
> skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
> snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
> snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 
> 
> Results at /archive/results/CI_IGT_test/Patchwork_1125/

Yay, a lucky patch that passed bat, so merged it to dinq!
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-12 15:28 [PATCH] drm/i915: Only complain about n_edp_entries with eDP ports ville.syrjala
@ 2016-01-12 16:49 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-12 16:49 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

== Summary ==

Built on 37f6c2ae666fbba9eff4355115252b8b0fd43050 drm-intel-nightly: 2016y-01m-12d-14h-25m-44s UTC integration manifest

Test drv_module_reload_basic:
        Subgroup none:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
Test gem_storedw_loop:
        Subgroup basic-render:
                pass       -> DMESG-WARN (skl-i5k-2) UNSTABLE
                dmesg-warn -> PASS       (bdw-nuci7)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup basic-flip-vs-modeset:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup basic-flip-vs-wf_vblank:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup basic-plain-flip:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
Test kms_pipe_crc_basic:
        Subgroup hang-read-crc-pipe-a:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup hang-read-crc-pipe-b:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup hang-read-crc-pipe-c:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup nonblocking-crc-pipe-a:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup nonblocking-crc-pipe-a-frame-sequence:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup nonblocking-crc-pipe-b:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup nonblocking-crc-pipe-b-frame-sequence:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup nonblocking-crc-pipe-c:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup nonblocking-crc-pipe-c-frame-sequence:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup read-crc-pipe-a:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup read-crc-pipe-a-frame-sequence:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
                dmesg-warn -> PASS       (byt-nuc)
        Subgroup read-crc-pipe-b:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup read-crc-pipe-b-frame-sequence:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup read-crc-pipe-c:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (bdw-ultra)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup read-crc-pipe-c-frame-sequence:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup suspend-read-crc-pipe-a:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup suspend-read-crc-pipe-c:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
Test pm_rpm:
        Subgroup basic-pci-d3-state:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)
        Subgroup basic-rte:
                dmesg-warn -> PASS       (skl-i5k-2)
                dmesg-warn -> PASS       (skl-i7k-2)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc          total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
hsw-xps12        total:138  pass:133  dwarn:1   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1153/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-12 15:00 [PATCH 0/3] LPSS PWM support for devices that support it Shobhit Kumar
@ 2016-01-12 15:20 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-12 15:20 UTC (permalink / raw)
  To: Shobhit Kumar; +Cc: intel-gfx

== Summary ==

Built on 37f6c2ae666fbba9eff4355115252b8b0fd43050 drm-intel-nightly: 2016y-01m-12d-14h-25m-44s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (bdw-nuci7)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-a-frame-sequence:
                dmesg-warn -> PASS       (byt-nuc)
        Subgroup read-crc-pipe-c:
                dmesg-warn -> PASS       (bdw-ultra)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc          total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:108  dwarn:25  dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:107  dwarn:26  dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1150/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-11 19:54 [PATCH v2 0/6] drm/i915: start hiding away vbt structure from the driver Jani Nikula
@ 2016-01-12  8:20 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-12  8:20 UTC (permalink / raw)
  To: Jani Nikula; +Cc: intel-gfx

== Summary ==

Built on a90796840c30dac6d9907439bf98d1d08046c49d drm-intel-nightly: 2016y-01m-11d-17h-22m-54s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                pass       -> DMESG-WARN (skl-i5k-2) UNSTABLE
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                dmesg-warn -> PASS       (ilk-hp8440p)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc          total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:102  dwarn:2   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1138/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-11 11:39 [PATCH] drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page Michel Thierry
@ 2016-01-11 11:53 ` Patchwork
  2016-01-12 16:50   ` Daniel Vetter
  0 siblings, 1 reply; 59+ messages in thread
From: Patchwork @ 2016-01-11 11:53 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

== Summary ==

Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (bdw-ultra)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                dmesg-warn -> PASS       (ilk-hp8440p)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b:
                dmesg-warn -> PASS       (byt-nuc)

bdw-nuci7        total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:141  pass:114  dwarn:3   dfail:0   fail:0   skip:24 
byt-nuc          total:141  pass:119  dwarn:7   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:136  dwarn:0   dfail:0   fail:1   skip:4  
hsw-xps12        total:138  pass:133  dwarn:1   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430s        total:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2        total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1125/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-07 16:36 [PATCH 0/6] Misc cleanups Tvrtko Ursulin
@ 2016-01-11  9:27 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-11  9:27 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Summary ==

Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC integration manifest

Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b:
                dmesg-warn -> PASS       (byt-nuc)

bdw-ultra        total:138  pass:130  dwarn:1   dfail:0   fail:1   skip:6  
bsw-nuc-2        total:141  pass:114  dwarn:3   dfail:0   fail:0   skip:24 
byt-nuc          total:141  pass:119  dwarn:7   dfail:0   fail:0   skip:15 
hsw-brixbox      total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:141  pass:100  dwarn:4   dfail:0   fail:0   skip:37 
skl-i5k-2        total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
skl-i7k-2        total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220t        total:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1113/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-06 20:53 [PATCH] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed yu.dai
@ 2016-01-07  7:49 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-07  7:49 UTC (permalink / raw)
  To: yu.dai; +Cc: intel-gfx

== Summary ==

Built on 532a438d16e609a4b8f161c0a18b34f24001ed8f drm-intel-nightly: 2016y-01m-06d-15h-38m-17s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-vebox:
                skip       -> PASS       (bdw-nuci7)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-c:
                pass       -> SKIP       (bdw-nuci7)

bdw-nuci7        total:132  pass:1    dwarn:0   dfail:0   fail:0   skip:131
bsw-nuc-2        total:135  pass:115  dwarn:0   dfail:0   fail:0   skip:20 
byt-nuc          total:135  pass:121  dwarn:1   dfail:0   fail:0   skip:13 
skl-i5k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:135  pass:123  dwarn:0   dfail:0   fail:0   skip:12 

Results at /archive/results/CI_IGT_test/Patchwork_1107/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-06  1:44 [PATCH] drm/i915/bxt: Don't save/restore eDP panel power during suspend Matt Roper
  2016-01-06 12:20 ` ✓ success: Fi.CI.BAT Patchwork
@ 2016-01-07  7:20 ` Patchwork
  1 sibling, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-07  7:20 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

== Summary ==

Built on 532a438d16e609a4b8f161c0a18b34f24001ed8f drm-intel-nightly: 2016y-01m-06d-15h-38m-17s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (skl-i5k-2) UNSTABLE
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-a-frame-sequence:
                pass       -> DMESG-WARN (byt-nuc) UNSTABLE
        Subgroup read-crc-pipe-c:
                pass       -> SKIP       (bdw-nuci7)

bdw-nuci7        total:132  pass:1    dwarn:0   dfail:0   fail:0   skip:131
bsw-nuc-2        total:135  pass:115  dwarn:0   dfail:0   fail:0   skip:20 
byt-nuc          total:135  pass:120  dwarn:2   dfail:0   fail:0   skip:13 
skl-i5k-2        total:135  pass:126  dwarn:1   dfail:0   fail:0   skip:8  
skl-i7k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:135  pass:123  dwarn:0   dfail:0   fail:0   skip:12 

Results at /archive/results/CI_IGT_test/Patchwork_1106/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-06 12:08 [PATCH] drm/i915/kbl: Enable PW1 and Misc I/O power wells Michel Thierry
@ 2016-01-06 14:20 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-06 14:20 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

== Summary ==

Built on 142b83d5713d07d01f6a0a1993761651459c2e66 drm-intel-nightly: 2016y-01m-06d-13h-21m-32s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                pass       -> DMESG-WARN (skl-i7k-2) UNSTABLE
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-a-frame-sequence:
                pass       -> DMESG-WARN (byt-nuc) UNSTABLE

bdw-nuci7        total:132  pass:0    dwarn:0   dfail:0   fail:0   skip:132
bsw-nuc-2        total:135  pass:115  dwarn:0   dfail:0   fail:0   skip:20 
byt-nuc          total:135  pass:120  dwarn:2   dfail:0   fail:0   skip:13 
skl-i5k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:135  pass:123  dwarn:0   dfail:0   fail:0   skip:12 

Results at /archive/results/CI_IGT_test/Patchwork_1099/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-06  1:44 [PATCH] drm/i915/bxt: Don't save/restore eDP panel power during suspend Matt Roper
@ 2016-01-06 12:20 ` Patchwork
  2016-01-07  7:20 ` Patchwork
  1 sibling, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-06 12:20 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

== Summary ==

Built on 89d0d1b6f0e9c3a6b90476bd115cfe1881646fd6 drm-intel-nightly: 2016y-01m-06d-10h-37m-17s UTC integration manifest

Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-a-frame-sequence:
                pass       -> DMESG-WARN (byt-nuc) UNSTABLE

bdw-nuci7        total:132  pass:0    dwarn:0   dfail:0   fail:0   skip:132
bsw-nuc-2        total:135  pass:115  dwarn:0   dfail:0   fail:0   skip:20 
byt-nuc          total:135  pass:121  dwarn:1   dfail:0   fail:0   skip:13 
skl-i5k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:135  pass:123  dwarn:0   dfail:0   fail:0   skip:12 

Results at /archive/results/CI_IGT_test/Patchwork_1096/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-05 15:32 [PATCH] drm/i915: Update Skylake DDI translation table for HDMI Rodrigo Vivi
@ 2016-01-06  9:49 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-06  9:49 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

== Summary ==

Built on 24b053acb16b4b3b021575e4ee30ffedd3ab2920 drm-intel-nightly: 2016y-01m-06d-08h-16m-11s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (skl-i5k-2) UNSTABLE
                dmesg-warn -> PASS       (bdw-ultra) UNSTABLE
                pass       -> DMESG-WARN (skl-i7k-2) UNSTABLE

bdw-nuci7        total:132  pass:123  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:132  pass:126  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:135  pass:115  dwarn:0   dfail:0   fail:0   skip:20 
byt-nuc          total:135  pass:121  dwarn:1   dfail:0   fail:0   skip:13 
hsw-brixbox      total:135  pass:128  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:135  pass:131  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:135  pass:100  dwarn:0   dfail:0   fail:0   skip:35 
ivb-t430s        total:135  pass:129  dwarn:0   dfail:0   fail:0   skip:6  
skl-i5k-2        total:135  pass:126  dwarn:1   dfail:0   fail:0   skip:8  
skl-i7k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:135  pass:123  dwarn:0   dfail:0   fail:0   skip:12 
snb-x220t        total:135  pass:123  dwarn:0   dfail:0   fail:1   skip:11 

Results at /archive/results/CI_IGT_test/Patchwork_1093/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* ✓ success: Fi.CI.BAT
  2016-01-05 13:49 [PATCH v6 0/3] drm/i915: Disable link training optimization if DP config has changed Mika Kahola
@ 2016-01-05 14:27 ` Patchwork
  0 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2016-01-05 14:27 UTC (permalink / raw)
  To: Mika Kahola; +Cc: intel-gfx

== Summary ==

Built on 05ade905f2fda5416476677509e016ef830d181a drm-intel-nightly: 2016y-01m-05d-13h-00m-24s UTC integration manifest

Test gem_storedw_loop:
        Subgroup basic-render:
                dmesg-warn -> PASS       (bdw-nuci7)
Test kms_flip:
        Subgroup basic-flip-vs-modeset:
                dmesg-warn -> PASS       (bsw-nuc-2) UNSTABLE
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-a:
                dmesg-warn -> PASS       (skl-i7k-2) UNSTABLE
        Subgroup read-crc-pipe-a-frame-sequence:
                pass       -> DMESG-WARN (byt-nuc) UNSTABLE
        Subgroup read-crc-pipe-b:
                dmesg-warn -> PASS       (skl-i5k-2) UNSTABLE
                dmesg-warn -> PASS       (snb-dellxps) UNSTABLE
        Subgroup suspend-read-crc-pipe-a:
                dmesg-warn -> PASS       (snb-x220t) UNSTABLE
        Subgroup suspend-read-crc-pipe-b:
                pass       -> DMESG-WARN (snb-x220t) UNSTABLE
                dmesg-warn -> PASS       (skl-i7k-2)
Test pm_rpm:
        Subgroup basic-rte:
                dmesg-warn -> PASS       (byt-nuc) UNSTABLE

bdw-nuci7        total:132  pass:122  dwarn:1   dfail:0   fail:0   skip:9  
bdw-ultra        total:132  pass:124  dwarn:2   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:135  pass:114  dwarn:1   dfail:0   fail:0   skip:20 
byt-nuc          total:135  pass:120  dwarn:2   dfail:0   fail:0   skip:13 
hsw-brixbox      total:135  pass:126  dwarn:2   dfail:0   fail:0   skip:7  
hsw-gt2          total:135  pass:130  dwarn:1   dfail:0   fail:0   skip:4  
hsw-xps12        total:132  pass:125  dwarn:3   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:135  pass:100  dwarn:0   dfail:0   fail:0   skip:35 
ivb-t430s        total:135  pass:127  dwarn:2   dfail:0   fail:0   skip:6  
skl-i5k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2        total:135  pass:125  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps      total:135  pass:122  dwarn:1   dfail:0   fail:0   skip:12 
snb-x220t        total:135  pass:121  dwarn:2   dfail:0   fail:1   skip:11 

Results at /archive/results/CI_IGT_test/Patchwork_1084/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2016-03-24 16:36 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-09 11:30 [PATCH] drm/i915: Support to enable TRTT on GEN9 akash.goel
2016-01-10 17:39 ` Chris Wilson
2016-01-11  7:39   ` Goel, Akash
2016-01-11  8:49     ` Chris Wilson
2016-01-11 12:29       ` Goel, Akash
2016-01-22 15:34         ` [PATCH v2] " akash.goel
2016-01-22 15:33           ` kbuild test robot
2016-03-03  4:54           ` [PATCH v3] " akash.goel
2016-03-03  4:54             ` kbuild test robot
2016-03-09 11:30             ` [PATCH v4] " akash.goel
2016-03-09 12:04               ` Chris Wilson
2016-03-09 14:50                 ` Goel, Akash
2016-03-09 15:02                   ` Chris Wilson
2016-03-09 15:56                     ` Goel, Akash
2016-03-09 16:21                       ` Chris Wilson
2016-03-09 16:38                         ` Goel, Akash
2016-03-10  7:06                           ` [PATCH v5] " akash.goel
2016-03-10 16:09                             ` kbuild test robot
2016-03-11 11:50                             ` [PATCH v6] " akash.goel
2016-03-11 15:57                               ` kbuild test robot
2016-03-18  8:35                               ` [PATCH v7] " akash.goel
2016-03-18  9:35                                 ` Chris Wilson
2016-03-18 10:23                                   ` [PATCH v8] " akash.goel
2016-03-22  8:42                                     ` [PATCH v9] " akash.goel
2016-03-24 16:29                                       ` Gore, Tim
2016-03-24 16:36                                         ` Goel, Akash
2016-03-09 14:18               ` [PATCH v4] " kbuild test robot
2016-01-11 11:19 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-22 15:44 ` ✗ Fi.CI.BAT: warning for drm/i915: Support to enable TRTT on GEN9 (rev2) Patchwork
2016-01-25 12:12 ` Patchwork
2016-01-25 12:14 ` ✓ Fi.CI.BAT: success " Patchwork
2016-03-03  6:42 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev3) Patchwork
2016-03-09 11:10 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev4) Patchwork
2016-03-10  7:10 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev5) Patchwork
2016-03-11 11:41 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev6) Patchwork
2016-03-18 12:45 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev8) Patchwork
2016-03-22 10:45 ` ✗ Fi.CI.BAT: failure for drm/i915: Support to enable TRTT on GEN9 (rev9) Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2016-01-14 10:53 [PATCH] drm/i915: add onoff utility function Jani Nikula
2016-01-14 12:49 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-14  2:59 [PATCH] drm/i915: Make sure DC writes are coherent on flush Francisco Jerez
2016-01-14 10:49 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-13 17:38 [PATCH] drm/i915: Demote user facing DMC firmware load failure message Chris Wilson
2016-01-14  9:20 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-13 17:04 [PATCH] drm/i915: Force ordering on request submission and hangcheck Mika Kuoppala
2016-01-14  8:20 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-13 16:33 [PATCH] drm/i915: Dump power well states on unclaimed trace Mika Kuoppala
2016-01-14  7:49 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-13 10:44 [PATCH] drm/i915: Expose exec parameter to force non IA-Coherent for Gen9+ Artur Harasimiuk
2016-01-13 11:49 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-13 10:06 [PATCH 0/8] Gen9 HW whitelist and Preemption WA patches Arun Siluvery
2016-01-13 10:50 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-12 23:40 [PATCH] drm/i915: Allow i915_gem_object_get_page() on userptr as well Chris Wilson
2016-01-13  9:20 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-12 17:32 [PATCH 1/3] drm/i915: Extract vfunc setup from logical ring initializers Tvrtko Ursulin
2016-01-13  8:11 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-12 15:28 [PATCH] drm/i915: Only complain about n_edp_entries with eDP ports ville.syrjala
2016-01-12 16:49 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-12 15:00 [PATCH 0/3] LPSS PWM support for devices that support it Shobhit Kumar
2016-01-12 15:20 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-11 19:54 [PATCH v2 0/6] drm/i915: start hiding away vbt structure from the driver Jani Nikula
2016-01-12  8:20 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-11 11:39 [PATCH] drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page Michel Thierry
2016-01-11 11:53 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-12 16:50   ` Daniel Vetter
2016-01-07 16:36 [PATCH 0/6] Misc cleanups Tvrtko Ursulin
2016-01-11  9:27 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-06 20:53 [PATCH] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed yu.dai
2016-01-07  7:49 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-06 12:08 [PATCH] drm/i915/kbl: Enable PW1 and Misc I/O power wells Michel Thierry
2016-01-06 14:20 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-06  1:44 [PATCH] drm/i915/bxt: Don't save/restore eDP panel power during suspend Matt Roper
2016-01-06 12:20 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-07  7:20 ` Patchwork
2016-01-05 15:32 [PATCH] drm/i915: Update Skylake DDI translation table for HDMI Rodrigo Vivi
2016-01-06  9:49 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-05 13:49 [PATCH v6 0/3] drm/i915: Disable link training optimization if DP config has changed Mika Kahola
2016-01-05 14:27 ` ✓ success: Fi.CI.BAT Patchwork
2015-12-21 11:57 [PATCH] drm/i915: Handle PipeC fused off on HSW Gabriel Feceoru
2016-01-13 15:41 ` ✓ success: Fi.CI.BAT Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.