All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] tools/null_state_gen: Automatically generate the copyright header
@ 2017-04-28  9:10 Oscar Mateo
  2017-04-28  9:10 ` [PATCH 2/2] tools/null_state_gen: Add GEN10 golden context batch buffer creation Oscar Mateo
  0 siblings, 1 reply; 7+ messages in thread
From: Oscar Mateo @ 2017-04-28  9:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Last bit to make the generated files directly usable in i915.

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 tools/null_state_gen/intel_null_state_gen.c | 41 +++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c
index 7d5887e..06eb954 100644
--- a/tools/null_state_gen/intel_null_state_gen.c
+++ b/tools/null_state_gen/intel_null_state_gen.c
@@ -29,9 +29,12 @@
 #include <stdlib.h>
 #include <errno.h>
 #include <assert.h>
+#include <time.h>
 
 #include "intel_renderstate.h"
 #include "intel_batchbuffer.h"
+#include "lib/version.h"
+#include "config.h"
 
 static int debug = 0;
 
@@ -42,6 +45,42 @@ static void print_usage(char *s)
 		s);
 }
 
+static void print_copyright(void)
+{
+	time_t time_raw;
+	struct tm *time_local;
+	char year[6]; // avoid the year 10000 effect!
+
+	time(&time_raw);
+	time_local = localtime(&time_raw);
+	strftime(year, sizeof(year), "%Y", time_local);
+
+	printf("/*\n");
+	printf(" * Copyright © %s Intel Corporation\n", year);
+	printf(" *\n");
+	printf(" * Permission is hereby granted, free of charge, to any person obtaining a\n");
+	printf(" * copy of this software and associated documentation files (the \"Software\"),\n");
+	printf(" * to deal in the Software without restriction, including without limitation\n");
+	printf(" * the rights to use, copy, modify, merge, publish, distribute, sublicense,\n");
+	printf(" * and/or sell copies of the Software, and to permit persons to whom the\n");
+	printf(" * Software is furnished to do so, subject to the following conditions:\n");
+	printf(" *\n");
+	printf(" * The above copyright notice and this permission notice (including the next\n");
+	printf(" * paragraph) shall be included in all copies or substantial portions of the\n");
+	printf(" * Software.\n");
+	printf(" *\n");
+	printf(" * THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n");
+	printf(" * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n");
+	printf(" * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL\n");
+	printf(" * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n");
+	printf(" * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING\n");
+	printf(" * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\n");
+	printf(" * DEALINGS IN THE SOFTWARE.\n");
+	printf(" *\n");
+	printf(" * Generated by: intel-gpu-tools-%s-%s\n", PACKAGE_VERSION, IGT_GIT_SHA1);
+	printf(" */\n\n");
+}
+
 /* Creates the intel_renderstate_genX.c file for the particular
  * GEN product
  */
@@ -52,6 +91,8 @@ static int print_state(int gen, struct intel_batchbuffer *batch)
 
 	fprintf(stderr, "Generating for gen%d\n", gen);
 
+	print_copyright();
+
 	printf("#include \"intel_renderstate.h\"\n\n");
 
 	/* Relocation offsets.  These are byte offsets in the golden context
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] tools/null_state_gen: Add GEN10 golden context batch buffer creation
  2017-04-28  9:10 [PATCH 1/2] tools/null_state_gen: Automatically generate the copyright header Oscar Mateo
@ 2017-04-28  9:10 ` Oscar Mateo
  2017-04-28 14:36   ` [PATCH] " Oscar Mateo
  0 siblings, 1 reply; 7+ messages in thread
From: Oscar Mateo @ 2017-04-28  9:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Mika Kuoppala

This batchbuffer is over 4096 bytes, so we need to increase the size of the
array (and the KMD has to be modified to deal with more than one page).

Notice that there to workarounds embedded here:
- WaRsGatherPoolEnable is to be applied to all CNL steppings, so it belongs
  here.
- WaPSRandomCSNotDone is A0 only, but since the golden context batch buffer
  is created offline in i-g-t (as opposed to dinamically in i915) we cannot
  really make it dependent on the stepping (there is a mechanism in i915 to
  *add* extra stuff to the golden context , via an additional auxiliary bb,
  but nothing to modify things *inside* the offline-created bb). So maybe
  apply the WA for now and remove it once production chips are the norm?

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 lib/gen10_render.h                             |  63 +++
 tools/null_state_gen/Makefile.am               |   3 +-
 tools/null_state_gen/intel_batchbuffer.h       |   2 +-
 tools/null_state_gen/intel_null_state_gen.c    |   5 +-
 tools/null_state_gen/intel_renderstate.h       |   1 +
 tools/null_state_gen/intel_renderstate_gen10.c | 538 +++++++++++++++++++++++++
 6 files changed, 609 insertions(+), 3 deletions(-)
 create mode 100644 lib/gen10_render.h
 create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c

diff --git a/lib/gen10_render.h b/lib/gen10_render.h
new file mode 100644
index 0000000..f4a7dff
--- /dev/null
+++ b/lib/gen10_render.h
@@ -0,0 +1,63 @@
+#ifndef GEN10_RENDER_H
+#define GEN10_RENDER_H
+
+#include "gen9_render.h"
+
+#define GEN7_MI_RS_CONTROL			(0x6 << 23)
+# define GEN7_MI_RS_CONTROL_ENABLE		(1 << 0)
+
+#define GEN10_3DSTATE_GATHER_POOL_ALLOC		GEN6_3D(3, 1, 0x1a)
+# define GEN10_3DSTATE_GATHER_POOL_ENABLE	(1 << 11)
+
+#define GEN10_3DSTATE_GATHER_CONSTANT_VS	GEN6_3D(3, 0, 0x34)
+#define GEN10_3DSTATE_GATHER_CONSTANT_HS	GEN6_3D(3, 0, 0x36)
+#define GEN10_3DSTATE_GATHER_CONSTANT_DS	GEN6_3D(3, 0, 0x37)
+#define GEN10_3DSTATE_GATHER_CONSTANT_GS	GEN6_3D(3, 0, 0x35)
+#define GEN10_3DSTATE_GATHER_CONSTANT_PS	GEN6_3D(3, 0, 0x38)
+
+#define GEN10_3DSTATE_WM_DEPTH_STENCIL		GEN6_3D(3, 0, 0x4e)
+#define GEN10_3DSTATE_WM_CHROMAKEY		GEN6_3D(3, 0, 0x4c)
+
+#define GEN8_REG_L3_CACHE_CONFIG	0x7034
+
+/*
+ * Programming for L3 cache allocations can be made per bank. Based on the
+ * programmed value HW will apply same allocations on other available banks.
+ * Total L3 Cache size per bank = 256 KB.
+ * {SLM,    URB,     DC,      RO(I/S, C, T),   L3 Client Pool}
+ * {  0,    96,      32,      128,                 0      }
+ */
+#define GEN10_L3_CACHE_CONFIG_VALUE	0x00420060
+
+#define URB_ALIGN(val, align)	((val % align) ? (val - (val % align)) : val)
+
+#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES		64
+#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES		2752
+
+#define GEN10_KB_PER_URB_INDEX			8
+#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB	96
+
+#define GEN10_URB_RESERVED_SIZE_KB		32
+#define GEN10_URB_RESERVED_END_SIZE_KB		8
+
+#define GEN10_VS_NUM_BITS_PER_URB_UNIT		512
+#define GEN10_VS_NUM_OF_URB_UNITS		1 // zero based
+#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS		(GEN10_VS_NUM_BITS_PER_URB_UNIT * \
+						(GEN10_VS_NUM_OF_URB_UNITS + 1))
+
+#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / GEN10_KB_PER_URB_INDEX)
+
+#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count)		\
+	URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX)
+
+#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice)	\
+	(total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - GEN10_URB_RESERVED_END_SIZE_KB)
+
+#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice)	\
+	((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) *	\
+	1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS)
+
+#define GEN10_VS_END_URB_INDEX(urb_size_per_slice)			\
+	((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / GEN10_KB_PER_URB_INDEX)
+
+#endif
diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am
index 24884a7..2f90990 100644
--- a/tools/null_state_gen/Makefile.am
+++ b/tools/null_state_gen/Makefile.am
@@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES = 	\
 	intel_renderstate_gen7.c \
 	intel_renderstate_gen8.c \
 	intel_renderstate_gen9.c \
+	intel_renderstate_gen10.c \
 	intel_null_state_gen.c
 
-gens := 6 7 8 9
+gens := 6 7 8 9 10
 
 h = /tmp/intel_renderstate_gen$$gen.c
 states: intel_null_state_gen
diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h
index 771d1c8..e40e01b 100644
--- a/tools/null_state_gen/intel_batchbuffer.h
+++ b/tools/null_state_gen/intel_batchbuffer.h
@@ -34,7 +34,7 @@
 #include <stdint.h>
 
 #define MAX_RELOCS 64
-#define MAX_ITEMS 1024
+#define MAX_ITEMS 2048
 #define MAX_STRLEN 256
 
 #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1))
diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c
index 06eb954..4f12f5f 100644
--- a/tools/null_state_gen/intel_null_state_gen.c
+++ b/tools/null_state_gen/intel_null_state_gen.c
@@ -41,7 +41,7 @@ static int debug = 0;
 static void print_usage(char *s)
 {
 	fprintf(stderr, "%s: <gen>\n"
-		"     gen:     gen to generate for (6,7,8,9)\n",
+		"     gen:     gen to generate for (6,7,8,9,10)\n",
 		s);
 }
 
@@ -173,6 +173,9 @@ static int do_generate(int gen)
 	case 9:
 		null_state_gen = gen9_setup_null_render_state;
 		break;
+	case 10:
+		null_state_gen = gen10_setup_null_render_state;
+		break;
 	}
 
 	if (null_state_gen == NULL) {
diff --git a/tools/null_state_gen/intel_renderstate.h b/tools/null_state_gen/intel_renderstate.h
index b27b434..b3c8c2b 100644
--- a/tools/null_state_gen/intel_renderstate.h
+++ b/tools/null_state_gen/intel_renderstate.h
@@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct intel_batchbuffer *batch);
 void gen7_setup_null_render_state(struct intel_batchbuffer *batch);
 void gen8_setup_null_render_state(struct intel_batchbuffer *batch);
 void gen9_setup_null_render_state(struct intel_batchbuffer *batch);
+void gen10_setup_null_render_state(struct intel_batchbuffer *batch);
 
 #endif /* __INTEL_RENDERSTATE_H__ */
diff --git a/tools/null_state_gen/intel_renderstate_gen10.c b/tools/null_state_gen/intel_renderstate_gen10.c
new file mode 100644
index 0000000..905c6c7
--- /dev/null
+++ b/tools/null_state_gen/intel_renderstate_gen10.c
@@ -0,0 +1,538 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *	Oscar Mateo <oscar.mateo@intel.com>
+ */
+
+#include "intel_renderstate.h"
+#include <lib/gen10_render.h>
+#include <lib/intel_reg.h>
+
+static void gen8_emit_wm(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
+	OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION);
+}
+
+static void gen8_emit_ps(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0); /* kernel hi */
+	OUT_BATCH(GEN7_PS_SPF_MODE);
+	OUT_BATCH(0); /* scratch space stuff */
+	OUT_BATCH(0); /* scratch hi */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0); // kernel 1
+	OUT_BATCH(0); /* kernel 1 hi */
+	OUT_BATCH(0); // kernel 2
+	OUT_BATCH(0); /* kernel 2 hi */
+}
+
+static void gen8_emit_sf(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT |
+		  1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT |
+		  GEN7_SF_POINT_WIDTH_FROM_SOURCE |
+		  8);
+}
+
+static void gen8_emit_vs(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_hs(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_raster(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
+	OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW);
+	OUT_BATCH(0.0);
+	OUT_BATCH(0.0);
+	OUT_BATCH(0.0);
+}
+
+static void gen10_emit_urb(struct intel_batchbuffer *batch)
+{
+	/* Smallest SKU: 3x8*/
+	int l3_bank_count = 3;
+	int slice_count = 1;
+	int urb_size_per_slice = GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count);
+	int other_urb_start_addr = GEN10_VS_END_URB_INDEX(urb_size_per_slice);
+	const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX;
+	const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS;
+	int vs_urb_entries = GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice);
+
+	if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES)
+		vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES;
+	if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES)
+		vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES;
+
+	OUT_BATCH(GEN7_3DSTATE_URB_VS);
+	OUT_BATCH(vs_urb_entries |
+		 (vs_urb_alloc_size << 16) |
+		 (vs_urb_start_addr << 25));
+
+	OUT_BATCH(GEN7_3DSTATE_URB_HS);
+	OUT_BATCH(other_urb_start_addr << 25);
+
+	OUT_BATCH(GEN7_3DSTATE_URB_DS);
+	OUT_BATCH(other_urb_start_addr << 25);
+
+	OUT_BATCH(GEN7_3DSTATE_URB_GS);
+	OUT_BATCH(other_urb_start_addr << 25);
+}
+
+static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
+	OUT_BATCH(_3DPRIM_TRILIST);
+}
+
+static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch)
+{
+	const int num_decls = 128;
+	int i;
+
+	OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST |
+		(((2 * num_decls) + 3) - 2) /* DWORD count - 2 */);
+	OUT_BATCH(0);
+	OUT_BATCH(num_decls);
+
+	for (i = 0; i < num_decls; i++) {
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index)
+{
+	OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2));
+	OUT_BATCH(index << 29);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index)
+{
+	OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2));
+	OUT_BATCH(index << 30);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch)
+{
+	const int buffers = 33;
+	int i;
+
+	OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS |
+		(((4 * buffers) + 1)- 2) /* DWORD count - 2 */);
+
+	for (i = 0; i < buffers; i++) {
+		OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT |
+			  GEN7_VB0_BUFFER_ADDR_MOD_EN);
+		OUT_BATCH(0); /* Address */
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch)
+{
+	const int elements = 34;
+	int i;
+
+	OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS |
+		(((2 * elements) + 1) - 2) /* DWORD count - 2 */);
+
+	/* Element 0 */
+	OUT_BATCH(VE0_VALID);
+	OUT_BATCH(
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+	/* Elements 1 -> 33 */
+	for (i = 1; i < elements; i++) {
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch)
+{
+	union {
+		float fval;
+		uint32_t uval;
+	} u;
+
+	unsigned offset;
+
+	u.fval = 1.0f;
+
+	offset = intel_batch_state_offset(batch, 64);
+	OUT_STATE(0);
+	OUT_STATE(0);      /* Alpha reference value */
+	OUT_STATE(u.uval); /* Blend constant color RED */
+	OUT_STATE(u.uval); /* Blend constant color BLUE */
+	OUT_STATE(u.uval); /* Blend constant color GREEN */
+	OUT_STATE(u.uval); /* Blend constant color ALPHA */
+
+	OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
+	OUT_BATCH_STATE_OFFSET(offset | 1);
+}
+
+static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+	int i;
+
+	offset = intel_batch_state_offset(batch, 64);
+
+	for (i = 0; i < 17; i++)
+		OUT_STATE(0);
+
+	OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset | 1);
+}
+
+static void gen8_emit_ps_extra(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
+	OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID |
+		  GEN8_PSX_ATTRIBUTE_ENABLE);
+
+}
+
+static void gen8_emit_ps_blend(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
+	OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
+}
+
+static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+
+	offset = intel_batch_state_offset(batch, 32);
+
+	OUT_STATE((uint32_t)0.0f); /* Minimum depth */
+	OUT_STATE((uint32_t)0.0f); /* Maximum depth */
+
+	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset);
+}
+
+static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+	int i;
+
+	offset = intel_batch_state_offset(batch, 64);
+
+	for (i = 0; i < 16; i++)
+		OUT_STATE(0);
+
+	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset);
+}
+
+static void gen8_emit_primitive(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DPRIMITIVE | (10-2));
+	OUT_BATCH(4);   /* gen8+ ignore the topology type field */
+	OUT_BATCH(1);   /* vertex count */
+	OUT_BATCH(0);
+	OUT_BATCH(1);   /* single instance */
+	OUT_BATCH(0);   /* start instance location */
+	OUT_BATCH(0);   /* index buffer offset, ignored */
+	OUT_BATCH(0);   /* extended parameter 0 */
+	OUT_BATCH(0);   /* extended parameter 1 */
+	OUT_BATCH(0);   /* extended parameter 2 */
+}
+
+static void gen9_emit_state_base_address(struct intel_batchbuffer *batch) {
+	const unsigned offset = 0;
+	OUT_BATCH(GEN6_STATE_BASE_ADDRESS |
+		(22 - 2) /* DWORD count - 2 */);
+
+	/* general state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0);
+
+	/* surface state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* dynamic state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* indirect state base address */
+	OUT_BATCH(BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* instruction state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* general state buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+	/* dynamic state buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+	/* indirect object buffer size */
+	OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY);
+	/* intruction buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+
+	/* bindless surface state base address */
+	OUT_BATCH(BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+	/* bindless surface state size */
+	OUT_BATCH(0);
+
+	/* bindless sampler state base address */
+	OUT_BATCH(BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+	/* bindless sampler state size */
+	OUT_BATCH(0);
+}
+
+/*
+ * Generate the batch buffer commands needed to initialize the 3D engine
+ * to its "golden state".
+ */
+void gen10_setup_null_render_state(struct intel_batchbuffer *batch)
+{
+	int i;
+
+	/* WaRsGatherPoolEnable: cnl */
+	OUT_BATCH(GEN7_MI_RS_CONTROL);
+
+#define GEN8_PIPE_CONTROL_GLOBAL_GTT   (1 << 24)
+	/* PIPE_CONTROL */
+	OUT_BATCH(GEN6_PIPE_CONTROL |
+	         (6 - 2));	/* DWORD count - 2 */
+	OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* PIPELINE_SELECT */
+	OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+
+	OUT_BATCH(MI_LOAD_REGISTER_IMM);
+	OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG);
+	OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE);
+
+	gen8_emit_wm(batch);
+	gen8_emit_ps(batch);
+	gen8_emit_sf(batch);
+
+	OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */
+	OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11);
+
+	gen8_emit_vs(batch);
+	gen8_emit_hs(batch);
+
+	OUT_CMD(GEN7_3DSTATE_GS, 10);
+	OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5);
+	OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */
+	OUT_CMD(GEN6_3DSTATE_CLIP, 4);
+	OUT_CMD(GEN7_3DSTATE_TE, 4);
+	OUT_CMD(GEN8_3DSTATE_VF, 2);
+	OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5);
+
+	/* URB States */
+	gen10_emit_urb(batch);
+
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130);
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130);
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130);
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130);
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130);
+
+	OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4);
+	OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4);
+	OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4);
+
+	/* Push Constants */
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2);
+
+	/* Constants */
+	OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11);
+
+	OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3);
+	OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2);
+	gen8_emit_vf_topology(batch);
+
+	/* Streamer out declaration list */
+	gen8_emit_so_decl_list(batch);
+
+	/* Streamer out buffers */
+	for (i = 0; i < 4; i++) {
+		gen8_emit_so_buffer(batch, i);
+	}
+
+	/* State base addresses */
+	gen9_emit_state_base_address(batch);
+
+	OUT_CMD(GEN6_STATE_SIP, 3);
+	OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4);
+	OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8);
+
+	/* Chroma key */
+	for (i = 0; i < 4; i++) {
+		gen8_emit_chroma_key(batch, i);
+	}
+
+	OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3);
+	OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3);
+	OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5);
+	OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5);
+	OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3);
+	OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2);
+
+	/* WaPSRandomCSNotDone:cnl (pre-production) */
+#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
+	OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
+	OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2);
+	OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2);
+	OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32);
+	OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16);
+	OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16);
+	OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5);
+
+	/* Vertex buffers */
+	gen8_emit_vertex_buffers(batch);
+	gen8_emit_vertex_elements(batch);
+
+	OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */);
+
+	/* 3D state binding table pointers */
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2);
+
+	gen8_emit_cc_state_pointers(batch);
+	gen8_emit_blend_state_pointers(batch);
+	gen8_emit_ps_extra(batch);
+	gen8_emit_ps_blend(batch);
+
+	/* 3D state sampler state pointers */
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2);
+
+	OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2);
+
+	gen8_emit_viewport_state_pointers_cc(batch);
+	gen8_emit_viewport_state_pointers_sf_clip(batch);
+
+	/* WaPSRandomCSNotDone:cnl (pre-production) */
+#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
+	OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
+	OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	gen8_emit_raster(batch);
+
+	OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4);
+	OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2);
+
+	/* Launch 3D operation */
+	gen8_emit_primitive(batch);
+
+	/* WaRsGatherPoolEnable: cnl */
+	OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE);
+	OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2));
+	OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE);
+	OUT_BATCH(0);
+	OUT_BATCH(0xfffff << 12);
+	OUT_BATCH(GEN7_MI_RS_CONTROL);
+	OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH] tools/null_state_gen: Add GEN10 golden context batch buffer creation
  2017-04-28  9:10 ` [PATCH 2/2] tools/null_state_gen: Add GEN10 golden context batch buffer creation Oscar Mateo
@ 2017-04-28 14:36   ` Oscar Mateo
  2017-07-06  0:50     ` Rodrigo Vivi
  0 siblings, 1 reply; 7+ messages in thread
From: Oscar Mateo @ 2017-04-28 14:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Mika Kuoppala

This batchbuffer is over 4096 bytes, so we need to increase the size of the
array (and the KMD has to be modified to deal with more than one page).

Notice that there to workarounds embedded here, both applicable to all CNL
steppings.

v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so update
    the comment in the code and in the commit message.

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 lib/gen10_render.h                             |  63 +++
 tools/null_state_gen/Makefile.am               |   3 +-
 tools/null_state_gen/intel_batchbuffer.h       |   2 +-
 tools/null_state_gen/intel_null_state_gen.c    |   5 +-
 tools/null_state_gen/intel_renderstate.h       |   1 +
 tools/null_state_gen/intel_renderstate_gen10.c | 538 +++++++++++++++++++++++++
 6 files changed, 609 insertions(+), 3 deletions(-)
 create mode 100644 lib/gen10_render.h
 create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c

diff --git a/lib/gen10_render.h b/lib/gen10_render.h
new file mode 100644
index 0000000..f4a7dff
--- /dev/null
+++ b/lib/gen10_render.h
@@ -0,0 +1,63 @@
+#ifndef GEN10_RENDER_H
+#define GEN10_RENDER_H
+
+#include "gen9_render.h"
+
+#define GEN7_MI_RS_CONTROL			(0x6 << 23)
+# define GEN7_MI_RS_CONTROL_ENABLE		(1 << 0)
+
+#define GEN10_3DSTATE_GATHER_POOL_ALLOC		GEN6_3D(3, 1, 0x1a)
+# define GEN10_3DSTATE_GATHER_POOL_ENABLE	(1 << 11)
+
+#define GEN10_3DSTATE_GATHER_CONSTANT_VS	GEN6_3D(3, 0, 0x34)
+#define GEN10_3DSTATE_GATHER_CONSTANT_HS	GEN6_3D(3, 0, 0x36)
+#define GEN10_3DSTATE_GATHER_CONSTANT_DS	GEN6_3D(3, 0, 0x37)
+#define GEN10_3DSTATE_GATHER_CONSTANT_GS	GEN6_3D(3, 0, 0x35)
+#define GEN10_3DSTATE_GATHER_CONSTANT_PS	GEN6_3D(3, 0, 0x38)
+
+#define GEN10_3DSTATE_WM_DEPTH_STENCIL		GEN6_3D(3, 0, 0x4e)
+#define GEN10_3DSTATE_WM_CHROMAKEY		GEN6_3D(3, 0, 0x4c)
+
+#define GEN8_REG_L3_CACHE_CONFIG	0x7034
+
+/*
+ * Programming for L3 cache allocations can be made per bank. Based on the
+ * programmed value HW will apply same allocations on other available banks.
+ * Total L3 Cache size per bank = 256 KB.
+ * {SLM,    URB,     DC,      RO(I/S, C, T),   L3 Client Pool}
+ * {  0,    96,      32,      128,                 0      }
+ */
+#define GEN10_L3_CACHE_CONFIG_VALUE	0x00420060
+
+#define URB_ALIGN(val, align)	((val % align) ? (val - (val % align)) : val)
+
+#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES		64
+#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES		2752
+
+#define GEN10_KB_PER_URB_INDEX			8
+#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB	96
+
+#define GEN10_URB_RESERVED_SIZE_KB		32
+#define GEN10_URB_RESERVED_END_SIZE_KB		8
+
+#define GEN10_VS_NUM_BITS_PER_URB_UNIT		512
+#define GEN10_VS_NUM_OF_URB_UNITS		1 // zero based
+#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS		(GEN10_VS_NUM_BITS_PER_URB_UNIT * \
+						(GEN10_VS_NUM_OF_URB_UNITS + 1))
+
+#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / GEN10_KB_PER_URB_INDEX)
+
+#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count)		\
+	URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX)
+
+#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice)	\
+	(total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - GEN10_URB_RESERVED_END_SIZE_KB)
+
+#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice)	\
+	((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) *	\
+	1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS)
+
+#define GEN10_VS_END_URB_INDEX(urb_size_per_slice)			\
+	((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / GEN10_KB_PER_URB_INDEX)
+
+#endif
diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am
index 24884a7..2f90990 100644
--- a/tools/null_state_gen/Makefile.am
+++ b/tools/null_state_gen/Makefile.am
@@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES = 	\
 	intel_renderstate_gen7.c \
 	intel_renderstate_gen8.c \
 	intel_renderstate_gen9.c \
+	intel_renderstate_gen10.c \
 	intel_null_state_gen.c
 
-gens := 6 7 8 9
+gens := 6 7 8 9 10
 
 h = /tmp/intel_renderstate_gen$$gen.c
 states: intel_null_state_gen
diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h
index 771d1c8..e40e01b 100644
--- a/tools/null_state_gen/intel_batchbuffer.h
+++ b/tools/null_state_gen/intel_batchbuffer.h
@@ -34,7 +34,7 @@
 #include <stdint.h>
 
 #define MAX_RELOCS 64
-#define MAX_ITEMS 1024
+#define MAX_ITEMS 2048
 #define MAX_STRLEN 256
 
 #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1))
diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c
index 06eb954..4f12f5f 100644
--- a/tools/null_state_gen/intel_null_state_gen.c
+++ b/tools/null_state_gen/intel_null_state_gen.c
@@ -41,7 +41,7 @@ static int debug = 0;
 static void print_usage(char *s)
 {
 	fprintf(stderr, "%s: <gen>\n"
-		"     gen:     gen to generate for (6,7,8,9)\n",
+		"     gen:     gen to generate for (6,7,8,9,10)\n",
 		s);
 }
 
@@ -173,6 +173,9 @@ static int do_generate(int gen)
 	case 9:
 		null_state_gen = gen9_setup_null_render_state;
 		break;
+	case 10:
+		null_state_gen = gen10_setup_null_render_state;
+		break;
 	}
 
 	if (null_state_gen == NULL) {
diff --git a/tools/null_state_gen/intel_renderstate.h b/tools/null_state_gen/intel_renderstate.h
index b27b434..b3c8c2b 100644
--- a/tools/null_state_gen/intel_renderstate.h
+++ b/tools/null_state_gen/intel_renderstate.h
@@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct intel_batchbuffer *batch);
 void gen7_setup_null_render_state(struct intel_batchbuffer *batch);
 void gen8_setup_null_render_state(struct intel_batchbuffer *batch);
 void gen9_setup_null_render_state(struct intel_batchbuffer *batch);
+void gen10_setup_null_render_state(struct intel_batchbuffer *batch);
 
 #endif /* __INTEL_RENDERSTATE_H__ */
diff --git a/tools/null_state_gen/intel_renderstate_gen10.c b/tools/null_state_gen/intel_renderstate_gen10.c
new file mode 100644
index 0000000..f5678c3
--- /dev/null
+++ b/tools/null_state_gen/intel_renderstate_gen10.c
@@ -0,0 +1,538 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *	Oscar Mateo <oscar.mateo@intel.com>
+ */
+
+#include "intel_renderstate.h"
+#include <lib/gen10_render.h>
+#include <lib/intel_reg.h>
+
+static void gen8_emit_wm(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
+	OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION);
+}
+
+static void gen8_emit_ps(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0); /* kernel hi */
+	OUT_BATCH(GEN7_PS_SPF_MODE);
+	OUT_BATCH(0); /* scratch space stuff */
+	OUT_BATCH(0); /* scratch hi */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0); // kernel 1
+	OUT_BATCH(0); /* kernel 1 hi */
+	OUT_BATCH(0); // kernel 2
+	OUT_BATCH(0); /* kernel 2 hi */
+}
+
+static void gen8_emit_sf(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT |
+		  1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT |
+		  GEN7_SF_POINT_WIDTH_FROM_SOURCE |
+		  8);
+}
+
+static void gen8_emit_vs(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_hs(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_raster(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
+	OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW);
+	OUT_BATCH(0.0);
+	OUT_BATCH(0.0);
+	OUT_BATCH(0.0);
+}
+
+static void gen10_emit_urb(struct intel_batchbuffer *batch)
+{
+	/* Smallest SKU: 3x8*/
+	int l3_bank_count = 3;
+	int slice_count = 1;
+	int urb_size_per_slice = GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count);
+	int other_urb_start_addr = GEN10_VS_END_URB_INDEX(urb_size_per_slice);
+	const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX;
+	const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS;
+	int vs_urb_entries = GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice);
+
+	if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES)
+		vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES;
+	if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES)
+		vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES;
+
+	OUT_BATCH(GEN7_3DSTATE_URB_VS);
+	OUT_BATCH(vs_urb_entries |
+		 (vs_urb_alloc_size << 16) |
+		 (vs_urb_start_addr << 25));
+
+	OUT_BATCH(GEN7_3DSTATE_URB_HS);
+	OUT_BATCH(other_urb_start_addr << 25);
+
+	OUT_BATCH(GEN7_3DSTATE_URB_DS);
+	OUT_BATCH(other_urb_start_addr << 25);
+
+	OUT_BATCH(GEN7_3DSTATE_URB_GS);
+	OUT_BATCH(other_urb_start_addr << 25);
+}
+
+static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
+	OUT_BATCH(_3DPRIM_TRILIST);
+}
+
+static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch)
+{
+	const int num_decls = 128;
+	int i;
+
+	OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST |
+		(((2 * num_decls) + 3) - 2) /* DWORD count - 2 */);
+	OUT_BATCH(0);
+	OUT_BATCH(num_decls);
+
+	for (i = 0; i < num_decls; i++) {
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index)
+{
+	OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2));
+	OUT_BATCH(index << 29);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index)
+{
+	OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2));
+	OUT_BATCH(index << 30);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch)
+{
+	const int buffers = 33;
+	int i;
+
+	OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS |
+		(((4 * buffers) + 1)- 2) /* DWORD count - 2 */);
+
+	for (i = 0; i < buffers; i++) {
+		OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT |
+			  GEN7_VB0_BUFFER_ADDR_MOD_EN);
+		OUT_BATCH(0); /* Address */
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch)
+{
+	const int elements = 34;
+	int i;
+
+	OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS |
+		(((2 * elements) + 1) - 2) /* DWORD count - 2 */);
+
+	/* Element 0 */
+	OUT_BATCH(VE0_VALID);
+	OUT_BATCH(
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+	/* Elements 1 -> 33 */
+	for (i = 1; i < elements; i++) {
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch)
+{
+	union {
+		float fval;
+		uint32_t uval;
+	} u;
+
+	unsigned offset;
+
+	u.fval = 1.0f;
+
+	offset = intel_batch_state_offset(batch, 64);
+	OUT_STATE(0);
+	OUT_STATE(0);      /* Alpha reference value */
+	OUT_STATE(u.uval); /* Blend constant color RED */
+	OUT_STATE(u.uval); /* Blend constant color BLUE */
+	OUT_STATE(u.uval); /* Blend constant color GREEN */
+	OUT_STATE(u.uval); /* Blend constant color ALPHA */
+
+	OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
+	OUT_BATCH_STATE_OFFSET(offset | 1);
+}
+
+static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+	int i;
+
+	offset = intel_batch_state_offset(batch, 64);
+
+	for (i = 0; i < 17; i++)
+		OUT_STATE(0);
+
+	OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset | 1);
+}
+
+static void gen8_emit_ps_extra(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
+	OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID |
+		  GEN8_PSX_ATTRIBUTE_ENABLE);
+
+}
+
+static void gen8_emit_ps_blend(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
+	OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
+}
+
+static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+
+	offset = intel_batch_state_offset(batch, 32);
+
+	OUT_STATE((uint32_t)0.0f); /* Minimum depth */
+	OUT_STATE((uint32_t)0.0f); /* Maximum depth */
+
+	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset);
+}
+
+static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+	int i;
+
+	offset = intel_batch_state_offset(batch, 64);
+
+	for (i = 0; i < 16; i++)
+		OUT_STATE(0);
+
+	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset);
+}
+
+static void gen8_emit_primitive(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DPRIMITIVE | (10-2));
+	OUT_BATCH(4);   /* gen8+ ignore the topology type field */
+	OUT_BATCH(1);   /* vertex count */
+	OUT_BATCH(0);
+	OUT_BATCH(1);   /* single instance */
+	OUT_BATCH(0);   /* start instance location */
+	OUT_BATCH(0);   /* index buffer offset, ignored */
+	OUT_BATCH(0);   /* extended parameter 0 */
+	OUT_BATCH(0);   /* extended parameter 1 */
+	OUT_BATCH(0);   /* extended parameter 2 */
+}
+
+static void gen9_emit_state_base_address(struct intel_batchbuffer *batch) {
+	const unsigned offset = 0;
+	OUT_BATCH(GEN6_STATE_BASE_ADDRESS |
+		(22 - 2) /* DWORD count - 2 */);
+
+	/* general state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0);
+
+	/* surface state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* dynamic state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* indirect state base address */
+	OUT_BATCH(BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* instruction state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* general state buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+	/* dynamic state buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+	/* indirect object buffer size */
+	OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY);
+	/* intruction buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+
+	/* bindless surface state base address */
+	OUT_BATCH(BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+	/* bindless surface state size */
+	OUT_BATCH(0);
+
+	/* bindless sampler state base address */
+	OUT_BATCH(BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+	/* bindless sampler state size */
+	OUT_BATCH(0);
+}
+
+/*
+ * Generate the batch buffer commands needed to initialize the 3D engine
+ * to its "golden state".
+ */
+void gen10_setup_null_render_state(struct intel_batchbuffer *batch)
+{
+	int i;
+
+	/* WaRsGatherPoolEnable: cnl */
+	OUT_BATCH(GEN7_MI_RS_CONTROL);
+
+#define GEN8_PIPE_CONTROL_GLOBAL_GTT   (1 << 24)
+	/* PIPE_CONTROL */
+	OUT_BATCH(GEN6_PIPE_CONTROL |
+	         (6 - 2));	/* DWORD count - 2 */
+	OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* PIPELINE_SELECT */
+	OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+
+	OUT_BATCH(MI_LOAD_REGISTER_IMM);
+	OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG);
+	OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE);
+
+	gen8_emit_wm(batch);
+	gen8_emit_ps(batch);
+	gen8_emit_sf(batch);
+
+	OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */
+	OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11);
+
+	gen8_emit_vs(batch);
+	gen8_emit_hs(batch);
+
+	OUT_CMD(GEN7_3DSTATE_GS, 10);
+	OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5);
+	OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */
+	OUT_CMD(GEN6_3DSTATE_CLIP, 4);
+	OUT_CMD(GEN7_3DSTATE_TE, 4);
+	OUT_CMD(GEN8_3DSTATE_VF, 2);
+	OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5);
+
+	/* URB States */
+	gen10_emit_urb(batch);
+
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130);
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130);
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130);
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130);
+	OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130);
+
+	OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4);
+	OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4);
+	OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4);
+
+	/* Push Constants */
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2);
+
+	/* Constants */
+	OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11);
+
+	OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3);
+	OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2);
+	gen8_emit_vf_topology(batch);
+
+	/* Streamer out declaration list */
+	gen8_emit_so_decl_list(batch);
+
+	/* Streamer out buffers */
+	for (i = 0; i < 4; i++) {
+		gen8_emit_so_buffer(batch, i);
+	}
+
+	/* State base addresses */
+	gen9_emit_state_base_address(batch);
+
+	OUT_CMD(GEN6_STATE_SIP, 3);
+	OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4);
+	OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8);
+
+	/* Chroma key */
+	for (i = 0; i < 4; i++) {
+		gen8_emit_chroma_key(batch, i);
+	}
+
+	OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3);
+	OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3);
+	OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5);
+	OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5);
+	OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3);
+	OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2);
+
+	/* WaPSRandomCSNotDone:cnl */
+#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
+	OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
+	OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2);
+	OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2);
+	OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32);
+	OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16);
+	OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16);
+	OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5);
+
+	/* Vertex buffers */
+	gen8_emit_vertex_buffers(batch);
+	gen8_emit_vertex_elements(batch);
+
+	OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */);
+
+	/* 3D state binding table pointers */
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2);
+
+	gen8_emit_cc_state_pointers(batch);
+	gen8_emit_blend_state_pointers(batch);
+	gen8_emit_ps_extra(batch);
+	gen8_emit_ps_blend(batch);
+
+	/* 3D state sampler state pointers */
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2);
+
+	OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2);
+
+	gen8_emit_viewport_state_pointers_cc(batch);
+	gen8_emit_viewport_state_pointers_sf_clip(batch);
+
+	/* WaPSRandomCSNotDone:cnl */
+#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
+	OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
+	OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	gen8_emit_raster(batch);
+
+	OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4);
+	OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2);
+
+	/* Launch 3D operation */
+	gen8_emit_primitive(batch);
+
+	/* WaRsGatherPoolEnable: cnl */
+	OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE);
+	OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2));
+	OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE);
+	OUT_BATCH(0);
+	OUT_BATCH(0xfffff << 12);
+	OUT_BATCH(GEN7_MI_RS_CONTROL);
+	OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/null_state_gen: Add GEN10 golden context batch buffer creation
  2017-04-28 14:36   ` [PATCH] " Oscar Mateo
@ 2017-07-06  0:50     ` Rodrigo Vivi
  2017-07-12 20:42       ` Oscar Mateo
  0 siblings, 1 reply; 7+ messages in thread
From: Rodrigo Vivi @ 2017-07-06  0:50 UTC (permalink / raw)
  To: Oscar Mateo; +Cc: intel-gfx, Ben Widawsky, Mika Kuoppala

Hi Oscar,

I had missed this patch here, but noticed now that I was refreshing
and testing more cnl tests before re-submitting them.

First of all I believe we need to remove the A0 w/a. I don't believe
we will ever see one. So I'm removing all A0 exclusive W/a from the
patches as well.

I also gave a try here on your null state. However if I use the golden
state generated by this version I get a blank screen because driver
load failes with some strange faults:

any idea?

[    4.115243] Memory manager not clean during takedown.
[    4.120389] ------------[ cut here ]------------
[    4.125068] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mm.c:892
drm_mm_takedown+0x25/0x30
[    4.133574] Modules linked in:
[    4.136707] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
4.12.0-eywa-46011-g9a19faf #360
[    4.144650] Hardware name: Intel Corporation Cannonlake Client
platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
CNLSFWR1.R00.X075.D01.1703021113 03/02
[    4.158500] task: ffff880264ab8000 task.stack: ffffc90000038000
[    4.164506] RIP: 0010:drm_mm_takedown+0x25/0x30
[    4.169104] RSP: 0000:ffffc9000003bc28 EFLAGS: 00010292
[    4.174409] RAX: 0000000000000029 RBX: ffff880260a54170 RCX:
ffffffff82468740
[    4.181654] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
00000000ffffffff
[    4.188839] RBP: ffffc9000003bc28 R08: 00000000fffffffe R09:
000000000000035a
[    4.196028] R10: 0000000000000005 R11: 0000000000000000 R12:
ffff880260a50000
[    4.203215] R13: ffff880260a54348 R14: ffff880260a50070 R15:
ffff880262844a00
[    4.210402] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
knlGS:0000000000000000
[    4.218541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.224344] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
00000000007406f0
[    4.231529] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    4.238716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[    4.245900] PKRU: 00000000
[    4.248673] Call Trace:
[    4.251193]  i915_gem_cleanup_stolen+0x1f/0x30
[    4.255703]  i915_ggtt_cleanup_hw+0xa4/0x170
[    4.260035]  i915_driver_cleanup_hw+0x36/0x40
[    4.264455]  i915_driver_load+0x6a0/0xe70
[    4.268535]  ? _raw_spin_unlock_irqrestore+0x26/0x50
[    4.273560]  i915_pci_probe+0x2c/0x50
[    4.277293]  local_pci_probe+0x45/0xa0
[    4.281106]  ? pci_match_device+0xe0/0x110
[    4.285265]  pci_device_probe+0x135/0x150
[    4.289343]  driver_probe_device+0x288/0x490
[    4.293676]  __driver_attach+0xc9/0xf0
[    4.297490]  ? driver_probe_device+0x490/0x490
[    4.301999]  bus_for_each_dev+0x5d/0x90
[    4.305902]  driver_attach+0x1e/0x20
[    4.309543]  bus_add_driver+0x1d0/0x290
[    4.313442]  driver_register+0x60/0xe0
[    4.317257]  __pci_register_driver+0x5d/0x60
[    4.321652]  i915_init+0x59/0x5c
[    4.324944]  ? mipi_dsi_bus_init+0x17/0x17
[    4.329103]  do_one_initcall+0x42/0x180
[    4.333007]  kernel_init_freeable+0x17c/0x202
[    4.337426]  ? set_debug_rodata+0x17/0x17
[    4.341500]  ? rest_init+0x90/0x90
[    4.344969]  kernel_init+0xe/0x110
[    4.348438]  ret_from_fork+0x25/0x30
[    4.352079] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48
83 c7 38 48 39 c7 75 01 c3 55 48 c7 c7 70 ac 20 82 31 c0 48 89 e5 e8
6b 62 b7 ff <0f> ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89
e5 41
[    4.371029] ---[ end trace 7d36c2dd72851315 ]---
[    4.381680] WARN_ON(dev_priv->mm.object_count)
[    4.381698] ------------[ cut here ]------------
[    4.390921] WARNING: CPU: 0 PID: 1 at
drivers/gpu/drm/i915/i915_gem.c:4964 i915_gem_load_cleanup+0x10b/0x120
[    4.400797] Modules linked in:
[    4.403927] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W
4.12.0-eywa-46011-g9a19faf #360
[    4.413021] Hardware name: Intel Corporation Cannonlake Client
platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
CNLSFWR1.R00.X075.D01.1703021113 03/02
[    4.426884] task: ffff880264ab8000 task.stack: ffffc90000038000
[    4.432865] RIP: 0010:i915_gem_load_cleanup+0x10b/0x120
[    4.438157] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292
[    4.443450] RAX: 0000000000000022 RBX: ffff880260a50000 RCX:
ffffffff82468740
[    4.450642] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
0000000000000202
[    4.457839] RBP: ffffc9000003bc68 R08: 0000000000000022 R09:
0000000000000389
[    4.465029] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff880260a54678
[    4.472227] R13: ffff88026446c000 R14: ffff88026446c000 R15:
ffff880262844a00
[    4.479420] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
knlGS:0000000000000000
[    4.487564] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.493370] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
00000000007406f0
[    4.500569] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    4.507763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[    4.514959] PKRU: 00000000
[    4.517737] Call Trace:
[    4.520265]  i915_driver_cleanup_early+0x1a/0x50
[    4.524955]  i915_driver_load+0x6b8/0xe70
[    4.529038]  ? _raw_spin_unlock_irqrestore+0x26/0x50
[    4.534100] clocksource: Switched to clocksource tsc
[    4.534105]  i915_pci_probe+0x2c/0x50
[    4.534113]  local_pci_probe+0x45/0xa0
[    4.534118]  ? pci_match_device+0xe0/0x110
[    4.534124]  pci_device_probe+0x135/0x150
[    4.534131]  driver_probe_device+0x288/0x490
[    4.534137]  __driver_attach+0xc9/0xf0
[    4.534142]  ? driver_probe_device+0x490/0x490
[    4.534146]  bus_for_each_dev+0x5d/0x90
[    4.534152]  driver_attach+0x1e/0x20
[    4.534156]  bus_add_driver+0x1d0/0x290
[    4.534162]  driver_register+0x60/0xe0
[    4.534167]  __pci_register_driver+0x5d/0x60
[    4.534173]  i915_init+0x59/0x5c
[    4.534177]  ? mipi_dsi_bus_init+0x17/0x17
[    4.534181]  do_one_initcall+0x42/0x180
[    4.534187]  kernel_init_freeable+0x17c/0x202
[    4.534191]  ? set_debug_rodata+0x17/0x17
[    4.534196]  ? rest_init+0x90/0x90
[    4.534200]  kernel_init+0xe/0x110
[    4.534204]  ret_from_fork+0x25/0x30
[    4.534208] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff 0f
ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8
05 4f b1 ff <0f> ff e9 33 ff ff ff 66 66 66 66 66 2e 0f 1f 84 00 00 00
00 00
[    4.534272] ---[ end trace 7d36c2dd72851316 ]---
[    4.534277] WARN_ON(!list_empty(&dev_priv->gt.timelines))
[    4.534293] ------------[ cut here ]------------
[    4.534298] WARNING: CPU: 0 PID: 1 at
drivers/gpu/drm/i915/i915_gem.c:4968 i915_gem_load_cleanup+0xef/0x120
[    4.534299] Modules linked in:
[    4.534304] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W
4.12.0-eywa-46011-g9a19faf #360
[    4.534306] Hardware name: Intel Corporation Cannonlake Client
platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
CNLSFWR1.R00.X075.D01.1703021113 03/02
[    4.534308] task: ffff880264ab8000 task.stack: ffffc90000038000
[    4.534312] RIP: 0010:i915_gem_load_cleanup+0xef/0x120
[    4.534314] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292
[    4.534317] RAX: 000000000000002d RBX: ffff880260a50000 RCX:
0000000000000000
[    4.534319] RDX: 0000000000000001 RSI: 0000000000000002 RDI:
0000000000000296
[    4.534321] RBP: ffffc9000003bc68 R08: 000000000000002d R09:
000000000000002d
[    4.534322] R10: 0000000000000000 R11: ffff880260a4e000 R12:
ffff880260a50070
[    4.534324] R13: ffff88026446c000 R14: ffff88026446c000 R15:
ffff880262844a00
[    4.534327] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
knlGS:0000000000000000
[    4.534329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.534331] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
00000000007406f0
[    4.534334] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    4.534335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[    4.534337] PKRU: 00000000
[    4.534338] Call Trace:
[    4.534344]  i915_driver_cleanup_early+0x1a/0x50
[    4.534350]  i915_driver_load+0x6b8/0xe70
[    4.534356]  ? _raw_spin_unlock_irqrestore+0x26/0x50
[    4.534361]  i915_pci_probe+0x2c/0x50
[    4.534366]  local_pci_probe+0x45/0xa0
[    4.534371]  ? pci_match_device+0xe0/0x110
[    4.534376]  pci_device_probe+0x135/0x150
[    4.534382]  driver_probe_device+0x288/0x490
[    4.534388]  __driver_attach+0xc9/0xf0
[    4.534393]  ? driver_probe_device+0x490/0x490
[    4.534398]  bus_for_each_dev+0x5d/0x90
[    4.534403]  driver_attach+0x1e/0x20
[    4.534408]  bus_add_driver+0x1d0/0x290
[    4.534414]  driver_register+0x60/0xe0
[    4.534419]  __pci_register_driver+0x5d/0x60
[    4.534424]  i915_init+0x59/0x5c
[    4.534428]  ? mipi_dsi_bus_init+0x17/0x17
[    4.534431]  do_one_initcall+0x42/0x180
[    4.534437]  kernel_init_freeable+0x17c/0x202
[    4.534440]  ? set_debug_rodata+0x17/0x17
[    4.534444]  ? rest_init+0x90/0x90
[    4.534448]  kernel_init+0xe/0x110
[    4.534451]  ret_from_fork+0x25/0x30
[    4.534455] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 3d 4f b1 ff 0f
ff e9 5d ff ff ff 48 c7 c6 b0 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8
21 4f b1 ff <0f> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98
1a 82
[    4.534519] ---[ end trace 7d36c2dd72851317 ]---
[    4.534605] =============================================================================
[    4.534608] BUG drm_i915_gem_object (Tainted: G        W      ):
Objects remaining in drm_i915_gem_object on __kmem_cache_shutdown()
[    4.534609] -----------------------------------------------------------------------------

[    4.534611] Disabling lock debugging due to kernel taint
[    4.534614] INFO: Slab 0xffffea0009820600 objects=19 used=2
fp=0xffff88026081ba80 flags=0x200000000008100
[    4.534618] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B   W
4.12.0-eywa-46011-g9a19faf #360
[    4.534620] Hardware name: Intel Corporation Cannonlake Client
platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
CNLSFWR1.R00.X075.D01.1703021113 03/02
[    4.534621] Call Trace:
[    4.534626]  dump_stack+0x65/0x89
[    4.534633]  slab_err+0xa1/0xb0
[    4.534640]  ? __kmalloc+0x185/0x270
[    4.534645]  ? kmem_cache_alloc_bulk+0x1f0/0x1f0
[    4.534650]  ? __kmem_cache_shutdown+0x160/0x400
[    4.534655]  __kmem_cache_shutdown+0x180/0x400
[    4.534663]  shutdown_cache+0x18/0x1a0
[    4.534667]  kmem_cache_destroy+0x1c1/0x1f0
[    4.534672]  i915_gem_load_cleanup+0xb4/0x120
[    4.534677]  i915_driver_cleanup_early+0x1a/0x50
[    4.534682]  i915_driver_load+0x6b8/0xe70
[    4.534689]  ? _raw_spin_unlock_irqrestore+0x26/0x50
[    4.534693]  i915_pci_probe+0x2c/0x50
[    4.534698]  local_pci_probe+0x45/0xa0
[    4.534703]  ? pci_match_device+0xe0/0x110
[    4.534708]  pci_device_probe+0x135/0x150
[    4.534714]  driver_probe_device+0x288/0x490
[    4.534721]  __driver_attach+0xc9/0xf0
[    4.534726]  ? driver_probe_device+0x490/0x490
[    4.534730]  bus_for_each_dev+0x5d/0x90
[    4.534736]  driver_attach+0x1e/0x20
[    4.534741]  bus_add_driver+0x1d0/0x290
[    4.534746]  driver_register+0x60/0xe0
[    4.534751]  __pci_register_driver+0x5d/0x60
[    4.534756]  i915_init+0x59/0x5c
[    4.534760]  ? mipi_dsi_bus_init+0x17/0x17
    4.534760]  ? mipi_dsi_bus_init+0x17/0x17
[    4.534763]  do_one_initcall+0x42/0x180
[    4.534769]  kernel_init_freeable+0x17c/0x202
[    4.534773]  ? set_debug_rodata+0x17/0x17
[    4.534777]  ? rest_init+0x90/0x90
[    4.534781]  kernel_init+0xe/0x110
[    4.534784]  ret_from_fork+0x25/0x30
[    4.534791] INFO: Object 0xffff880260818340 @offset=832
[    4.534792] INFO: Object 0xffff880260818680 @offset=1664
[    4.534795] kmem_cache_destroy drm_i915_gem_object: Slab cache
still has objects
[    4.534798] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B   W
4.12.0-eywa-46011-g9a19faf #360
[    4.534800] Hardware name: Intel Corporation Cannonlake Client
platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
CNLSFWR1.R00.X075.D01.1703021113 03/02
[    4.534801] Call Trace:
[    4.534805]  dump_stack+0x65/0x89
[    4.534809]  kmem_cache_destroy+0x1e1/0x1f0
[    4.534814]  i915_gem_load_cleanup+0xb4/0x120
[    4.534819]  i915_driver_cleanup_early+0x1a/0x50
[    4.534824]  i915_driver_load+0x6b8/0xe70
[    4.534830]  ? _raw_spin_unlock_irqrestore+0x26/0x50
[    4.534835]  i915_pci_probe+0x2c/0x50
[    4.534840]  local_pci_probe+0x45/0xa0
[    4.534844]  ? pci_match_device+0xe0/0x110
[    4.534850]  pci_device_probe+0x135/0x150
[    4.534856]  driver_probe_device+0x288/0x490
[    4.534862]  __driver_attach+0xc9/0xf0
[    4.534867]  ? driver_probe_device+0x490/0x490
[    4.534871]  bus_for_each_dev+0x5d/0x90
[    4.534877]  driver_attach+0x1e/0x20
[    4.534882]  bus_add_driver+0x1d0/0x290
[    4.534888]  driver_register+0x60/0xe0
[    4.534893]  __pci_register_driver+0x5d/0x60
[    4.534897]  i915_init+0x59/0x5c
[    4.534901]  ? mipi_dsi_bus_init+0x17/0x17
[    4.534904]  do_one_initcall+0x42/0x180
[    4.534910]  kernel_init_freeable+0x17c/0x202
[    4.534914]  ? set_debug_rodata+0x17/0x17
[    4.534917]  ? rest_init+0x90/0x90
[    4.534922]  kernel_init+0xe/0x110
[    4.534925]  ret_from_fork+0x25/0x30
[    4.535386] i915 0000:00:02.0: [drm:i915_driver_load] Device
initialization failed (-22)
[    4.535390] i915 0000:00:02.0: Please file a bug at
https://bugs.freedesktop.org/enter_bug.cgi?product=DRI against
DRM/Intel providing the dmesg log by booting with drm.debug=0xf
[    4.535450] i915: probe of 0000:00:02.0 failed with error -22


On Fri, Apr 28, 2017 at 7:36 AM, Oscar Mateo <oscar.mateo@intel.com> wrote:
> This batchbuffer is over 4096 bytes, so we need to increase the size of the
> array (and the KMD has to be modified to deal with more than one page).
>
> Notice that there to workarounds embedded here, both applicable to all CNL
> steppings.
>
> v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so update
>     the comment in the code and in the commit message.
>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  lib/gen10_render.h                             |  63 +++
>  tools/null_state_gen/Makefile.am               |   3 +-
>  tools/null_state_gen/intel_batchbuffer.h       |   2 +-
>  tools/null_state_gen/intel_null_state_gen.c    |   5 +-
>  tools/null_state_gen/intel_renderstate.h       |   1 +
>  tools/null_state_gen/intel_renderstate_gen10.c | 538 +++++++++++++++++++++++++
>  6 files changed, 609 insertions(+), 3 deletions(-)
>  create mode 100644 lib/gen10_render.h
>  create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c
>
> diff --git a/lib/gen10_render.h b/lib/gen10_render.h
> new file mode 100644
> index 0000000..f4a7dff
> --- /dev/null
> +++ b/lib/gen10_render.h
> @@ -0,0 +1,63 @@
> +#ifndef GEN10_RENDER_H
> +#define GEN10_RENDER_H
> +
> +#include "gen9_render.h"
> +
> +#define GEN7_MI_RS_CONTROL                     (0x6 << 23)
> +# define GEN7_MI_RS_CONTROL_ENABLE             (1 << 0)
> +
> +#define GEN10_3DSTATE_GATHER_POOL_ALLOC                GEN6_3D(3, 1, 0x1a)
> +# define GEN10_3DSTATE_GATHER_POOL_ENABLE      (1 << 11)
> +
> +#define GEN10_3DSTATE_GATHER_CONSTANT_VS       GEN6_3D(3, 0, 0x34)
> +#define GEN10_3DSTATE_GATHER_CONSTANT_HS       GEN6_3D(3, 0, 0x36)
> +#define GEN10_3DSTATE_GATHER_CONSTANT_DS       GEN6_3D(3, 0, 0x37)
> +#define GEN10_3DSTATE_GATHER_CONSTANT_GS       GEN6_3D(3, 0, 0x35)
> +#define GEN10_3DSTATE_GATHER_CONSTANT_PS       GEN6_3D(3, 0, 0x38)
> +
> +#define GEN10_3DSTATE_WM_DEPTH_STENCIL         GEN6_3D(3, 0, 0x4e)
> +#define GEN10_3DSTATE_WM_CHROMAKEY             GEN6_3D(3, 0, 0x4c)
> +
> +#define GEN8_REG_L3_CACHE_CONFIG       0x7034
> +
> +/*
> + * Programming for L3 cache allocations can be made per bank. Based on the
> + * programmed value HW will apply same allocations on other available banks.
> + * Total L3 Cache size per bank = 256 KB.
> + * {SLM,    URB,     DC,      RO(I/S, C, T),   L3 Client Pool}
> + * {  0,    96,      32,      128,                 0      }
> + */
> +#define GEN10_L3_CACHE_CONFIG_VALUE    0x00420060
> +
> +#define URB_ALIGN(val, align)  ((val % align) ? (val - (val % align)) : val)
> +
> +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES                64
> +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES                2752
> +
> +#define GEN10_KB_PER_URB_INDEX                 8
> +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB       96
> +
> +#define GEN10_URB_RESERVED_SIZE_KB             32
> +#define GEN10_URB_RESERVED_END_SIZE_KB         8
> +
> +#define GEN10_VS_NUM_BITS_PER_URB_UNIT         512
> +#define GEN10_VS_NUM_OF_URB_UNITS              1 // zero based
> +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS                (GEN10_VS_NUM_BITS_PER_URB_UNIT * \
> +                                               (GEN10_VS_NUM_OF_URB_UNITS + 1))
> +
> +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / GEN10_KB_PER_URB_INDEX)
> +
> +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count)                \
> +       URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX)
> +
> +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice)       \
> +       (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - GEN10_URB_RESERVED_END_SIZE_KB)
> +
> +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice)   \
> +       ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) *    \
> +       1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS)
> +
> +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice)                     \
> +       ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / GEN10_KB_PER_URB_INDEX)
> +
> +#endif
> diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am
> index 24884a7..2f90990 100644
> --- a/tools/null_state_gen/Makefile.am
> +++ b/tools/null_state_gen/Makefile.am
> @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES =       \
>         intel_renderstate_gen7.c \
>         intel_renderstate_gen8.c \
>         intel_renderstate_gen9.c \
> +       intel_renderstate_gen10.c \
>         intel_null_state_gen.c
>
> -gens := 6 7 8 9
> +gens := 6 7 8 9 10
>
>  h = /tmp/intel_renderstate_gen$$gen.c
>  states: intel_null_state_gen
> diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h
> index 771d1c8..e40e01b 100644
> --- a/tools/null_state_gen/intel_batchbuffer.h
> +++ b/tools/null_state_gen/intel_batchbuffer.h
> @@ -34,7 +34,7 @@
>  #include <stdint.h>
>
>  #define MAX_RELOCS 64
> -#define MAX_ITEMS 1024
> +#define MAX_ITEMS 2048
>  #define MAX_STRLEN 256
>
>  #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1))
> diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c
> index 06eb954..4f12f5f 100644
> --- a/tools/null_state_gen/intel_null_state_gen.c
> +++ b/tools/null_state_gen/intel_null_state_gen.c
> @@ -41,7 +41,7 @@ static int debug = 0;
>  static void print_usage(char *s)
>  {
>         fprintf(stderr, "%s: <gen>\n"
> -               "     gen:     gen to generate for (6,7,8,9)\n",
> +               "     gen:     gen to generate for (6,7,8,9,10)\n",
>                 s);
>  }
>
> @@ -173,6 +173,9 @@ static int do_generate(int gen)
>         case 9:
>                 null_state_gen = gen9_setup_null_render_state;
>                 break;
> +       case 10:
> +               null_state_gen = gen10_setup_null_render_state;
> +               break;
>         }
>
>         if (null_state_gen == NULL) {
> diff --git a/tools/null_state_gen/intel_renderstate.h b/tools/null_state_gen/intel_renderstate.h
> index b27b434..b3c8c2b 100644
> --- a/tools/null_state_gen/intel_renderstate.h
> +++ b/tools/null_state_gen/intel_renderstate.h
> @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct intel_batchbuffer *batch);
>  void gen7_setup_null_render_state(struct intel_batchbuffer *batch);
>  void gen8_setup_null_render_state(struct intel_batchbuffer *batch);
>  void gen9_setup_null_render_state(struct intel_batchbuffer *batch);
> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch);
>
>  #endif /* __INTEL_RENDERSTATE_H__ */
> diff --git a/tools/null_state_gen/intel_renderstate_gen10.c b/tools/null_state_gen/intel_renderstate_gen10.c
> new file mode 100644
> index 0000000..f5678c3
> --- /dev/null
> +++ b/tools/null_state_gen/intel_renderstate_gen10.c
> @@ -0,0 +1,538 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Authors:
> + *     Oscar Mateo <oscar.mateo@intel.com>
> + */
> +
> +#include "intel_renderstate.h"
> +#include <lib/gen10_render.h>
> +#include <lib/intel_reg.h>
> +
> +static void gen8_emit_wm(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
> +       OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION);
> +}
> +
> +static void gen8_emit_ps(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2));
> +       OUT_BATCH(0);
> +       OUT_BATCH(0); /* kernel hi */
> +       OUT_BATCH(GEN7_PS_SPF_MODE);
> +       OUT_BATCH(0); /* scratch space stuff */
> +       OUT_BATCH(0); /* scratch hi */
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0); // kernel 1
> +       OUT_BATCH(0); /* kernel 1 hi */
> +       OUT_BATCH(0); // kernel 2
> +       OUT_BATCH(0); /* kernel 2 hi */
> +}
> +
> +static void gen8_emit_sf(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT |
> +                 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT |
> +                 GEN7_SF_POINT_WIDTH_FROM_SOURCE |
> +                 8);
> +}
> +
> +static void gen8_emit_vs(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2));
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +}
> +
> +static void gen8_emit_hs(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2));
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT);
> +       OUT_BATCH(0);
> +}
> +
> +static void gen8_emit_raster(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
> +       OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW);
> +       OUT_BATCH(0.0);
> +       OUT_BATCH(0.0);
> +       OUT_BATCH(0.0);
> +}
> +
> +static void gen10_emit_urb(struct intel_batchbuffer *batch)
> +{
> +       /* Smallest SKU: 3x8*/
> +       int l3_bank_count = 3;
> +       int slice_count = 1;
> +       int urb_size_per_slice = GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count);
> +       int other_urb_start_addr = GEN10_VS_END_URB_INDEX(urb_size_per_slice);
> +       const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX;
> +       const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS;
> +       int vs_urb_entries = GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice);
> +
> +       if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES)
> +               vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES;
> +       if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES)
> +               vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES;
> +
> +       OUT_BATCH(GEN7_3DSTATE_URB_VS);
> +       OUT_BATCH(vs_urb_entries |
> +                (vs_urb_alloc_size << 16) |
> +                (vs_urb_start_addr << 25));
> +
> +       OUT_BATCH(GEN7_3DSTATE_URB_HS);
> +       OUT_BATCH(other_urb_start_addr << 25);
> +
> +       OUT_BATCH(GEN7_3DSTATE_URB_DS);
> +       OUT_BATCH(other_urb_start_addr << 25);
> +
> +       OUT_BATCH(GEN7_3DSTATE_URB_GS);
> +       OUT_BATCH(other_urb_start_addr << 25);
> +}
> +
> +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
> +       OUT_BATCH(_3DPRIM_TRILIST);
> +}
> +
> +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch)
> +{
> +       const int num_decls = 128;
> +       int i;
> +
> +       OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST |
> +               (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */);
> +       OUT_BATCH(0);
> +       OUT_BATCH(num_decls);
> +
> +       for (i = 0; i < num_decls; i++) {
> +               OUT_BATCH(0);
> +               OUT_BATCH(0);
> +       }
> +}
> +
> +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index)
> +{
> +       OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2));
> +       OUT_BATCH(index << 29);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +}
> +
> +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index)
> +{
> +       OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2));
> +       OUT_BATCH(index << 30);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +}
> +
> +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch)
> +{
> +       const int buffers = 33;
> +       int i;
> +
> +       OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS |
> +               (((4 * buffers) + 1)- 2) /* DWORD count - 2 */);
> +
> +       for (i = 0; i < buffers; i++) {
> +               OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT |
> +                         GEN7_VB0_BUFFER_ADDR_MOD_EN);
> +               OUT_BATCH(0); /* Address */
> +               OUT_BATCH(0);
> +               OUT_BATCH(0);
> +       }
> +}
> +
> +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch)
> +{
> +       const int elements = 34;
> +       int i;
> +
> +       OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS |
> +               (((2 * elements) + 1) - 2) /* DWORD count - 2 */);
> +
> +       /* Element 0 */
> +       OUT_BATCH(VE0_VALID);
> +       OUT_BATCH(
> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
> +       /* Elements 1 -> 33 */
> +       for (i = 1; i < elements; i++) {
> +               OUT_BATCH(0);
> +               OUT_BATCH(0);
> +       }
> +}
> +
> +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch)
> +{
> +       union {
> +               float fval;
> +               uint32_t uval;
> +       } u;
> +
> +       unsigned offset;
> +
> +       u.fval = 1.0f;
> +
> +       offset = intel_batch_state_offset(batch, 64);
> +       OUT_STATE(0);
> +       OUT_STATE(0);      /* Alpha reference value */
> +       OUT_STATE(u.uval); /* Blend constant color RED */
> +       OUT_STATE(u.uval); /* Blend constant color BLUE */
> +       OUT_STATE(u.uval); /* Blend constant color GREEN */
> +       OUT_STATE(u.uval); /* Blend constant color ALPHA */
> +
> +       OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
> +       OUT_BATCH_STATE_OFFSET(offset | 1);
> +}
> +
> +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch)
> +{
> +       unsigned offset;
> +       int i;
> +
> +       offset = intel_batch_state_offset(batch, 64);
> +
> +       for (i = 0; i < 17; i++)
> +               OUT_STATE(0);
> +
> +       OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
> +       OUT_BATCH_STATE_OFFSET(offset | 1);
> +}
> +
> +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
> +       OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID |
> +                 GEN8_PSX_ATTRIBUTE_ENABLE);
> +
> +}
> +
> +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
> +       OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
> +}
> +
> +static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch)
> +{
> +       unsigned offset;
> +
> +       offset = intel_batch_state_offset(batch, 32);
> +
> +       OUT_STATE((uint32_t)0.0f); /* Minimum depth */
> +       OUT_STATE((uint32_t)0.0f); /* Maximum depth */
> +
> +       OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
> +       OUT_BATCH_STATE_OFFSET(offset);
> +}
> +
> +static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch)
> +{
> +       unsigned offset;
> +       int i;
> +
> +       offset = intel_batch_state_offset(batch, 64);
> +
> +       for (i = 0; i < 16; i++)
> +               OUT_STATE(0);
> +
> +       OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2));
> +       OUT_BATCH_STATE_OFFSET(offset);
> +}
> +
> +static void gen8_emit_primitive(struct intel_batchbuffer *batch)
> +{
> +       OUT_BATCH(GEN6_3DPRIMITIVE | (10-2));
> +       OUT_BATCH(4);   /* gen8+ ignore the topology type field */
> +       OUT_BATCH(1);   /* vertex count */
> +       OUT_BATCH(0);
> +       OUT_BATCH(1);   /* single instance */
> +       OUT_BATCH(0);   /* start instance location */
> +       OUT_BATCH(0);   /* index buffer offset, ignored */
> +       OUT_BATCH(0);   /* extended parameter 0 */
> +       OUT_BATCH(0);   /* extended parameter 1 */
> +       OUT_BATCH(0);   /* extended parameter 2 */
> +}
> +
> +static void gen9_emit_state_base_address(struct intel_batchbuffer *batch) {
> +       const unsigned offset = 0;
> +       OUT_BATCH(GEN6_STATE_BASE_ADDRESS |
> +               (22 - 2) /* DWORD count - 2 */);
> +
> +       /* general state base address - requires BB address
> +        * added to state offset to be stored in this location
> +        */
> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
> +       OUT_BATCH(0);
> +
> +       /* stateless data port */
> +       OUT_BATCH(0);
> +
> +       /* surface state base address - requires BB address
> +        * added to state offset to be stored in this location
> +        */
> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
> +       OUT_BATCH(0);
> +
> +       /* dynamic state base address - requires BB address
> +        * added to state offset to be stored in this location
> +        */
> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
> +       OUT_BATCH(0);
> +
> +       /* indirect state base address */
> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
> +       OUT_BATCH(0);
> +
> +       /* instruction state base address - requires BB address
> +        * added to state offset to be stored in this location
> +        */
> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
> +       OUT_BATCH(0);
> +
> +       /* general state buffer size */
> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
> +       /* dynamic state buffer size */
> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
> +       /* indirect object buffer size */
> +       OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY);
> +       /* intruction buffer size */
> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
> +
> +       /* bindless surface state base address */
> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
> +       OUT_BATCH(0);
> +       /* bindless surface state size */
> +       OUT_BATCH(0);
> +
> +       /* bindless sampler state base address */
> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
> +       OUT_BATCH(0);
> +       /* bindless sampler state size */
> +       OUT_BATCH(0);
> +}
> +
> +/*
> + * Generate the batch buffer commands needed to initialize the 3D engine
> + * to its "golden state".
> + */
> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch)
> +{
> +       int i;
> +
> +       /* WaRsGatherPoolEnable: cnl */
> +       OUT_BATCH(GEN7_MI_RS_CONTROL);
> +
> +#define GEN8_PIPE_CONTROL_GLOBAL_GTT   (1 << 24)
> +       /* PIPE_CONTROL */
> +       OUT_BATCH(GEN6_PIPE_CONTROL |
> +                (6 - 2));      /* DWORD count - 2 */
> +       OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +
> +       /* PIPELINE_SELECT */
> +       OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D);
> +
> +       OUT_BATCH(MI_LOAD_REGISTER_IMM);
> +       OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG);
> +       OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE);
> +
> +       gen8_emit_wm(batch);
> +       gen8_emit_ps(batch);
> +       gen8_emit_sf(batch);
> +
> +       OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */
> +       OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11);
> +
> +       gen8_emit_vs(batch);
> +       gen8_emit_hs(batch);
> +
> +       OUT_CMD(GEN7_3DSTATE_GS, 10);
> +       OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5);
> +       OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */
> +       OUT_CMD(GEN6_3DSTATE_CLIP, 4);
> +       OUT_CMD(GEN7_3DSTATE_TE, 4);
> +       OUT_CMD(GEN8_3DSTATE_VF, 2);
> +       OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5);
> +
> +       /* URB States */
> +       gen10_emit_urb(batch);
> +
> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130);
> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130);
> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130);
> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130);
> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130);
> +
> +       OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4);
> +       OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4);
> +       OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4);
> +
> +       /* Push Constants */
> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2);
> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2);
> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2);
> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2);
> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2);
> +
> +       /* Constants */
> +       OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11);
> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11);
> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11);
> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11);
> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11);
> +
> +       OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3);
> +       OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2);
> +       gen8_emit_vf_topology(batch);
> +
> +       /* Streamer out declaration list */
> +       gen8_emit_so_decl_list(batch);
> +
> +       /* Streamer out buffers */
> +       for (i = 0; i < 4; i++) {
> +               gen8_emit_so_buffer(batch, i);
> +       }
> +
> +       /* State base addresses */
> +       gen9_emit_state_base_address(batch);
> +
> +       OUT_CMD(GEN6_STATE_SIP, 3);
> +       OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4);
> +       OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8);
> +
> +       /* Chroma key */
> +       for (i = 0; i < 4; i++) {
> +               gen8_emit_chroma_key(batch, i);
> +       }
> +
> +       OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3);
> +       OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3);
> +       OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5);
> +       OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5);
> +       OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3);
> +       OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2);
> +
> +       /* WaPSRandomCSNotDone:cnl */
> +#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
> +       OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
> +       OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +
> +       OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2);
> +       OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2);
> +       OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32);
> +       OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16);
> +       OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16);
> +       OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5);
> +
> +       /* Vertex buffers */
> +       gen8_emit_vertex_buffers(batch);
> +       gen8_emit_vertex_elements(batch);
> +
> +       OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */);
> +
> +       /* 3D state binding table pointers */
> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2);
> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2);
> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2);
> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2);
> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2);
> +
> +       gen8_emit_cc_state_pointers(batch);
> +       gen8_emit_blend_state_pointers(batch);
> +       gen8_emit_ps_extra(batch);
> +       gen8_emit_ps_blend(batch);
> +
> +       /* 3D state sampler state pointers */
> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2);
> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2);
> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2);
> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2);
> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2);
> +
> +       OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2);
> +
> +       gen8_emit_viewport_state_pointers_cc(batch);
> +       gen8_emit_viewport_state_pointers_sf_clip(batch);
> +
> +       /* WaPSRandomCSNotDone:cnl */
> +#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
> +       OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
> +       OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0);
> +
> +       gen8_emit_raster(batch);
> +
> +       OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4);
> +       OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2);
> +
> +       /* Launch 3D operation */
> +       gen8_emit_primitive(batch);
> +
> +       /* WaRsGatherPoolEnable: cnl */
> +       OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE);
> +       OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2));
> +       OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE);
> +       OUT_BATCH(0);
> +       OUT_BATCH(0xfffff << 12);
> +       OUT_BATCH(GEN7_MI_RS_CONTROL);
> +       OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4);
> +
> +       OUT_BATCH(MI_BATCH_BUFFER_END);
> +}
> --
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/null_state_gen: Add GEN10 golden context batch buffer creation
  2017-07-06  0:50     ` Rodrigo Vivi
@ 2017-07-12 20:42       ` Oscar Mateo
  2017-07-12 21:03         ` Rodrigo Vivi
  0 siblings, 1 reply; 7+ messages in thread
From: Oscar Mateo @ 2017-07-12 20:42 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Ben Widawsky, Mika Kuoppala



On 07/05/2017 05:50 PM, Rodrigo Vivi wrote:
> Hi Oscar,

Hey!

> I had missed this patch here, but noticed now that I was refreshing
> and testing more cnl tests before re-submitting them.
>
> First of all I believe we need to remove the A0 w/a. I don't believe
> we will ever see one. So I'm removing all A0 exclusive W/a from the
> patches as well.

Be careful: I think both WAs in the patch are for all steppings (one was 
incorrectly marked as A0 only in v1 of this patch).

> I also gave a try here on your null state. However if I use the golden
> state generated by this version I get a blank screen because driver
> load failes with some strange faults:

Good. I don't have a CNL so it was only compile-tested.

> any idea?

Did you also include the i915 patch to allow golden BBs over one page in 
size? I sent it separately as "drm/i915: Allow null render state 
batchbuffers bigger than one page". BTW: this patch was given a cold 
shoulder in the mailing list, since I could not re-justify why null 
state was needed in the first place (since UMD needs to configure the 3D 
pipeline first thing anyway). I am still trying to get a better 
explanation from HW people.

-- Oscar

> [    4.115243] Memory manager not clean during takedown.
> [    4.120389] ------------[ cut here ]------------
> [    4.125068] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mm.c:892
> drm_mm_takedown+0x25/0x30
> [    4.133574] Modules linked in:
> [    4.136707] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 4.12.0-eywa-46011-g9a19faf #360
> [    4.144650] Hardware name: Intel Corporation Cannonlake Client
> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
> CNLSFWR1.R00.X075.D01.1703021113 03/02
> [    4.158500] task: ffff880264ab8000 task.stack: ffffc90000038000
> [    4.164506] RIP: 0010:drm_mm_takedown+0x25/0x30
> [    4.169104] RSP: 0000:ffffc9000003bc28 EFLAGS: 00010292
> [    4.174409] RAX: 0000000000000029 RBX: ffff880260a54170 RCX:
> ffffffff82468740
> [    4.181654] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
> 00000000ffffffff
> [    4.188839] RBP: ffffc9000003bc28 R08: 00000000fffffffe R09:
> 000000000000035a
> [    4.196028] R10: 0000000000000005 R11: 0000000000000000 R12:
> ffff880260a50000
> [    4.203215] R13: ffff880260a54348 R14: ffff880260a50070 R15:
> ffff880262844a00
> [    4.210402] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
> knlGS:0000000000000000
> [    4.218541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    4.224344] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
> 00000000007406f0
> [    4.231529] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [    4.238716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [    4.245900] PKRU: 00000000
> [    4.248673] Call Trace:
> [    4.251193]  i915_gem_cleanup_stolen+0x1f/0x30
> [    4.255703]  i915_ggtt_cleanup_hw+0xa4/0x170
> [    4.260035]  i915_driver_cleanup_hw+0x36/0x40
> [    4.264455]  i915_driver_load+0x6a0/0xe70
> [    4.268535]  ? _raw_spin_unlock_irqrestore+0x26/0x50
> [    4.273560]  i915_pci_probe+0x2c/0x50
> [    4.277293]  local_pci_probe+0x45/0xa0
> [    4.281106]  ? pci_match_device+0xe0/0x110
> [    4.285265]  pci_device_probe+0x135/0x150
> [    4.289343]  driver_probe_device+0x288/0x490
> [    4.293676]  __driver_attach+0xc9/0xf0
> [    4.297490]  ? driver_probe_device+0x490/0x490
> [    4.301999]  bus_for_each_dev+0x5d/0x90
> [    4.305902]  driver_attach+0x1e/0x20
> [    4.309543]  bus_add_driver+0x1d0/0x290
> [    4.313442]  driver_register+0x60/0xe0
> [    4.317257]  __pci_register_driver+0x5d/0x60
> [    4.321652]  i915_init+0x59/0x5c
> [    4.324944]  ? mipi_dsi_bus_init+0x17/0x17
> [    4.329103]  do_one_initcall+0x42/0x180
> [    4.333007]  kernel_init_freeable+0x17c/0x202
> [    4.337426]  ? set_debug_rodata+0x17/0x17
> [    4.341500]  ? rest_init+0x90/0x90
> [    4.344969]  kernel_init+0xe/0x110
> [    4.348438]  ret_from_fork+0x25/0x30
> [    4.352079] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48
> 83 c7 38 48 39 c7 75 01 c3 55 48 c7 c7 70 ac 20 82 31 c0 48 89 e5 e8
> 6b 62 b7 ff <0f> ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89
> e5 41
> [    4.371029] ---[ end trace 7d36c2dd72851315 ]---
> [    4.381680] WARN_ON(dev_priv->mm.object_count)
> [    4.381698] ------------[ cut here ]------------
> [    4.390921] WARNING: CPU: 0 PID: 1 at
> drivers/gpu/drm/i915/i915_gem.c:4964 i915_gem_load_cleanup+0x10b/0x120
> [    4.400797] Modules linked in:
> [    4.403927] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W
> 4.12.0-eywa-46011-g9a19faf #360
> [    4.413021] Hardware name: Intel Corporation Cannonlake Client
> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
> CNLSFWR1.R00.X075.D01.1703021113 03/02
> [    4.426884] task: ffff880264ab8000 task.stack: ffffc90000038000
> [    4.432865] RIP: 0010:i915_gem_load_cleanup+0x10b/0x120
> [    4.438157] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292
> [    4.443450] RAX: 0000000000000022 RBX: ffff880260a50000 RCX:
> ffffffff82468740
> [    4.450642] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
> 0000000000000202
> [    4.457839] RBP: ffffc9000003bc68 R08: 0000000000000022 R09:
> 0000000000000389
> [    4.465029] R10: 0000000000000000 R11: 0000000000000001 R12:
> ffff880260a54678
> [    4.472227] R13: ffff88026446c000 R14: ffff88026446c000 R15:
> ffff880262844a00
> [    4.479420] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
> knlGS:0000000000000000
> [    4.487564] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    4.493370] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
> 00000000007406f0
> [    4.500569] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [    4.507763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [    4.514959] PKRU: 00000000
> [    4.517737] Call Trace:
> [    4.520265]  i915_driver_cleanup_early+0x1a/0x50
> [    4.524955]  i915_driver_load+0x6b8/0xe70
> [    4.529038]  ? _raw_spin_unlock_irqrestore+0x26/0x50
> [    4.534100] clocksource: Switched to clocksource tsc
> [    4.534105]  i915_pci_probe+0x2c/0x50
> [    4.534113]  local_pci_probe+0x45/0xa0
> [    4.534118]  ? pci_match_device+0xe0/0x110
> [    4.534124]  pci_device_probe+0x135/0x150
> [    4.534131]  driver_probe_device+0x288/0x490
> [    4.534137]  __driver_attach+0xc9/0xf0
> [    4.534142]  ? driver_probe_device+0x490/0x490
> [    4.534146]  bus_for_each_dev+0x5d/0x90
> [    4.534152]  driver_attach+0x1e/0x20
> [    4.534156]  bus_add_driver+0x1d0/0x290
> [    4.534162]  driver_register+0x60/0xe0
> [    4.534167]  __pci_register_driver+0x5d/0x60
> [    4.534173]  i915_init+0x59/0x5c
> [    4.534177]  ? mipi_dsi_bus_init+0x17/0x17
> [    4.534181]  do_one_initcall+0x42/0x180
> [    4.534187]  kernel_init_freeable+0x17c/0x202
> [    4.534191]  ? set_debug_rodata+0x17/0x17
> [    4.534196]  ? rest_init+0x90/0x90
> [    4.534200]  kernel_init+0xe/0x110
> [    4.534204]  ret_from_fork+0x25/0x30
> [    4.534208] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff 0f
> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8
> 05 4f b1 ff <0f> ff e9 33 ff ff ff 66 66 66 66 66 2e 0f 1f 84 00 00 00
> 00 00
> [    4.534272] ---[ end trace 7d36c2dd72851316 ]---
> [    4.534277] WARN_ON(!list_empty(&dev_priv->gt.timelines))
> [    4.534293] ------------[ cut here ]------------
> [    4.534298] WARNING: CPU: 0 PID: 1 at
> drivers/gpu/drm/i915/i915_gem.c:4968 i915_gem_load_cleanup+0xef/0x120
> [    4.534299] Modules linked in:
> [    4.534304] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W
> 4.12.0-eywa-46011-g9a19faf #360
> [    4.534306] Hardware name: Intel Corporation Cannonlake Client
> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
> CNLSFWR1.R00.X075.D01.1703021113 03/02
> [    4.534308] task: ffff880264ab8000 task.stack: ffffc90000038000
> [    4.534312] RIP: 0010:i915_gem_load_cleanup+0xef/0x120
> [    4.534314] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292
> [    4.534317] RAX: 000000000000002d RBX: ffff880260a50000 RCX:
> 0000000000000000
> [    4.534319] RDX: 0000000000000001 RSI: 0000000000000002 RDI:
> 0000000000000296
> [    4.534321] RBP: ffffc9000003bc68 R08: 000000000000002d R09:
> 000000000000002d
> [    4.534322] R10: 0000000000000000 R11: ffff880260a4e000 R12:
> ffff880260a50070
> [    4.534324] R13: ffff88026446c000 R14: ffff88026446c000 R15:
> ffff880262844a00
> [    4.534327] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
> knlGS:0000000000000000
> [    4.534329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    4.534331] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
> 00000000007406f0
> [    4.534334] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [    4.534335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [    4.534337] PKRU: 00000000
> [    4.534338] Call Trace:
> [    4.534344]  i915_driver_cleanup_early+0x1a/0x50
> [    4.534350]  i915_driver_load+0x6b8/0xe70
> [    4.534356]  ? _raw_spin_unlock_irqrestore+0x26/0x50
> [    4.534361]  i915_pci_probe+0x2c/0x50
> [    4.534366]  local_pci_probe+0x45/0xa0
> [    4.534371]  ? pci_match_device+0xe0/0x110
> [    4.534376]  pci_device_probe+0x135/0x150
> [    4.534382]  driver_probe_device+0x288/0x490
> [    4.534388]  __driver_attach+0xc9/0xf0
> [    4.534393]  ? driver_probe_device+0x490/0x490
> [    4.534398]  bus_for_each_dev+0x5d/0x90
> [    4.534403]  driver_attach+0x1e/0x20
> [    4.534408]  bus_add_driver+0x1d0/0x290
> [    4.534414]  driver_register+0x60/0xe0
> [    4.534419]  __pci_register_driver+0x5d/0x60
> [    4.534424]  i915_init+0x59/0x5c
> [    4.534428]  ? mipi_dsi_bus_init+0x17/0x17
> [    4.534431]  do_one_initcall+0x42/0x180
> [    4.534437]  kernel_init_freeable+0x17c/0x202
> [    4.534440]  ? set_debug_rodata+0x17/0x17
> [    4.534444]  ? rest_init+0x90/0x90
> [    4.534448]  kernel_init+0xe/0x110
> [    4.534451]  ret_from_fork+0x25/0x30
> [    4.534455] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 3d 4f b1 ff 0f
> ff e9 5d ff ff ff 48 c7 c6 b0 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8
> 21 4f b1 ff <0f> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98
> 1a 82
> [    4.534519] ---[ end trace 7d36c2dd72851317 ]---
> [    4.534605] =============================================================================
> [    4.534608] BUG drm_i915_gem_object (Tainted: G        W      ):
> Objects remaining in drm_i915_gem_object on __kmem_cache_shutdown()
> [    4.534609] -----------------------------------------------------------------------------
>
> [    4.534611] Disabling lock debugging due to kernel taint
> [    4.534614] INFO: Slab 0xffffea0009820600 objects=19 used=2
> fp=0xffff88026081ba80 flags=0x200000000008100
> [    4.534618] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B   W
> 4.12.0-eywa-46011-g9a19faf #360
> [    4.534620] Hardware name: Intel Corporation Cannonlake Client
> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
> CNLSFWR1.R00.X075.D01.1703021113 03/02
> [    4.534621] Call Trace:
> [    4.534626]  dump_stack+0x65/0x89
> [    4.534633]  slab_err+0xa1/0xb0
> [    4.534640]  ? __kmalloc+0x185/0x270
> [    4.534645]  ? kmem_cache_alloc_bulk+0x1f0/0x1f0
> [    4.534650]  ? __kmem_cache_shutdown+0x160/0x400
> [    4.534655]  __kmem_cache_shutdown+0x180/0x400
> [    4.534663]  shutdown_cache+0x18/0x1a0
> [    4.534667]  kmem_cache_destroy+0x1c1/0x1f0
> [    4.534672]  i915_gem_load_cleanup+0xb4/0x120
> [    4.534677]  i915_driver_cleanup_early+0x1a/0x50
> [    4.534682]  i915_driver_load+0x6b8/0xe70
> [    4.534689]  ? _raw_spin_unlock_irqrestore+0x26/0x50
> [    4.534693]  i915_pci_probe+0x2c/0x50
> [    4.534698]  local_pci_probe+0x45/0xa0
> [    4.534703]  ? pci_match_device+0xe0/0x110
> [    4.534708]  pci_device_probe+0x135/0x150
> [    4.534714]  driver_probe_device+0x288/0x490
> [    4.534721]  __driver_attach+0xc9/0xf0
> [    4.534726]  ? driver_probe_device+0x490/0x490
> [    4.534730]  bus_for_each_dev+0x5d/0x90
> [    4.534736]  driver_attach+0x1e/0x20
> [    4.534741]  bus_add_driver+0x1d0/0x290
> [    4.534746]  driver_register+0x60/0xe0
> [    4.534751]  __pci_register_driver+0x5d/0x60
> [    4.534756]  i915_init+0x59/0x5c
> [    4.534760]  ? mipi_dsi_bus_init+0x17/0x17
>      4.534760]  ? mipi_dsi_bus_init+0x17/0x17
> [    4.534763]  do_one_initcall+0x42/0x180
> [    4.534769]  kernel_init_freeable+0x17c/0x202
> [    4.534773]  ? set_debug_rodata+0x17/0x17
> [    4.534777]  ? rest_init+0x90/0x90
> [    4.534781]  kernel_init+0xe/0x110
> [    4.534784]  ret_from_fork+0x25/0x30
> [    4.534791] INFO: Object 0xffff880260818340 @offset=832
> [    4.534792] INFO: Object 0xffff880260818680 @offset=1664
> [    4.534795] kmem_cache_destroy drm_i915_gem_object: Slab cache
> still has objects
> [    4.534798] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B   W
> 4.12.0-eywa-46011-g9a19faf #360
> [    4.534800] Hardware name: Intel Corporation Cannonlake Client
> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
> CNLSFWR1.R00.X075.D01.1703021113 03/02
> [    4.534801] Call Trace:
> [    4.534805]  dump_stack+0x65/0x89
> [    4.534809]  kmem_cache_destroy+0x1e1/0x1f0
> [    4.534814]  i915_gem_load_cleanup+0xb4/0x120
> [    4.534819]  i915_driver_cleanup_early+0x1a/0x50
> [    4.534824]  i915_driver_load+0x6b8/0xe70
> [    4.534830]  ? _raw_spin_unlock_irqrestore+0x26/0x50
> [    4.534835]  i915_pci_probe+0x2c/0x50
> [    4.534840]  local_pci_probe+0x45/0xa0
> [    4.534844]  ? pci_match_device+0xe0/0x110
> [    4.534850]  pci_device_probe+0x135/0x150
> [    4.534856]  driver_probe_device+0x288/0x490
> [    4.534862]  __driver_attach+0xc9/0xf0
> [    4.534867]  ? driver_probe_device+0x490/0x490
> [    4.534871]  bus_for_each_dev+0x5d/0x90
> [    4.534877]  driver_attach+0x1e/0x20
> [    4.534882]  bus_add_driver+0x1d0/0x290
> [    4.534888]  driver_register+0x60/0xe0
> [    4.534893]  __pci_register_driver+0x5d/0x60
> [    4.534897]  i915_init+0x59/0x5c
> [    4.534901]  ? mipi_dsi_bus_init+0x17/0x17
> [    4.534904]  do_one_initcall+0x42/0x180
> [    4.534910]  kernel_init_freeable+0x17c/0x202
> [    4.534914]  ? set_debug_rodata+0x17/0x17
> [    4.534917]  ? rest_init+0x90/0x90
> [    4.534922]  kernel_init+0xe/0x110
> [    4.534925]  ret_from_fork+0x25/0x30
> [    4.535386] i915 0000:00:02.0: [drm:i915_driver_load] Device
> initialization failed (-22)
> [    4.535390] i915 0000:00:02.0: Please file a bug at
> https://bugs.freedesktop.org/enter_bug.cgi?product=DRI against
> DRM/Intel providing the dmesg log by booting with drm.debug=0xf
> [    4.535450] i915: probe of 0000:00:02.0 failed with error -22
>
>
> On Fri, Apr 28, 2017 at 7:36 AM, Oscar Mateo <oscar.mateo@intel.com> wrote:
>> This batchbuffer is over 4096 bytes, so we need to increase the size of the
>> array (and the KMD has to be modified to deal with more than one page).
>>
>> Notice that there to workarounds embedded here, both applicable to all CNL
>> steppings.
>>
>> v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so update
>>      the comment in the code and in the commit message.
>>
>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>> Cc: Ben Widawsky <ben@bwidawsk.net>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> ---
>>   lib/gen10_render.h                             |  63 +++
>>   tools/null_state_gen/Makefile.am               |   3 +-
>>   tools/null_state_gen/intel_batchbuffer.h       |   2 +-
>>   tools/null_state_gen/intel_null_state_gen.c    |   5 +-
>>   tools/null_state_gen/intel_renderstate.h       |   1 +
>>   tools/null_state_gen/intel_renderstate_gen10.c | 538 +++++++++++++++++++++++++
>>   6 files changed, 609 insertions(+), 3 deletions(-)
>>   create mode 100644 lib/gen10_render.h
>>   create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c
>>
>> diff --git a/lib/gen10_render.h b/lib/gen10_render.h
>> new file mode 100644
>> index 0000000..f4a7dff
>> --- /dev/null
>> +++ b/lib/gen10_render.h
>> @@ -0,0 +1,63 @@
>> +#ifndef GEN10_RENDER_H
>> +#define GEN10_RENDER_H
>> +
>> +#include "gen9_render.h"
>> +
>> +#define GEN7_MI_RS_CONTROL                     (0x6 << 23)
>> +# define GEN7_MI_RS_CONTROL_ENABLE             (1 << 0)
>> +
>> +#define GEN10_3DSTATE_GATHER_POOL_ALLOC                GEN6_3D(3, 1, 0x1a)
>> +# define GEN10_3DSTATE_GATHER_POOL_ENABLE      (1 << 11)
>> +
>> +#define GEN10_3DSTATE_GATHER_CONSTANT_VS       GEN6_3D(3, 0, 0x34)
>> +#define GEN10_3DSTATE_GATHER_CONSTANT_HS       GEN6_3D(3, 0, 0x36)
>> +#define GEN10_3DSTATE_GATHER_CONSTANT_DS       GEN6_3D(3, 0, 0x37)
>> +#define GEN10_3DSTATE_GATHER_CONSTANT_GS       GEN6_3D(3, 0, 0x35)
>> +#define GEN10_3DSTATE_GATHER_CONSTANT_PS       GEN6_3D(3, 0, 0x38)
>> +
>> +#define GEN10_3DSTATE_WM_DEPTH_STENCIL         GEN6_3D(3, 0, 0x4e)
>> +#define GEN10_3DSTATE_WM_CHROMAKEY             GEN6_3D(3, 0, 0x4c)
>> +
>> +#define GEN8_REG_L3_CACHE_CONFIG       0x7034
>> +
>> +/*
>> + * Programming for L3 cache allocations can be made per bank. Based on the
>> + * programmed value HW will apply same allocations on other available banks.
>> + * Total L3 Cache size per bank = 256 KB.
>> + * {SLM,    URB,     DC,      RO(I/S, C, T),   L3 Client Pool}
>> + * {  0,    96,      32,      128,                 0      }
>> + */
>> +#define GEN10_L3_CACHE_CONFIG_VALUE    0x00420060
>> +
>> +#define URB_ALIGN(val, align)  ((val % align) ? (val - (val % align)) : val)
>> +
>> +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES                64
>> +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES                2752
>> +
>> +#define GEN10_KB_PER_URB_INDEX                 8
>> +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB       96
>> +
>> +#define GEN10_URB_RESERVED_SIZE_KB             32
>> +#define GEN10_URB_RESERVED_END_SIZE_KB         8
>> +
>> +#define GEN10_VS_NUM_BITS_PER_URB_UNIT         512
>> +#define GEN10_VS_NUM_OF_URB_UNITS              1 // zero based
>> +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS                (GEN10_VS_NUM_BITS_PER_URB_UNIT * \
>> +                                               (GEN10_VS_NUM_OF_URB_UNITS + 1))
>> +
>> +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / GEN10_KB_PER_URB_INDEX)
>> +
>> +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count)                \
>> +       URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX)
>> +
>> +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice)       \
>> +       (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - GEN10_URB_RESERVED_END_SIZE_KB)
>> +
>> +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice)   \
>> +       ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) *    \
>> +       1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS)
>> +
>> +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice)                     \
>> +       ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / GEN10_KB_PER_URB_INDEX)
>> +
>> +#endif
>> diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am
>> index 24884a7..2f90990 100644
>> --- a/tools/null_state_gen/Makefile.am
>> +++ b/tools/null_state_gen/Makefile.am
>> @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES =       \
>>          intel_renderstate_gen7.c \
>>          intel_renderstate_gen8.c \
>>          intel_renderstate_gen9.c \
>> +       intel_renderstate_gen10.c \
>>          intel_null_state_gen.c
>>
>> -gens := 6 7 8 9
>> +gens := 6 7 8 9 10
>>
>>   h = /tmp/intel_renderstate_gen$$gen.c
>>   states: intel_null_state_gen
>> diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h
>> index 771d1c8..e40e01b 100644
>> --- a/tools/null_state_gen/intel_batchbuffer.h
>> +++ b/tools/null_state_gen/intel_batchbuffer.h
>> @@ -34,7 +34,7 @@
>>   #include <stdint.h>
>>
>>   #define MAX_RELOCS 64
>> -#define MAX_ITEMS 1024
>> +#define MAX_ITEMS 2048
>>   #define MAX_STRLEN 256
>>
>>   #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1))
>> diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c
>> index 06eb954..4f12f5f 100644
>> --- a/tools/null_state_gen/intel_null_state_gen.c
>> +++ b/tools/null_state_gen/intel_null_state_gen.c
>> @@ -41,7 +41,7 @@ static int debug = 0;
>>   static void print_usage(char *s)
>>   {
>>          fprintf(stderr, "%s: <gen>\n"
>> -               "     gen:     gen to generate for (6,7,8,9)\n",
>> +               "     gen:     gen to generate for (6,7,8,9,10)\n",
>>                  s);
>>   }
>>
>> @@ -173,6 +173,9 @@ static int do_generate(int gen)
>>          case 9:
>>                  null_state_gen = gen9_setup_null_render_state;
>>                  break;
>> +       case 10:
>> +               null_state_gen = gen10_setup_null_render_state;
>> +               break;
>>          }
>>
>>          if (null_state_gen == NULL) {
>> diff --git a/tools/null_state_gen/intel_renderstate.h b/tools/null_state_gen/intel_renderstate.h
>> index b27b434..b3c8c2b 100644
>> --- a/tools/null_state_gen/intel_renderstate.h
>> +++ b/tools/null_state_gen/intel_renderstate.h
>> @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct intel_batchbuffer *batch);
>>   void gen7_setup_null_render_state(struct intel_batchbuffer *batch);
>>   void gen8_setup_null_render_state(struct intel_batchbuffer *batch);
>>   void gen9_setup_null_render_state(struct intel_batchbuffer *batch);
>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch);
>>
>>   #endif /* __INTEL_RENDERSTATE_H__ */
>> diff --git a/tools/null_state_gen/intel_renderstate_gen10.c b/tools/null_state_gen/intel_renderstate_gen10.c
>> new file mode 100644
>> index 0000000..f5678c3
>> --- /dev/null
>> +++ b/tools/null_state_gen/intel_renderstate_gen10.c
>> @@ -0,0 +1,538 @@
>> +/*
>> + * Copyright © 2014 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> + * DEALINGS IN THE SOFTWARE.
>> + *
>> + * Authors:
>> + *     Oscar Mateo <oscar.mateo@intel.com>
>> + */
>> +
>> +#include "intel_renderstate.h"
>> +#include <lib/gen10_render.h>
>> +#include <lib/intel_reg.h>
>> +
>> +static void gen8_emit_wm(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
>> +       OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION);
>> +}
>> +
>> +static void gen8_emit_ps(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2));
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0); /* kernel hi */
>> +       OUT_BATCH(GEN7_PS_SPF_MODE);
>> +       OUT_BATCH(0); /* scratch space stuff */
>> +       OUT_BATCH(0); /* scratch hi */
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0); // kernel 1
>> +       OUT_BATCH(0); /* kernel 1 hi */
>> +       OUT_BATCH(0); // kernel 2
>> +       OUT_BATCH(0); /* kernel 2 hi */
>> +}
>> +
>> +static void gen8_emit_sf(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT |
>> +                 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT |
>> +                 GEN7_SF_POINT_WIDTH_FROM_SOURCE |
>> +                 8);
>> +}
>> +
>> +static void gen8_emit_vs(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2));
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +}
>> +
>> +static void gen8_emit_hs(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2));
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT);
>> +       OUT_BATCH(0);
>> +}
>> +
>> +static void gen8_emit_raster(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
>> +       OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW);
>> +       OUT_BATCH(0.0);
>> +       OUT_BATCH(0.0);
>> +       OUT_BATCH(0.0);
>> +}
>> +
>> +static void gen10_emit_urb(struct intel_batchbuffer *batch)
>> +{
>> +       /* Smallest SKU: 3x8*/
>> +       int l3_bank_count = 3;
>> +       int slice_count = 1;
>> +       int urb_size_per_slice = GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count);
>> +       int other_urb_start_addr = GEN10_VS_END_URB_INDEX(urb_size_per_slice);
>> +       const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX;
>> +       const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS;
>> +       int vs_urb_entries = GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice);
>> +
>> +       if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES)
>> +               vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES;
>> +       if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES)
>> +               vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES;
>> +
>> +       OUT_BATCH(GEN7_3DSTATE_URB_VS);
>> +       OUT_BATCH(vs_urb_entries |
>> +                (vs_urb_alloc_size << 16) |
>> +                (vs_urb_start_addr << 25));
>> +
>> +       OUT_BATCH(GEN7_3DSTATE_URB_HS);
>> +       OUT_BATCH(other_urb_start_addr << 25);
>> +
>> +       OUT_BATCH(GEN7_3DSTATE_URB_DS);
>> +       OUT_BATCH(other_urb_start_addr << 25);
>> +
>> +       OUT_BATCH(GEN7_3DSTATE_URB_GS);
>> +       OUT_BATCH(other_urb_start_addr << 25);
>> +}
>> +
>> +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
>> +       OUT_BATCH(_3DPRIM_TRILIST);
>> +}
>> +
>> +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch)
>> +{
>> +       const int num_decls = 128;
>> +       int i;
>> +
>> +       OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST |
>> +               (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(num_decls);
>> +
>> +       for (i = 0; i < num_decls; i++) {
>> +               OUT_BATCH(0);
>> +               OUT_BATCH(0);
>> +       }
>> +}
>> +
>> +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index)
>> +{
>> +       OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2));
>> +       OUT_BATCH(index << 29);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +}
>> +
>> +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index)
>> +{
>> +       OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2));
>> +       OUT_BATCH(index << 30);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +}
>> +
>> +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch)
>> +{
>> +       const int buffers = 33;
>> +       int i;
>> +
>> +       OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS |
>> +               (((4 * buffers) + 1)- 2) /* DWORD count - 2 */);
>> +
>> +       for (i = 0; i < buffers; i++) {
>> +               OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT |
>> +                         GEN7_VB0_BUFFER_ADDR_MOD_EN);
>> +               OUT_BATCH(0); /* Address */
>> +               OUT_BATCH(0);
>> +               OUT_BATCH(0);
>> +       }
>> +}
>> +
>> +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch)
>> +{
>> +       const int elements = 34;
>> +       int i;
>> +
>> +       OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS |
>> +               (((2 * elements) + 1) - 2) /* DWORD count - 2 */);
>> +
>> +       /* Element 0 */
>> +       OUT_BATCH(VE0_VALID);
>> +       OUT_BATCH(
>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
>> +       /* Elements 1 -> 33 */
>> +       for (i = 1; i < elements; i++) {
>> +               OUT_BATCH(0);
>> +               OUT_BATCH(0);
>> +       }
>> +}
>> +
>> +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch)
>> +{
>> +       union {
>> +               float fval;
>> +               uint32_t uval;
>> +       } u;
>> +
>> +       unsigned offset;
>> +
>> +       u.fval = 1.0f;
>> +
>> +       offset = intel_batch_state_offset(batch, 64);
>> +       OUT_STATE(0);
>> +       OUT_STATE(0);      /* Alpha reference value */
>> +       OUT_STATE(u.uval); /* Blend constant color RED */
>> +       OUT_STATE(u.uval); /* Blend constant color BLUE */
>> +       OUT_STATE(u.uval); /* Blend constant color GREEN */
>> +       OUT_STATE(u.uval); /* Blend constant color ALPHA */
>> +
>> +       OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
>> +       OUT_BATCH_STATE_OFFSET(offset | 1);
>> +}
>> +
>> +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch)
>> +{
>> +       unsigned offset;
>> +       int i;
>> +
>> +       offset = intel_batch_state_offset(batch, 64);
>> +
>> +       for (i = 0; i < 17; i++)
>> +               OUT_STATE(0);
>> +
>> +       OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
>> +       OUT_BATCH_STATE_OFFSET(offset | 1);
>> +}
>> +
>> +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
>> +       OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID |
>> +                 GEN8_PSX_ATTRIBUTE_ENABLE);
>> +
>> +}
>> +
>> +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
>> +       OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
>> +}
>> +
>> +static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch)
>> +{
>> +       unsigned offset;
>> +
>> +       offset = intel_batch_state_offset(batch, 32);
>> +
>> +       OUT_STATE((uint32_t)0.0f); /* Minimum depth */
>> +       OUT_STATE((uint32_t)0.0f); /* Maximum depth */
>> +
>> +       OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
>> +       OUT_BATCH_STATE_OFFSET(offset);
>> +}
>> +
>> +static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch)
>> +{
>> +       unsigned offset;
>> +       int i;
>> +
>> +       offset = intel_batch_state_offset(batch, 64);
>> +
>> +       for (i = 0; i < 16; i++)
>> +               OUT_STATE(0);
>> +
>> +       OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2));
>> +       OUT_BATCH_STATE_OFFSET(offset);
>> +}
>> +
>> +static void gen8_emit_primitive(struct intel_batchbuffer *batch)
>> +{
>> +       OUT_BATCH(GEN6_3DPRIMITIVE | (10-2));
>> +       OUT_BATCH(4);   /* gen8+ ignore the topology type field */
>> +       OUT_BATCH(1);   /* vertex count */
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(1);   /* single instance */
>> +       OUT_BATCH(0);   /* start instance location */
>> +       OUT_BATCH(0);   /* index buffer offset, ignored */
>> +       OUT_BATCH(0);   /* extended parameter 0 */
>> +       OUT_BATCH(0);   /* extended parameter 1 */
>> +       OUT_BATCH(0);   /* extended parameter 2 */
>> +}
>> +
>> +static void gen9_emit_state_base_address(struct intel_batchbuffer *batch) {
>> +       const unsigned offset = 0;
>> +       OUT_BATCH(GEN6_STATE_BASE_ADDRESS |
>> +               (22 - 2) /* DWORD count - 2 */);
>> +
>> +       /* general state base address - requires BB address
>> +        * added to state offset to be stored in this location
>> +        */
>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>> +       OUT_BATCH(0);
>> +
>> +       /* stateless data port */
>> +       OUT_BATCH(0);
>> +
>> +       /* surface state base address - requires BB address
>> +        * added to state offset to be stored in this location
>> +        */
>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>> +       OUT_BATCH(0);
>> +
>> +       /* dynamic state base address - requires BB address
>> +        * added to state offset to be stored in this location
>> +        */
>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>> +       OUT_BATCH(0);
>> +
>> +       /* indirect state base address */
>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>> +       OUT_BATCH(0);
>> +
>> +       /* instruction state base address - requires BB address
>> +        * added to state offset to be stored in this location
>> +        */
>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>> +       OUT_BATCH(0);
>> +
>> +       /* general state buffer size */
>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>> +       /* dynamic state buffer size */
>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>> +       /* indirect object buffer size */
>> +       OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY);
>> +       /* intruction buffer size */
>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>> +
>> +       /* bindless surface state base address */
>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>> +       OUT_BATCH(0);
>> +       /* bindless surface state size */
>> +       OUT_BATCH(0);
>> +
>> +       /* bindless sampler state base address */
>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>> +       OUT_BATCH(0);
>> +       /* bindless sampler state size */
>> +       OUT_BATCH(0);
>> +}
>> +
>> +/*
>> + * Generate the batch buffer commands needed to initialize the 3D engine
>> + * to its "golden state".
>> + */
>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch)
>> +{
>> +       int i;
>> +
>> +       /* WaRsGatherPoolEnable: cnl */
>> +       OUT_BATCH(GEN7_MI_RS_CONTROL);
>> +
>> +#define GEN8_PIPE_CONTROL_GLOBAL_GTT   (1 << 24)
>> +       /* PIPE_CONTROL */
>> +       OUT_BATCH(GEN6_PIPE_CONTROL |
>> +                (6 - 2));      /* DWORD count - 2 */
>> +       OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +
>> +       /* PIPELINE_SELECT */
>> +       OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D);
>> +
>> +       OUT_BATCH(MI_LOAD_REGISTER_IMM);
>> +       OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG);
>> +       OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE);
>> +
>> +       gen8_emit_wm(batch);
>> +       gen8_emit_ps(batch);
>> +       gen8_emit_sf(batch);
>> +
>> +       OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */
>> +       OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11);
>> +
>> +       gen8_emit_vs(batch);
>> +       gen8_emit_hs(batch);
>> +
>> +       OUT_CMD(GEN7_3DSTATE_GS, 10);
>> +       OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5);
>> +       OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */
>> +       OUT_CMD(GEN6_3DSTATE_CLIP, 4);
>> +       OUT_CMD(GEN7_3DSTATE_TE, 4);
>> +       OUT_CMD(GEN8_3DSTATE_VF, 2);
>> +       OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5);
>> +
>> +       /* URB States */
>> +       gen10_emit_urb(batch);
>> +
>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130);
>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130);
>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130);
>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130);
>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130);
>> +
>> +       OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4);
>> +       OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4);
>> +       OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4);
>> +
>> +       /* Push Constants */
>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2);
>> +
>> +       /* Constants */
>> +       OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11);
>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11);
>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11);
>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11);
>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11);
>> +
>> +       OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3);
>> +       OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2);
>> +       gen8_emit_vf_topology(batch);
>> +
>> +       /* Streamer out declaration list */
>> +       gen8_emit_so_decl_list(batch);
>> +
>> +       /* Streamer out buffers */
>> +       for (i = 0; i < 4; i++) {
>> +               gen8_emit_so_buffer(batch, i);
>> +       }
>> +
>> +       /* State base addresses */
>> +       gen9_emit_state_base_address(batch);
>> +
>> +       OUT_CMD(GEN6_STATE_SIP, 3);
>> +       OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4);
>> +       OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8);
>> +
>> +       /* Chroma key */
>> +       for (i = 0; i < 4; i++) {
>> +               gen8_emit_chroma_key(batch, i);
>> +       }
>> +
>> +       OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3);
>> +       OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3);
>> +       OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5);
>> +       OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5);
>> +       OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3);
>> +       OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2);
>> +
>> +       /* WaPSRandomCSNotDone:cnl */
>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
>> +       OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
>> +       OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +
>> +       OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2);
>> +       OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2);
>> +       OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32);
>> +       OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16);
>> +       OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16);
>> +       OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5);
>> +
>> +       /* Vertex buffers */
>> +       gen8_emit_vertex_buffers(batch);
>> +       gen8_emit_vertex_elements(batch);
>> +
>> +       OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */);
>> +
>> +       /* 3D state binding table pointers */
>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2);
>> +
>> +       gen8_emit_cc_state_pointers(batch);
>> +       gen8_emit_blend_state_pointers(batch);
>> +       gen8_emit_ps_extra(batch);
>> +       gen8_emit_ps_blend(batch);
>> +
>> +       /* 3D state sampler state pointers */
>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2);
>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2);
>> +
>> +       OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2);
>> +
>> +       gen8_emit_viewport_state_pointers_cc(batch);
>> +       gen8_emit_viewport_state_pointers_sf_clip(batch);
>> +
>> +       /* WaPSRandomCSNotDone:cnl */
>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
>> +       OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
>> +       OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0);
>> +
>> +       gen8_emit_raster(batch);
>> +
>> +       OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4);
>> +       OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2);
>> +
>> +       /* Launch 3D operation */
>> +       gen8_emit_primitive(batch);
>> +
>> +       /* WaRsGatherPoolEnable: cnl */
>> +       OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE);
>> +       OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2));
>> +       OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE);
>> +       OUT_BATCH(0);
>> +       OUT_BATCH(0xfffff << 12);
>> +       OUT_BATCH(GEN7_MI_RS_CONTROL);
>> +       OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4);
>> +
>> +       OUT_BATCH(MI_BATCH_BUFFER_END);
>> +}
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/null_state_gen: Add GEN10 golden context batch buffer creation
  2017-07-12 20:42       ` Oscar Mateo
@ 2017-07-12 21:03         ` Rodrigo Vivi
  2017-07-13 22:30           ` Rodrigo Vivi
  0 siblings, 1 reply; 7+ messages in thread
From: Rodrigo Vivi @ 2017-07-12 21:03 UTC (permalink / raw)
  To: Oscar Mateo; +Cc: intel-gfx, Ben Widawsky, Mika Kuoppala

On Wed, Jul 12, 2017 at 1:42 PM, Oscar Mateo <oscar.mateo@intel.com> wrote:
>
>
> On 07/05/2017 05:50 PM, Rodrigo Vivi wrote:
>>
>> Hi Oscar,
>
>
> Hey!
>
>> I had missed this patch here, but noticed now that I was refreshing
>> and testing more cnl tests before re-submitting them.
>>
>> First of all I believe we need to remove the A0 w/a. I don't believe
>> we will ever see one. So I'm removing all A0 exclusive W/a from the
>> patches as well.
>
>
> Be careful: I think both WAs in the patch are for all steppings (one was
> incorrectly marked as A0 only in v1 of this patch).

ah cool, so v2 is right...

>
>> I also gave a try here on your null state. However if I use the golden
>> state generated by this version I get a blank screen because driver
>> load failes with some strange faults:
>
>
> Good. I don't have a CNL so it was only compile-tested.
>
>> any idea?
>
>
> Did you also include the i915 patch to allow golden BBs over one page in
> size? I sent it separately as "drm/i915: Allow null render state
> batchbuffers bigger than one page". BTW: this patch was given a cold
> shoulder in the mailing list, since I could not re-justify why null state
> was needed in the first place (since UMD needs to configure the 3D pipeline
> first thing anyway). I am still trying to get a better explanation from HW
> people.

hmmmm no... I missed that patch... sorry...

I'm currently without access to CNL, but as soon as I have I will test
it and if that works I will just merge igt one, review your kernel
one, etc...

>
> -- Oscar
>
>> [    4.115243] Memory manager not clean during takedown.
>>
>> [    4.120389] ------------[ cut here ]------------
>> [    4.125068] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mm.c:892
>> drm_mm_takedown+0x25/0x30
>> [    4.133574] Modules linked in:
>> [    4.136707] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
>> 4.12.0-eywa-46011-g9a19faf #360
>> [    4.144650] Hardware name: Intel Corporation Cannonlake Client
>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>> [    4.158500] task: ffff880264ab8000 task.stack: ffffc90000038000
>> [    4.164506] RIP: 0010:drm_mm_takedown+0x25/0x30
>> [    4.169104] RSP: 0000:ffffc9000003bc28 EFLAGS: 00010292
>> [    4.174409] RAX: 0000000000000029 RBX: ffff880260a54170 RCX:
>> ffffffff82468740
>> [    4.181654] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
>> 00000000ffffffff
>> [    4.188839] RBP: ffffc9000003bc28 R08: 00000000fffffffe R09:
>> 000000000000035a
>> [    4.196028] R10: 0000000000000005 R11: 0000000000000000 R12:
>> ffff880260a50000
>> [    4.203215] R13: ffff880260a54348 R14: ffff880260a50070 R15:
>> ffff880262844a00
>> [    4.210402] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
>> knlGS:0000000000000000
>> [    4.218541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    4.224344] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
>> 00000000007406f0
>> [    4.231529] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [    4.238716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [    4.245900] PKRU: 00000000
>> [    4.248673] Call Trace:
>> [    4.251193]  i915_gem_cleanup_stolen+0x1f/0x30
>> [    4.255703]  i915_ggtt_cleanup_hw+0xa4/0x170
>> [    4.260035]  i915_driver_cleanup_hw+0x36/0x40
>> [    4.264455]  i915_driver_load+0x6a0/0xe70
>> [    4.268535]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>> [    4.273560]  i915_pci_probe+0x2c/0x50
>> [    4.277293]  local_pci_probe+0x45/0xa0
>> [    4.281106]  ? pci_match_device+0xe0/0x110
>> [    4.285265]  pci_device_probe+0x135/0x150
>> [    4.289343]  driver_probe_device+0x288/0x490
>> [    4.293676]  __driver_attach+0xc9/0xf0
>> [    4.297490]  ? driver_probe_device+0x490/0x490
>> [    4.301999]  bus_for_each_dev+0x5d/0x90
>> [    4.305902]  driver_attach+0x1e/0x20
>> [    4.309543]  bus_add_driver+0x1d0/0x290
>> [    4.313442]  driver_register+0x60/0xe0
>> [    4.317257]  __pci_register_driver+0x5d/0x60
>> [    4.321652]  i915_init+0x59/0x5c
>> [    4.324944]  ? mipi_dsi_bus_init+0x17/0x17
>> [    4.329103]  do_one_initcall+0x42/0x180
>> [    4.333007]  kernel_init_freeable+0x17c/0x202
>> [    4.337426]  ? set_debug_rodata+0x17/0x17
>> [    4.341500]  ? rest_init+0x90/0x90
>> [    4.344969]  kernel_init+0xe/0x110
>> [    4.348438]  ret_from_fork+0x25/0x30
>> [    4.352079] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48
>> 83 c7 38 48 39 c7 75 01 c3 55 48 c7 c7 70 ac 20 82 31 c0 48 89 e5 e8
>> 6b 62 b7 ff <0f> ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89
>> e5 41
>> [    4.371029] ---[ end trace 7d36c2dd72851315 ]---
>> [    4.381680] WARN_ON(dev_priv->mm.object_count)
>> [    4.381698] ------------[ cut here ]------------
>> [    4.390921] WARNING: CPU: 0 PID: 1 at
>> drivers/gpu/drm/i915/i915_gem.c:4964 i915_gem_load_cleanup+0x10b/0x120
>> [    4.400797] Modules linked in:
>> [    4.403927] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W
>> 4.12.0-eywa-46011-g9a19faf #360
>> [    4.413021] Hardware name: Intel Corporation Cannonlake Client
>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>> [    4.426884] task: ffff880264ab8000 task.stack: ffffc90000038000
>> [    4.432865] RIP: 0010:i915_gem_load_cleanup+0x10b/0x120
>> [    4.438157] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292
>> [    4.443450] RAX: 0000000000000022 RBX: ffff880260a50000 RCX:
>> ffffffff82468740
>> [    4.450642] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
>> 0000000000000202
>> [    4.457839] RBP: ffffc9000003bc68 R08: 0000000000000022 R09:
>> 0000000000000389
>> [    4.465029] R10: 0000000000000000 R11: 0000000000000001 R12:
>> ffff880260a54678
>> [    4.472227] R13: ffff88026446c000 R14: ffff88026446c000 R15:
>> ffff880262844a00
>> [    4.479420] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
>> knlGS:0000000000000000
>> [    4.487564] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    4.493370] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
>> 00000000007406f0
>> [    4.500569] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [    4.507763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [    4.514959] PKRU: 00000000
>> [    4.517737] Call Trace:
>> [    4.520265]  i915_driver_cleanup_early+0x1a/0x50
>> [    4.524955]  i915_driver_load+0x6b8/0xe70
>> [    4.529038]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>> [    4.534100] clocksource: Switched to clocksource tsc
>> [    4.534105]  i915_pci_probe+0x2c/0x50
>> [    4.534113]  local_pci_probe+0x45/0xa0
>> [    4.534118]  ? pci_match_device+0xe0/0x110
>> [    4.534124]  pci_device_probe+0x135/0x150
>> [    4.534131]  driver_probe_device+0x288/0x490
>> [    4.534137]  __driver_attach+0xc9/0xf0
>> [    4.534142]  ? driver_probe_device+0x490/0x490
>> [    4.534146]  bus_for_each_dev+0x5d/0x90
>> [    4.534152]  driver_attach+0x1e/0x20
>> [    4.534156]  bus_add_driver+0x1d0/0x290
>> [    4.534162]  driver_register+0x60/0xe0
>> [    4.534167]  __pci_register_driver+0x5d/0x60
>> [    4.534173]  i915_init+0x59/0x5c
>> [    4.534177]  ? mipi_dsi_bus_init+0x17/0x17
>> [    4.534181]  do_one_initcall+0x42/0x180
>> [    4.534187]  kernel_init_freeable+0x17c/0x202
>> [    4.534191]  ? set_debug_rodata+0x17/0x17
>> [    4.534196]  ? rest_init+0x90/0x90
>> [    4.534200]  kernel_init+0xe/0x110
>> [    4.534204]  ret_from_fork+0x25/0x30
>> [    4.534208] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff 0f
>> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8
>> 05 4f b1 ff <0f> ff e9 33 ff ff ff 66 66 66 66 66 2e 0f 1f 84 00 00 00
>> 00 00
>> [    4.534272] ---[ end trace 7d36c2dd72851316 ]---
>> [    4.534277] WARN_ON(!list_empty(&dev_priv->gt.timelines))
>> [    4.534293] ------------[ cut here ]------------
>> [    4.534298] WARNING: CPU: 0 PID: 1 at
>> drivers/gpu/drm/i915/i915_gem.c:4968 i915_gem_load_cleanup+0xef/0x120
>> [    4.534299] Modules linked in:
>> [    4.534304] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W
>> 4.12.0-eywa-46011-g9a19faf #360
>> [    4.534306] Hardware name: Intel Corporation Cannonlake Client
>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>> [    4.534308] task: ffff880264ab8000 task.stack: ffffc90000038000
>> [    4.534312] RIP: 0010:i915_gem_load_cleanup+0xef/0x120
>> [    4.534314] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292
>> [    4.534317] RAX: 000000000000002d RBX: ffff880260a50000 RCX:
>> 0000000000000000
>> [    4.534319] RDX: 0000000000000001 RSI: 0000000000000002 RDI:
>> 0000000000000296
>> [    4.534321] RBP: ffffc9000003bc68 R08: 000000000000002d R09:
>> 000000000000002d
>> [    4.534322] R10: 0000000000000000 R11: ffff880260a4e000 R12:
>> ffff880260a50070
>> [    4.534324] R13: ffff88026446c000 R14: ffff88026446c000 R15:
>> ffff880262844a00
>> [    4.534327] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
>> knlGS:0000000000000000
>> [    4.534329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    4.534331] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
>> 00000000007406f0
>> [    4.534334] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [    4.534335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [    4.534337] PKRU: 00000000
>> [    4.534338] Call Trace:
>> [    4.534344]  i915_driver_cleanup_early+0x1a/0x50
>> [    4.534350]  i915_driver_load+0x6b8/0xe70
>> [    4.534356]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>> [    4.534361]  i915_pci_probe+0x2c/0x50
>> [    4.534366]  local_pci_probe+0x45/0xa0
>> [    4.534371]  ? pci_match_device+0xe0/0x110
>> [    4.534376]  pci_device_probe+0x135/0x150
>> [    4.534382]  driver_probe_device+0x288/0x490
>> [    4.534388]  __driver_attach+0xc9/0xf0
>> [    4.534393]  ? driver_probe_device+0x490/0x490
>> [    4.534398]  bus_for_each_dev+0x5d/0x90
>> [    4.534403]  driver_attach+0x1e/0x20
>> [    4.534408]  bus_add_driver+0x1d0/0x290
>> [    4.534414]  driver_register+0x60/0xe0
>> [    4.534419]  __pci_register_driver+0x5d/0x60
>> [    4.534424]  i915_init+0x59/0x5c
>> [    4.534428]  ? mipi_dsi_bus_init+0x17/0x17
>> [    4.534431]  do_one_initcall+0x42/0x180
>> [    4.534437]  kernel_init_freeable+0x17c/0x202
>> [    4.534440]  ? set_debug_rodata+0x17/0x17
>> [    4.534444]  ? rest_init+0x90/0x90
>> [    4.534448]  kernel_init+0xe/0x110
>> [    4.534451]  ret_from_fork+0x25/0x30
>> [    4.534455] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 3d 4f b1 ff 0f
>> ff e9 5d ff ff ff 48 c7 c6 b0 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8
>> 21 4f b1 ff <0f> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98
>> 1a 82
>> [    4.534519] ---[ end trace 7d36c2dd72851317 ]---
>> [    4.534605]
>> =============================================================================
>> [    4.534608] BUG drm_i915_gem_object (Tainted: G        W      ):
>> Objects remaining in drm_i915_gem_object on __kmem_cache_shutdown()
>> [    4.534609]
>> -----------------------------------------------------------------------------
>>
>> [    4.534611] Disabling lock debugging due to kernel taint
>> [    4.534614] INFO: Slab 0xffffea0009820600 objects=19 used=2
>> fp=0xffff88026081ba80 flags=0x200000000008100
>> [    4.534618] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B   W
>> 4.12.0-eywa-46011-g9a19faf #360
>> [    4.534620] Hardware name: Intel Corporation Cannonlake Client
>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>> [    4.534621] Call Trace:
>> [    4.534626]  dump_stack+0x65/0x89
>> [    4.534633]  slab_err+0xa1/0xb0
>> [    4.534640]  ? __kmalloc+0x185/0x270
>> [    4.534645]  ? kmem_cache_alloc_bulk+0x1f0/0x1f0
>> [    4.534650]  ? __kmem_cache_shutdown+0x160/0x400
>> [    4.534655]  __kmem_cache_shutdown+0x180/0x400
>> [    4.534663]  shutdown_cache+0x18/0x1a0
>> [    4.534667]  kmem_cache_destroy+0x1c1/0x1f0
>> [    4.534672]  i915_gem_load_cleanup+0xb4/0x120
>> [    4.534677]  i915_driver_cleanup_early+0x1a/0x50
>> [    4.534682]  i915_driver_load+0x6b8/0xe70
>> [    4.534689]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>> [    4.534693]  i915_pci_probe+0x2c/0x50
>> [    4.534698]  local_pci_probe+0x45/0xa0
>> [    4.534703]  ? pci_match_device+0xe0/0x110
>> [    4.534708]  pci_device_probe+0x135/0x150
>> [    4.534714]  driver_probe_device+0x288/0x490
>> [    4.534721]  __driver_attach+0xc9/0xf0
>> [    4.534726]  ? driver_probe_device+0x490/0x490
>> [    4.534730]  bus_for_each_dev+0x5d/0x90
>> [    4.534736]  driver_attach+0x1e/0x20
>> [    4.534741]  bus_add_driver+0x1d0/0x290
>> [    4.534746]  driver_register+0x60/0xe0
>> [    4.534751]  __pci_register_driver+0x5d/0x60
>> [    4.534756]  i915_init+0x59/0x5c
>> [    4.534760]  ? mipi_dsi_bus_init+0x17/0x17
>>      4.534760]  ? mipi_dsi_bus_init+0x17/0x17
>> [    4.534763]  do_one_initcall+0x42/0x180
>> [    4.534769]  kernel_init_freeable+0x17c/0x202
>> [    4.534773]  ? set_debug_rodata+0x17/0x17
>> [    4.534777]  ? rest_init+0x90/0x90
>> [    4.534781]  kernel_init+0xe/0x110
>> [    4.534784]  ret_from_fork+0x25/0x30
>> [    4.534791] INFO: Object 0xffff880260818340 @offset=832
>> [    4.534792] INFO: Object 0xffff880260818680 @offset=1664
>> [    4.534795] kmem_cache_destroy drm_i915_gem_object: Slab cache
>> still has objects
>> [    4.534798] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B   W
>> 4.12.0-eywa-46011-g9a19faf #360
>> [    4.534800] Hardware name: Intel Corporation Cannonlake Client
>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>> [    4.534801] Call Trace:
>> [    4.534805]  dump_stack+0x65/0x89
>> [    4.534809]  kmem_cache_destroy+0x1e1/0x1f0
>> [    4.534814]  i915_gem_load_cleanup+0xb4/0x120
>> [    4.534819]  i915_driver_cleanup_early+0x1a/0x50
>> [    4.534824]  i915_driver_load+0x6b8/0xe70
>> [    4.534830]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>> [    4.534835]  i915_pci_probe+0x2c/0x50
>> [    4.534840]  local_pci_probe+0x45/0xa0
>> [    4.534844]  ? pci_match_device+0xe0/0x110
>> [    4.534850]  pci_device_probe+0x135/0x150
>> [    4.534856]  driver_probe_device+0x288/0x490
>> [    4.534862]  __driver_attach+0xc9/0xf0
>> [    4.534867]  ? driver_probe_device+0x490/0x490
>> [    4.534871]  bus_for_each_dev+0x5d/0x90
>> [    4.534877]  driver_attach+0x1e/0x20
>> [    4.534882]  bus_add_driver+0x1d0/0x290
>> [    4.534888]  driver_register+0x60/0xe0
>> [    4.534893]  __pci_register_driver+0x5d/0x60
>> [    4.534897]  i915_init+0x59/0x5c
>> [    4.534901]  ? mipi_dsi_bus_init+0x17/0x17
>> [    4.534904]  do_one_initcall+0x42/0x180
>> [    4.534910]  kernel_init_freeable+0x17c/0x202
>> [    4.534914]  ? set_debug_rodata+0x17/0x17
>> [    4.534917]  ? rest_init+0x90/0x90
>> [    4.534922]  kernel_init+0xe/0x110
>> [    4.534925]  ret_from_fork+0x25/0x30
>> [    4.535386] i915 0000:00:02.0: [drm:i915_driver_load] Device
>> initialization failed (-22)
>> [    4.535390] i915 0000:00:02.0: Please file a bug at
>> https://bugs.freedesktop.org/enter_bug.cgi?product=DRI against
>> DRM/Intel providing the dmesg log by booting with drm.debug=0xf
>> [    4.535450] i915: probe of 0000:00:02.0 failed with error -22
>>
>>
>> On Fri, Apr 28, 2017 at 7:36 AM, Oscar Mateo <oscar.mateo@intel.com>
>> wrote:
>>>
>>> This batchbuffer is over 4096 bytes, so we need to increase the size of
>>> the
>>> array (and the KMD has to be modified to deal with more than one page).
>>>
>>> Notice that there to workarounds embedded here, both applicable to all
>>> CNL
>>> steppings.
>>>
>>> v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so
>>> update
>>>      the comment in the code and in the commit message.
>>>
>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>>> Cc: Ben Widawsky <ben@bwidawsk.net>
>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>>> ---
>>>   lib/gen10_render.h                             |  63 +++
>>>   tools/null_state_gen/Makefile.am               |   3 +-
>>>   tools/null_state_gen/intel_batchbuffer.h       |   2 +-
>>>   tools/null_state_gen/intel_null_state_gen.c    |   5 +-
>>>   tools/null_state_gen/intel_renderstate.h       |   1 +
>>>   tools/null_state_gen/intel_renderstate_gen10.c | 538
>>> +++++++++++++++++++++++++
>>>   6 files changed, 609 insertions(+), 3 deletions(-)
>>>   create mode 100644 lib/gen10_render.h
>>>   create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c
>>>
>>> diff --git a/lib/gen10_render.h b/lib/gen10_render.h
>>> new file mode 100644
>>> index 0000000..f4a7dff
>>> --- /dev/null
>>> +++ b/lib/gen10_render.h
>>> @@ -0,0 +1,63 @@
>>> +#ifndef GEN10_RENDER_H
>>> +#define GEN10_RENDER_H
>>> +
>>> +#include "gen9_render.h"
>>> +
>>> +#define GEN7_MI_RS_CONTROL                     (0x6 << 23)
>>> +# define GEN7_MI_RS_CONTROL_ENABLE             (1 << 0)
>>> +
>>> +#define GEN10_3DSTATE_GATHER_POOL_ALLOC                GEN6_3D(3, 1,
>>> 0x1a)
>>> +# define GEN10_3DSTATE_GATHER_POOL_ENABLE      (1 << 11)
>>> +
>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_VS       GEN6_3D(3, 0, 0x34)
>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_HS       GEN6_3D(3, 0, 0x36)
>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_DS       GEN6_3D(3, 0, 0x37)
>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_GS       GEN6_3D(3, 0, 0x35)
>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_PS       GEN6_3D(3, 0, 0x38)
>>> +
>>> +#define GEN10_3DSTATE_WM_DEPTH_STENCIL         GEN6_3D(3, 0, 0x4e)
>>> +#define GEN10_3DSTATE_WM_CHROMAKEY             GEN6_3D(3, 0, 0x4c)
>>> +
>>> +#define GEN8_REG_L3_CACHE_CONFIG       0x7034
>>> +
>>> +/*
>>> + * Programming for L3 cache allocations can be made per bank. Based on
>>> the
>>> + * programmed value HW will apply same allocations on other available
>>> banks.
>>> + * Total L3 Cache size per bank = 256 KB.
>>> + * {SLM,    URB,     DC,      RO(I/S, C, T),   L3 Client Pool}
>>> + * {  0,    96,      32,      128,                 0      }
>>> + */
>>> +#define GEN10_L3_CACHE_CONFIG_VALUE    0x00420060
>>> +
>>> +#define URB_ALIGN(val, align)  ((val % align) ? (val - (val % align)) :
>>> val)
>>> +
>>> +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES                64
>>> +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES                2752
>>> +
>>> +#define GEN10_KB_PER_URB_INDEX                 8
>>> +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB       96
>>> +
>>> +#define GEN10_URB_RESERVED_SIZE_KB             32
>>> +#define GEN10_URB_RESERVED_END_SIZE_KB         8
>>> +
>>> +#define GEN10_VS_NUM_BITS_PER_URB_UNIT         512
>>> +#define GEN10_VS_NUM_OF_URB_UNITS              1 // zero based
>>> +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS
>>> (GEN10_VS_NUM_BITS_PER_URB_UNIT * \
>>> +
>>> (GEN10_VS_NUM_OF_URB_UNITS + 1))
>>> +
>>> +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB /
>>> GEN10_KB_PER_URB_INDEX)
>>> +
>>> +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count)
>>> \
>>> +       URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB *
>>> l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX)
>>> +
>>> +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice)       \
>>> +       (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB -
>>> GEN10_URB_RESERVED_END_SIZE_KB)
>>> +
>>> +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice)   \
>>> +       ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) *    \
>>> +       1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS)
>>> +
>>> +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice)                     \
>>> +       ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) /
>>> GEN10_KB_PER_URB_INDEX)
>>> +
>>> +#endif
>>> diff --git a/tools/null_state_gen/Makefile.am
>>> b/tools/null_state_gen/Makefile.am
>>> index 24884a7..2f90990 100644
>>> --- a/tools/null_state_gen/Makefile.am
>>> +++ b/tools/null_state_gen/Makefile.am
>>> @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES =       \
>>>          intel_renderstate_gen7.c \
>>>          intel_renderstate_gen8.c \
>>>          intel_renderstate_gen9.c \
>>> +       intel_renderstate_gen10.c \
>>>          intel_null_state_gen.c
>>>
>>> -gens := 6 7 8 9
>>> +gens := 6 7 8 9 10
>>>
>>>   h = /tmp/intel_renderstate_gen$$gen.c
>>>   states: intel_null_state_gen
>>> diff --git a/tools/null_state_gen/intel_batchbuffer.h
>>> b/tools/null_state_gen/intel_batchbuffer.h
>>> index 771d1c8..e40e01b 100644
>>> --- a/tools/null_state_gen/intel_batchbuffer.h
>>> +++ b/tools/null_state_gen/intel_batchbuffer.h
>>> @@ -34,7 +34,7 @@
>>>   #include <stdint.h>
>>>
>>>   #define MAX_RELOCS 64
>>> -#define MAX_ITEMS 1024
>>> +#define MAX_ITEMS 2048
>>>   #define MAX_STRLEN 256
>>>
>>>   #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1))
>>> diff --git a/tools/null_state_gen/intel_null_state_gen.c
>>> b/tools/null_state_gen/intel_null_state_gen.c
>>> index 06eb954..4f12f5f 100644
>>> --- a/tools/null_state_gen/intel_null_state_gen.c
>>> +++ b/tools/null_state_gen/intel_null_state_gen.c
>>> @@ -41,7 +41,7 @@ static int debug = 0;
>>>   static void print_usage(char *s)
>>>   {
>>>          fprintf(stderr, "%s: <gen>\n"
>>> -               "     gen:     gen to generate for (6,7,8,9)\n",
>>> +               "     gen:     gen to generate for (6,7,8,9,10)\n",
>>>                  s);
>>>   }
>>>
>>> @@ -173,6 +173,9 @@ static int do_generate(int gen)
>>>          case 9:
>>>                  null_state_gen = gen9_setup_null_render_state;
>>>                  break;
>>> +       case 10:
>>> +               null_state_gen = gen10_setup_null_render_state;
>>> +               break;
>>>          }
>>>
>>>          if (null_state_gen == NULL) {
>>> diff --git a/tools/null_state_gen/intel_renderstate.h
>>> b/tools/null_state_gen/intel_renderstate.h
>>> index b27b434..b3c8c2b 100644
>>> --- a/tools/null_state_gen/intel_renderstate.h
>>> +++ b/tools/null_state_gen/intel_renderstate.h
>>> @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct
>>> intel_batchbuffer *batch);
>>>   void gen7_setup_null_render_state(struct intel_batchbuffer *batch);
>>>   void gen8_setup_null_render_state(struct intel_batchbuffer *batch);
>>>   void gen9_setup_null_render_state(struct intel_batchbuffer *batch);
>>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch);
>>>
>>>   #endif /* __INTEL_RENDERSTATE_H__ */
>>> diff --git a/tools/null_state_gen/intel_renderstate_gen10.c
>>> b/tools/null_state_gen/intel_renderstate_gen10.c
>>> new file mode 100644
>>> index 0000000..f5678c3
>>> --- /dev/null
>>> +++ b/tools/null_state_gen/intel_renderstate_gen10.c
>>> @@ -0,0 +1,538 @@
>>> +/*
>>> + * Copyright © 2014 Intel Corporation
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining
>>> a
>>> + * copy of this software and associated documentation files (the
>>> "Software"),
>>> + * to deal in the Software without restriction, including without
>>> limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute,
>>> sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to whom the
>>> + * Software is furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice (including the
>>> next
>>> + * paragraph) shall be included in all copies or substantial portions of
>>> the
>>> + * Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>>> EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>>> MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>>> SHALL
>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>>> OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>>> ARISING
>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>>> + * DEALINGS IN THE SOFTWARE.
>>> + *
>>> + * Authors:
>>> + *     Oscar Mateo <oscar.mateo@intel.com>
>>> + */
>>> +
>>> +#include "intel_renderstate.h"
>>> +#include <lib/gen10_render.h>
>>> +#include <lib/intel_reg.h>
>>> +
>>> +static void gen8_emit_wm(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
>>> +       OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION);
>>> +}
>>> +
>>> +static void gen8_emit_ps(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2));
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0); /* kernel hi */
>>> +       OUT_BATCH(GEN7_PS_SPF_MODE);
>>> +       OUT_BATCH(0); /* scratch space stuff */
>>> +       OUT_BATCH(0); /* scratch hi */
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0); // kernel 1
>>> +       OUT_BATCH(0); /* kernel 1 hi */
>>> +       OUT_BATCH(0); // kernel 2
>>> +       OUT_BATCH(0); /* kernel 2 hi */
>>> +}
>>> +
>>> +static void gen8_emit_sf(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT |
>>> +                 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT |
>>> +                 GEN7_SF_POINT_WIDTH_FROM_SOURCE |
>>> +                 8);
>>> +}
>>> +
>>> +static void gen8_emit_vs(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2));
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +}
>>> +
>>> +static void gen8_emit_hs(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2));
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT);
>>> +       OUT_BATCH(0);
>>> +}
>>> +
>>> +static void gen8_emit_raster(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
>>> +       OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW);
>>> +       OUT_BATCH(0.0);
>>> +       OUT_BATCH(0.0);
>>> +       OUT_BATCH(0.0);
>>> +}
>>> +
>>> +static void gen10_emit_urb(struct intel_batchbuffer *batch)
>>> +{
>>> +       /* Smallest SKU: 3x8*/
>>> +       int l3_bank_count = 3;
>>> +       int slice_count = 1;
>>> +       int urb_size_per_slice =
>>> GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count);
>>> +       int other_urb_start_addr =
>>> GEN10_VS_END_URB_INDEX(urb_size_per_slice);
>>> +       const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX;
>>> +       const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS;
>>> +       int vs_urb_entries =
>>> GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice);
>>> +
>>> +       if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES)
>>> +               vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES;
>>> +       if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES)
>>> +               vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES;
>>> +
>>> +       OUT_BATCH(GEN7_3DSTATE_URB_VS);
>>> +       OUT_BATCH(vs_urb_entries |
>>> +                (vs_urb_alloc_size << 16) |
>>> +                (vs_urb_start_addr << 25));
>>> +
>>> +       OUT_BATCH(GEN7_3DSTATE_URB_HS);
>>> +       OUT_BATCH(other_urb_start_addr << 25);
>>> +
>>> +       OUT_BATCH(GEN7_3DSTATE_URB_DS);
>>> +       OUT_BATCH(other_urb_start_addr << 25);
>>> +
>>> +       OUT_BATCH(GEN7_3DSTATE_URB_GS);
>>> +       OUT_BATCH(other_urb_start_addr << 25);
>>> +}
>>> +
>>> +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
>>> +       OUT_BATCH(_3DPRIM_TRILIST);
>>> +}
>>> +
>>> +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch)
>>> +{
>>> +       const int num_decls = 128;
>>> +       int i;
>>> +
>>> +       OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST |
>>> +               (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(num_decls);
>>> +
>>> +       for (i = 0; i < num_decls; i++) {
>>> +               OUT_BATCH(0);
>>> +               OUT_BATCH(0);
>>> +       }
>>> +}
>>> +
>>> +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const
>>> int index)
>>> +{
>>> +       OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2));
>>> +       OUT_BATCH(index << 29);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +}
>>> +
>>> +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const
>>> int index)
>>> +{
>>> +       OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2));
>>> +       OUT_BATCH(index << 30);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +}
>>> +
>>> +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch)
>>> +{
>>> +       const int buffers = 33;
>>> +       int i;
>>> +
>>> +       OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS |
>>> +               (((4 * buffers) + 1)- 2) /* DWORD count - 2 */);
>>> +
>>> +       for (i = 0; i < buffers; i++) {
>>> +               OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT |
>>> +                         GEN7_VB0_BUFFER_ADDR_MOD_EN);
>>> +               OUT_BATCH(0); /* Address */
>>> +               OUT_BATCH(0);
>>> +               OUT_BATCH(0);
>>> +       }
>>> +}
>>> +
>>> +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch)
>>> +{
>>> +       const int elements = 34;
>>> +       int i;
>>> +
>>> +       OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS |
>>> +               (((2 * elements) + 1) - 2) /* DWORD count - 2 */);
>>> +
>>> +       /* Element 0 */
>>> +       OUT_BATCH(VE0_VALID);
>>> +       OUT_BATCH(
>>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
>>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
>>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
>>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
>>> +       /* Elements 1 -> 33 */
>>> +       for (i = 1; i < elements; i++) {
>>> +               OUT_BATCH(0);
>>> +               OUT_BATCH(0);
>>> +       }
>>> +}
>>> +
>>> +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch)
>>> +{
>>> +       union {
>>> +               float fval;
>>> +               uint32_t uval;
>>> +       } u;
>>> +
>>> +       unsigned offset;
>>> +
>>> +       u.fval = 1.0f;
>>> +
>>> +       offset = intel_batch_state_offset(batch, 64);
>>> +       OUT_STATE(0);
>>> +       OUT_STATE(0);      /* Alpha reference value */
>>> +       OUT_STATE(u.uval); /* Blend constant color RED */
>>> +       OUT_STATE(u.uval); /* Blend constant color BLUE */
>>> +       OUT_STATE(u.uval); /* Blend constant color GREEN */
>>> +       OUT_STATE(u.uval); /* Blend constant color ALPHA */
>>> +
>>> +       OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
>>> +       OUT_BATCH_STATE_OFFSET(offset | 1);
>>> +}
>>> +
>>> +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer
>>> *batch)
>>> +{
>>> +       unsigned offset;
>>> +       int i;
>>> +
>>> +       offset = intel_batch_state_offset(batch, 64);
>>> +
>>> +       for (i = 0; i < 17; i++)
>>> +               OUT_STATE(0);
>>> +
>>> +       OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
>>> +       OUT_BATCH_STATE_OFFSET(offset | 1);
>>> +}
>>> +
>>> +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
>>> +       OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID |
>>> +                 GEN8_PSX_ATTRIBUTE_ENABLE);
>>> +
>>> +}
>>> +
>>> +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
>>> +       OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
>>> +}
>>> +
>>> +static void gen8_emit_viewport_state_pointers_cc(struct
>>> intel_batchbuffer *batch)
>>> +{
>>> +       unsigned offset;
>>> +
>>> +       offset = intel_batch_state_offset(batch, 32);
>>> +
>>> +       OUT_STATE((uint32_t)0.0f); /* Minimum depth */
>>> +       OUT_STATE((uint32_t)0.0f); /* Maximum depth */
>>> +
>>> +       OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
>>> +       OUT_BATCH_STATE_OFFSET(offset);
>>> +}
>>> +
>>> +static void gen8_emit_viewport_state_pointers_sf_clip(struct
>>> intel_batchbuffer *batch)
>>> +{
>>> +       unsigned offset;
>>> +       int i;
>>> +
>>> +       offset = intel_batch_state_offset(batch, 64);
>>> +
>>> +       for (i = 0; i < 16; i++)
>>> +               OUT_STATE(0);
>>> +
>>> +       OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 -
>>> 2));
>>> +       OUT_BATCH_STATE_OFFSET(offset);
>>> +}
>>> +
>>> +static void gen8_emit_primitive(struct intel_batchbuffer *batch)
>>> +{
>>> +       OUT_BATCH(GEN6_3DPRIMITIVE | (10-2));
>>> +       OUT_BATCH(4);   /* gen8+ ignore the topology type field */
>>> +       OUT_BATCH(1);   /* vertex count */
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(1);   /* single instance */
>>> +       OUT_BATCH(0);   /* start instance location */
>>> +       OUT_BATCH(0);   /* index buffer offset, ignored */
>>> +       OUT_BATCH(0);   /* extended parameter 0 */
>>> +       OUT_BATCH(0);   /* extended parameter 1 */
>>> +       OUT_BATCH(0);   /* extended parameter 2 */
>>> +}
>>> +
>>> +static void gen9_emit_state_base_address(struct intel_batchbuffer
>>> *batch) {
>>> +       const unsigned offset = 0;
>>> +       OUT_BATCH(GEN6_STATE_BASE_ADDRESS |
>>> +               (22 - 2) /* DWORD count - 2 */);
>>> +
>>> +       /* general state base address - requires BB address
>>> +        * added to state offset to be stored in this location
>>> +        */
>>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>>> +       OUT_BATCH(0);
>>> +
>>> +       /* stateless data port */
>>> +       OUT_BATCH(0);
>>> +
>>> +       /* surface state base address - requires BB address
>>> +        * added to state offset to be stored in this location
>>> +        */
>>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>>> +       OUT_BATCH(0);
>>> +
>>> +       /* dynamic state base address - requires BB address
>>> +        * added to state offset to be stored in this location
>>> +        */
>>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>>> +       OUT_BATCH(0);
>>> +
>>> +       /* indirect state base address */
>>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>>> +       OUT_BATCH(0);
>>> +
>>> +       /* instruction state base address - requires BB address
>>> +        * added to state offset to be stored in this location
>>> +        */
>>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>>> +       OUT_BATCH(0);
>>> +
>>> +       /* general state buffer size */
>>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>>> +       /* dynamic state buffer size */
>>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>>> +       /* indirect object buffer size */
>>> +       OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY);
>>> +       /* intruction buffer size */
>>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>>> +
>>> +       /* bindless surface state base address */
>>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>>> +       OUT_BATCH(0);
>>> +       /* bindless surface state size */
>>> +       OUT_BATCH(0);
>>> +
>>> +       /* bindless sampler state base address */
>>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>>> +       OUT_BATCH(0);
>>> +       /* bindless sampler state size */
>>> +       OUT_BATCH(0);
>>> +}
>>> +
>>> +/*
>>> + * Generate the batch buffer commands needed to initialize the 3D engine
>>> + * to its "golden state".
>>> + */
>>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch)
>>> +{
>>> +       int i;
>>> +
>>> +       /* WaRsGatherPoolEnable: cnl */
>>> +       OUT_BATCH(GEN7_MI_RS_CONTROL);
>>> +
>>> +#define GEN8_PIPE_CONTROL_GLOBAL_GTT   (1 << 24)
>>> +       /* PIPE_CONTROL */
>>> +       OUT_BATCH(GEN6_PIPE_CONTROL |
>>> +                (6 - 2));      /* DWORD count - 2 */
>>> +       OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +
>>> +       /* PIPELINE_SELECT */
>>> +       OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D);
>>> +
>>> +       OUT_BATCH(MI_LOAD_REGISTER_IMM);
>>> +       OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG);
>>> +       OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE);
>>> +
>>> +       gen8_emit_wm(batch);
>>> +       gen8_emit_ps(batch);
>>> +       gen8_emit_sf(batch);
>>> +
>>> +       OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */
>>> +       OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11);
>>> +
>>> +       gen8_emit_vs(batch);
>>> +       gen8_emit_hs(batch);
>>> +
>>> +       OUT_CMD(GEN7_3DSTATE_GS, 10);
>>> +       OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5);
>>> +       OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */
>>> +       OUT_CMD(GEN6_3DSTATE_CLIP, 4);
>>> +       OUT_CMD(GEN7_3DSTATE_TE, 4);
>>> +       OUT_CMD(GEN8_3DSTATE_VF, 2);
>>> +       OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5);
>>> +
>>> +       /* URB States */
>>> +       gen10_emit_urb(batch);
>>> +
>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130);
>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130);
>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130);
>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130);
>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130);
>>> +
>>> +       OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4);
>>> +       OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4);
>>> +       OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4);
>>> +
>>> +       /* Push Constants */
>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2);
>>> +
>>> +       /* Constants */
>>> +       OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11);
>>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11);
>>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11);
>>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11);
>>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11);
>>> +
>>> +       OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3);
>>> +       OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2);
>>> +       gen8_emit_vf_topology(batch);
>>> +
>>> +       /* Streamer out declaration list */
>>> +       gen8_emit_so_decl_list(batch);
>>> +
>>> +       /* Streamer out buffers */
>>> +       for (i = 0; i < 4; i++) {
>>> +               gen8_emit_so_buffer(batch, i);
>>> +       }
>>> +
>>> +       /* State base addresses */
>>> +       gen9_emit_state_base_address(batch);
>>> +
>>> +       OUT_CMD(GEN6_STATE_SIP, 3);
>>> +       OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4);
>>> +       OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8);
>>> +
>>> +       /* Chroma key */
>>> +       for (i = 0; i < 4; i++) {
>>> +               gen8_emit_chroma_key(batch, i);
>>> +       }
>>> +
>>> +       OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3);
>>> +       OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3);
>>> +       OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5);
>>> +       OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5);
>>> +       OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3);
>>> +       OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2);
>>> +
>>> +       /* WaPSRandomCSNotDone:cnl */
>>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
>>> +       OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
>>> +       OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +
>>> +       OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2);
>>> +       OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2);
>>> +       OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32);
>>> +       OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16);
>>> +       OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16);
>>> +       OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5);
>>> +
>>> +       /* Vertex buffers */
>>> +       gen8_emit_vertex_buffers(batch);
>>> +       gen8_emit_vertex_elements(batch);
>>> +
>>> +       OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */);
>>> +
>>> +       /* 3D state binding table pointers */
>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2);
>>> +
>>> +       gen8_emit_cc_state_pointers(batch);
>>> +       gen8_emit_blend_state_pointers(batch);
>>> +       gen8_emit_ps_extra(batch);
>>> +       gen8_emit_ps_blend(batch);
>>> +
>>> +       /* 3D state sampler state pointers */
>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2);
>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2);
>>> +
>>> +       OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2);
>>> +
>>> +       gen8_emit_viewport_state_pointers_cc(batch);
>>> +       gen8_emit_viewport_state_pointers_sf_clip(batch);
>>> +
>>> +       /* WaPSRandomCSNotDone:cnl */
>>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
>>> +       OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
>>> +       OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0);
>>> +
>>> +       gen8_emit_raster(batch);
>>> +
>>> +       OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4);
>>> +       OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2);
>>> +
>>> +       /* Launch 3D operation */
>>> +       gen8_emit_primitive(batch);
>>> +
>>> +       /* WaRsGatherPoolEnable: cnl */
>>> +       OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE);
>>> +       OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2));
>>> +       OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE);
>>> +       OUT_BATCH(0);
>>> +       OUT_BATCH(0xfffff << 12);
>>> +       OUT_BATCH(GEN7_MI_RS_CONTROL);
>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4);
>>> +
>>> +       OUT_BATCH(MI_BATCH_BUFFER_END);
>>> +}
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>>
>>
>



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/null_state_gen: Add GEN10 golden context batch buffer creation
  2017-07-12 21:03         ` Rodrigo Vivi
@ 2017-07-13 22:30           ` Rodrigo Vivi
  0 siblings, 0 replies; 7+ messages in thread
From: Rodrigo Vivi @ 2017-07-13 22:30 UTC (permalink / raw)
  To: Oscar Mateo; +Cc: intel-gfx, Ben Widawsky, Mika Kuoppala

So, you were right, with your patches (v2 of that igt) and the
"drm/i915: Allow null render state
> batchbuffers bigger than one page" in place everything works...

However it seems that patch is kind of Nacked for now... We need to
first get a solution there before continue with this patches here...

On Wed, Jul 12, 2017 at 2:03 PM, Rodrigo Vivi <rodrigo.vivi@gmail.com> wrote:
> On Wed, Jul 12, 2017 at 1:42 PM, Oscar Mateo <oscar.mateo@intel.com> wrote:
>>
>>
>> On 07/05/2017 05:50 PM, Rodrigo Vivi wrote:
>>>
>>> Hi Oscar,
>>
>>
>> Hey!
>>
>>> I had missed this patch here, but noticed now that I was refreshing
>>> and testing more cnl tests before re-submitting them.
>>>
>>> First of all I believe we need to remove the A0 w/a. I don't believe
>>> we will ever see one. So I'm removing all A0 exclusive W/a from the
>>> patches as well.
>>
>>
>> Be careful: I think both WAs in the patch are for all steppings (one was
>> incorrectly marked as A0 only in v1 of this patch).
>
> ah cool, so v2 is right...
>
>>
>>> I also gave a try here on your null state. However if I use the golden
>>> state generated by this version I get a blank screen because driver
>>> load failes with some strange faults:
>>
>>
>> Good. I don't have a CNL so it was only compile-tested.
>>
>>> any idea?
>>
>>
>> Did you also include the i915 patch to allow golden BBs over one page in
>> size? I sent it separately as "drm/i915: Allow null render state
>> batchbuffers bigger than one page". BTW: this patch was given a cold
>> shoulder in the mailing list, since I could not re-justify why null state
>> was needed in the first place (since UMD needs to configure the 3D pipeline
>> first thing anyway). I am still trying to get a better explanation from HW
>> people.
>
> hmmmm no... I missed that patch... sorry...
>
> I'm currently without access to CNL, but as soon as I have I will test
> it and if that works I will just merge igt one, review your kernel
> one, etc...
>
>>
>> -- Oscar
>>
>>> [    4.115243] Memory manager not clean during takedown.
>>>
>>> [    4.120389] ------------[ cut here ]------------
>>> [    4.125068] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mm.c:892
>>> drm_mm_takedown+0x25/0x30
>>> [    4.133574] Modules linked in:
>>> [    4.136707] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
>>> 4.12.0-eywa-46011-g9a19faf #360
>>> [    4.144650] Hardware name: Intel Corporation Cannonlake Client
>>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>>> [    4.158500] task: ffff880264ab8000 task.stack: ffffc90000038000
>>> [    4.164506] RIP: 0010:drm_mm_takedown+0x25/0x30
>>> [    4.169104] RSP: 0000:ffffc9000003bc28 EFLAGS: 00010292
>>> [    4.174409] RAX: 0000000000000029 RBX: ffff880260a54170 RCX:
>>> ffffffff82468740
>>> [    4.181654] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
>>> 00000000ffffffff
>>> [    4.188839] RBP: ffffc9000003bc28 R08: 00000000fffffffe R09:
>>> 000000000000035a
>>> [    4.196028] R10: 0000000000000005 R11: 0000000000000000 R12:
>>> ffff880260a50000
>>> [    4.203215] R13: ffff880260a54348 R14: ffff880260a50070 R15:
>>> ffff880262844a00
>>> [    4.210402] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
>>> knlGS:0000000000000000
>>> [    4.218541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    4.224344] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
>>> 00000000007406f0
>>> [    4.231529] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [    4.238716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [    4.245900] PKRU: 00000000
>>> [    4.248673] Call Trace:
>>> [    4.251193]  i915_gem_cleanup_stolen+0x1f/0x30
>>> [    4.255703]  i915_ggtt_cleanup_hw+0xa4/0x170
>>> [    4.260035]  i915_driver_cleanup_hw+0x36/0x40
>>> [    4.264455]  i915_driver_load+0x6a0/0xe70
>>> [    4.268535]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>>> [    4.273560]  i915_pci_probe+0x2c/0x50
>>> [    4.277293]  local_pci_probe+0x45/0xa0
>>> [    4.281106]  ? pci_match_device+0xe0/0x110
>>> [    4.285265]  pci_device_probe+0x135/0x150
>>> [    4.289343]  driver_probe_device+0x288/0x490
>>> [    4.293676]  __driver_attach+0xc9/0xf0
>>> [    4.297490]  ? driver_probe_device+0x490/0x490
>>> [    4.301999]  bus_for_each_dev+0x5d/0x90
>>> [    4.305902]  driver_attach+0x1e/0x20
>>> [    4.309543]  bus_add_driver+0x1d0/0x290
>>> [    4.313442]  driver_register+0x60/0xe0
>>> [    4.317257]  __pci_register_driver+0x5d/0x60
>>> [    4.321652]  i915_init+0x59/0x5c
>>> [    4.324944]  ? mipi_dsi_bus_init+0x17/0x17
>>> [    4.329103]  do_one_initcall+0x42/0x180
>>> [    4.333007]  kernel_init_freeable+0x17c/0x202
>>> [    4.337426]  ? set_debug_rodata+0x17/0x17
>>> [    4.341500]  ? rest_init+0x90/0x90
>>> [    4.344969]  kernel_init+0xe/0x110
>>> [    4.348438]  ret_from_fork+0x25/0x30
>>> [    4.352079] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48
>>> 83 c7 38 48 39 c7 75 01 c3 55 48 c7 c7 70 ac 20 82 31 c0 48 89 e5 e8
>>> 6b 62 b7 ff <0f> ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89
>>> e5 41
>>> [    4.371029] ---[ end trace 7d36c2dd72851315 ]---
>>> [    4.381680] WARN_ON(dev_priv->mm.object_count)
>>> [    4.381698] ------------[ cut here ]------------
>>> [    4.390921] WARNING: CPU: 0 PID: 1 at
>>> drivers/gpu/drm/i915/i915_gem.c:4964 i915_gem_load_cleanup+0x10b/0x120
>>> [    4.400797] Modules linked in:
>>> [    4.403927] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W
>>> 4.12.0-eywa-46011-g9a19faf #360
>>> [    4.413021] Hardware name: Intel Corporation Cannonlake Client
>>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>>> [    4.426884] task: ffff880264ab8000 task.stack: ffffc90000038000
>>> [    4.432865] RIP: 0010:i915_gem_load_cleanup+0x10b/0x120
>>> [    4.438157] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292
>>> [    4.443450] RAX: 0000000000000022 RBX: ffff880260a50000 RCX:
>>> ffffffff82468740
>>> [    4.450642] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
>>> 0000000000000202
>>> [    4.457839] RBP: ffffc9000003bc68 R08: 0000000000000022 R09:
>>> 0000000000000389
>>> [    4.465029] R10: 0000000000000000 R11: 0000000000000001 R12:
>>> ffff880260a54678
>>> [    4.472227] R13: ffff88026446c000 R14: ffff88026446c000 R15:
>>> ffff880262844a00
>>> [    4.479420] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
>>> knlGS:0000000000000000
>>> [    4.487564] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    4.493370] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
>>> 00000000007406f0
>>> [    4.500569] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [    4.507763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [    4.514959] PKRU: 00000000
>>> [    4.517737] Call Trace:
>>> [    4.520265]  i915_driver_cleanup_early+0x1a/0x50
>>> [    4.524955]  i915_driver_load+0x6b8/0xe70
>>> [    4.529038]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>>> [    4.534100] clocksource: Switched to clocksource tsc
>>> [    4.534105]  i915_pci_probe+0x2c/0x50
>>> [    4.534113]  local_pci_probe+0x45/0xa0
>>> [    4.534118]  ? pci_match_device+0xe0/0x110
>>> [    4.534124]  pci_device_probe+0x135/0x150
>>> [    4.534131]  driver_probe_device+0x288/0x490
>>> [    4.534137]  __driver_attach+0xc9/0xf0
>>> [    4.534142]  ? driver_probe_device+0x490/0x490
>>> [    4.534146]  bus_for_each_dev+0x5d/0x90
>>> [    4.534152]  driver_attach+0x1e/0x20
>>> [    4.534156]  bus_add_driver+0x1d0/0x290
>>> [    4.534162]  driver_register+0x60/0xe0
>>> [    4.534167]  __pci_register_driver+0x5d/0x60
>>> [    4.534173]  i915_init+0x59/0x5c
>>> [    4.534177]  ? mipi_dsi_bus_init+0x17/0x17
>>> [    4.534181]  do_one_initcall+0x42/0x180
>>> [    4.534187]  kernel_init_freeable+0x17c/0x202
>>> [    4.534191]  ? set_debug_rodata+0x17/0x17
>>> [    4.534196]  ? rest_init+0x90/0x90
>>> [    4.534200]  kernel_init+0xe/0x110
>>> [    4.534204]  ret_from_fork+0x25/0x30
>>> [    4.534208] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff 0f
>>> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8
>>> 05 4f b1 ff <0f> ff e9 33 ff ff ff 66 66 66 66 66 2e 0f 1f 84 00 00 00
>>> 00 00
>>> [    4.534272] ---[ end trace 7d36c2dd72851316 ]---
>>> [    4.534277] WARN_ON(!list_empty(&dev_priv->gt.timelines))
>>> [    4.534293] ------------[ cut here ]------------
>>> [    4.534298] WARNING: CPU: 0 PID: 1 at
>>> drivers/gpu/drm/i915/i915_gem.c:4968 i915_gem_load_cleanup+0xef/0x120
>>> [    4.534299] Modules linked in:
>>> [    4.534304] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W
>>> 4.12.0-eywa-46011-g9a19faf #360
>>> [    4.534306] Hardware name: Intel Corporation Cannonlake Client
>>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>>> [    4.534308] task: ffff880264ab8000 task.stack: ffffc90000038000
>>> [    4.534312] RIP: 0010:i915_gem_load_cleanup+0xef/0x120
>>> [    4.534314] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292
>>> [    4.534317] RAX: 000000000000002d RBX: ffff880260a50000 RCX:
>>> 0000000000000000
>>> [    4.534319] RDX: 0000000000000001 RSI: 0000000000000002 RDI:
>>> 0000000000000296
>>> [    4.534321] RBP: ffffc9000003bc68 R08: 000000000000002d R09:
>>> 000000000000002d
>>> [    4.534322] R10: 0000000000000000 R11: ffff880260a4e000 R12:
>>> ffff880260a50070
>>> [    4.534324] R13: ffff88026446c000 R14: ffff88026446c000 R15:
>>> ffff880262844a00
>>> [    4.534327] FS:  0000000000000000(0000) GS:ffff88026dc00000(0000)
>>> knlGS:0000000000000000
>>> [    4.534329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    4.534331] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4:
>>> 00000000007406f0
>>> [    4.534334] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [    4.534335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [    4.534337] PKRU: 00000000
>>> [    4.534338] Call Trace:
>>> [    4.534344]  i915_driver_cleanup_early+0x1a/0x50
>>> [    4.534350]  i915_driver_load+0x6b8/0xe70
>>> [    4.534356]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>>> [    4.534361]  i915_pci_probe+0x2c/0x50
>>> [    4.534366]  local_pci_probe+0x45/0xa0
>>> [    4.534371]  ? pci_match_device+0xe0/0x110
>>> [    4.534376]  pci_device_probe+0x135/0x150
>>> [    4.534382]  driver_probe_device+0x288/0x490
>>> [    4.534388]  __driver_attach+0xc9/0xf0
>>> [    4.534393]  ? driver_probe_device+0x490/0x490
>>> [    4.534398]  bus_for_each_dev+0x5d/0x90
>>> [    4.534403]  driver_attach+0x1e/0x20
>>> [    4.534408]  bus_add_driver+0x1d0/0x290
>>> [    4.534414]  driver_register+0x60/0xe0
>>> [    4.534419]  __pci_register_driver+0x5d/0x60
>>> [    4.534424]  i915_init+0x59/0x5c
>>> [    4.534428]  ? mipi_dsi_bus_init+0x17/0x17
>>> [    4.534431]  do_one_initcall+0x42/0x180
>>> [    4.534437]  kernel_init_freeable+0x17c/0x202
>>> [    4.534440]  ? set_debug_rodata+0x17/0x17
>>> [    4.534444]  ? rest_init+0x90/0x90
>>> [    4.534448]  kernel_init+0xe/0x110
>>> [    4.534451]  ret_from_fork+0x25/0x30
>>> [    4.534455] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 3d 4f b1 ff 0f
>>> ff e9 5d ff ff ff 48 c7 c6 b0 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8
>>> 21 4f b1 ff <0f> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98
>>> 1a 82
>>> [    4.534519] ---[ end trace 7d36c2dd72851317 ]---
>>> [    4.534605]
>>> =============================================================================
>>> [    4.534608] BUG drm_i915_gem_object (Tainted: G        W      ):
>>> Objects remaining in drm_i915_gem_object on __kmem_cache_shutdown()
>>> [    4.534609]
>>> -----------------------------------------------------------------------------
>>>
>>> [    4.534611] Disabling lock debugging due to kernel taint
>>> [    4.534614] INFO: Slab 0xffffea0009820600 objects=19 used=2
>>> fp=0xffff88026081ba80 flags=0x200000000008100
>>> [    4.534618] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>> 4.12.0-eywa-46011-g9a19faf #360
>>> [    4.534620] Hardware name: Intel Corporation Cannonlake Client
>>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>>> [    4.534621] Call Trace:
>>> [    4.534626]  dump_stack+0x65/0x89
>>> [    4.534633]  slab_err+0xa1/0xb0
>>> [    4.534640]  ? __kmalloc+0x185/0x270
>>> [    4.534645]  ? kmem_cache_alloc_bulk+0x1f0/0x1f0
>>> [    4.534650]  ? __kmem_cache_shutdown+0x160/0x400
>>> [    4.534655]  __kmem_cache_shutdown+0x180/0x400
>>> [    4.534663]  shutdown_cache+0x18/0x1a0
>>> [    4.534667]  kmem_cache_destroy+0x1c1/0x1f0
>>> [    4.534672]  i915_gem_load_cleanup+0xb4/0x120
>>> [    4.534677]  i915_driver_cleanup_early+0x1a/0x50
>>> [    4.534682]  i915_driver_load+0x6b8/0xe70
>>> [    4.534689]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>>> [    4.534693]  i915_pci_probe+0x2c/0x50
>>> [    4.534698]  local_pci_probe+0x45/0xa0
>>> [    4.534703]  ? pci_match_device+0xe0/0x110
>>> [    4.534708]  pci_device_probe+0x135/0x150
>>> [    4.534714]  driver_probe_device+0x288/0x490
>>> [    4.534721]  __driver_attach+0xc9/0xf0
>>> [    4.534726]  ? driver_probe_device+0x490/0x490
>>> [    4.534730]  bus_for_each_dev+0x5d/0x90
>>> [    4.534736]  driver_attach+0x1e/0x20
>>> [    4.534741]  bus_add_driver+0x1d0/0x290
>>> [    4.534746]  driver_register+0x60/0xe0
>>> [    4.534751]  __pci_register_driver+0x5d/0x60
>>> [    4.534756]  i915_init+0x59/0x5c
>>> [    4.534760]  ? mipi_dsi_bus_init+0x17/0x17
>>>      4.534760]  ? mipi_dsi_bus_init+0x17/0x17
>>> [    4.534763]  do_one_initcall+0x42/0x180
>>> [    4.534769]  kernel_init_freeable+0x17c/0x202
>>> [    4.534773]  ? set_debug_rodata+0x17/0x17
>>> [    4.534777]  ? rest_init+0x90/0x90
>>> [    4.534781]  kernel_init+0xe/0x110
>>> [    4.534784]  ret_from_fork+0x25/0x30
>>> [    4.534791] INFO: Object 0xffff880260818340 @offset=832
>>> [    4.534792] INFO: Object 0xffff880260818680 @offset=1664
>>> [    4.534795] kmem_cache_destroy drm_i915_gem_object: Slab cache
>>> still has objects
>>> [    4.534798] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>> 4.12.0-eywa-46011-g9a19faf #360
>>> [    4.534800] Hardware name: Intel Corporation Cannonlake Client
>>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS
>>> CNLSFWR1.R00.X075.D01.1703021113 03/02
>>> [    4.534801] Call Trace:
>>> [    4.534805]  dump_stack+0x65/0x89
>>> [    4.534809]  kmem_cache_destroy+0x1e1/0x1f0
>>> [    4.534814]  i915_gem_load_cleanup+0xb4/0x120
>>> [    4.534819]  i915_driver_cleanup_early+0x1a/0x50
>>> [    4.534824]  i915_driver_load+0x6b8/0xe70
>>> [    4.534830]  ? _raw_spin_unlock_irqrestore+0x26/0x50
>>> [    4.534835]  i915_pci_probe+0x2c/0x50
>>> [    4.534840]  local_pci_probe+0x45/0xa0
>>> [    4.534844]  ? pci_match_device+0xe0/0x110
>>> [    4.534850]  pci_device_probe+0x135/0x150
>>> [    4.534856]  driver_probe_device+0x288/0x490
>>> [    4.534862]  __driver_attach+0xc9/0xf0
>>> [    4.534867]  ? driver_probe_device+0x490/0x490
>>> [    4.534871]  bus_for_each_dev+0x5d/0x90
>>> [    4.534877]  driver_attach+0x1e/0x20
>>> [    4.534882]  bus_add_driver+0x1d0/0x290
>>> [    4.534888]  driver_register+0x60/0xe0
>>> [    4.534893]  __pci_register_driver+0x5d/0x60
>>> [    4.534897]  i915_init+0x59/0x5c
>>> [    4.534901]  ? mipi_dsi_bus_init+0x17/0x17
>>> [    4.534904]  do_one_initcall+0x42/0x180
>>> [    4.534910]  kernel_init_freeable+0x17c/0x202
>>> [    4.534914]  ? set_debug_rodata+0x17/0x17
>>> [    4.534917]  ? rest_init+0x90/0x90
>>> [    4.534922]  kernel_init+0xe/0x110
>>> [    4.534925]  ret_from_fork+0x25/0x30
>>> [    4.535386] i915 0000:00:02.0: [drm:i915_driver_load] Device
>>> initialization failed (-22)
>>> [    4.535390] i915 0000:00:02.0: Please file a bug at
>>> https://bugs.freedesktop.org/enter_bug.cgi?product=DRI against
>>> DRM/Intel providing the dmesg log by booting with drm.debug=0xf
>>> [    4.535450] i915: probe of 0000:00:02.0 failed with error -22
>>>
>>>
>>> On Fri, Apr 28, 2017 at 7:36 AM, Oscar Mateo <oscar.mateo@intel.com>
>>> wrote:
>>>>
>>>> This batchbuffer is over 4096 bytes, so we need to increase the size of
>>>> the
>>>> array (and the KMD has to be modified to deal with more than one page).
>>>>
>>>> Notice that there to workarounds embedded here, both applicable to all
>>>> CNL
>>>> steppings.
>>>>
>>>> v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so
>>>> update
>>>>      the comment in the code and in the commit message.
>>>>
>>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>>>> Cc: Ben Widawsky <ben@bwidawsk.net>
>>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>>>> ---
>>>>   lib/gen10_render.h                             |  63 +++
>>>>   tools/null_state_gen/Makefile.am               |   3 +-
>>>>   tools/null_state_gen/intel_batchbuffer.h       |   2 +-
>>>>   tools/null_state_gen/intel_null_state_gen.c    |   5 +-
>>>>   tools/null_state_gen/intel_renderstate.h       |   1 +
>>>>   tools/null_state_gen/intel_renderstate_gen10.c | 538
>>>> +++++++++++++++++++++++++
>>>>   6 files changed, 609 insertions(+), 3 deletions(-)
>>>>   create mode 100644 lib/gen10_render.h
>>>>   create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c
>>>>
>>>> diff --git a/lib/gen10_render.h b/lib/gen10_render.h
>>>> new file mode 100644
>>>> index 0000000..f4a7dff
>>>> --- /dev/null
>>>> +++ b/lib/gen10_render.h
>>>> @@ -0,0 +1,63 @@
>>>> +#ifndef GEN10_RENDER_H
>>>> +#define GEN10_RENDER_H
>>>> +
>>>> +#include "gen9_render.h"
>>>> +
>>>> +#define GEN7_MI_RS_CONTROL                     (0x6 << 23)
>>>> +# define GEN7_MI_RS_CONTROL_ENABLE             (1 << 0)
>>>> +
>>>> +#define GEN10_3DSTATE_GATHER_POOL_ALLOC                GEN6_3D(3, 1,
>>>> 0x1a)
>>>> +# define GEN10_3DSTATE_GATHER_POOL_ENABLE      (1 << 11)
>>>> +
>>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_VS       GEN6_3D(3, 0, 0x34)
>>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_HS       GEN6_3D(3, 0, 0x36)
>>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_DS       GEN6_3D(3, 0, 0x37)
>>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_GS       GEN6_3D(3, 0, 0x35)
>>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_PS       GEN6_3D(3, 0, 0x38)
>>>> +
>>>> +#define GEN10_3DSTATE_WM_DEPTH_STENCIL         GEN6_3D(3, 0, 0x4e)
>>>> +#define GEN10_3DSTATE_WM_CHROMAKEY             GEN6_3D(3, 0, 0x4c)
>>>> +
>>>> +#define GEN8_REG_L3_CACHE_CONFIG       0x7034
>>>> +
>>>> +/*
>>>> + * Programming for L3 cache allocations can be made per bank. Based on
>>>> the
>>>> + * programmed value HW will apply same allocations on other available
>>>> banks.
>>>> + * Total L3 Cache size per bank = 256 KB.
>>>> + * {SLM,    URB,     DC,      RO(I/S, C, T),   L3 Client Pool}
>>>> + * {  0,    96,      32,      128,                 0      }
>>>> + */
>>>> +#define GEN10_L3_CACHE_CONFIG_VALUE    0x00420060
>>>> +
>>>> +#define URB_ALIGN(val, align)  ((val % align) ? (val - (val % align)) :
>>>> val)
>>>> +
>>>> +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES                64
>>>> +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES                2752
>>>> +
>>>> +#define GEN10_KB_PER_URB_INDEX                 8
>>>> +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB       96
>>>> +
>>>> +#define GEN10_URB_RESERVED_SIZE_KB             32
>>>> +#define GEN10_URB_RESERVED_END_SIZE_KB         8
>>>> +
>>>> +#define GEN10_VS_NUM_BITS_PER_URB_UNIT         512
>>>> +#define GEN10_VS_NUM_OF_URB_UNITS              1 // zero based
>>>> +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS
>>>> (GEN10_VS_NUM_BITS_PER_URB_UNIT * \
>>>> +
>>>> (GEN10_VS_NUM_OF_URB_UNITS + 1))
>>>> +
>>>> +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB /
>>>> GEN10_KB_PER_URB_INDEX)
>>>> +
>>>> +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count)
>>>> \
>>>> +       URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB *
>>>> l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX)
>>>> +
>>>> +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice)       \
>>>> +       (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB -
>>>> GEN10_URB_RESERVED_END_SIZE_KB)
>>>> +
>>>> +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice)   \
>>>> +       ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) *    \
>>>> +       1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS)
>>>> +
>>>> +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice)                     \
>>>> +       ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) /
>>>> GEN10_KB_PER_URB_INDEX)
>>>> +
>>>> +#endif
>>>> diff --git a/tools/null_state_gen/Makefile.am
>>>> b/tools/null_state_gen/Makefile.am
>>>> index 24884a7..2f90990 100644
>>>> --- a/tools/null_state_gen/Makefile.am
>>>> +++ b/tools/null_state_gen/Makefile.am
>>>> @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES =       \
>>>>          intel_renderstate_gen7.c \
>>>>          intel_renderstate_gen8.c \
>>>>          intel_renderstate_gen9.c \
>>>> +       intel_renderstate_gen10.c \
>>>>          intel_null_state_gen.c
>>>>
>>>> -gens := 6 7 8 9
>>>> +gens := 6 7 8 9 10
>>>>
>>>>   h = /tmp/intel_renderstate_gen$$gen.c
>>>>   states: intel_null_state_gen
>>>> diff --git a/tools/null_state_gen/intel_batchbuffer.h
>>>> b/tools/null_state_gen/intel_batchbuffer.h
>>>> index 771d1c8..e40e01b 100644
>>>> --- a/tools/null_state_gen/intel_batchbuffer.h
>>>> +++ b/tools/null_state_gen/intel_batchbuffer.h
>>>> @@ -34,7 +34,7 @@
>>>>   #include <stdint.h>
>>>>
>>>>   #define MAX_RELOCS 64
>>>> -#define MAX_ITEMS 1024
>>>> +#define MAX_ITEMS 2048
>>>>   #define MAX_STRLEN 256
>>>>
>>>>   #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1))
>>>> diff --git a/tools/null_state_gen/intel_null_state_gen.c
>>>> b/tools/null_state_gen/intel_null_state_gen.c
>>>> index 06eb954..4f12f5f 100644
>>>> --- a/tools/null_state_gen/intel_null_state_gen.c
>>>> +++ b/tools/null_state_gen/intel_null_state_gen.c
>>>> @@ -41,7 +41,7 @@ static int debug = 0;
>>>>   static void print_usage(char *s)
>>>>   {
>>>>          fprintf(stderr, "%s: <gen>\n"
>>>> -               "     gen:     gen to generate for (6,7,8,9)\n",
>>>> +               "     gen:     gen to generate for (6,7,8,9,10)\n",
>>>>                  s);
>>>>   }
>>>>
>>>> @@ -173,6 +173,9 @@ static int do_generate(int gen)
>>>>          case 9:
>>>>                  null_state_gen = gen9_setup_null_render_state;
>>>>                  break;
>>>> +       case 10:
>>>> +               null_state_gen = gen10_setup_null_render_state;
>>>> +               break;
>>>>          }
>>>>
>>>>          if (null_state_gen == NULL) {
>>>> diff --git a/tools/null_state_gen/intel_renderstate.h
>>>> b/tools/null_state_gen/intel_renderstate.h
>>>> index b27b434..b3c8c2b 100644
>>>> --- a/tools/null_state_gen/intel_renderstate.h
>>>> +++ b/tools/null_state_gen/intel_renderstate.h
>>>> @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct
>>>> intel_batchbuffer *batch);
>>>>   void gen7_setup_null_render_state(struct intel_batchbuffer *batch);
>>>>   void gen8_setup_null_render_state(struct intel_batchbuffer *batch);
>>>>   void gen9_setup_null_render_state(struct intel_batchbuffer *batch);
>>>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch);
>>>>
>>>>   #endif /* __INTEL_RENDERSTATE_H__ */
>>>> diff --git a/tools/null_state_gen/intel_renderstate_gen10.c
>>>> b/tools/null_state_gen/intel_renderstate_gen10.c
>>>> new file mode 100644
>>>> index 0000000..f5678c3
>>>> --- /dev/null
>>>> +++ b/tools/null_state_gen/intel_renderstate_gen10.c
>>>> @@ -0,0 +1,538 @@
>>>> +/*
>>>> + * Copyright © 2014 Intel Corporation
>>>> + *
>>>> + * Permission is hereby granted, free of charge, to any person obtaining
>>>> a
>>>> + * copy of this software and associated documentation files (the
>>>> "Software"),
>>>> + * to deal in the Software without restriction, including without
>>>> limitation
>>>> + * the rights to use, copy, modify, merge, publish, distribute,
>>>> sublicense,
>>>> + * and/or sell copies of the Software, and to permit persons to whom the
>>>> + * Software is furnished to do so, subject to the following conditions:
>>>> + *
>>>> + * The above copyright notice and this permission notice (including the
>>>> next
>>>> + * paragraph) shall be included in all copies or substantial portions of
>>>> the
>>>> + * Software.
>>>> + *
>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>>>> EXPRESS OR
>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>>>> MERCHANTABILITY,
>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>>>> SHALL
>>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>>>> OTHER
>>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>>>> ARISING
>>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>>>> + * DEALINGS IN THE SOFTWARE.
>>>> + *
>>>> + * Authors:
>>>> + *     Oscar Mateo <oscar.mateo@intel.com>
>>>> + */
>>>> +
>>>> +#include "intel_renderstate.h"
>>>> +#include <lib/gen10_render.h>
>>>> +#include <lib/intel_reg.h>
>>>> +
>>>> +static void gen8_emit_wm(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
>>>> +       OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION);
>>>> +}
>>>> +
>>>> +static void gen8_emit_ps(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2));
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0); /* kernel hi */
>>>> +       OUT_BATCH(GEN7_PS_SPF_MODE);
>>>> +       OUT_BATCH(0); /* scratch space stuff */
>>>> +       OUT_BATCH(0); /* scratch hi */
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0); // kernel 1
>>>> +       OUT_BATCH(0); /* kernel 1 hi */
>>>> +       OUT_BATCH(0); // kernel 2
>>>> +       OUT_BATCH(0); /* kernel 2 hi */
>>>> +}
>>>> +
>>>> +static void gen8_emit_sf(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT |
>>>> +                 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT |
>>>> +                 GEN7_SF_POINT_WIDTH_FROM_SOURCE |
>>>> +                 8);
>>>> +}
>>>> +
>>>> +static void gen8_emit_vs(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2));
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +}
>>>> +
>>>> +static void gen8_emit_hs(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2));
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT);
>>>> +       OUT_BATCH(0);
>>>> +}
>>>> +
>>>> +static void gen8_emit_raster(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
>>>> +       OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW);
>>>> +       OUT_BATCH(0.0);
>>>> +       OUT_BATCH(0.0);
>>>> +       OUT_BATCH(0.0);
>>>> +}
>>>> +
>>>> +static void gen10_emit_urb(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       /* Smallest SKU: 3x8*/
>>>> +       int l3_bank_count = 3;
>>>> +       int slice_count = 1;
>>>> +       int urb_size_per_slice =
>>>> GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count);
>>>> +       int other_urb_start_addr =
>>>> GEN10_VS_END_URB_INDEX(urb_size_per_slice);
>>>> +       const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX;
>>>> +       const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS;
>>>> +       int vs_urb_entries =
>>>> GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice);
>>>> +
>>>> +       if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES)
>>>> +               vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES;
>>>> +       if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES)
>>>> +               vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES;
>>>> +
>>>> +       OUT_BATCH(GEN7_3DSTATE_URB_VS);
>>>> +       OUT_BATCH(vs_urb_entries |
>>>> +                (vs_urb_alloc_size << 16) |
>>>> +                (vs_urb_start_addr << 25));
>>>> +
>>>> +       OUT_BATCH(GEN7_3DSTATE_URB_HS);
>>>> +       OUT_BATCH(other_urb_start_addr << 25);
>>>> +
>>>> +       OUT_BATCH(GEN7_3DSTATE_URB_DS);
>>>> +       OUT_BATCH(other_urb_start_addr << 25);
>>>> +
>>>> +       OUT_BATCH(GEN7_3DSTATE_URB_GS);
>>>> +       OUT_BATCH(other_urb_start_addr << 25);
>>>> +}
>>>> +
>>>> +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
>>>> +       OUT_BATCH(_3DPRIM_TRILIST);
>>>> +}
>>>> +
>>>> +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       const int num_decls = 128;
>>>> +       int i;
>>>> +
>>>> +       OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST |
>>>> +               (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(num_decls);
>>>> +
>>>> +       for (i = 0; i < num_decls; i++) {
>>>> +               OUT_BATCH(0);
>>>> +               OUT_BATCH(0);
>>>> +       }
>>>> +}
>>>> +
>>>> +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const
>>>> int index)
>>>> +{
>>>> +       OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2));
>>>> +       OUT_BATCH(index << 29);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +}
>>>> +
>>>> +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const
>>>> int index)
>>>> +{
>>>> +       OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2));
>>>> +       OUT_BATCH(index << 30);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +}
>>>> +
>>>> +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       const int buffers = 33;
>>>> +       int i;
>>>> +
>>>> +       OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS |
>>>> +               (((4 * buffers) + 1)- 2) /* DWORD count - 2 */);
>>>> +
>>>> +       for (i = 0; i < buffers; i++) {
>>>> +               OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT |
>>>> +                         GEN7_VB0_BUFFER_ADDR_MOD_EN);
>>>> +               OUT_BATCH(0); /* Address */
>>>> +               OUT_BATCH(0);
>>>> +               OUT_BATCH(0);
>>>> +       }
>>>> +}
>>>> +
>>>> +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       const int elements = 34;
>>>> +       int i;
>>>> +
>>>> +       OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS |
>>>> +               (((2 * elements) + 1) - 2) /* DWORD count - 2 */);
>>>> +
>>>> +       /* Element 0 */
>>>> +       OUT_BATCH(VE0_VALID);
>>>> +       OUT_BATCH(
>>>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
>>>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
>>>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
>>>> +               GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
>>>> +       /* Elements 1 -> 33 */
>>>> +       for (i = 1; i < elements; i++) {
>>>> +               OUT_BATCH(0);
>>>> +               OUT_BATCH(0);
>>>> +       }
>>>> +}
>>>> +
>>>> +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       union {
>>>> +               float fval;
>>>> +               uint32_t uval;
>>>> +       } u;
>>>> +
>>>> +       unsigned offset;
>>>> +
>>>> +       u.fval = 1.0f;
>>>> +
>>>> +       offset = intel_batch_state_offset(batch, 64);
>>>> +       OUT_STATE(0);
>>>> +       OUT_STATE(0);      /* Alpha reference value */
>>>> +       OUT_STATE(u.uval); /* Blend constant color RED */
>>>> +       OUT_STATE(u.uval); /* Blend constant color BLUE */
>>>> +       OUT_STATE(u.uval); /* Blend constant color GREEN */
>>>> +       OUT_STATE(u.uval); /* Blend constant color ALPHA */
>>>> +
>>>> +       OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
>>>> +       OUT_BATCH_STATE_OFFSET(offset | 1);
>>>> +}
>>>> +
>>>> +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer
>>>> *batch)
>>>> +{
>>>> +       unsigned offset;
>>>> +       int i;
>>>> +
>>>> +       offset = intel_batch_state_offset(batch, 64);
>>>> +
>>>> +       for (i = 0; i < 17; i++)
>>>> +               OUT_STATE(0);
>>>> +
>>>> +       OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
>>>> +       OUT_BATCH_STATE_OFFSET(offset | 1);
>>>> +}
>>>> +
>>>> +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
>>>> +       OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID |
>>>> +                 GEN8_PSX_ATTRIBUTE_ENABLE);
>>>> +
>>>> +}
>>>> +
>>>> +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
>>>> +       OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
>>>> +}
>>>> +
>>>> +static void gen8_emit_viewport_state_pointers_cc(struct
>>>> intel_batchbuffer *batch)
>>>> +{
>>>> +       unsigned offset;
>>>> +
>>>> +       offset = intel_batch_state_offset(batch, 32);
>>>> +
>>>> +       OUT_STATE((uint32_t)0.0f); /* Minimum depth */
>>>> +       OUT_STATE((uint32_t)0.0f); /* Maximum depth */
>>>> +
>>>> +       OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
>>>> +       OUT_BATCH_STATE_OFFSET(offset);
>>>> +}
>>>> +
>>>> +static void gen8_emit_viewport_state_pointers_sf_clip(struct
>>>> intel_batchbuffer *batch)
>>>> +{
>>>> +       unsigned offset;
>>>> +       int i;
>>>> +
>>>> +       offset = intel_batch_state_offset(batch, 64);
>>>> +
>>>> +       for (i = 0; i < 16; i++)
>>>> +               OUT_STATE(0);
>>>> +
>>>> +       OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 -
>>>> 2));
>>>> +       OUT_BATCH_STATE_OFFSET(offset);
>>>> +}
>>>> +
>>>> +static void gen8_emit_primitive(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       OUT_BATCH(GEN6_3DPRIMITIVE | (10-2));
>>>> +       OUT_BATCH(4);   /* gen8+ ignore the topology type field */
>>>> +       OUT_BATCH(1);   /* vertex count */
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(1);   /* single instance */
>>>> +       OUT_BATCH(0);   /* start instance location */
>>>> +       OUT_BATCH(0);   /* index buffer offset, ignored */
>>>> +       OUT_BATCH(0);   /* extended parameter 0 */
>>>> +       OUT_BATCH(0);   /* extended parameter 1 */
>>>> +       OUT_BATCH(0);   /* extended parameter 2 */
>>>> +}
>>>> +
>>>> +static void gen9_emit_state_base_address(struct intel_batchbuffer
>>>> *batch) {
>>>> +       const unsigned offset = 0;
>>>> +       OUT_BATCH(GEN6_STATE_BASE_ADDRESS |
>>>> +               (22 - 2) /* DWORD count - 2 */);
>>>> +
>>>> +       /* general state base address - requires BB address
>>>> +        * added to state offset to be stored in this location
>>>> +        */
>>>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       /* stateless data port */
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       /* surface state base address - requires BB address
>>>> +        * added to state offset to be stored in this location
>>>> +        */
>>>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       /* dynamic state base address - requires BB address
>>>> +        * added to state offset to be stored in this location
>>>> +        */
>>>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       /* indirect state base address */
>>>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       /* instruction state base address - requires BB address
>>>> +        * added to state offset to be stored in this location
>>>> +        */
>>>> +       OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       /* general state buffer size */
>>>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>>>> +       /* dynamic state buffer size */
>>>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>>>> +       /* indirect object buffer size */
>>>> +       OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY);
>>>> +       /* intruction buffer size */
>>>> +       OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
>>>> +
>>>> +       /* bindless surface state base address */
>>>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>>>> +       OUT_BATCH(0);
>>>> +       /* bindless surface state size */
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       /* bindless sampler state base address */
>>>> +       OUT_BATCH(BASE_ADDRESS_MODIFY);
>>>> +       OUT_BATCH(0);
>>>> +       /* bindless sampler state size */
>>>> +       OUT_BATCH(0);
>>>> +}
>>>> +
>>>> +/*
>>>> + * Generate the batch buffer commands needed to initialize the 3D engine
>>>> + * to its "golden state".
>>>> + */
>>>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch)
>>>> +{
>>>> +       int i;
>>>> +
>>>> +       /* WaRsGatherPoolEnable: cnl */
>>>> +       OUT_BATCH(GEN7_MI_RS_CONTROL);
>>>> +
>>>> +#define GEN8_PIPE_CONTROL_GLOBAL_GTT   (1 << 24)
>>>> +       /* PIPE_CONTROL */
>>>> +       OUT_BATCH(GEN6_PIPE_CONTROL |
>>>> +                (6 - 2));      /* DWORD count - 2 */
>>>> +       OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       /* PIPELINE_SELECT */
>>>> +       OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D);
>>>> +
>>>> +       OUT_BATCH(MI_LOAD_REGISTER_IMM);
>>>> +       OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG);
>>>> +       OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE);
>>>> +
>>>> +       gen8_emit_wm(batch);
>>>> +       gen8_emit_ps(batch);
>>>> +       gen8_emit_sf(batch);
>>>> +
>>>> +       OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */
>>>> +       OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11);
>>>> +
>>>> +       gen8_emit_vs(batch);
>>>> +       gen8_emit_hs(batch);
>>>> +
>>>> +       OUT_CMD(GEN7_3DSTATE_GS, 10);
>>>> +       OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5);
>>>> +       OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */
>>>> +       OUT_CMD(GEN6_3DSTATE_CLIP, 4);
>>>> +       OUT_CMD(GEN7_3DSTATE_TE, 4);
>>>> +       OUT_CMD(GEN8_3DSTATE_VF, 2);
>>>> +       OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5);
>>>> +
>>>> +       /* URB States */
>>>> +       gen10_emit_urb(batch);
>>>> +
>>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130);
>>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130);
>>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130);
>>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130);
>>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130);
>>>> +
>>>> +       OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4);
>>>> +       OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4);
>>>> +       OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4);
>>>> +
>>>> +       /* Push Constants */
>>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2);
>>>> +
>>>> +       /* Constants */
>>>> +       OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11);
>>>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11);
>>>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11);
>>>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11);
>>>> +       OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11);
>>>> +
>>>> +       OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3);
>>>> +       OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2);
>>>> +       gen8_emit_vf_topology(batch);
>>>> +
>>>> +       /* Streamer out declaration list */
>>>> +       gen8_emit_so_decl_list(batch);
>>>> +
>>>> +       /* Streamer out buffers */
>>>> +       for (i = 0; i < 4; i++) {
>>>> +               gen8_emit_so_buffer(batch, i);
>>>> +       }
>>>> +
>>>> +       /* State base addresses */
>>>> +       gen9_emit_state_base_address(batch);
>>>> +
>>>> +       OUT_CMD(GEN6_STATE_SIP, 3);
>>>> +       OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4);
>>>> +       OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8);
>>>> +
>>>> +       /* Chroma key */
>>>> +       for (i = 0; i < 4; i++) {
>>>> +               gen8_emit_chroma_key(batch, i);
>>>> +       }
>>>> +
>>>> +       OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3);
>>>> +       OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3);
>>>> +       OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5);
>>>> +       OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5);
>>>> +       OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3);
>>>> +       OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2);
>>>> +
>>>> +       /* WaPSRandomCSNotDone:cnl */
>>>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
>>>> +       OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
>>>> +       OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2);
>>>> +       OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2);
>>>> +       OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32);
>>>> +       OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16);
>>>> +       OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16);
>>>> +       OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5);
>>>> +
>>>> +       /* Vertex buffers */
>>>> +       gen8_emit_vertex_buffers(batch);
>>>> +       gen8_emit_vertex_elements(batch);
>>>> +
>>>> +       OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */);
>>>> +
>>>> +       /* 3D state binding table pointers */
>>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2);
>>>> +
>>>> +       gen8_emit_cc_state_pointers(batch);
>>>> +       gen8_emit_blend_state_pointers(batch);
>>>> +       gen8_emit_ps_extra(batch);
>>>> +       gen8_emit_ps_blend(batch);
>>>> +
>>>> +       /* 3D state sampler state pointers */
>>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2);
>>>> +       OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2);
>>>> +
>>>> +       OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2);
>>>> +
>>>> +       gen8_emit_viewport_state_pointers_cc(batch);
>>>> +       gen8_emit_viewport_state_pointers_sf_clip(batch);
>>>> +
>>>> +       /* WaPSRandomCSNotDone:cnl */
>>>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE   (1 << 20)
>>>> +       OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
>>>> +       OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0);
>>>> +
>>>> +       gen8_emit_raster(batch);
>>>> +
>>>> +       OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4);
>>>> +       OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2);
>>>> +
>>>> +       /* Launch 3D operation */
>>>> +       gen8_emit_primitive(batch);
>>>> +
>>>> +       /* WaRsGatherPoolEnable: cnl */
>>>> +       OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE);
>>>> +       OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2));
>>>> +       OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE);
>>>> +       OUT_BATCH(0);
>>>> +       OUT_BATCH(0xfffff << 12);
>>>> +       OUT_BATCH(GEN7_MI_RS_CONTROL);
>>>> +       OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4);
>>>> +
>>>> +       OUT_BATCH(MI_BATCH_BUFFER_END);
>>>> +}
>>>> --
>>>> 1.9.1
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>
>>>
>>>
>>
>
>
>
> --
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-07-13 22:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-28  9:10 [PATCH 1/2] tools/null_state_gen: Automatically generate the copyright header Oscar Mateo
2017-04-28  9:10 ` [PATCH 2/2] tools/null_state_gen: Add GEN10 golden context batch buffer creation Oscar Mateo
2017-04-28 14:36   ` [PATCH] " Oscar Mateo
2017-07-06  0:50     ` Rodrigo Vivi
2017-07-12 20:42       ` Oscar Mateo
2017-07-12 21:03         ` Rodrigo Vivi
2017-07-13 22:30           ` Rodrigo Vivi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.