All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/6] tools/null_state_gen: Add copyrights
@ 2014-10-09 16:54 Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 2/6] tools/null_state_gen: Add more debug output Mika Kuoppala
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Mika Kuoppala @ 2014-10-09 16:54 UTC (permalink / raw)
  To: intel-gfx

to files where they were missing.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 tools/null_state_gen/intel_null_state_gen.c   | 27 +++++++++++++++++++++++++++
 tools/null_state_gen/intel_renderstate_gen6.c | 26 ++++++++++++++++++++++++++
 tools/null_state_gen/intel_renderstate_gen7.c |  9 +++++----
 tools/null_state_gen/intel_renderstate_gen8.c | 26 ++++++++++++++++++++++++++
 4 files changed, 84 insertions(+), 4 deletions(-)

diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c
index b337706..c72796b 100644
--- a/tools/null_state_gen/intel_null_state_gen.c
+++ b/tools/null_state_gen/intel_null_state_gen.c
@@ -1,3 +1,30 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *	Mika Kuoppala <mika.kuoppala@intel.com>
+ *	Armin Reese <armin.c.reese@intel.com>
+ */
+
 #include <stdio.h>
 #include <stdlib.h>
 #include <errno.h>
diff --git a/tools/null_state_gen/intel_renderstate_gen6.c b/tools/null_state_gen/intel_renderstate_gen6.c
index 5f922f7..f18bb12 100644
--- a/tools/null_state_gen/intel_renderstate_gen6.c
+++ b/tools/null_state_gen/intel_renderstate_gen6.c
@@ -1,3 +1,29 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *	Mika Kuoppala <mika.kuoppala@intel.com>
+ */
+
 #include "intel_batchbuffer.h"
 #include <lib/gen6_render.h>
 #include <lib/intel_reg.h>
diff --git a/tools/null_state_gen/intel_renderstate_gen7.c b/tools/null_state_gen/intel_renderstate_gen7.c
index 22cd268..a48fb27 100644
--- a/tools/null_state_gen/intel_renderstate_gen7.c
+++ b/tools/null_state_gen/intel_renderstate_gen7.c
@@ -17,16 +17,17 @@
  * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
  * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *	Mika Kuoppala <mika.kuoppala@intel.com>
  */
 
-
 #include "intel_batchbuffer.h"
 #include <lib/gen7_render.h>
 #include <lib/intel_reg.h>
 #include <string.h>
-#include <stdio.h>
 
 static const uint32_t ps_kernel[][4] = {
 	{ 0x0080005a, 0x2e2077bd, 0x000000c0, 0x008d0040 },
diff --git a/tools/null_state_gen/intel_renderstate_gen8.c b/tools/null_state_gen/intel_renderstate_gen8.c
index 4812b51..73375a0 100644
--- a/tools/null_state_gen/intel_renderstate_gen8.c
+++ b/tools/null_state_gen/intel_renderstate_gen8.c
@@ -1,3 +1,29 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *	Mika Kuoppala <mika.kuoppala@intel.com>
+ */
+
 #include "intel_batchbuffer.h"
 #include <lib/gen8_render.h>
 #include <lib/intel_reg.h>
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/6] tools/null_state_gen: Add more debug output
  2014-10-09 16:54 [PATCH 1/6] tools/null_state_gen: Add copyrights Mika Kuoppala
@ 2014-10-09 16:54 ` Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 3/6] tools/null_state_gen: Limit the total state len to 4096 bytes Mika Kuoppala
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Mika Kuoppala @ 2014-10-09 16:54 UTC (permalink / raw)
  To: intel-gfx

Be more verbose about the state size we generate.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 tools/null_state_gen/intel_null_state_gen.c | 36 ++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c
index c72796b..353556a 100644
--- a/tools/null_state_gen/intel_null_state_gen.c
+++ b/tools/null_state_gen/intel_null_state_gen.c
@@ -32,8 +32,6 @@
 
 #include "intel_batchbuffer.h"
 
-#define STATE_ALIGN 64
-
 extern int gen6_setup_null_render_state(struct intel_batchbuffer *batch);
 extern int gen7_setup_null_render_state(struct intel_batchbuffer *batch);
 extern int gen8_setup_null_render_state(struct intel_batchbuffer *batch);
@@ -47,36 +45,52 @@ static void print_usage(char *s)
 	       s);
 }
 
+/* Creates the intel_renderstate_genX.c file for the particular
+ * GEN product
+ */
 static int print_state(int gen, struct intel_batchbuffer *batch)
 {
 	int i;
+	unsigned long cmds;
+
+	fprintf(stderr, "Generating for gen%d\n", gen);
 
 	printf("#include \"intel_renderstate.h\"\n\n");
 
+	/* Relocation offsets.  These are byte offsets in the golden context
+	 * batch buffer where the BB graphics address will be added to
+	 * the indirect state offset already stored in those locations.  The
+	 * resulting value will inform the GPU where the indirect states are.
+	 */
 	printf("static const u32 gen%d_null_state_relocs[] = {\n", gen);
 	for (i = 0; i < batch->cmds->num_items; i++) {
 		if (intel_batch_is_reloc(batch, i))
 			printf("\t0x%08x,\n", i * 4);
 	}
-	printf("\t%d,\n", -1);
-	printf("};\n\n");
+	printf("\t-1,\n};\n\n");
 
+	/* GPU commands to execute to set up the RCS golden state.  This
+	 * state will become the default config.
+	 */
 	printf("static const u32 gen%d_null_state_batch[] = {\n", gen);
 	for (i = 0; i < intel_batch_num_cmds(batch); i++) {
+		const int offset = i * 4;
 		const struct bb_item *cmd = intel_batch_cmd_get(batch, i);
 		printf("\t0x%08x,", cmd->data);
 
 		if (debug)
-			printf("\t /* 0x%08x %s '%s' */", i * 4,
-			       intel_batch_type_as_str(cmd), cmd->str);
+			printf("\t /* 0x%08x %s '%s' */", offset,
+				intel_batch_type_as_str(cmd), cmd->str);
 
-		if (i * 4 == batch->cmds_end_offset)
+		if (offset == batch->cmds_end_offset) {
+			cmds = i + 1;
 			printf("\t /* cmds end */");
+		}
 
 		if (intel_batch_is_reloc(batch, i))
 			printf("\t /* reloc */");
 
-		if (i * 4 == batch->state_start_offset)
+		if (offset == batch->state_start_offset)
 			printf("\t /* state start */");
 
 		if (i == intel_batch_num_cmds(batch) - 1)
@@ -87,9 +101,15 @@ static int print_state(int gen, struct intel_batchbuffer *batch)
 
 	printf("};\n\nRO_RENDERSTATE(%d);\n", gen);
 
+	fprintf(stderr, "Commands %lu (%lu bytes)\n", cmds, cmds * 4);
+	fprintf(stderr, "State    %lu (%lu bytes)\n", batch->state->num_items, batch->state->num_items * 4);
+	fprintf(stderr, "Total    %lu (%lu bytes)\n", batch->cmds->num_items, batch->cmds->num_items * 4);
+	fprintf(stderr, "\n");
+
 	return 0;
 }
 
+/* Selects generator function for the given product and executes it. */
 static int do_generate(int gen)
 {
 	struct intel_batchbuffer *batch;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/6] tools/null_state_gen: Limit the total state len to 4096 bytes
  2014-10-09 16:54 [PATCH 1/6] tools/null_state_gen: Add copyrights Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 2/6] tools/null_state_gen: Add more debug output Mika Kuoppala
@ 2014-10-09 16:54 ` Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 4/6] tools/null_state_gen: Add macro to emit commands with null state Mika Kuoppala
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Mika Kuoppala @ 2014-10-09 16:54 UTC (permalink / raw)
  To: intel-gfx

Currently our kernel side buffer object is only one page.
Limit the amount of dwords to 1024 to enforce this.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 tools/null_state_gen/intel_batchbuffer.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h
index e44c5c9..f10831c 100644
--- a/tools/null_state_gen/intel_batchbuffer.h
+++ b/tools/null_state_gen/intel_batchbuffer.h
@@ -34,7 +34,7 @@
 #include <stdint.h>
 
 #define MAX_RELOCS 64
-#define MAX_ITEMS 4096
+#define MAX_ITEMS 1024
 #define MAX_STRLEN 256
 
 #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/6] tools/null_state_gen: Add macro to emit commands with null state
  2014-10-09 16:54 [PATCH 1/6] tools/null_state_gen: Add copyrights Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 2/6] tools/null_state_gen: Add more debug output Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 3/6] tools/null_state_gen: Limit the total state len to 4096 bytes Mika Kuoppala
@ 2014-10-09 16:54 ` Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 5/6] tools/null_state_gen: Add Gen8 golden state Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 6/6] tools/null_state_gen: Add GEN9 golden context batch buffer creation Mika Kuoppala
  4 siblings, 0 replies; 8+ messages in thread
From: Mika Kuoppala @ 2014-10-09 16:54 UTC (permalink / raw)
  To: intel-gfx

In null/golden context there are multiple state commands where
the actual state is always zero. For more compact batch representation
add a macro which just emits command and the rest of the state as zero.

v2: - Be more verbose about length bias (Bradley Volkin)
    - strip out unrelated state_offset declaration (Bradley Volkin)

Cc: Volkin, Bradley D <bradley.d.volkin@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 tools/null_state_gen/intel_batchbuffer.c | 15 +++++++++++++++
 tools/null_state_gen/intel_batchbuffer.h |  7 +++++++
 2 files changed, 22 insertions(+)

diff --git a/tools/null_state_gen/intel_batchbuffer.c b/tools/null_state_gen/intel_batchbuffer.c
index 2a0b340..a31ea38 100644
--- a/tools/null_state_gen/intel_batchbuffer.c
+++ b/tools/null_state_gen/intel_batchbuffer.c
@@ -274,3 +274,18 @@ const char *intel_batch_type_as_str(const struct bb_item *item)
 
 	return "UNKNOWN";
 }
+
+void intel_batch_cmd_emit_null(struct intel_batchbuffer *batch,
+			       const int cmd, const int len, const int len_bias,
+			       const char *str)
+{
+	int i;
+
+	assert(len > 1);
+	assert((len - len_bias) >= 0);
+
+	bb_area_emit(batch->cmds, (cmd | (len - len_bias)), CMD, str);
+
+	for (i = len_bias-1; i < len; i++)
+		OUT_BATCH(0);
+}
diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h
index f10831c..f85f31d 100644
--- a/tools/null_state_gen/intel_batchbuffer.h
+++ b/tools/null_state_gen/intel_batchbuffer.h
@@ -69,6 +69,9 @@ struct intel_batchbuffer {
 
 struct intel_batchbuffer *intel_batchbuffer_create(void);
 
+#define OUT_CMD_B(cmd, len, bias) intel_batch_cmd_emit_null(batch, (cmd), (len), (bias), #cmd " " #len)
+#define OUT_CMD(cmd, len) OUT_CMD_B(cmd, len, 2)
+
 #define OUT_BATCH(d) bb_area_emit(batch->cmds, d, CMD, #d)
 #define OUT_BATCH_STATE_OFFSET(d) bb_area_emit(batch->cmds, d, STATE_OFFSET, #d)
 #define OUT_RELOC(batch, read_domain, write_domain, d) bb_area_emit(batch->cmds, d, RELOC, #d)
@@ -94,4 +97,8 @@ const char *intel_batch_type_as_str(const struct bb_item *item);
 void bb_area_emit(struct bb_area *a, uint32_t dword, item_type type, const char *str);
 void bb_area_emit_offset(struct bb_area *a, unsigned i, uint32_t dword, item_type type, const char *str);
 
+void intel_batch_cmd_emit_null(struct intel_batchbuffer *batch,
+			       const int cmd,
+			       const int len, const int len_bias,
+			       const char *str);
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/6] tools/null_state_gen: Add Gen8 golden state
  2014-10-09 16:54 [PATCH 1/6] tools/null_state_gen: Add copyrights Mika Kuoppala
                   ` (2 preceding siblings ...)
  2014-10-09 16:54 ` [PATCH 4/6] tools/null_state_gen: Add macro to emit commands with null state Mika Kuoppala
@ 2014-10-09 16:54 ` Mika Kuoppala
  2014-10-09 16:54 ` [PATCH 6/6] tools/null_state_gen: Add GEN9 golden context batch buffer creation Mika Kuoppala
  4 siblings, 0 replies; 8+ messages in thread
From: Mika Kuoppala @ 2014-10-09 16:54 UTC (permalink / raw)
  To: intel-gfx

Previously we didn't have a clear understanding what is necessary
for a pipeline state to be properly initialized. So we had to improvise
and use a stripped out render copy.

Now we have a more clear understanding so switch out render copy based
frankenstate to state we can call golden state.

v2: - export intel_batch_state_offset
    - add 3DSTATE_RASTER (Bradley Volkin)

Cc: Volkin, Bradley D <bradley.d.volkin@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 lib/gen6_render.h                             |   5 +-
 lib/gen8_render.h                             |  24 +
 tools/null_state_gen/intel_batchbuffer.h      |   2 +-
 tools/null_state_gen/intel_renderstate_gen8.c | 850 +++++++++-----------------
 4 files changed, 305 insertions(+), 576 deletions(-)

diff --git a/lib/gen6_render.h b/lib/gen6_render.h
index c3e85eb..8a4ec53 100644
--- a/lib/gen6_render.h
+++ b/lib/gen6_render.h
@@ -41,7 +41,7 @@
 /* These two are BLC and CTG only, not BW or CL */
 #define GEN6_3DSTATE_AA_LINE_PARAMS		GEN6_3D(3, 1, 0xa)
 #define GEN6_3DSTATE_GS_SVB_INDEX		GEN6_3D(3, 1, 0xb)
-
+#define GEN6_3DSTATE_MONOFILTER_SIZE		GEN6_3D(3, 1, 0x11)
 #define GEN6_3DPRIMITIVE				GEN6_3D(3, 3, 0)
 
 #define GEN6_3DSTATE_CLEAR_PARAMS		GEN6_3D(3, 1, 0x10)
@@ -91,6 +91,7 @@
 # define GEN6_3DSTATE_SF_TRI_PROVOKE_SHIFT		29
 # define GEN6_3DSTATE_SF_LINE_PROVOKE_SHIFT		27
 # define GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT		25
+# define GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT 12
 
 #define GEN6_3DSTATE_WM				GEN6_3D(3, 0, 0x14)
 /* DW2 */
@@ -303,7 +304,6 @@
 #define GEN6_EU_ATT_CLR_1	       0x8834
 #define GEN6_EU_RDATA		       0x8840
 
-
 #define GEN6_PIPE_CONTROL			GEN6_3D(3, 2, 0)
 
 #define GEN6_3DPRIMITIVE				GEN6_3D(3, 3, 0)
@@ -411,6 +411,7 @@
 
 /* for GEN6_STATE_BASE_ADDRESS */
 #define BASE_ADDRESS_MODIFY		(1 << 0)
+#define BUFFER_SIZE_MODIFY		(1 << 0)
 
 /* for GEN6_3DSTATE_PIPELINED_POINTERS */
 #define GEN6_GS_DISABLE		       0
diff --git a/lib/gen8_render.h b/lib/gen8_render.h
index 0eec80c..ba3f9f2 100644
--- a/lib/gen8_render.h
+++ b/lib/gen8_render.h
@@ -8,6 +8,8 @@
 #define GEN7_3DSTATE_URB_DS (0x7832 << 16)
 #define GEN7_3DSTATE_URB_GS (0x7833 << 16)
 
+# define GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION	(1 << 26)
+
 #define GEN6_3DSTATE_SCISSOR_STATE_POINTERS	GEN6_3D(3, 0, 0xf)
 #define GEN7_3DSTATE_CLEAR_PARAMS		GEN6_3D(3, 0, 0x04)
 #define GEN7_3DSTATE_DEPTH_BUFFER		GEN6_3D(3, 0, 0x05)
@@ -29,6 +31,7 @@
 #define GEN7_3DSTATE_CONSTANT_GS		GEN6_3D(3, 0, 0x16)
 #define GEN7_3DSTATE_CONSTANT_HS		GEN6_3D(3, 0, 0x19)
 #define GEN7_3DSTATE_CONSTANT_DS		GEN6_3D(3, 0, 0x1a)
+#define GEN7_3DSTATE_CONSTANT_PS		GEN6_3D(3, 0, 0x17)
 #define GEN7_3DSTATE_HS				GEN6_3D(3, 0, 0x1b)
 #define GEN7_3DSTATE_TE				GEN6_3D(3, 0, 0x1c)
 #define GEN7_3DSTATE_DS				GEN6_3D(3, 0, 0x1d)
@@ -44,6 +47,12 @@
 # define GEN8_RASTER_FRONT_WINDING_CCW			(1 << 21)
 # define GEN8_RASTER_CULL_NONE                          (1 << 16)
 #define GEN7_3DSTATE_PS				GEN6_3D(3, 0, 0x20)
+# define GEN7_PS_SPF_MODE                               (1 << 31)
+
+# define GEN7_SF_POINT_WIDTH_FROM_SOURCE                (1 << 11)
+
+# define GEN7_VS_FLOATING_POINT_MODE_ALTERNATE          (1 << 16)
+
 #define GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP	\
 						GEN6_3D(3, 0, 0x21)
 #define GEN8_3DSTATE_PS_BLEND			GEN6_3D(3, 0, 0x4d)
@@ -68,14 +77,26 @@
 #define GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS	GEN6_3D(3, 0, 0x2e)
 #define GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS	GEN6_3D(3, 0, 0x2f)
 
+#define GEN8_3DSTATE_VF				GEN6_3D(3, 0, 0x0c)
 #define GEN8_3DSTATE_VF_TOPOLOGY		GEN6_3D(3, 0, 0x4b)
 
+#define GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC	GEN6_3D(3, 1, 0x19)
+#define GEN8_3DSTATE_GATHER_POOL_ALLOC		GEN6_3D(3, 1, 0x1a)
+#define GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC 	GEN6_3D(3, 1, 0x1b)
 #define GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS	GEN6_3D(3, 1, 0x12)
 #define GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS	GEN6_3D(3, 1, 0x13)
 #define GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS	GEN6_3D(3, 1, 0x14)
 #define GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS	GEN6_3D(3, 1, 0x15)
 #define GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS	GEN6_3D(3, 1, 0x16)
 
+#define GEN8_3DSTATE_VF_SGVS			GEN6_3D(3, 0, 0x4a)
+#define GEN8_3DSTATE_SO_DECL_LIST		GEN6_3D(3, 1, 0x17)
+#define GEN8_3DSTATE_SO_BUFFER			GEN6_3D(3, 1, 0x18)
+#define GEN8_3DSTATE_POLY_STIPPLE_OFFSET	GEN6_3D(3, 1, 0x06)
+#define GEN8_3DSTATE_POLY_STIPPLE_PATTERN	GEN6_3D(3, 1, 0x07)
+#define GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0	GEN6_3D(3, 1, 0x02)
+#define GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1	GEN6_3D(3, 1, 0x0c)
+
 /* Some random bits that we care about */
 #define GEN7_VB0_BUFFER_ADDR_MOD_EN		(1 << 14)
 #define GEN7_3DSTATE_PS_PERSPECTIVE_PIXEL_BARYCENTRIC (1 << 11)
@@ -84,6 +105,9 @@
 /* Random shifts */
 #define GEN8_3DSTATE_PS_MAX_THREADS_SHIFT 23
 
+/* STATE_BASE_ADDRESS state size in pages*/
+#define GEN8_STATE_SIZE_PAGES(x) ((x) << 12)
+
 /* Shamelessly ripped from mesa */
 struct gen8_surface_state
 {
diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h
index f85f31d..8b87c02 100644
--- a/tools/null_state_gen/intel_batchbuffer.h
+++ b/tools/null_state_gen/intel_batchbuffer.h
@@ -84,7 +84,7 @@ uint32_t intel_batch_state_copy(struct intel_batchbuffer *batch, void *d, unsign
 				const char *name);
 uint32_t intel_batch_state_alloc(struct intel_batchbuffer *batch, unsigned bytes, unsigned align,
 				 const char *name);
-
+uint32_t intel_batch_state_offset(struct intel_batchbuffer *batch, unsigned align);
 unsigned intel_batch_num_cmds(struct intel_batchbuffer *batch);
 
 struct bb_item *intel_batch_cmd_get(struct intel_batchbuffer *batch, unsigned i);
diff --git a/tools/null_state_gen/intel_renderstate_gen8.c b/tools/null_state_gen/intel_renderstate_gen8.c
index 73375a0..2d7a4b0 100644
--- a/tools/null_state_gen/intel_renderstate_gen8.c
+++ b/tools/null_state_gen/intel_renderstate_gen8.c
@@ -29,708 +29,412 @@
 #include <lib/intel_reg.h>
 #include <string.h>
 
-struct {
-	uint32_t cc_state;
-	uint32_t blend_state;
-} cc;
-
-struct {
-	uint32_t cc_state;
-	uint32_t sf_clip_state;
-} viewport;
-
-/* see shaders/ps/blit.g7a */
-static const uint32_t ps_kernel[][4] = {
-#if 1
-   { 0x0060005a, 0x21403ae8, 0x3a0000c0, 0x008d0040 },
-   { 0x0060005a, 0x21603ae8, 0x3a0000c0, 0x008d0080 },
-   { 0x0060005a, 0x21803ae8, 0x3a0000d0, 0x008d0040 },
-   { 0x0060005a, 0x21a03ae8, 0x3a0000d0, 0x008d0080 },
-   { 0x02800031, 0x2e0022e8, 0x0e000140, 0x08840001 },
-   { 0x05800031, 0x200022e0, 0x0e000e00, 0x90031000 },
-#else
-   /* Write all -1 */
-   { 0x00600001, 0x2e000608, 0x00000000, 0x3f800000 },
-   { 0x00600001, 0x2e200608, 0x00000000, 0x3f800000 },
-   { 0x00600001, 0x2e400608, 0x00000000, 0x3f800000 },
-   { 0x00600001, 0x2e600608, 0x00000000, 0x3f800000 },
-   { 0x00600001, 0x2e800608, 0x00000000, 0x3f800000 },
-   { 0x00600001, 0x2ea00608, 0x00000000, 0x3f800000 },
-   { 0x00600001, 0x2ec00608, 0x00000000, 0x3f800000 },
-   { 0x00600001, 0x2ee00608, 0x00000000, 0x3f800000 },
-   { 0x05800031, 0x200022e0, 0x0e000e00, 0x90031000 },
-#endif
-};
-
-static uint32_t
-gen8_bind_buf_null(struct intel_batchbuffer *batch)
+static void gen8_emit_wm(struct intel_batchbuffer *batch)
 {
-	struct gen8_surface_state ss;
-	memset(&ss, 0, sizeof(ss));
-
-	return OUT_STATE_STRUCT(ss, 64);
-}
-
-static uint32_t
-gen8_bind_surfaces(struct intel_batchbuffer *batch)
-{
-	unsigned offset;
-
-	offset = intel_batch_state_alloc(batch, 8, 32, "bind surfaces");
-
-	bb_area_emit_offset(batch->state, offset, gen8_bind_buf_null(batch), STATE_OFFSET, "bind 1");
-	bb_area_emit_offset(batch->state, offset + 4, gen8_bind_buf_null(batch), STATE_OFFSET, "bind 2");
-
-	return offset;
-}
-
-/* Mostly copy+paste from gen6, except wrap modes moved */
-static uint32_t
-gen8_create_sampler(struct intel_batchbuffer *batch) {
-	struct gen8_sampler_state ss;
-	memset(&ss, 0, sizeof(ss));
-
-	ss.ss0.min_filter = GEN6_MAPFILTER_NEAREST;
-	ss.ss0.mag_filter = GEN6_MAPFILTER_NEAREST;
-	ss.ss3.r_wrap_mode = GEN6_TEXCOORDMODE_CLAMP;
-	ss.ss3.s_wrap_mode = GEN6_TEXCOORDMODE_CLAMP;
-	ss.ss3.t_wrap_mode = GEN6_TEXCOORDMODE_CLAMP;
-
-	/* I've experimented with non-normalized coordinates and using the LD
-	 * sampler fetch, but couldn't make it work. */
-	ss.ss3.non_normalized_coord = 0;
-
-	return OUT_STATE_STRUCT(ss, 64);
-}
-
-static uint32_t
-gen8_fill_ps(struct intel_batchbuffer *batch,
-	     const uint32_t kernel[][4],
-	     size_t size)
-{
-	return intel_batch_state_copy(batch, kernel, size, 64, "ps kernel");
+	OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
+	OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION);
 }
 
-/**
- * gen7_fill_vertex_buffer_data populate vertex buffer with data.
- *
- * The vertex buffer consists of 3 vertices to construct a RECTLIST. The 4th
- * vertex is implied (automatically derived by the HW). Each element has the
- * destination offset, and the normalized texture offset (src). The rectangle
- * itself will span the entire subsurface to be copied.
- *
- * see gen6_emit_vertex_elements
- */
-static uint32_t
-gen7_fill_vertex_buffer_data(struct intel_batchbuffer *batch)
+static void gen8_emit_ps(struct intel_batchbuffer *batch)
 {
-	uint16_t *v;
-
-	return intel_batch_state_alloc(batch, 2 * sizeof(*v), 8, "vertex buffer");
-}
-
-/**
- * gen6_emit_vertex_elements - The vertex elements describe the contents of the
- * vertex buffer. We pack the vertex buffer in a semi weird way, conforming to
- * what gen6_rendercopy did. The most straightforward would be to store
- * everything as floats.
- *
- * see gen7_fill_vertex_buffer_data() for where the corresponding elements are
- * packed.
- */
-static void
-gen6_emit_vertex_elements(struct intel_batchbuffer *batch) {
-	/*
-	 * The VUE layout
-	 *    dword 0-3: pad (0, 0, 0. 0)
-	 *    dword 4-7: position (x, y, 0, 1.0),
-	 *    dword 8-11: texture coordinate 0 (u0, v0, 0, 1.0)
-	 */
-	OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS | (3 * 2 + 1 - 2));
-
-	/* Element state 0. These are 4 dwords of 0 required for the VUE format.
-	 * We don't really know or care what they do.
-	 */
-	OUT_BATCH(0 << VE0_VERTEX_BUFFER_INDEX_SHIFT | VE0_VALID |
-		  GEN6_SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT); /* we specify 0, but it's really does not exist */
-	OUT_BATCH(GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
-
-	/* Element state 1 - Our "destination" vertices. These are passed down
-	 * through the pipeline, and eventually make it to the pixel shader as
-	 * the offsets in the destination surface. It's packed as the 16
-	 * signed/scaled because of gen6 rendercopy. I see no particular reason
-	 * for doing this though.
-	 */
-	OUT_BATCH(0 << VE0_VERTEX_BUFFER_INDEX_SHIFT | VE0_VALID |
-		  GEN6_SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
-	OUT_BATCH(GEN6_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
-
-	/* Element state 2. Last but not least we store the U,V components as
-	 * normalized floats. These will be used in the pixel shader to sample
-	 * from the source buffer.
-	 */
-	OUT_BATCH(0 << VE0_VERTEX_BUFFER_INDEX_SHIFT | VE0_VALID |
-		  GEN6_SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
-		  4 << VE0_OFFSET_SHIFT);	/* offset vb in bytes */
-	OUT_BATCH(GEN6_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN6_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
-}
-
-/**
- * gen7_emit_vertex_buffer emit the vertex buffers command
- *
- * @batch
- * @offset - bytw offset within the @batch where the vertex buffer starts.
- */
-static void gen7_emit_vertex_buffer(struct intel_batchbuffer *batch,
-				    uint32_t offset) {
-	OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS | (1 + (4 * 1) - 2));
-	OUT_BATCH(0 << VB0_BUFFER_INDEX_SHIFT | /* VB 0th index */
-		  GEN7_VB0_BUFFER_ADDR_MOD_EN | /* Address Modify Enable */
-		  VB0_NULL_VERTEX_BUFFER |
-		  0 << VB0_BUFFER_PITCH_SHIFT);
-	OUT_RELOC_STATE(batch, I915_GEM_DOMAIN_VERTEX, 0, offset);
+	OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2));
 	OUT_BATCH(0);
+	OUT_BATCH(0); /* kernel hi */
+	OUT_BATCH(GEN7_PS_SPF_MODE);
+	OUT_BATCH(0); /* scratch space stuff */
+	OUT_BATCH(0); /* scratch hi */
 	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0); // kernel 1
+	OUT_BATCH(0); /* kernel 1 hi */
+	OUT_BATCH(0); // kernel 2
+	OUT_BATCH(0); /* kernel 2 hi */
 }
 
-static uint32_t
-gen6_create_cc_state(struct intel_batchbuffer *batch)
-{
-	struct gen6_color_calc_state cc_state;
-	memset(&cc_state, 0, sizeof(cc_state));
-
-	return OUT_STATE_STRUCT(cc_state, 64);
-}
-
-static uint32_t
-gen8_create_blend_state(struct intel_batchbuffer *batch)
-{
-	struct gen8_blend_state blend;
-	int i;
-
-	memset(&blend, 0, sizeof(blend));
-
-	for (i = 0; i < 16; i++) {
-		blend.bs[i].dest_blend_factor = GEN6_BLENDFACTOR_ZERO;
-		blend.bs[i].source_blend_factor = GEN6_BLENDFACTOR_ONE;
-		blend.bs[i].color_blend_func = GEN6_BLENDFUNCTION_ADD;
-		blend.bs[i].pre_blend_color_clamp = 1;
-		blend.bs[i].color_buffer_blend = 0;
-	}
-
-	return OUT_STATE_STRUCT(blend, 64);
-}
-
-static uint32_t
-gen6_create_cc_viewport(struct intel_batchbuffer *batch)
+static void gen8_emit_sf(struct intel_batchbuffer *batch)
 {
-	struct gen6_cc_viewport vp;
-
-	memset(&vp, 0, sizeof(vp));
-
-	/* XXX I don't understand this */
-	vp.min_depth = -1.e35;
-	vp.max_depth = 1.e35;
-
-	return OUT_STATE_STRUCT(vp, 32);
-}
-
-static uint32_t
-gen7_create_sf_clip_viewport(struct intel_batchbuffer *batch) {
-	/* XXX these are likely not needed */
-	struct gen7_sf_clip_viewport scv_state;
-
-	memset(&scv_state, 0, sizeof(scv_state));
-
-	scv_state.guardband.xmin = 0;
-	scv_state.guardband.xmax = 1.0f;
-	scv_state.guardband.ymin = 0;
-	scv_state.guardband.ymax = 1.0f;
-
-	return OUT_STATE_STRUCT(scv_state, 64);
+	OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT |
+		  1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT |
+		  GEN7_SF_POINT_WIDTH_FROM_SOURCE |
+		  8);
 }
 
-static uint32_t
-gen6_create_scissor_rect(struct intel_batchbuffer *batch)
+static void gen8_emit_vs(struct intel_batchbuffer *batch)
 {
-	struct gen6_scissor_rect scissor;
-
-	memset(&scissor, 0, sizeof(scissor));
-
-	return OUT_STATE_STRUCT(scissor, 64);
-}
-
-static void
-gen8_emit_sip(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN6_STATE_SIP | (3 - 2));
+	OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2));
 	OUT_BATCH(0);
 	OUT_BATCH(0);
-}
-
-static void
-gen7_emit_push_constants(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS);
+	OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE);
 	OUT_BATCH(0);
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS);
 	OUT_BATCH(0);
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS);
 	OUT_BATCH(0);
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS);
 	OUT_BATCH(0);
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS);
 	OUT_BATCH(0);
 }
 
-static void
-gen8_emit_state_base_address(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN6_STATE_BASE_ADDRESS | (16 - 2));
-
-	/* general */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+static void gen8_emit_hs(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2));
 	OUT_BATCH(0);
-
-	/* stateless data port */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-
-	/* surface */
-	OUT_RELOC(batch, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
 	OUT_BATCH(0);
-
-	/* dynamic */
-	OUT_RELOC(batch, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
-		  0, BASE_ADDRESS_MODIFY);
 	OUT_BATCH(0);
-
-	/* indirect */
 	OUT_BATCH(0);
 	OUT_BATCH(0);
-
-	/* instruction */
-	OUT_RELOC(batch, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
 	OUT_BATCH(0);
+	OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT);
+	OUT_BATCH(0);
+}
 
-	/* general state buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* dynamic state buffer size */
-	OUT_BATCH(1 << 12 | 1);
-	/* indirect object buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* intruction buffer size */
-	OUT_BATCH(1 << 12 | 1);
+static void gen8_emit_raster(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0.0);
+	OUT_BATCH(0.0);
+	OUT_BATCH(0.0);
 }
 
-static void
-gen7_emit_urb(struct intel_batchbuffer *batch) {
-	/* XXX: Min valid values from mesa */
+static void gen8_emit_urb(struct intel_batchbuffer *batch)
+{
 	const int vs_entries = 64;
 	const int vs_size = 2;
 	const int vs_start = 4;
 
 	OUT_BATCH(GEN7_3DSTATE_URB_VS);
 	OUT_BATCH(vs_entries | ((vs_size - 1) << 16) | (vs_start << 25));
-	OUT_BATCH(GEN7_3DSTATE_URB_GS);
-	OUT_BATCH(vs_start << 25);
+
 	OUT_BATCH(GEN7_3DSTATE_URB_HS);
-	OUT_BATCH(vs_start << 25);
-	OUT_BATCH(GEN7_3DSTATE_URB_DS);
-	OUT_BATCH(vs_start << 25);
-}
+	OUT_BATCH(0x0f << 25);
 
-static void
-gen8_emit_cc(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS);
-	OUT_BATCH_STATE_OFFSET(cc.blend_state | 1);
+	OUT_BATCH(GEN7_3DSTATE_URB_DS);
+	OUT_BATCH(0x0f << 25);
 
-	OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
-	OUT_BATCH_STATE_OFFSET(cc.cc_state | 1);
+	OUT_BATCH(GEN7_3DSTATE_URB_GS);
+	OUT_BATCH(0x0f << 25);
 }
 
-static void
-gen8_emit_multisample(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN8_3DSTATE_MULTISAMPLE);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_SAMPLE_MASK);
-	OUT_BATCH(1);
+static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
+	OUT_BATCH(_3DPRIM_TRILIST);
 }
 
-static void
-gen8_emit_vs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS);
-	OUT_BATCH(0);
+static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch)
+{
+	const int num_decls = 128;
+	int i;
 
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_VS | (11 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST | ((2 * num_decls) + 1));
 	OUT_BATCH(0);
+	OUT_BATCH(num_decls);
 
-	OUT_BATCH(GEN6_3DSTATE_VS | (9-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	for (i = 0; i < num_decls; i++) {
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
 }
 
-static void
-gen8_emit_hs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CONSTANT_HS | (11 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_HS | (9-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index)
+{
+	OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2));
+	OUT_BATCH(index << 29);
 	OUT_BATCH(0);
 	OUT_BATCH(0);
 	OUT_BATCH(0);
 	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS);
 	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS);
 	OUT_BATCH(0);
 }
 
-static void
-gen8_emit_gs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CONSTANT_GS | (11 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_GS | (10-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS);
-	OUT_BATCH(0);
+static void gen8_emit_state_base_address(struct intel_batchbuffer *batch) {
+	const unsigned offset = 0;
+	OUT_BATCH(GEN6_STATE_BASE_ADDRESS | (16 - 2));
 
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS);
+	/* general */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
 	OUT_BATCH(0);
-}
 
-static void
-gen8_emit_ds(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CONSTANT_DS | (11 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	/* stateless data port */
 	OUT_BATCH(0);
 
-	OUT_BATCH(GEN7_3DSTATE_DS | (9-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	/* surface state base addess */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
 	OUT_BATCH(0);
 
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS);
+	/* dynamic state base address */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
 	OUT_BATCH(0);
 
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS);
+	/* indirect */
+	OUT_BATCH(BASE_ADDRESS_MODIFY);
 	OUT_BATCH(0);
-}
 
-static void
-gen8_emit_wm_hz_op(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN8_3DSTATE_WM_HZ_OP | (5-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	/* instruction */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
 	OUT_BATCH(0);
-}
 
-static void
-gen8_emit_null_state(struct intel_batchbuffer *batch) {
-	gen8_emit_wm_hz_op(batch);
-	gen8_emit_hs(batch);
-	OUT_BATCH(GEN7_3DSTATE_TE | (4-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	gen8_emit_gs(batch);
-	gen8_emit_ds(batch);
-	gen8_emit_vs(batch);
+	/* general state buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+	/* dynamic state buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+	/* indirect object buffer size */
+	OUT_BATCH(0 | BUFFER_SIZE_MODIFY);
+	/* intruction buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
 }
 
-static void
-gen7_emit_clip(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN6_3DSTATE_CLIP | (4 - 2));
+static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index)
+{
+	OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2));
+	OUT_BATCH(index << 30);
 	OUT_BATCH(0);
-	OUT_BATCH(0); /*  pass-through */
 	OUT_BATCH(0);
 }
 
-static void
-gen8_emit_sf(struct intel_batchbuffer *batch)
+static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch)
 {
+	const int buffers = 33;
 	int i;
 
-	OUT_BATCH(GEN7_3DSTATE_SBE | (4 - 2));
-	OUT_BATCH(1 << GEN7_SBE_NUM_OUTPUTS_SHIFT |
-		  GEN8_SBE_FORCE_URB_ENTRY_READ_LENGTH |
-		  GEN8_SBE_FORCE_URB_ENTRY_READ_OFFSET |
-		  1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
-		  1 << GEN8_SBE_URB_ENTRY_READ_OFFSET_SHIFT);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS | ((4 * buffers) - 1));
 
-	OUT_BATCH(GEN8_3DSTATE_SBE_SWIZ | (11 - 2));
-	for (i = 0; i < 8; i++)
+	for (i = 0; i < buffers; i++) {
+		OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT |
+			  GEN7_VB0_BUFFER_ADDR_MOD_EN);
+		OUT_BATCH(0); /* Addr */
 		OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
 
-	OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
-	OUT_BATCH(GEN8_RASTER_FRONT_WINDING_CCW | GEN8_RASTER_CULL_NONE);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+static void gen6_emit_vertex_elements(struct intel_batchbuffer *batch)
+{
+	const int elements = 34;
+	int i;
 
-	OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS | ((2 * elements - 1)));
+
+	for (i = 0; i < elements; i++) {
+		if (i == 0) {
+			OUT_BATCH(VE0_VALID | i);
+			OUT_BATCH(
+				GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+				GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+				GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+				GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT
+				);
+		} else {
+			OUT_BATCH(0);
+			OUT_BATCH(0);
+		}
+	}
 }
 
-static void
-gen8_emit_ps(struct intel_batchbuffer *batch, uint32_t kernel) {
-	const int max_threads = 63;
-
-	OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
-	OUT_BATCH(/* XXX: I don't understand the BARYCENTRIC stuff, but it
-		   * appears we need it to put our setup data in the place we
-		   * expect (g6, see below) */
-		  GEN7_3DSTATE_PS_PERSPECTIVE_PIXEL_BARYCENTRIC);
+static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch)
+{
+	union {
+		float fval;
+		uint32_t uval;
+	} u;
 
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_PS | (11-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	unsigned offset;
 
-	OUT_BATCH(GEN7_3DSTATE_PS | (12-2));
-	OUT_BATCH_STATE_OFFSET(kernel);
-	OUT_BATCH(0); /* kernel hi */
-	OUT_BATCH(1 << GEN6_3DSTATE_WM_SAMPLER_COUNT_SHIFT |
-		  2 << GEN6_3DSTATE_WM_BINDING_TABLE_ENTRY_COUNT_SHIFT);
-	OUT_BATCH(0); /* scratch space stuff */
-	OUT_BATCH(0); /* scratch hi */
-	OUT_BATCH((max_threads - 1) << GEN8_3DSTATE_PS_MAX_THREADS_SHIFT |
-		  GEN6_3DSTATE_WM_16_DISPATCH_ENABLE);
-	OUT_BATCH(6 << GEN6_3DSTATE_WM_DISPATCH_START_GRF_0_SHIFT);
-	OUT_BATCH(0); // kernel 1
-	OUT_BATCH(0); /* kernel 1 hi */
-	OUT_BATCH(0); // kernel 2
-	OUT_BATCH(0); /* kernel 2 hi */
+	u.fval = 1.0f;
 
-	OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
-	OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
+	offset = intel_batch_state_offset(batch, 64);
+	OUT_STATE(0);
+	OUT_STATE(0);      /* Alpha reference value */
+	OUT_STATE(u.uval); /* Blend constant color RED */
+	OUT_STATE(u.uval); /* Blend constant color BLUE */
+	OUT_STATE(u.uval); /* Blend constant color GREEN */
+	OUT_STATE(u.uval); /* Blend constant color ALPHA */
 
-	OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
-	OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID | GEN8_PSX_ATTRIBUTE_ENABLE);
+	OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
+	OUT_BATCH_STATE_OFFSET(offset | 1);
 }
 
-static void
-gen8_emit_depth(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_DEPTH_BUFFER | (8-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+	int i;
 
-	OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER | (5 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	offset = intel_batch_state_offset(batch, 64);
 
-	OUT_BATCH(GEN7_3DSTATE_STENCIL_BUFFER | (5 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-}
+	for (i = 0; i < 17; i++)
+		OUT_STATE(0);
 
-static void
-gen7_emit_clear(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CLEAR_PARAMS | (3-2));
-	OUT_BATCH(0);
-	OUT_BATCH(1); // clear valid
+	OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset | 1);
 }
 
-static void
-gen6_emit_drawing_rectangle(struct intel_batchbuffer *batch)
+static void gen8_emit_ps_extra(struct intel_batchbuffer *batch)
 {
-	OUT_BATCH(GEN6_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
-	OUT_BATCH(0xffffffff);
-	OUT_BATCH(0 | 0);
-	OUT_BATCH(0);
-}
+        OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
+        OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID |
+		  GEN8_PSX_ATTRIBUTE_ENABLE);
 
-static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
-	OUT_BATCH(_3DPRIM_RECTLIST);
 }
 
-/* Vertex elements MUST be defined before this according to spec */
-static void gen8_emit_primitive(struct intel_batchbuffer *batch)
+static void gen8_emit_ps_blend(struct intel_batchbuffer *batch)
 {
-	OUT_BATCH(GEN8_3DSTATE_VF_INSTANCING | (3 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DPRIMITIVE | (7-2));
-	OUT_BATCH(0);	/* gen8+ ignore the topology type field */
-	OUT_BATCH(3);	/* vertex count */
-	OUT_BATCH(0);	/*  We're specifying this instead with offset in GEN6_3DSTATE_VERTEX_BUFFERS */
-	OUT_BATCH(1);	/* single instance */
-	OUT_BATCH(0);	/* start instance location */
-	OUT_BATCH(0);	/* index buffer offset, ignored */
+        OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
+        OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
 }
 
-void gen8_setup_null_render_state(struct intel_batchbuffer *batch)
+static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch)
 {
-	uint32_t ps_sampler_state, ps_kernel_off, ps_binding_table;
-	uint32_t scissor_state;
-	uint32_t vertex_buffer;
-	uint32_t batch_end;
-	int ret;
+	unsigned offset;
 
-	ps_binding_table  = gen8_bind_surfaces(batch);
-	ps_sampler_state  = gen8_create_sampler(batch);
-	ps_kernel_off = gen8_fill_ps(batch, ps_kernel, sizeof(ps_kernel));
-	vertex_buffer = gen7_fill_vertex_buffer_data(batch);
-	cc.cc_state = gen6_create_cc_state(batch);
-	cc.blend_state = gen8_create_blend_state(batch);
-	viewport.cc_state = gen6_create_cc_viewport(batch);
-	viewport.sf_clip_state = gen7_create_sf_clip_viewport(batch);
-	scissor_state = gen6_create_scissor_rect(batch);
-	/* TODO: theree is other state which isn't setup */
-
-	/* Start emitting the commands. The order roughly follows the mesa blorp
-	 * order */
-	OUT_BATCH(GEN6_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+	offset = intel_batch_state_offset(batch, 32);
 
-	gen8_emit_sip(batch);
+	OUT_STATE((uint32_t)0.0f); /* Minimum depth */
+	OUT_STATE((uint32_t)0.0f); /* Maximum depth */
 
-	gen7_emit_push_constants(batch);
+	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset);
+}
 
-	gen8_emit_state_base_address(batch);
+static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+	int i;
 
-	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC);
-	OUT_BATCH_STATE_OFFSET(viewport.cc_state);
-	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP);
-	OUT_BATCH_STATE_OFFSET(viewport.sf_clip_state);
+	offset = intel_batch_state_offset(batch, 64);
 
-	gen7_emit_urb(batch);
+	for (i = 0; i < 16; i++)
+		OUT_STATE(0);
 
-	gen8_emit_cc(batch);
+	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset);
+}
 
-	gen8_emit_multisample(batch);
+static void gen8_emit_primitive(struct intel_batchbuffer *batch)
+{
+        OUT_BATCH(GEN6_3DPRIMITIVE | (7-2));
+        OUT_BATCH(4);   /* gen8+ ignore the topology type field */
+        OUT_BATCH(1);   /* vertex count */
+        OUT_BATCH(0);
+        OUT_BATCH(1);   /* single instance */
+        OUT_BATCH(0);   /* start instance location */
+        OUT_BATCH(0);   /* index buffer offset, ignored */
+}
 
-	gen8_emit_null_state(batch);
+int gen8_setup_null_render_state(struct intel_batchbuffer *batch)
+{
+	int ret;
+	int i;
 
-	OUT_BATCH(GEN7_3DSTATE_STREAMOUT | (5-2));
+#define GEN8_PIPE_CONTROL_GLOBAL_GTT   (1 << 24)
+
+	OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2));
+	OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT);
 	OUT_BATCH(0);
 	OUT_BATCH(0);
 	OUT_BATCH(0);
 	OUT_BATCH(0);
 
-	gen7_emit_clip(batch);
+	OUT_BATCH(GEN6_PIPELINE_SELECT | PIPELINE_SELECT_3D);
 
+	gen8_emit_wm(batch);
+	gen8_emit_ps(batch);
 	gen8_emit_sf(batch);
 
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS);
-	OUT_BATCH_STATE_OFFSET(ps_binding_table);
+	OUT_CMD(GEN7_3DSTATE_SBE, 4);
+	OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11);
 
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS);
-	OUT_BATCH_STATE_OFFSET(ps_sampler_state);
-
-	gen8_emit_ps(batch, ps_kernel_off);
+	gen8_emit_vs(batch);
+	gen8_emit_hs(batch);
 
-	OUT_BATCH(GEN6_3DSTATE_SCISSOR_STATE_POINTERS);
-	OUT_BATCH_STATE_OFFSET(scissor_state);
+	OUT_CMD(GEN7_3DSTATE_GS, 10);
+	OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5);
+	OUT_CMD(GEN7_3DSTATE_DS, 9);
+	OUT_CMD(GEN6_3DSTATE_CLIP, 4);
+	gen8_emit_raster(batch);
+	OUT_CMD(GEN7_3DSTATE_TE, 4);
+	OUT_CMD(GEN8_3DSTATE_VF, 2);
+	OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5);
+
+	gen8_emit_urb(batch);
+
+	OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4);
+	OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4);
+	OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2);
+	OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11);
+	OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3);
+	OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2);
 
-	gen8_emit_depth(batch);
+	gen8_emit_vf_topology(batch);
+	gen8_emit_so_decl_list(batch);
 
-	gen7_emit_clear(batch);
+	gen8_emit_so_buffer(batch, 0);
+	gen8_emit_so_buffer(batch, 1);
+	gen8_emit_so_buffer(batch, 2);
+	gen8_emit_so_buffer(batch, 3);
 
-	gen6_emit_drawing_rectangle(batch);
+	gen8_emit_state_base_address(batch);
 
-	gen7_emit_vertex_buffer(batch, vertex_buffer);
+	OUT_CMD(GEN6_STATE_SIP, 3);
+	OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4);
+	OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8);
+
+	gen8_emit_chroma_key(batch, 0);
+	gen8_emit_chroma_key(batch, 1);
+	gen8_emit_chroma_key(batch, 2);
+	gen8_emit_chroma_key(batch, 3);
+
+	OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3);
+	OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3);
+	OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5);
+	OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5);
+	OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3);
+	OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2);
+	OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2);
+	OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2);
+	OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 33);
+	OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 16 + 1);
+	OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 16 + 1);
+	OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5);
+
+	gen8_emit_vertex_buffers(batch);
 	gen6_emit_vertex_elements(batch);
 
-	gen8_emit_vf_topology(batch);
+	OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1); /* Enable */
+
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2);
+
+	gen8_emit_cc_state_pointers(batch);
+	gen8_emit_blend_state_pointers(batch);
+
+	gen8_emit_ps_extra(batch);
+	gen8_emit_ps_blend(batch);
+
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2);
+
+	OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2);
+
+	gen8_emit_viewport_state_pointers_cc(batch);
+	gen8_emit_viewport_state_pointers_sf_clip(batch);
+
 	gen8_emit_primitive(batch);
 
 	OUT_BATCH(MI_BATCH_BUFFER_END);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/6] tools/null_state_gen: Add GEN9 golden context batch buffer creation
  2014-10-09 16:54 [PATCH 1/6] tools/null_state_gen: Add copyrights Mika Kuoppala
                   ` (3 preceding siblings ...)
  2014-10-09 16:54 ` [PATCH 5/6] tools/null_state_gen: Add Gen8 golden state Mika Kuoppala
@ 2014-10-09 16:54 ` Mika Kuoppala
  2014-10-10 12:03   ` Damien Lespiau
  4 siblings, 1 reply; 8+ messages in thread
From: Mika Kuoppala @ 2014-10-09 16:54 UTC (permalink / raw)
  To: intel-gfx

From: Armin Reese <armin.c.reese@intel.com>

Modifications to 'null_state_gen' so it can generate GEN9
golden context batch buffer source for SKL.

v2: - rebased on top of gen8 changes (Mika)
    - fixed state base address command size (Mika)
    - base address size macro as pages (Mika)

v3: - rebased on top of current master (Mika)
    - removed obsolete #includes (Mika)
    - added copyright (Mika)
    - render and component packing added (Mika)

Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Armin Reese <armin.c.reese@intel.com>
Cc: Volkin, Bradley D <bradley.d.volkin@intel.com>
Signed-off-by: Armin Reese <armin.c.reese@intel.com> (v1)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 lib/gen9_render.h                             |   2 +
 tools/null_state_gen/Makefile.am              |   3 +-
 tools/null_state_gen/intel_null_state_gen.c   |   8 +-
 tools/null_state_gen/intel_renderstate_gen9.c | 477 ++++++++++++++++++++++++++
 4 files changed, 487 insertions(+), 3 deletions(-)
 create mode 100644 tools/null_state_gen/intel_renderstate_gen9.c

diff --git a/lib/gen9_render.h b/lib/gen9_render.h
index 2cd7530..aac620a 100644
--- a/lib/gen9_render.h
+++ b/lib/gen9_render.h
@@ -4,6 +4,7 @@
 #include "gen8_render.h"
 
 #define GEN7_3DSTATE_VF				GEN6_3D(3, 0, 0x0c)
+#define GEN9_3DSTATE_COMPONENT_PACKING		GEN6_3D(3, 0, 0x55)
 
 #define GEN9_SBE_ACTIVE_COMPONENT_NONE		0
 #define GEN9_SBE_ACTIVE_COMPONENT_XY		1
@@ -11,5 +12,6 @@
 #define GEN9_SBE_ACTIVE_COMPONENT_XYZW		3
 
 #define GEN9_PIPELINE_SELECTION_MASK		(3 << 8)
+#define GEN9_PIPELINE_SELECT			(GEN6_3D(1, 1, 4) | (3 << 8))
 
 #endif
diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am
index 58fbd53..b131e0d 100644
--- a/tools/null_state_gen/Makefile.am
+++ b/tools/null_state_gen/Makefile.am
@@ -8,9 +8,10 @@ intel_null_state_gen_SOURCES = 	\
 	intel_renderstate_gen6.c \
 	intel_renderstate_gen7.c \
 	intel_renderstate_gen8.c \
+	intel_renderstate_gen9.c \
 	intel_null_state_gen.c
 
-gens := 6 7 8
+gens := 6 7 8 9
 
 h = /tmp/intel_renderstate_gen$$gen.c
 state_headers: intel_null_state_gen
diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c
index 353556a..8024ac3 100644
--- a/tools/null_state_gen/intel_null_state_gen.c
+++ b/tools/null_state_gen/intel_null_state_gen.c
@@ -35,14 +35,15 @@
 extern int gen6_setup_null_render_state(struct intel_batchbuffer *batch);
 extern int gen7_setup_null_render_state(struct intel_batchbuffer *batch);
 extern int gen8_setup_null_render_state(struct intel_batchbuffer *batch);
+extern int gen9_setup_null_render_state(struct intel_batchbuffer *batch);
 
 static int debug = 0;
 
 static void print_usage(char *s)
 {
 	fprintf(stderr, "%s: <gen>\n"
-		"     gen:     gen to generate for (6,7,8)\n",
-	       s);
+		"     gen:     gen to generate for (6,7,8,9)\n",
+		s);
 }
 
 /* Creates the intel_renderstate_genX.c file for the particular
@@ -132,6 +133,9 @@ static int do_generate(int gen)
 	case 8:
 		null_state_gen = gen8_setup_null_render_state;
 		break;
+	case 9:
+		null_state_gen = gen9_setup_null_render_state;
+		break;
 	}
 
 	if (null_state_gen == NULL) {
diff --git a/tools/null_state_gen/intel_renderstate_gen9.c b/tools/null_state_gen/intel_renderstate_gen9.c
new file mode 100644
index 0000000..6f808f8
--- /dev/null
+++ b/tools/null_state_gen/intel_renderstate_gen9.c
@@ -0,0 +1,477 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *	Armin Reese <armin.c.reese@intel.com>
+ *	Mika Kuoppala <mika.kuoppala@intel.com>
+ */
+
+#include "intel_batchbuffer.h"
+#include <lib/gen9_render.h>
+#include <lib/intel_reg.h>
+
+static void gen8_emit_wm(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
+	OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION);
+}
+
+static void gen8_emit_ps(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0); /* kernel hi */
+	OUT_BATCH(GEN7_PS_SPF_MODE);
+	OUT_BATCH(0); /* scratch space stuff */
+	OUT_BATCH(0); /* scratch hi */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0); // kernel 1
+	OUT_BATCH(0); /* kernel 1 hi */
+	OUT_BATCH(0); // kernel 2
+	OUT_BATCH(0); /* kernel 2 hi */
+}
+
+static void gen8_emit_sf(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT |
+		  1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT |
+		  GEN7_SF_POINT_WIDTH_FROM_SOURCE |
+		  8);
+}
+
+static void gen8_emit_vs(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_hs(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_raster(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
+	OUT_BATCH(0);
+	OUT_BATCH(0.0);
+	OUT_BATCH(0.0);
+	OUT_BATCH(0.0);
+}
+
+static void gen8_emit_urb(struct intel_batchbuffer *batch)
+{
+	const int vs_entries = 64;
+	const int vs_size = 2;
+	const int vs_start = 4;
+
+	OUT_BATCH(GEN7_3DSTATE_URB_VS);
+	OUT_BATCH(vs_entries | ((vs_size - 1) << 16) | (vs_start << 25));
+
+	OUT_BATCH(GEN7_3DSTATE_URB_HS);
+	OUT_BATCH(0x0f << 25);
+
+	OUT_BATCH(GEN7_3DSTATE_URB_DS);
+	OUT_BATCH(0x0f << 25);
+
+	OUT_BATCH(GEN7_3DSTATE_URB_GS);
+	OUT_BATCH(0x0f << 25);
+}
+
+static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
+	OUT_BATCH(_3DPRIM_TRILIST);
+}
+
+static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch)
+{
+	const int num_decls = 128;
+	int i;
+
+	OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST |
+		(((2 * num_decls) + 3) - 2) /* DWORD count - 2 */);
+	OUT_BATCH(0);
+	OUT_BATCH(num_decls);
+
+	for (i = 0; i < num_decls; i++) {
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index)
+{
+	OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2));
+	OUT_BATCH(index << 29);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index)
+{
+	OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2));
+	OUT_BATCH(index << 30);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch)
+{
+	const int buffers = 33;
+	int i;
+
+	OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS |
+		(((4 * buffers) + 1)- 2) /* DWORD count - 2 */);
+
+	for (i = 0; i < buffers; i++) {
+		OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT |
+			  GEN7_VB0_BUFFER_ADDR_MOD_EN);
+		OUT_BATCH(0); /* Address */
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch)
+{
+	const int elements = 34;
+	int i;
+
+	OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS |
+		(((2 * elements) + 1) - 2) /* DWORD count - 2 */);
+
+	/* Element 0 */
+	OUT_BATCH(VE0_VALID);
+	OUT_BATCH(
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+	/* Elements 1 -> 33 */
+	for (i = 1; i < elements; i++) {
+		OUT_BATCH(0);
+		OUT_BATCH(0);
+	}
+}
+
+static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch)
+{
+	union {
+		float fval;
+		uint32_t uval;
+	} u;
+
+	unsigned offset;
+
+	u.fval = 1.0f;
+
+	offset = intel_batch_state_offset(batch, 64);
+	OUT_STATE(0);
+	OUT_STATE(0);      /* Alpha reference value */
+	OUT_STATE(u.uval); /* Blend constant color RED */
+	OUT_STATE(u.uval); /* Blend constant color BLUE */
+	OUT_STATE(u.uval); /* Blend constant color GREEN */
+	OUT_STATE(u.uval); /* Blend constant color ALPHA */
+
+	OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
+	OUT_BATCH_STATE_OFFSET(offset | 1);
+}
+
+static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+	int i;
+
+	offset = intel_batch_state_offset(batch, 64);
+
+	for (i = 0; i < 17; i++)
+		OUT_STATE(0);
+
+	OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset | 1);
+}
+
+static void gen8_emit_ps_extra(struct intel_batchbuffer *batch)
+{
+        OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
+        OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID |
+		  GEN8_PSX_ATTRIBUTE_ENABLE);
+
+}
+
+static void gen8_emit_ps_blend(struct intel_batchbuffer *batch)
+{
+        OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
+        OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
+}
+
+static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+
+	offset = intel_batch_state_offset(batch, 32);
+
+	OUT_STATE((uint32_t)0.0f); /* Minimum depth */
+	OUT_STATE((uint32_t)0.0f); /* Maximum depth */
+
+	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset);
+}
+
+static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch)
+{
+	unsigned offset;
+	int i;
+
+	offset = intel_batch_state_offset(batch, 64);
+
+	for (i = 0; i < 16; i++)
+		OUT_STATE(0);
+
+	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2));
+	OUT_BATCH_STATE_OFFSET(offset);
+}
+
+static void gen8_emit_primitive(struct intel_batchbuffer *batch)
+{
+        OUT_BATCH(GEN6_3DPRIMITIVE | (7-2));
+        OUT_BATCH(4);   /* gen8+ ignore the topology type field */
+        OUT_BATCH(1);   /* vertex count */
+        OUT_BATCH(0);
+        OUT_BATCH(1);   /* single instance */
+        OUT_BATCH(0);   /* start instance location */
+        OUT_BATCH(0);   /* index buffer offset, ignored */
+}
+
+static void gen9_emit_state_base_address(struct intel_batchbuffer *batch) {
+	const unsigned offset = 0;
+	OUT_BATCH(GEN6_STATE_BASE_ADDRESS |
+		(19 - 2) /* DWORD count - 2 */);
+
+	/* general state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0);
+
+	/* surface state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* dynamic state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* indirect state base address */
+	OUT_BATCH(BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* instruction state base address - requires BB address
+	 * added to state offset to be stored in this location
+	 */
+	OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* general state buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+	/* dynamic state buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+	/* indirect object buffer size */
+	OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY);
+	/* intruction buffer size */
+	OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY);
+
+	/* bindless surface state base address */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	/* bindless surface state size */
+	OUT_BATCH(0);
+}
+
+/*
+ * Generate the batch buffer commands needed to initialize the 3D engine
+ * to its "golden state".
+ */
+int gen9_setup_null_render_state(struct intel_batchbuffer *batch)
+{
+	int ret;
+	int i;
+
+#define GEN8_PIPE_CONTROL_GLOBAL_GTT   (1 << 24)
+	/* PIPE_CONTROL */
+	OUT_BATCH(GEN6_PIPE_CONTROL |
+	         (6 - 2));	/* DWORD count - 2 */
+	OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* PIPELINE_SELECT */
+	OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+
+	gen8_emit_wm(batch);
+	gen8_emit_ps(batch);
+	gen8_emit_sf(batch);
+
+	OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */
+	OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11);
+
+	gen8_emit_vs(batch);
+	gen8_emit_hs(batch);
+
+	OUT_CMD(GEN7_3DSTATE_GS, 10);
+	OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5);
+	OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */
+	OUT_CMD(GEN6_3DSTATE_CLIP, 4);
+	gen8_emit_raster(batch);
+	OUT_CMD(GEN7_3DSTATE_TE, 4);
+	OUT_CMD(GEN8_3DSTATE_VF, 2);
+	OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5);
+
+	/* URB States */
+	gen8_emit_urb(batch);
+
+	OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4);
+	OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4);
+	OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4);
+
+	/* Push Constants */
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2);
+
+	/* Constants */
+	OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11);
+	OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11);
+
+	OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3);
+	OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2);
+	gen8_emit_vf_topology(batch);
+
+	/* Streamer out declaration list */
+	gen8_emit_so_decl_list(batch);
+
+	/* Streamer out buffers */
+	for (i = 0; i < 4; i++) {
+		gen8_emit_so_buffer(batch, i);
+	}
+
+	/* State base addresses */
+	gen9_emit_state_base_address(batch);
+
+	OUT_CMD(GEN6_STATE_SIP, 3);
+	OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4);
+	OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8);
+
+	/* Chroma key */
+	for (i = 0; i < 4; i++) {
+		gen8_emit_chroma_key(batch, i);
+	}
+
+	OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3);
+	OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3);
+	OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5);
+	OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5);
+	OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3);
+	OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2);
+	OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2);
+	OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2);
+	OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32);
+	OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16);
+	OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16);
+	OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5);
+
+	/* Vertex buffers */
+	gen8_emit_vertex_buffers(batch);
+	gen8_emit_vertex_elements(batch);
+	OUT_CMD(GEN9_3DSTATE_COMPONENT_PACKING, 5);
+
+	OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */);
+
+	gen8_emit_cc_state_pointers(batch);
+	gen8_emit_blend_state_pointers(batch);
+	gen8_emit_ps_extra(batch);
+	gen8_emit_ps_blend(batch);
+
+	/* 3D state sampler state pointers */
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2);
+
+	OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2);
+
+	gen8_emit_viewport_state_pointers_cc(batch);
+	gen8_emit_viewport_state_pointers_sf_clip(batch);
+
+	/* 3D state binding table pointers */
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2);
+	OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2);
+
+	/* Launch 3D operation */
+	gen8_emit_primitive(batch);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+
+	return ret;
+}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 6/6] tools/null_state_gen: Add GEN9 golden context batch buffer creation
  2014-10-09 16:54 ` [PATCH 6/6] tools/null_state_gen: Add GEN9 golden context batch buffer creation Mika Kuoppala
@ 2014-10-10 12:03   ` Damien Lespiau
  2014-10-10 14:35     ` [PATCH] tools/null_state_gen: Add copyright notice to state output Mika Kuoppala
  0 siblings, 1 reply; 8+ messages in thread
From: Damien Lespiau @ 2014-10-10 12:03 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Thu, Oct 09, 2014 at 07:54:39PM +0300, Mika Kuoppala wrote:
> From: Armin Reese <armin.c.reese@intel.com>
> 
> Modifications to 'null_state_gen' so it can generate GEN9
> golden context batch buffer source for SKL.
> 
> v2: - rebased on top of gen8 changes (Mika)
>     - fixed state base address command size (Mika)
>     - base address size macro as pages (Mika)
> 
> v3: - rebased on top of current master (Mika)
>     - removed obsolete #includes (Mika)
>     - added copyright (Mika)
>     - render and component packing added (Mika)
> 
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Armin Reese <armin.c.reese@intel.com>
> Cc: Volkin, Bradley D <bradley.d.volkin@intel.com>
> Signed-off-by: Armin Reese <armin.c.reese@intel.com> (v1)
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>

Hi,

This looks really good, I think we should just push it. There was a
suggestion to make the null state generator output the copyright notice
along with the generated file as well, but that can be done whenever.

-- 
Damien

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] tools/null_state_gen: Add copyright notice to state output
  2014-10-10 12:03   ` Damien Lespiau
@ 2014-10-10 14:35     ` Mika Kuoppala
  0 siblings, 0 replies; 8+ messages in thread
From: Mika Kuoppala @ 2014-10-10 14:35 UTC (permalink / raw)
  To: intel-gfx

along with info about what generated it.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 tools/null_state_gen/Makefile.am | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am
index b131e0d..bf8cbdb 100644
--- a/tools/null_state_gen/Makefile.am
+++ b/tools/null_state_gen/Makefile.am
@@ -1,3 +1,4 @@
+GPU_TOOLS_PATH := $(top_srcdir)
 AM_CPPFLAGS = -I$(top_srcdir)
 
 noinst_PROGRAMS = intel_null_state_gen
@@ -14,7 +15,14 @@ intel_null_state_gen_SOURCES = 	\
 gens := 6 7 8 9
 
 h = /tmp/intel_renderstate_gen$$gen.c
-state_headers: intel_null_state_gen
+states: intel_null_state_gen
 	for gen in $(gens); do \
-		./intel_null_state_gen $$gen >$(h) ;\
+		head -n 22 intel_null_state_gen.c >$(h); \
+		if test -d $(GPU_TOOLS_PATH)/.git; then \
+			echo -n " * Generated by: " >>$(h); \
+			git describe >>$(h); \
+		fi; \
+		echo " */" >>$(h); \
+		echo "" >>$(h); \
+		./intel_null_state_gen $$gen >>$(h); \
 	done
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-10-10 14:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-09 16:54 [PATCH 1/6] tools/null_state_gen: Add copyrights Mika Kuoppala
2014-10-09 16:54 ` [PATCH 2/6] tools/null_state_gen: Add more debug output Mika Kuoppala
2014-10-09 16:54 ` [PATCH 3/6] tools/null_state_gen: Limit the total state len to 4096 bytes Mika Kuoppala
2014-10-09 16:54 ` [PATCH 4/6] tools/null_state_gen: Add macro to emit commands with null state Mika Kuoppala
2014-10-09 16:54 ` [PATCH 5/6] tools/null_state_gen: Add Gen8 golden state Mika Kuoppala
2014-10-09 16:54 ` [PATCH 6/6] tools/null_state_gen: Add GEN9 golden context batch buffer creation Mika Kuoppala
2014-10-10 12:03   ` Damien Lespiau
2014-10-10 14:35     ` [PATCH] tools/null_state_gen: Add copyright notice to state output Mika Kuoppala

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.