intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Sync the assembler with Mesa's opcode emission code
@ 2013-02-04 15:26 Damien Lespiau
  2013-02-04 15:26 ` [PATCH 01/90] build: Add CAIRO_FLAGS to the debugger compilation Damien Lespiau
                   ` (90 more replies)
  0 siblings, 91 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:26 UTC (permalink / raw)
  To: intel-gfx

Hey,

Some time ago, Daniel mentioned merging the assembler into intel-gpu-tools to
lower maintenance cost and have more eyes on the code.

This series is the aftermath of that with an effort to sync the opcode emission
from Mesa with the assembler. It's also available in my fdo i-g-t tree:

http://cgit.freedesktop.org/~damien/intel-gpu-tools/log/?h=wip/mesa-sync

The list of changes is pretty large, but straighforward. The big picture is:
   - Sync the brw_eu* files from Mesa and split them into a library,
   - Gradually transform the assembler code to be able to use Mesa's structures
     and functions. The big change here is to collect the operands in struct
     brw_reg to be able to use the other brw_*() functions from Mesa,
   - Add some nice little details to the assembler (like cleaning-up the
     non-useful warning messages when compiling libva shaders, adding the line
     number to warning and error messages, adding region warnings, ...)
   - Port the few features that we need for the libva shaders to brw_eu_*
   - Fix a few things I came across
   - Lots of small refactoring

The regression tests used to make sure that that series is not too wrong was to
ensure libva's shaders generate the same opcodes. This also means "bug
compatible" as there are cases where the assembler output opcodes that don't
respect region constraints (for instance).

This means that there's still a (documented) diff between Mesa's copy of
brw_eu* and our. But hopefully, with time, it'll shrink down.

There is a sister series for Mesa I'll post later on the Mesa mailing list, the
sync goes both ways.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH 01/90] build: Add CAIRO_FLAGS to the debugger compilation
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
@ 2013-02-04 15:26 ` Damien Lespiau
  2013-02-04 15:26 ` [PATCH 02/90] gitignore: Ignore TAGS files Damien Lespiau
                   ` (89 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:26 UTC (permalink / raw)
  To: intel-gfx

The library in lib/ exposes <cairo.h> in its main header and thus users
must be able to include it.
---
 debugger/Makefile.am |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/debugger/Makefile.am b/debugger/Makefile.am
index d76e2ac..f1e49b9 100644
--- a/debugger/Makefile.am
+++ b/debugger/Makefile.am
@@ -11,6 +11,7 @@ AM_CPPFLAGS = 			\
 AM_CFLAGS = 			\
 	$(DRM_CFLAGS) 		\
 	$(PCIACCESS_CFLAGS) 	\
+	$(CAIRO_CFLAGS)		\
 	$(CWARNFLAGS)
 
 LDADD = $(top_builddir)/lib/libintel_tools.la $(DRM_LIBS) $(PCIACCESS_LIBS) $(CAIRO_LIBS)
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 02/90] gitignore: Ignore TAGS files
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
  2013-02-04 15:26 ` [PATCH 01/90] build: Add CAIRO_FLAGS to the debugger compilation Damien Lespiau
@ 2013-02-04 15:26 ` Damien Lespiau
  2013-02-04 15:26 ` [PATCH 03/90] build: Don't use AM_MAINTAINER_MODE Damien Lespiau
                   ` (88 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:26 UTC (permalink / raw)
  To: intel-gfx

TAGS files are generated with "make tags" to quickly jump through the
code. Ignore those by-products of automake/ctags.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 .gitignore |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/.gitignore b/.gitignore
index 611ab06..bfd59dc 100644
--- a/.gitignore
+++ b/.gitignore
@@ -79,6 +79,7 @@ core
 *.swo
 *.swp
 cscope.*
+TAGS
 
 /assembler/gram.c
 /assembler/gram.h
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 03/90] build: Don't use AM_MAINTAINER_MODE
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
  2013-02-04 15:26 ` [PATCH 01/90] build: Add CAIRO_FLAGS to the debugger compilation Damien Lespiau
  2013-02-04 15:26 ` [PATCH 02/90] gitignore: Ignore TAGS files Damien Lespiau
@ 2013-02-04 15:26 ` Damien Lespiau
  2013-02-04 15:26 ` [PATCH 04/90] build: Only build the assembler if flex and bison are found Damien Lespiau
                   ` (87 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:26 UTC (permalink / raw)
  To: intel-gfx

This does not bring us anything these days, not using the macro at all
is the same thing as having it always on.

See this discussion:
https://www.redhat.com/archives/virt-tools-list/2010-October/msg00049.html

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 autogen.sh   |    2 +-
 configure.ac |    1 -
 2 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/autogen.sh b/autogen.sh
index 904cd67..354f254 100755
--- a/autogen.sh
+++ b/autogen.sh
@@ -9,4 +9,4 @@ cd $srcdir
 autoreconf -v --install || exit 1
 cd $ORIGDIR || exit $?
 
-$srcdir/configure --enable-maintainer-mode "$@"
+$srcdir/configure "$@"
diff --git a/configure.ac b/configure.ac
index 70a4651..fb3450b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -36,7 +36,6 @@ AC_GNU_SOURCE
 
 AM_INIT_AUTOMAKE([foreign dist-bzip2])
 AM_PATH_PYTHON([3],, [:])
-AM_MAINTAINER_MODE
 
 AC_PROG_CC
 AM_PROG_LEX
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 04/90] build: Only build the assembler if flex and bison are found
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (2 preceding siblings ...)
  2013-02-04 15:26 ` [PATCH 03/90] build: Don't use AM_MAINTAINER_MODE Damien Lespiau
@ 2013-02-04 15:26 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 05/90] build: Add the debugger compilation status to the summary Damien Lespiau
                   ` (86 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:26 UTC (permalink / raw)
  To: intel-gfx

And start displaying a nice summary of what we are going to compile.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 Makefile.am  |    6 +++++-
 configure.ac |   15 +++++++++++++++
 2 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/Makefile.am b/Makefile.am
index a135531..60fc03e 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -21,7 +21,11 @@
 
 ACLOCAL_AMFLAGS = ${ACLOCAL_FLAGS}
 
-SUBDIRS = lib man tools scripts tests benchmarks demos assembler
+SUBDIRS = lib man tools scripts tests benchmarks demos
+
+if BUILD_ASSEMBLER
+SUBDIRS += assembler
+endif
 
 if BUILD_SHADER_DEBUGGER
 SUBDIRS += debugger
diff --git a/configure.ac b/configure.ac
index fb3450b..ff7e779 100644
--- a/configure.ac
+++ b/configure.ac
@@ -93,6 +93,12 @@ if test x"$udev" = xyes; then
 fi
 PKG_CHECK_MODULES(GLIB, glib-2.0)
 
+# can we build the assembler?
+AS_IF([test x"$LEX" != "x:" -a x"$YACC" != xyacc],
+      [enable_assembler=yes],
+      [enable_assembler=no])
+AM_CONDITIONAL(BUILD_ASSEMBLER, [test "x$enable_assembler" = xyes])
+
 # -----------------------------------------------------------------------------
 #			Configuration options
 # -----------------------------------------------------------------------------
@@ -155,3 +161,12 @@ AC_CONFIG_FILES([
 	assembler/intel-gen4asm.pc
 ])
 AC_OUTPUT
+
+# Print a summary of the compilation
+echo ""
+echo "Intel GPU tools"
+
+echo ""
+echo " • Tools:"
+echo "       Assembler: ${enable_assembler}"
+echo ""
-- 
1.7.7.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 05/90] build: Add the debugger compilation status to the summary
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (3 preceding siblings ...)
  2013-02-04 15:26 ` [PATCH 04/90] build: Only build the assembler if flex and bison are found Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 06/90] assembler: Sync brw_instruction's header with mesa's Damien Lespiau
                   ` (85 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 configure.ac |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/configure.ac b/configure.ac
index ff7e779..832c6e4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -109,6 +109,7 @@ AC_ARG_ENABLE(shader-debugger, AS_HELP_STRING([--enable-shader-debugger],
 
 # Shadder debugger depends on python3, intel-genasm and objcopy
 if test "x$BUILD_SHADER_DEBUGGER" != xno; then
+
     # Check Python 3 is installed
     if test "$PYTHON" = ":" ; then
 	if test "x$BUILD_SHADER_DEBUGGER" = xyes; then
@@ -138,6 +139,9 @@ if test "x$BUILD_SHADER_DEBUGGER" != xno; then
 fi
 
 AM_CONDITIONAL(BUILD_SHADER_DEBUGGER, [test "x$BUILD_SHADER_DEBUGGER" != xno])
+AS_IF([test "x$BUILD_SHADER_DEBUGGER" != no],
+      [enable_debugger=yes], [enable_debugger=no])
+
 # -----------------------------------------------------------------------------
 
 # To build multithread code, gcc uses -pthread, Solaris Studio cc uses -mt
@@ -169,4 +173,5 @@ echo "Intel GPU tools"
 echo ""
 echo " • Tools:"
 echo "       Assembler: ${enable_assembler}"
+echo "       Debugger: ${enable_debugger}"
 echo ""
-- 
1.7.7.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 06/90] assembler: Sync brw_instruction's header with mesa's
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (4 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 05/90] build: Add the debugger compilation status to the summary Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 07/90] assembler: Rename three_src_gen6 to da3src Damien Lespiau
                   ` (84 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Two changes there, a field has been renamed and one bit of padding is
now used for compressed instructions.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |   34 +++++++++++++++++++---------------
 assembler/disasm.c      |    8 ++++----
 assembler/gram.y        |   30 +++++++++++++++---------------
 3 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index 3a3b160..59b28fa 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1043,21 +1043,25 @@ struct brw_instruction
 {
    struct 
    {
-      GLuint opcode:7;			/* 0x0000007f */
-      GLuint pad:1;			/* 0x00000080 */ /* reserved for Opcode */
-      GLuint access_mode:1;		/* 0x00000100 */
-      GLuint mask_control:1;		/* 0x00000200 */
-      GLuint dependency_control:2;	/* 0x00000c00 */
-      GLuint compression_control:2;	/* 0x00003000 */
-      GLuint thread_control:2;		/* 0x0000c000 */
-      GLuint predicate_control:4;	/* 0x000f0000 */
-      GLuint predicate_inverse:1;	/* 0x00100000 */
-      GLuint execution_size:3;		/* 0x00e00000 */
-      GLuint sfid_destreg__conditionalmod:4; /* sfid - send on GEN6+, destreg - send on Prev GEN6, conditionalmod - others */
-      GLuint acc_wr_control:1;          /* 0x10000000 */
-      GLuint pad0:1;                    /* 0x20000000 */
-      GLuint debug_control:1;		/* 0x40000000 */
-      GLuint saturate:1;		/* 0x80000000 */
+      GLuint opcode:7;
+      GLuint pad:1;
+      GLuint access_mode:1;
+      GLuint mask_control:1;
+      GLuint dependency_control:2;
+      GLuint compression_control:2; /* gen6: quater control */
+      GLuint thread_control:2;
+      GLuint predicate_control:4;
+      GLuint predicate_inverse:1;
+      GLuint execution_size:3;
+      /**
+       * Conditional Modifier for most instructions.  On Gen6+, this is also
+       * used for the SEND instruction's Message Target/SFID.
+       */
+      GLuint destreg__conditionalmod:4;
+      GLuint acc_wr_control:1;
+      GLuint cmpt_control:1;
+      GLuint debug_control:1;
+      GLuint saturate:1;
    } header;
 
    union {
diff --git a/assembler/disasm.c b/assembler/disasm.c
index 1ec6ae5..1cb0924 100644
--- a/assembler/disasm.c
+++ b/assembler/disasm.c
@@ -798,7 +798,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
     if (inst->header.opcode != BRW_OPCODE_SEND &&
 	inst->header.opcode != BRW_OPCODE_SENDC)
 	err |= control (file, "conditional modifier", conditional_modifier,
-			inst->header.sfid_destreg__conditionalmod, NULL);
+			inst->header.destreg__conditionalmod, NULL);
 
     if (inst->header.opcode != BRW_OPCODE_NOP) {
 	string (file, "(");
@@ -808,7 +808,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
 
     if (inst->header.opcode == BRW_OPCODE_SEND ||
 	inst->header.opcode == BRW_OPCODE_SENDC)
-	format (file, " %d", inst->header.sfid_destreg__conditionalmod);
+	format (file, " %d", inst->header.destreg__conditionalmod);
 
     if (opcode[inst->header.opcode].ndst > 0) {
 	pad (file, 16);
@@ -829,8 +829,8 @@ int disasm (FILE *file, struct brw_instruction *inst)
 	pad (file, 16);
 	space = 0;
 	err |= control (file, "target function", target_function,
-			inst->header.sfid_destreg__conditionalmod, &space);
-	switch (inst->header.sfid_destreg__conditionalmod) {
+			inst->header.destreg__conditionalmod, &space);
+	switch (inst->header.destreg__conditionalmod) {
 	case BRW_MESSAGE_TARGET_MATH:
 	    err |= control (file, "math function", math_function,
 			    inst->bits3.math.function, &space);
diff --git a/assembler/gram.y b/assembler/gram.y
index 2ed79c1..a762835 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -678,7 +678,7 @@ unaryinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.sfid_destreg__conditionalmod = $3.cond;
+		  $$.header.destreg__conditionalmod = $3.cond;
 		  $$.header.saturate = $4;
 		  $$.header.execution_size = $5;
 		  set_instruction_options(&$$, &$8);
@@ -715,7 +715,7 @@ binaryinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.sfid_destreg__conditionalmod = $3.cond;
+		  $$.header.destreg__conditionalmod = $3.cond;
 		  $$.header.saturate = $4;
 		  $$.header.execution_size = $5;
 		  set_instruction_options(&$$, &$9);
@@ -754,7 +754,7 @@ binaryaccinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.sfid_destreg__conditionalmod = $3.cond;
+		  $$.header.destreg__conditionalmod = $3.cond;
 		  $$.header.saturate = $4;
 		  $$.header.execution_size = $5;
 		  set_instruction_options(&$$, &$9);
@@ -801,7 +801,7 @@ trinaryinstruction:
 		  $$.bits1.three_src_gen6.flag_subreg_nr = $1.bits2.da1.flag_subreg_nr;
 
 		  $$.header.opcode = $2;
-		  $$.header.sfid_destreg__conditionalmod = $3.cond;
+		  $$.header.destreg__conditionalmod = $3.cond;
 		  $$.header.saturate = $4;
 		  $$.header.execution_size = $5;
 
@@ -839,7 +839,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.sfid_destreg__conditionalmod = $4; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $4; /* msg reg index */
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$5) != 0)
 		    YYERROR;
@@ -869,9 +869,9 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 
 		  if (IS_GENp(5)) {
                       if (IS_GENp(6)) {
-                          $$.header.sfid_destreg__conditionalmod = $7.bits2.send_gen5.sfid;
+                          $$.header.destreg__conditionalmod = $7.bits2.send_gen5.sfid;
                       } else {
-                          $$.header.sfid_destreg__conditionalmod = $4; /* msg reg index */
+                          $$.header.destreg__conditionalmod = $4; /* msg reg index */
                           $$.bits2.send_gen5.sfid = $7.bits2.send_gen5.sfid;
                           $$.bits2.send_gen5.end_of_thread = $12.bits3.generic_gen5.end_of_thread;
                       }
@@ -882,7 +882,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       $$.bits3.generic_gen5.end_of_thread =
                           $12.bits3.generic_gen5.end_of_thread;
 		  } else {
-                      $$.header.sfid_destreg__conditionalmod = $4; /* msg reg index */
+                      $$.header.destreg__conditionalmod = $4; /* msg reg index */
                       $$.bits3.generic = $7.bits3.generic;
                       $$.bits3.generic.msg_length = $9;
                       $$.bits3.generic.response_length = $11;
@@ -895,7 +895,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.sfid_destreg__conditionalmod = $5.reg_nr; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $5.reg_nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 
@@ -918,7 +918,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.sfid_destreg__conditionalmod = $5.reg_nr; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $5.reg_nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$4) != 0)
@@ -948,7 +948,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-                  $$.header.sfid_destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
+                  $$.header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
 		  set_instruction_predicate(&$$, &$1);
 
 		  if (set_instruction_dest(&$$, &$4) != 0)
@@ -994,7 +994,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-                  $$.header.sfid_destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
+                  $$.header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
 		  set_instruction_predicate(&$$, &$1);
 
 		  if (set_instruction_dest(&$$, &$4) != 0)
@@ -1029,7 +1029,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.sfid_destreg__conditionalmod = $5.reg_nr; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $5.reg_nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$4) != 0)
@@ -1051,7 +1051,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.sfid_destreg__conditionalmod = $5.reg_nr; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $5.reg_nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 
@@ -1100,7 +1100,7 @@ mathinstruction: predicate MATH_INST execsize dst src srcimm math_function insto
 		{
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.sfid_destreg__conditionalmod = $7;
+		  $$.header.destreg__conditionalmod = $7;
 		  $$.header.execution_size = $3;
 		  set_instruction_options(&$$, &$8);
 		  set_instruction_predicate(&$$, &$1);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 07/90] assembler: Rename three_src_gen6 to da3src
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (5 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 06/90] assembler: Sync brw_instruction's header with mesa's Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 08/90] assembler: Rename dp_read_gen6 to gen6_dp_sampler_const_cache Damien Lespiau
                   ` (83 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Mesa's brw_structs.h has named/renamed this field to da3src. Sync with
them.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |    6 +++---
 assembler/gram.y        |   30 +++++++++++++++---------------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index 59b28fa..e4fdb51 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1142,7 +1142,7 @@ struct brw_instruction
 	 GLuint dest_writemask:4;
 	 GLuint dest_subreg_nr:3;
 	 GLuint dest_reg_nr:8;
-      } three_src_gen6; /* Three-source-operator instructions for Gen6+ */
+      } da3src;
 
       struct
       {
@@ -1229,7 +1229,7 @@ struct brw_instruction
 	 GLuint src1_rep_ctrl:1;
 	 GLuint src1_swizzle:8;
 	 GLuint src1_subreg_nr_low:2; /* src1_subreg_nr spans on two DWORDs */
-      } three_src_gen6; /* Three-source-operator instructions for Gen6+ */
+      } da3src;
 
        struct 
        {
@@ -1315,7 +1315,7 @@ struct brw_instruction
 	 GLuint src2_subreg_nr:3;
 	 GLuint src2_reg_nr:8;
 	 GLuint pad1:2; /* reserved */
-      } three_src_gen6; /* Three-source-operator instructions for Gen6+ */
+      } da3src;
 
       struct
       {
diff --git a/assembler/gram.y b/assembler/gram.y
index a762835..9380f44 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -797,8 +797,8 @@ trinaryinstruction:
 
 		  $$.header.predicate_control = $1.header.predicate_control;
 		  $$.header.predicate_inverse = $1.header.predicate_inverse;
-		  $$.bits1.three_src_gen6.flag_reg_nr = $1.bits2.da1.flag_reg_nr;
-		  $$.bits1.three_src_gen6.flag_subreg_nr = $1.bits2.da1.flag_subreg_nr;
+		  $$.bits1.da3src.flag_reg_nr = $1.bits2.da1.flag_reg_nr;
+		  $$.bits1.da3src.flag_subreg_nr = $1.bits2.da1.flag_subreg_nr;
 
 		  $$.header.opcode = $2;
 		  $$.header.destreg__conditionalmod = $3.cond;
@@ -3064,11 +3064,11 @@ static int reg_type_2_to_3(int reg_type)
 int set_instruction_dest_three_src(struct brw_instruction *instr,
                                    struct dst_operand *dest)
 {
-	instr->bits1.three_src_gen6.dest_reg_file = dest->reg_file;
-	instr->bits1.three_src_gen6.dest_reg_nr = dest->reg_nr;
-	instr->bits1.three_src_gen6.dest_subreg_nr = get_subreg_address(dest->reg_file, dest->reg_type, dest->subreg_nr, dest->address_mode) / 4; // in DWORD
-	instr->bits1.three_src_gen6.dest_writemask = dest->writemask;
-	instr->bits1.three_src_gen6.dest_reg_type = reg_type_2_to_3(dest->reg_type);
+	instr->bits1.da3src.dest_reg_file = dest->reg_file;
+	instr->bits1.da3src.dest_reg_nr = dest->reg_nr;
+	instr->bits1.da3src.dest_subreg_nr = get_subreg_address(dest->reg_file, dest->reg_type, dest->subreg_nr, dest->address_mode) / 4; // in DWORD
+	instr->bits1.da3src.dest_writemask = dest->writemask;
+	instr->bits1.da3src.dest_reg_type = reg_type_2_to_3(dest->reg_type);
 	return 0;
 }
 
@@ -3079,9 +3079,9 @@ int set_instruction_src0_three_src(struct brw_instruction *instr,
 		reset_instruction_src_region(instr, src);
 	}
 	// TODO: supporting src0 swizzle, src0 modifier, src0 rep_ctrl
-	instr->bits1.three_src_gen6.src_reg_type = reg_type_2_to_3(src->reg_type);
-	instr->bits2.three_src_gen6.src0_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode) / 4; // in DWORD
-	instr->bits2.three_src_gen6.src0_reg_nr = src->reg_nr;
+	instr->bits1.da3src.src_reg_type = reg_type_2_to_3(src->reg_type);
+	instr->bits2.da3src.src0_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode) / 4; // in DWORD
+	instr->bits2.da3src.src0_reg_nr = src->reg_nr;
 	return 0;
 }
 
@@ -3093,9 +3093,9 @@ int set_instruction_src1_three_src(struct brw_instruction *instr,
 	}
 	// TODO: supporting src1 swizzle, src1 modifier, src1 rep_ctrl
 	int v = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode) / 4; // in DWORD
-	instr->bits2.three_src_gen6.src1_subreg_nr_low = v % 4; // lower 2 bits
-	instr->bits3.three_src_gen6.src1_subreg_nr_high = v / 4; // highest bit
-	instr->bits3.three_src_gen6.src1_reg_nr = src->reg_nr;
+	instr->bits2.da3src.src1_subreg_nr_low = v % 4; // lower 2 bits
+	instr->bits3.da3src.src1_subreg_nr_high = v / 4; // highest bit
+	instr->bits3.da3src.src1_reg_nr = src->reg_nr;
 	return 0;
 }
 
@@ -3106,8 +3106,8 @@ int set_instruction_src2_three_src(struct brw_instruction *instr,
 		reset_instruction_src_region(instr, src);
 	}
 	// TODO: supporting src2 swizzle, src2 modifier, src2 rep_ctrl
-	instr->bits3.three_src_gen6.src2_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode) / 4; // in DWORD
-	instr->bits3.three_src_gen6.src2_reg_nr = src->reg_nr;
+	instr->bits3.da3src.src2_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode) / 4; // in DWORD
+	instr->bits3.da3src.src2_reg_nr = src->reg_nr;
 	return 0;
 }
 
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 08/90] assembler: Rename dp_read_gen6 to gen6_dp_sampler_const_cache
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (6 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 07/90] assembler: Rename three_src_gen6 to da3src Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 09/90] assembler: Rename dp_gen6 to gen6_dp and sync with Mesa's Damien Lespiau
                   ` (82 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The purpose of this commit is to synchronize opcode definitions across
the gen4asm assembler and mesa.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |   27 ++++++++++++++++-----------
 assembler/gram.y        |    6 +++---
 2 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index e4fdb51..e2be147 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1469,17 +1469,22 @@ struct brw_instruction
            GLuint end_of_thread:1;
        } dp_read_gen5;
 
-       struct {
-           GLuint binding_table_index:8;
-           GLuint msg_control:5;  
-           GLuint msg_type:3;  
-           GLuint pad0:3;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } dp_read_gen6;
+      /**
+       * Message for the Sandybridge Sampler Cache or Constant Cache Data Port.
+       *
+       * See the Sandybridge PRM, Volume 4 Part 1, Section 3.9.2.1.1.
+       **/
+      struct {
+	 GLuint binding_table_index:8;
+	 GLuint msg_control:5;
+	 GLuint msg_type:3;
+	 GLuint pad0:3;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } gen6_dp_sampler_const_cache;
 
        struct {
            GLuint binding_table_index:8;
diff --git a/assembler/gram.y b/assembler/gram.y
index 9380f44..70caeb4 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1260,9 +1260,9 @@ msgtarget:	NULL_TOKEN
                       $$.bits2.send_gen5.sfid = 
                           BRW_MESSAGE_TARGET_DP_SC;
                       $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.dp_read_gen6.binding_table_index = $3;
-                      $$.bits3.dp_read_gen6.msg_control = $7;
-                      $$.bits3.dp_read_gen6.msg_type = $9;
+                      $$.bits3.gen6_dp_sampler_const_cache.binding_table_index = $3;
+                      $$.bits3.gen6_dp_sampler_const_cache.msg_control = $7;
+                      $$.bits3.gen6_dp_sampler_const_cache.msg_type = $9;
 		  } else if (IS_GENx(5)) {
                       $$.bits2.send_gen5.sfid = 
                           BRW_MESSAGE_TARGET_DATAPORT_READ;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 09/90] assembler: Rename dp_gen6 to gen6_dp and sync with Mesa's
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (7 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 08/90] assembler: Rename dp_read_gen6 to gen6_dp_sampler_const_cache Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 10/90] assembler: Rename dp_gen7 to gen7_dp and sync it " Damien Lespiau
                   ` (81 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The purpose of this commit is to synchronize opcode definitions across
the gen4asm assembler and mesa.

I had to drop how mesa splits msg_control as the current assembly
language gives access the the whole msg_control field.

Recompiling the xorg and the intel driver of libva shaders doesn't show
any difference in the assembly created.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |   34 ++++++++++++++++++++++------------
 assembler/gram.y        |    8 ++++----
 2 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index e2be147..e571052 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1513,18 +1513,28 @@ struct brw_instruction
            GLuint end_of_thread:1;
        } dp_write_gen6;
 
-       struct {
-           GLuint binding_table_index:8;
-           GLuint msg_control:5;
-           GLuint msg_type:4;    
-           GLuint send_commit_msg:1; /* ignore on read message */
-           GLuint pad0:1;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } dp_gen6;
+      /**
+       * Message for the Sandybridge Render Cache Data Port.
+       *
+       * Most fields are defined in the Sandybridge PRM, Volume 4 Part 1,
+       * Section 3.9.2.1.1: Message Descriptor.
+       *
+       * "Slot Group Select" and "Last Render Target" are part of the
+       * 5-bit message control for Render Target Write messages.  See
+       * Section 3.9.9.2.1 of the same volume.
+       */
+      struct {
+	 GLuint binding_table_index:8;
+	 GLuint msg_control:5;
+	 GLuint msg_type:4;
+	 GLuint send_commit_msg:1;
+	 GLuint pad0:1;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } gen6_dp;
 
        struct {
            GLuint binding_table_index:8;
diff --git a/assembler/gram.y b/assembler/gram.y
index 70caeb4..df26393 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1471,10 +1471,10 @@ msgtarget:	NULL_TOKEN
                             YYERROR;
                         }
 
-                        $$.bits3.dp_gen6.send_commit_msg = $11;
-                        $$.bits3.dp_gen6.binding_table_index = $9;
-                        $$.bits3.dp_gen6.msg_control = $7;
-                        $$.bits3.dp_gen6.msg_type = $5;
+                        $$.bits3.gen6_dp.send_commit_msg = $11;
+                        $$.bits3.gen6_dp.binding_table_index = $9;
+                        $$.bits3.gen6_dp.msg_control = $7;
+                        $$.bits3.gen6_dp.msg_type = $5;
                     } else if (!IS_GENp(5)) {
                         fprintf (stderr, "Gen6- doesn't support data port for sampler/render/constant/data cache\n");
                         YYERROR;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 10/90] assembler: Rename dp_gen7 to gen7_dp and sync it with Mesa's
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (8 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 09/90] assembler: Rename dp_gen6 to gen6_dp and sync with Mesa's Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 11/90] assembler: Remove struct dp_write_gen6 and struct use gen6_dp Damien Lespiau
                   ` (80 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The purpose of this commit is to synchronize opcode definitions across
the gen4asm assembler and mesa.

I had to drop how mesa splits msg_control as the current assembly
language gives access the the whole msg_control field.

Recompiling the xorg and the intel driver of libva shaders doesn't show
any difference in the assembly created.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |   31 ++++++++++++++++++++-----------
 assembler/gram.y        |   26 +++++++++++++-------------
 2 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index e571052..cfb3028 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1536,17 +1536,26 @@ struct brw_instruction
 	 GLuint end_of_thread:1;
       } gen6_dp;
 
-       struct {
-           GLuint binding_table_index:8;
-           GLuint msg_control:6;
-           GLuint msg_type:4;    
-           GLuint category:1;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } dp_gen7;
+      /**
+       * Message for any of the Gen7 Data Port caches.
+       *
+       * Most fields are defined in BSpec volume 5c.2 Data Port / Messages /
+       * Data Port Messages / Message Descriptor.  Once again, "Slot Group
+       * Select" and "Last Render Target" are part of the 6-bit message
+       * control for Render Target Writes.
+       */
+      struct {
+	 GLuint binding_table_index:8;
+	 GLuint msg_control:6;
+	 GLuint msg_type:4;
+	 GLuint category:1;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad2:2;
+	 GLuint end_of_thread:1;
+      } gen7_dp;
+      /** @} */
 
        struct {
            GLuint opcode:1;
diff --git a/assembler/gram.y b/assembler/gram.y
index df26393..1295d60 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1253,9 +1253,9 @@ msgtarget:	NULL_TOKEN
                       $$.bits2.send_gen5.sfid = 
                           BRW_MESSAGE_TARGET_DP_SC;
                       $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.dp_gen7.binding_table_index = $3;
-                      $$.bits3.dp_gen7.msg_control = $7;
-                      $$.bits3.dp_gen7.msg_type = $9;
+                      $$.bits3.gen7_dp.binding_table_index = $3;
+                      $$.bits3.gen7_dp.msg_control = $7;
+                      $$.bits3.gen7_dp.msg_type = $9;
 		  } else if (IS_GENx(6)) {
                       $$.bits2.send_gen5.sfid = 
                           BRW_MESSAGE_TARGET_DP_SC;
@@ -1287,9 +1287,9 @@ msgtarget:	NULL_TOKEN
                       $$.bits2.send_gen5.sfid =
                           BRW_MESSAGE_TARGET_DP_RC;
                       $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.dp_gen7.binding_table_index = $3;
-                      $$.bits3.dp_gen7.msg_control = $5;
-                      $$.bits3.dp_gen7.msg_type = $7;
+                      $$.bits3.gen7_dp.binding_table_index = $3;
+                      $$.bits3.gen7_dp.msg_control = $5;
+                      $$.bits3.gen7_dp.msg_type = $7;
                   } else if (IS_GENx(6)) {
                       $$.bits2.send_gen5.sfid =
                           BRW_MESSAGE_TARGET_DP_RC;
@@ -1332,9 +1332,9 @@ msgtarget:	NULL_TOKEN
                       $$.bits2.send_gen5.sfid =
                           BRW_MESSAGE_TARGET_DP_RC;
                       $$.bits3.generic_gen5.header_present = ($11 != 0);
-                      $$.bits3.dp_gen7.binding_table_index = $3;
-                      $$.bits3.dp_gen7.msg_control = $5;
-                      $$.bits3.dp_gen7.msg_type = $7;
+                      $$.bits3.gen7_dp.binding_table_index = $3;
+                      $$.bits3.gen7_dp.msg_control = $5;
+                      $$.bits3.gen7_dp.msg_type = $7;
 		  } else if (IS_GENx(6)) {
                       $$.bits2.send_gen5.sfid =
                           BRW_MESSAGE_TARGET_DP_RC;
@@ -1459,10 +1459,10 @@ msgtarget:	NULL_TOKEN
                             YYERROR;
                         }
 
-                        $$.bits3.dp_gen7.category = $11;
-                        $$.bits3.dp_gen7.binding_table_index = $9;
-                        $$.bits3.dp_gen7.msg_control = $7;
-                        $$.bits3.dp_gen7.msg_type = $5;
+                        $$.bits3.gen7_dp.category = $11;
+                        $$.bits3.gen7_dp.binding_table_index = $9;
+                        $$.bits3.gen7_dp.msg_control = $7;
+                        $$.bits3.gen7_dp.msg_type = $5;
                     } else if (IS_GENx(6)) {
                         if ($3 != BRW_MESSAGE_TARGET_DP_SC &&
                             $3 != BRW_MESSAGE_TARGET_DP_RC &&
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 11/90] assembler: Remove struct dp_write_gen6 and struct use gen6_dp
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (9 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 10/90] assembler: Rename dp_gen7 to gen7_dp and sync it " Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 12/90] assembler: Rename gen5 DP pixel_scoreboard_clear to last_render_target Damien Lespiau
                   ` (79 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

We ended up with 2 structures that where exactly the same, so just use
one, which happens to be the one Mesa has.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |   13 -------------
 assembler/gram.y        |   16 ++++++++--------
 2 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index cfb3028..0e3c430 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1500,19 +1500,6 @@ struct brw_instruction
            GLuint end_of_thread:1;
        } dp_write_gen5;
 
-       struct {
-           GLuint binding_table_index:8;
-           GLuint msg_control:5;
-           GLuint msg_type:4;    
-           GLuint send_commit_msg:1;
-           GLuint pad0:1;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } dp_write_gen6;
-
       /**
        * Message for the Sandybridge Render Cache Data Port.
        *
diff --git a/assembler/gram.y b/assembler/gram.y
index 1295d60..538f8f7 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1298,10 +1298,10 @@ msgtarget:	NULL_TOKEN
                        * message header
                        */
                       $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.dp_write_gen6.binding_table_index = $3;
-                      $$.bits3.dp_write_gen6.msg_control = $5;
-                     $$.bits3.dp_write_gen6.msg_type = $7;
-                      $$.bits3.dp_write_gen6.send_commit_msg = $9;
+                      $$.bits3.gen6_dp.binding_table_index = $3;
+                      $$.bits3.gen6_dp.msg_control = $5;
+                     $$.bits3.gen6_dp.msg_type = $7;
+                      $$.bits3.gen6_dp.send_commit_msg = $9;
 		  } else if (IS_GENx(5)) {
                       $$.bits2.send_gen5.sfid =
                           BRW_MESSAGE_TARGET_DATAPORT_WRITE;
@@ -1339,10 +1339,10 @@ msgtarget:	NULL_TOKEN
                       $$.bits2.send_gen5.sfid =
                           BRW_MESSAGE_TARGET_DP_RC;
                       $$.bits3.generic_gen5.header_present = ($11 != 0);
-                      $$.bits3.dp_write_gen6.binding_table_index = $3;
-                      $$.bits3.dp_write_gen6.msg_control = $5;
-                     $$.bits3.dp_write_gen6.msg_type = $7;
-                      $$.bits3.dp_write_gen6.send_commit_msg = $9;
+                      $$.bits3.gen6_dp.binding_table_index = $3;
+                      $$.bits3.gen6_dp.msg_control = $5;
+                     $$.bits3.gen6_dp.msg_type = $7;
+                      $$.bits3.gen6_dp.send_commit_msg = $9;
 		  } else if (IS_GENx(5)) {
                       $$.bits2.send_gen5.sfid =
                           BRW_MESSAGE_TARGET_DATAPORT_WRITE;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 12/90] assembler: Rename gen5 DP pixel_scoreboard_clear to last_render_target
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (10 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 11/90] assembler: Remove struct dp_write_gen6 and struct use gen6_dp Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 13/90] assembler: Rename branch to branch_gen6 Damien Lespiau
                   ` (78 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The purpose of this commit is to synchronize opcode definitions across
the gen4asm assembler and mesa.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |    4 ++--
 assembler/disasm.c      |    2 +-
 assembler/gram.y        |   16 ++++++++--------
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index 0e3c430..511c326 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1368,7 +1368,7 @@ struct brw_instruction
       struct {
 	 GLuint binding_table_index:8;
 	 GLuint msg_control:3;
-	 GLuint pixel_scoreboard_clear:1;
+	 GLuint last_render_target:1;
 	 GLuint msg_type:3;    
 	 GLuint send_commit_msg:1;
 	 GLuint response_length:4;
@@ -1489,7 +1489,7 @@ struct brw_instruction
        struct {
            GLuint binding_table_index:8;
            GLuint msg_control:3;
-           GLuint pixel_scoreboard_clear:1;
+           GLuint last_render_target:1;
            GLuint msg_type:3;    
            GLuint send_commit_msg:1;
            GLuint pad0:3;
diff --git a/assembler/disasm.c b/assembler/disasm.c
index 1cb0924..6260a4e 100644
--- a/assembler/disasm.c
+++ b/assembler/disasm.c
@@ -854,7 +854,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
 	case BRW_MESSAGE_TARGET_DATAPORT_WRITE:
 	    format (file, " (%d, %d, %d, %d)",
 		    inst->bits3.dp_write.binding_table_index,
-		    (inst->bits3.dp_write.pixel_scoreboard_clear << 3) |
+		    (inst->bits3.dp_write.last_render_target << 3) |
 		    inst->bits3.dp_write.msg_control,
 		    inst->bits3.dp_write.msg_type,
 		    inst->bits3.dp_write.send_commit_msg);
diff --git a/assembler/gram.y b/assembler/gram.y
index 538f8f7..f71f960 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1307,7 +1307,7 @@ msgtarget:	NULL_TOKEN
                           BRW_MESSAGE_TARGET_DATAPORT_WRITE;
                       $$.bits3.generic_gen5.header_present = 1;
                       $$.bits3.dp_write_gen5.binding_table_index = $3;
-                      $$.bits3.dp_write_gen5.pixel_scoreboard_clear = ($5 & 0x8) >> 3;
+                      $$.bits3.dp_write_gen5.last_render_target = ($5 & 0x8) >> 3;
                       $$.bits3.dp_write_gen5.msg_control = $5 & 0x7;
                       $$.bits3.dp_write_gen5.msg_type = $7;
                       $$.bits3.dp_write_gen5.send_commit_msg = $9;
@@ -1316,10 +1316,10 @@ msgtarget:	NULL_TOKEN
                           BRW_MESSAGE_TARGET_DATAPORT_WRITE;
                       $$.bits3.dp_write.binding_table_index = $3;
                       /* The msg control field of brw_struct.h is split into
-                       * msg control and pixel_scoreboard_clear, even though
-                       * pixel_scoreboard_clear isn't common to all write messages.
+                       * msg control and last_render_target, even though
+                       * last_render_target isn't common to all write messages.
                        */
-                      $$.bits3.dp_write.pixel_scoreboard_clear = ($5 & 0x8) >> 3;
+                      $$.bits3.dp_write.last_render_target = ($5 & 0x8) >> 3;
                       $$.bits3.dp_write.msg_control = $5 & 0x7;
                       $$.bits3.dp_write.msg_type = $7;
                       $$.bits3.dp_write.send_commit_msg = $9;
@@ -1348,7 +1348,7 @@ msgtarget:	NULL_TOKEN
                           BRW_MESSAGE_TARGET_DATAPORT_WRITE;
                       $$.bits3.generic_gen5.header_present = ($11 != 0);
                       $$.bits3.dp_write_gen5.binding_table_index = $3;
-                      $$.bits3.dp_write_gen5.pixel_scoreboard_clear = ($5 & 0x8) >> 3;
+                      $$.bits3.dp_write_gen5.last_render_target = ($5 & 0x8) >> 3;
                       $$.bits3.dp_write_gen5.msg_control = $5 & 0x7;
                       $$.bits3.dp_write_gen5.msg_type = $7;
                       $$.bits3.dp_write_gen5.send_commit_msg = $9;
@@ -1357,10 +1357,10 @@ msgtarget:	NULL_TOKEN
                           BRW_MESSAGE_TARGET_DATAPORT_WRITE;
                       $$.bits3.dp_write.binding_table_index = $3;
                       /* The msg control field of brw_struct.h is split into
-                       * msg control and pixel_scoreboard_clear, even though
-                       * pixel_scoreboard_clear isn't common to all write messages.
+                       * msg control and last_render_target, even though
+                       * last_render_target isn't common to all write messages.
                        */
-                      $$.bits3.dp_write.pixel_scoreboard_clear = ($5 & 0x8) >> 3;
+                      $$.bits3.dp_write.last_render_target = ($5 & 0x8) >> 3;
                       $$.bits3.dp_write.msg_control = $5 & 0x7;
                       $$.bits3.dp_write.msg_type = $7;
                       $$.bits3.dp_write.send_commit_msg = $9;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 13/90] assembler: Rename branch to branch_gen6
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (11 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 12/90] assembler: Rename gen5 DP pixel_scoreboard_clear to last_render_target Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 14/90] assembler: Rename branch_2_offset to break_cont Damien Lespiau
                   ` (77 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The purpose of this commit is to synchronize opcode definitions across
the gen4asm assembler and mesa.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |   17 ++++++++++++-----
 assembler/main.c        |    2 +-
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index 511c326..7b0b0da 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1125,6 +1125,18 @@ struct brw_instruction
 	 GLuint dest_address_mode:1;
       } ia16; /* indirect align16 */
 
+      struct {
+	 GLuint dest_reg_file:2;
+	 GLuint dest_reg_type:3;
+	 GLuint src0_reg_file:2;
+	 GLuint src0_reg_type:3;
+	 GLuint src1_reg_file:2;
+	 GLuint src1_reg_type:3;
+	 GLuint pad:1;
+
+	 GLint jump_count:16;
+      } branch_gen6;
+
       struct
       {
 	 GLuint dest_reg_file:1; /* used in Gen6, deleted in Gen7 */
@@ -1144,11 +1156,6 @@ struct brw_instruction
 	 GLuint dest_reg_nr:8;
       } da3src;
 
-      struct
-      {
-	 GLuint pad:16;
-	 GLint JIP:16;
-      } branch; /* conditional branch JIP for Gen6 only */
    } bits1;
 
 
diff --git a/assembler/main.c b/assembler/main.c
index 15ed517..ae271b4 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -448,7 +448,7 @@ int main(int argc, char **argv)
 		    if(opcode == BRW_OPCODE_CALL || opcode == BRW_OPCODE_JMPI)
 			entry->instruction.bits3.JIP = offset; // for CALL, JMPI
 		    else
-			entry->instruction.bits1.branch.JIP = offset; // for CASE,ELSE,FORK,IF,WHILE
+			entry->instruction.bits1.branch_gen6.jump_count = offset; // for CASE,ELSE,FORK,IF,WHILE
 		} else if(IS_GENp(7)) {
 		    int opcode = entry->instruction.header.opcode;
 		    /* Gen7 JMPI Restrictions in bspec:
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 14/90] assembler: Rename branch_2_offset to break_cont
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (12 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 13/90] assembler: Rename branch to branch_gen6 Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 15/90] assembler: Rename bits3.id and bits3.fd Damien Lespiau
                   ` (76 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Once again, import the equivalent struct from mesa.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |   16 +++++++++++++---
 assembler/main.c        |    8 ++++----
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index 7b0b0da..2815256 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1324,11 +1324,21 @@ struct brw_instruction
 	 GLuint pad1:2; /* reserved */
       } da3src;
 
+      /* This is also used for gen7 IF/ELSE instructions */
       struct
       {
-	 GLint JIP:16; /* Gen7 bspec: both the JIP and UIP are signed 16-bit numbers */
-	 GLint UIP:16;
-      } branch_2_offset; /* for Gen6, Gen7 2-offsets branch; for Gen7 1-offset branch */
+	 /* Signed jump distance to the ip to jump to if all channels
+	  * are disabled after the break or continue.  It should point
+	  * to the end of the innermost control flow block, as that's
+	  * where some channel could get re-enabled.
+	  */
+	 int jip:16;
+
+	 /* Signed jump distance to the location to resume execution
+	  * of this channel if it's enabled for the break or continue.
+	  */
+	 int uip:16;
+      } break_cont;
 
       GLint JIP; /* used by Gen6 CALL instructions; Gen7 JMPI */
 
diff --git a/assembler/main.c b/assembler/main.c
index ae271b4..1b411c7 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -424,8 +424,8 @@ int main(int argc, char **argv)
 
 	    if (inst->second_reloc_offset) {
 		// this is a branch instruction with two offset arguments
-		entry->instruction.bits3.branch_2_offset.JIP = jump_distance(inst->first_reloc_offset);
-		entry->instruction.bits3.branch_2_offset.UIP = jump_distance(inst->second_reloc_offset);
+		entry->instruction.bits3.break_cont.jip = jump_distance(inst->first_reloc_offset);
+		entry->instruction.bits3.break_cont.uip = jump_distance(inst->second_reloc_offset);
 	    } else if (inst->first_reloc_offset) {
 		// this is a branch instruction with one offset argument
 		int offset = inst->first_reloc_offset;
@@ -441,7 +441,7 @@ int main(int argc, char **argv)
 		if(!IS_GENp(6)) {
 		    entry->instruction.bits3.JIP = offset;
 		    if(entry->instruction.header.opcode == BRW_OPCODE_ELSE)
-			entry->instruction.bits3.branch_2_offset.UIP = 1; /* Set the istack pop count, which must always be 1. */
+			entry->instruction.bits3.break_cont.uip = 1; /* Set the istack pop count, which must always be 1. */
 		} else if(IS_GENx(6)) {
 		    /* TODO: endif JIP pos is not in Gen6 spec. may be bits1 */
 		    int opcode = entry->instruction.header.opcode;
@@ -457,7 +457,7 @@ int main(int argc, char **argv)
 		    if(opcode == BRW_OPCODE_JMPI)
 			entry->instruction.bits3.JIP = offset;
 		    else
-			entry->instruction.bits3.branch_2_offset.JIP = offset;
+			entry->instruction.bits3.break_cont.jip = offset;
 		}
 	    }
 	}
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 15/90] assembler: Rename bits3.id and bits3.fd
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (13 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 14/90] assembler: Rename branch_2_offset to break_cont Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 16/90] assembler: Adopt brw_structs.h from mesa Damien Lespiau
                   ` (75 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

As always, to sync with mesa.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |    4 ++--
 assembler/disasm.c      |    6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index 2815256..c442f4a 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1599,8 +1599,8 @@ struct brw_instruction
        } generic_gen5;
 
       GLuint ud;
-      GLint id;
-      GLfloat fd;
+      GLint d;
+      GLfloat f;
    } bits3;
 
    char *first_reloc_target, *second_reloc_target; // first for JIP, second for UIP
diff --git a/assembler/disasm.c b/assembler/disasm.c
index 6260a4e..b6fdc2e 100644
--- a/assembler/disasm.c
+++ b/assembler/disasm.c
@@ -628,13 +628,13 @@ static int imm (FILE *file, GLuint type, struct brw_instruction *inst) {
 	format (file, "0x%08xUD", inst->bits3.ud);
 	break;
     case BRW_REGISTER_TYPE_D:
-	format (file, "%dD", inst->bits3.id);
+	format (file, "%dD", inst->bits3.d);
 	break;
     case BRW_REGISTER_TYPE_UW:
 	format (file, "0x%04xUW", (uint16_t) inst->bits3.ud);
 	break;
     case BRW_REGISTER_TYPE_W:
-	format (file, "%dW", (int16_t) inst->bits3.id);
+	format (file, "%dW", (int16_t) inst->bits3.d);
 	break;
     case BRW_REGISTER_TYPE_UB:
 	format (file, "0x%02xUB", (int8_t) inst->bits3.ud);
@@ -646,7 +646,7 @@ static int imm (FILE *file, GLuint type, struct brw_instruction *inst) {
 	format (file, "0x%08xV", inst->bits3.ud);
 	break;
     case BRW_REGISTER_TYPE_F:
-	format (file, "%-gF", inst->bits3.fd);
+	format (file, "%-gF", inst->bits3.f);
     }
     return 0;
 }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 16/90] assembler: Adopt brw_structs.h from mesa
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (14 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 15/90] assembler: Rename bits3.id and bits3.fd Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 17/90] assembler: Remove trailing white spaces from brw_structs.h Damien Lespiau
                   ` (74 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Finally merge both brw_structs.h from mesa. One detail has risen in that
last commit, the msg_control field of data port message descriptors.

Mesa's msg_control field is sometimes split with messages-specific
fields where the assembler (at least for recent generations) exposes the
full msg_control value in the send instruction.

As libva's shaders encodes the full msg_control value in its send
instructions, I've chosen to not take the split msg_control from mesa.
It's absolutely possible to have a patch fixing that divergence at some
later point.

I've also kept a hack introduced with ironlake to not have to rewrite
shaders (that encode msg_control in the text, remember), and thus
creates a another difference with Mesa.

-	 GLuint msg_control:3;
-	 GLuint msg_type:3;
+	 GLuint msg_control:4;
+	 GLuint msg_type:2;

Once again, I've made sure that re-generating libva's shaders don't show
any difference.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h | 1365 ++++++++++++++++++++++-------------------------
 1 files changed, 624 insertions(+), 741 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index c442f4a..218f11e 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1,91 +1,38 @@
- /**************************************************************************
- * 
- * Copyright 2005 Tungsten Graphics, Inc., Cedar Park, Texas.
- * All Rights Reserved.
- * 
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the
- * "Software"), to deal in the Software without restriction, including
- * without limitation the rights to use, copy, modify, merge, publish,
- * distribute, sub license, and/or sell copies of the Software, and to
- * permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- * 
- * The above copyright notice and this permission notice (including the
- * next paragraph) shall be included in all copies or substantial portions
- * of the Software.
- * 
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
- * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
- * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
- * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
- * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
- * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- * 
- **************************************************************************/
+/*
+ Copyright (C) Intel Corp.  2006.  All Rights Reserved.
+ Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
+ develop this 3D driver.
+ 
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+ 
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.
+ 
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ 
+ **********************************************************************/
+ /*
+  * Authors:
+  *   Keith Whitwell <keith@tungstengraphics.com>
+  */
+        
 
 #ifndef BRW_STRUCTS_H
 #define BRW_STRUCTS_H
 
-/* Command packets:
- */
-struct header 
-{
-   GLuint length:16; 
-   GLuint opcode:16; 
-} bits;
-
-
-union header_union
-{
-   struct header bits;
-   GLuint dword;
-};
-
-struct brw_3d_control
-{   
-   struct 
-   {
-      GLuint length:8;
-      GLuint notify_enable:1;
-      GLuint pad:3;
-      GLuint wc_flush_enable:1; 
-      GLuint depth_stall_enable:1; 
-      GLuint operation:2; 
-      GLuint opcode:16; 
-   } header;
-   
-   struct
-   {
-      GLuint pad:2;
-      GLuint dest_addr_type:1; 
-      GLuint dest_addr:29; 
-   } dest;
-   
-   GLuint dword2;   
-   GLuint dword3;   
-};
-
-
-struct brw_3d_primitive
-{
-   struct
-   {
-      GLuint length:8; 
-      GLuint pad:2;
-      GLuint topology:5; 
-      GLuint indexed:1; 
-      GLuint opcode:16; 
-   } header;
-
-   GLuint verts_per_instance;  
-   GLuint start_vert_location;  
-   GLuint instance_count;  
-   GLuint start_instance_location;  
-   GLuint base_vert_location;  
-};
-
 /* These seem to be passed around as function args, so it works out
  * better to keep them as #defines:
  */
@@ -94,243 +41,6 @@ struct brw_3d_primitive
 #define BRW_INHIBIT_FLUSH_RENDER_CACHE 0x4
 #define BRW_FLUSH_SNAPSHOT_COUNTERS    0x8
 
-struct brw_mi_flush
-{
-   GLuint flags:4;
-   GLuint pad:12;
-   GLuint opcode:16;
-};
-
-struct brw_vf_statistics
-{
-   GLuint statistics_enable:1;
-   GLuint pad:15;
-   GLuint opcode:16;
-};
-
-
-
-struct brw_binding_table_pointers
-{
-   struct header header;
-   GLuint vs; 
-   GLuint gs; 
-   GLuint clp; 
-   GLuint sf; 
-   GLuint wm; 
-};
-
-
-struct brw_blend_constant_color
-{
-   struct header header;
-   GLfloat blend_constant_color[4];  
-};
-
-
-struct brw_depthbuffer
-{
-   union header_union header;
-   
-   union {
-      struct {
-	 GLuint pitch:18; 
-	 GLuint format:3; 
-	 GLuint pad:4;
-	 GLuint depth_offset_disable:1; 
-	 GLuint tile_walk:1; 
-	 GLuint tiled_surface:1; 
-	 GLuint pad2:1;
-	 GLuint surface_type:3; 
-      } bits;
-      GLuint dword;
-   } dword1;
-   
-   GLuint dword2_base_addr; 
- 
-   union {
-      struct {
-	 GLuint pad:1;
-	 GLuint mipmap_layout:1; 
-	 GLuint lod:4; 
-	 GLuint width:13; 
-	 GLuint height:13; 
-      } bits;
-      GLuint dword;
-   } dword3;
-
-   union {
-      struct {
-	 GLuint pad:12;
-	 GLuint min_array_element:9; 
-	 GLuint depth:11; 
-      } bits;
-      GLuint dword;
-   } dword4;
-};
-
-struct brw_drawrect
-{
-   struct header header;
-   GLuint xmin:16; 
-   GLuint ymin:16; 
-   GLuint xmax:16; 
-   GLuint ymax:16; 
-   GLuint xorg:16;  
-   GLuint yorg:16;  
-};
-
-
-
-
-struct brw_global_depth_offset_clamp
-{
-   struct header header;
-   GLfloat depth_offset_clamp;  
-};
-
-struct brw_indexbuffer
-{   
-   union {
-      struct
-      {
-	 GLuint length:8; 
-	 GLuint index_format:2; 
-	 GLuint cut_index_enable:1; 
-	 GLuint pad:5; 
-	 GLuint opcode:16; 
-      } bits;
-      GLuint dword;
-
-   } header;
-
-   GLuint buffer_start; 
-   GLuint buffer_end; 
-};
-
-
-struct brw_line_stipple
-{   
-   struct header header;
-  
-   struct
-   {
-      GLuint pattern:16; 
-      GLuint pad:16;
-   } bits0;
-   
-   struct
-   {
-      GLuint repeat_count:9; 
-      GLuint pad:7;
-      GLuint inverse_repeat_count:16; 
-   } bits1;
-};
-
-
-struct brw_pipelined_state_pointers
-{
-   struct header header;
-   
-   struct {
-      GLuint pad:5;
-      GLuint offset:27; 
-   } vs;
-   
-   struct
-   {
-      GLuint enable:1;
-      GLuint pad:4;
-      GLuint offset:27; 
-   } gs;
-   
-   struct
-   {
-      GLuint enable:1;
-      GLuint pad:4;
-      GLuint offset:27; 
-   } clp;
-   
-   struct
-   {
-      GLuint pad:5;
-      GLuint offset:27; 
-   } sf;
-
-   struct
-   {
-      GLuint pad:5;
-      GLuint offset:27; 
-   } wm;
-   
-   struct
-   {
-      GLuint pad:5;
-      GLuint offset:27; /* KW: check me! */
-   } cc;
-};
-
-
-struct brw_polygon_stipple_offset
-{
-   struct header header;
-
-   struct {
-      GLuint y_offset:5; 
-      GLuint pad:3;
-      GLuint x_offset:5; 
-      GLuint pad0:19;
-   } bits0;
-};
-
-
-
-struct brw_polygon_stipple
-{
-   struct header header;
-   GLuint stipple[32];
-};
-
-
-
-struct brw_pipeline_select
-{
-   struct
-   {
-      GLuint pipeline_select:1;   
-      GLuint pad:15;
-      GLuint opcode:16;   
-   } header;
-};
-
-
-struct brw_pipe_control
-{
-   struct
-   {
-      GLuint length:8;
-      GLuint notify_enable:1;
-      GLuint pad:2;
-      GLuint instruction_state_cache_flush_enable:1;
-      GLuint write_cache_flush_enable:1;
-      GLuint depth_stall_enable:1;
-      GLuint post_sync_operation:2;
-
-      GLuint opcode:16;
-   } header;
-
-   struct
-   {
-      GLuint pad:2;
-      GLuint dest_addr_type:1;
-      GLuint dest_addr:29;
-   } bits1;
-
-   GLuint data0;
-   GLuint data1;
-};
-
-
 struct brw_urb_fence
 {
    struct
@@ -358,107 +68,11 @@ struct brw_urb_fence
    {
       GLuint sf_fence:10;  
       GLuint vf_fence:10;  
-      GLuint cs_fence:10;  
-      GLuint pad:2;
-   } bits1;
-};
-
-struct brw_constant_buffer_state /* previously brw_command_streamer */
-{
-   struct header header;
-
-   struct
-   {
-      GLuint nr_urb_entries:3;   
+      GLuint cs_fence:11;  
       GLuint pad:1;
-      GLuint urb_entry_size:5;   
-      GLuint pad0:23;
-   } bits0;
-};
-
-struct brw_constant_buffer
-{
-   struct
-   {
-      GLuint length:8;   
-      GLuint valid:1;   
-      GLuint pad:7;
-      GLuint opcode:16;   
-   } header;
-
-   struct
-   {
-      GLuint buffer_length:6;   
-      GLuint buffer_address:26;  
-   } bits0;
-};
-
-struct brw_state_base_address
-{
-   struct header header;
-
-   struct
-   {
-      GLuint modify_enable:1;
-      GLuint pad:4;
-      GLuint general_state_address:27;  
-   } bits0;
-
-   struct
-   {
-      GLuint modify_enable:1;
-      GLuint pad:4;
-      GLuint surface_state_address:27;  
    } bits1;
-
-   struct
-   {
-      GLuint modify_enable:1;
-      GLuint pad:4;
-      GLuint indirect_object_state_address:27;  
-   } bits2;
-
-   struct
-   {
-      GLuint modify_enable:1;
-      GLuint pad:11;
-      GLuint general_state_upper_bound:20;  
-   } bits3;
-
-   struct
-   {
-      GLuint modify_enable:1;
-      GLuint pad:11;
-      GLuint indirect_object_state_upper_bound:20;  
-   } bits4;
 };
 
-struct brw_state_prefetch
-{
-   struct header header;
-
-   struct
-   {
-      GLuint prefetch_count:3;   
-      GLuint pad:3;
-      GLuint prefetch_pointer:26;  
-   } bits0;
-};
-
-struct brw_system_instruction_pointer
-{
-   struct header header;
-
-   struct
-   {
-      GLuint pad:4;
-      GLuint system_instruction_pointer:28;  
-   } bits0;
-};
-
-
-
-
 /* State structs for the various fixed function units:
  */
 
@@ -468,7 +82,7 @@ struct thread0
    GLuint pad0:1;
    GLuint grf_reg_count:3; 
    GLuint pad1:2;
-   GLuint kernel_start_pointer:26; 
+   GLuint kernel_start_pointer:26; /* Offset from GENERAL_STATE_BASE */
 };
 
 struct thread1
@@ -514,7 +128,22 @@ struct thread3
 struct brw_clip_unit_state
 {
    struct thread0 thread0;
-   struct thread1 thread1;
+   struct
+   {
+      GLuint pad0:7;
+      GLuint sw_exception_enable:1;
+      GLuint pad1:3;
+      GLuint mask_stack_exception_enable:1;
+      GLuint pad2:1;
+      GLuint illegal_op_exception_enable:1;
+      GLuint pad3:2;
+      GLuint floating_point_mode:1;
+      GLuint thread_priority:1;
+      GLuint binding_table_entry_count:8;
+      GLuint pad4:5;
+      GLuint single_program_flow:1;
+   } thread1;
+
    struct thread2 thread2;
    struct thread3 thread3;
 
@@ -527,8 +156,8 @@ struct brw_clip_unit_state
       GLuint pad1:1;
       GLuint urb_entry_allocation_size:5; 
       GLuint pad2:1;
-      GLuint max_threads:6; 	/* may be less */
-      GLuint pad3:1;
+      GLuint max_threads:5; 	/* may be less */
+      GLuint pad3:2;
    } thread4;   
       
    struct
@@ -537,7 +166,7 @@ struct brw_clip_unit_state
       GLuint clip_mode:3; 
       GLuint userclip_enable_flags:8; 
       GLuint userclip_must_clip:1; 
-      GLuint pad1:1;
+      GLuint negative_w_clip_test:1;
       GLuint guard_band_enable:1; 
       GLuint viewport_z_clip_enable:1; 
       GLuint viewport_xy_clip_enable:1; 
@@ -559,7 +188,105 @@ struct brw_clip_unit_state
    GLfloat viewport_ymax;  
 };
 
+struct gen6_blend_state
+{
+   struct {
+      GLuint dest_blend_factor:5;
+      GLuint source_blend_factor:5;
+      GLuint pad3:1;
+      GLuint blend_func:3;
+      GLuint pad2:1;
+      GLuint ia_dest_blend_factor:5;
+      GLuint ia_source_blend_factor:5;
+      GLuint pad1:1;
+      GLuint ia_blend_func:3;
+      GLuint pad0:1;
+      GLuint ia_blend_enable:1;
+      GLuint blend_enable:1;
+   } blend0;
+
+   struct {
+      GLuint post_blend_clamp_enable:1;
+      GLuint pre_blend_clamp_enable:1;
+      GLuint clamp_range:2;
+      GLuint pad0:4;
+      GLuint x_dither_offset:2;
+      GLuint y_dither_offset:2;
+      GLuint dither_enable:1;
+      GLuint alpha_test_func:3;
+      GLuint alpha_test_enable:1;
+      GLuint pad1:1;
+      GLuint logic_op_func:4;
+      GLuint logic_op_enable:1;
+      GLuint pad2:1;
+      GLuint write_disable_b:1;
+      GLuint write_disable_g:1;
+      GLuint write_disable_r:1;
+      GLuint write_disable_a:1;
+      GLuint pad3:1;
+      GLuint alpha_to_coverage_dither:1;
+      GLuint alpha_to_one:1;
+      GLuint alpha_to_coverage:1;
+   } blend1;
+};
+
+struct gen6_color_calc_state
+{
+   struct {
+      GLuint alpha_test_format:1;
+      GLuint pad0:14;
+      GLuint round_disable:1;
+      GLuint bf_stencil_ref:8;
+      GLuint stencil_ref:8;
+   } cc0;
 
+   union {
+      GLfloat alpha_ref_f;
+      struct {
+	 GLuint ui:8;
+	 GLuint pad0:24;
+      } alpha_ref_fi;
+   } cc1;
+
+   GLfloat constant_r;
+   GLfloat constant_g;
+   GLfloat constant_b;
+   GLfloat constant_a;
+};
+
+struct gen6_depth_stencil_state
+{
+   struct {
+      GLuint pad0:3;
+      GLuint bf_stencil_pass_depth_pass_op:3;
+      GLuint bf_stencil_pass_depth_fail_op:3;
+      GLuint bf_stencil_fail_op:3;
+      GLuint bf_stencil_func:3;
+      GLuint bf_stencil_enable:1;
+      GLuint pad1:2;
+      GLuint stencil_write_enable:1;
+      GLuint stencil_pass_depth_pass_op:3;
+      GLuint stencil_pass_depth_fail_op:3;
+      GLuint stencil_fail_op:3;
+      GLuint stencil_func:3;
+      GLuint stencil_enable:1;
+   } ds0;
+
+   struct {
+      GLuint bf_stencil_write_mask:8;
+      GLuint bf_stencil_test_mask:8;
+      GLuint stencil_write_mask:8;
+      GLuint stencil_test_mask:8;
+   } ds1;
+
+   struct {
+      GLuint pad0:26;
+      GLuint depth_write_enable:1;
+      GLuint depth_test_func:3;
+      GLuint pad1:1;
+      GLuint depth_test_enable:1;
+   } ds2;
+};
 
 struct brw_cc_unit_state
 {
@@ -617,7 +344,7 @@ struct brw_cc_unit_state
    struct
    {
       GLuint pad0:5; 
-      GLuint cc_viewport_state_offset:27; 
+      GLuint cc_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
    } cc4;
    
    struct
@@ -653,8 +380,6 @@ struct brw_cc_unit_state
    } cc7;
 };
 
-
-
 struct brw_sf_unit_state
 {
    struct thread0 thread0;
@@ -679,7 +404,7 @@ struct brw_sf_unit_state
       GLuint front_winding:1; 
       GLuint viewport_transform:1; 
       GLuint pad0:3;
-      GLuint sf_viewport_state_offset:27; 
+      GLuint sf_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
    } sf5;
    
    struct
@@ -704,7 +429,8 @@ struct brw_sf_unit_state
       GLuint use_point_size_state:1; 
       GLuint subpixel_precision:1; 
       GLuint sprite_point:1; 
-      GLuint pad0:11;
+      GLuint pad0:10;
+      GLuint aa_line_distance_mode:1;
       GLuint trifan_pv:2; 
       GLuint linestrip_pv:2; 
       GLuint tristrip_pv:2; 
@@ -713,6 +439,13 @@ struct brw_sf_unit_state
 
 };
 
+struct gen6_scissor_rect
+{
+   GLuint xmin:16;
+   GLuint ymin:16;
+   GLuint xmax:16;
+   GLuint ymax:16;
+};
 
 struct brw_gs_unit_state
 {
@@ -723,14 +456,16 @@ struct brw_gs_unit_state
 
    struct
    {
-      GLuint pad0:10;
+      GLuint pad0:8;
+      GLuint rendering_enable:1; /* for Ironlake */
+      GLuint pad4:1;
       GLuint stats_enable:1; 
       GLuint nr_urb_entries:7; 
       GLuint pad1:1;
       GLuint urb_entry_allocation_size:5; 
       GLuint pad2:1;
-      GLuint max_threads:1; 
-      GLuint pad3:6;
+      GLuint max_threads:5; 
+      GLuint pad3:2;
    } thread4;   
       
    struct
@@ -744,9 +479,14 @@ struct brw_gs_unit_state
    struct
    {
       GLuint max_vp_index:4; 
-      GLuint pad0:26;
-      GLuint reorder_enable:1; 
+      GLuint pad0:12;
+      GLuint svbi_post_inc_value:10;
       GLuint pad1:1;
+      GLuint svbi_post_inc_enable:1;
+      GLuint svbi_payload:1;
+      GLuint discard_adjaceny:1;
+      GLuint reorder_enable:1; 
+      GLuint pad2:1;
    } gs6;
 };
 
@@ -766,8 +506,8 @@ struct brw_vs_unit_state
       GLuint pad1:1;
       GLuint urb_entry_allocation_size:5; 
       GLuint pad2:1;
-      GLuint max_threads:4; 
-      GLuint pad3:3;
+      GLuint max_threads:6; 
+      GLuint pad3:1;
    } thread4;   
 
    struct
@@ -795,7 +535,7 @@ struct brw_wm_unit_state
    
    struct {
       GLuint stats_enable:1; 
-      GLuint pad0:1;
+      GLuint depth_buffer_clear:1;
       GLuint sampler_count:3; 
       GLuint sampler_state_pointer:27; 
    } wm4;
@@ -805,7 +545,16 @@ struct brw_wm_unit_state
       GLuint enable_8_pix:1; 
       GLuint enable_16_pix:1; 
       GLuint enable_32_pix:1; 
-      GLuint pad0:7;
+      GLuint enable_con_32_pix:1;
+      GLuint enable_con_64_pix:1;
+      GLuint pad0:1;
+
+      /* These next four bits are for Ironlake+ */
+      GLuint fast_span_coverage_enable:1;
+      GLuint depth_buffer_clear:1;
+      GLuint depth_buffer_resolve_enable:1;
+      GLuint hierarchical_depth_buffer_resolve_enable:1;
+
       GLuint legacy_global_depth_bias:1; 
       GLuint line_stipple:1; 
       GLuint depth_offset:1; 
@@ -818,19 +567,49 @@ struct brw_wm_unit_state
       GLuint program_computes_depth:1; 
       GLuint program_uses_killpixel:1; 
       GLuint legacy_line_rast: 1; 
-      GLuint pad1:1; 
-      GLuint max_threads:6; 
-      GLuint pad2:1;
+      GLuint transposed_urb_read_enable:1; 
+      GLuint max_threads:7; 
    } wm5;
    
    GLfloat global_depth_offset_constant;  
    GLfloat global_depth_offset_scale;   
+   
+   /* for Ironlake only */
+   struct {
+      GLuint pad0:1;
+      GLuint grf_reg_count_1:3; 
+      GLuint pad1:2;
+      GLuint kernel_start_pointer_1:26;
+   } wm8;       
+
+   struct {
+      GLuint pad0:1;
+      GLuint grf_reg_count_2:3; 
+      GLuint pad1:2;
+      GLuint kernel_start_pointer_2:26;
+   } wm9;       
+
+   struct {
+      GLuint pad0:1;
+      GLuint grf_reg_count_3:3; 
+      GLuint pad1:2;
+      GLuint kernel_start_pointer_3:26;
+   } wm10;       
 };
 
 struct brw_sampler_default_color {
    GLfloat color[4];
 };
 
+struct gen5_sampler_default_color {
+   uint8_t ub[4];
+   float f[4];
+   uint16_t hf[4];
+   uint16_t us[4];
+   int16_t s[4];
+   uint8_t b[4];
+};
+
 struct brw_sampler_state
 {
    
@@ -842,7 +621,7 @@ struct brw_sampler_state
       GLuint mag_filter:3; 
       GLuint mip_filter:2; 
       GLuint base_level:5; 
-      GLuint pad:1;
+      GLuint min_mag_neq:1;
       GLuint lod_preclamp:1; 
       GLuint default_color_mode:1; 
       GLuint pad0:1;
@@ -854,7 +633,8 @@ struct brw_sampler_state
       GLuint r_wrap_mode:3; 
       GLuint t_wrap_mode:3; 
       GLuint s_wrap_mode:3; 
-      GLuint pad:3;
+      GLuint cube_control_mode:1;
+      GLuint pad:2;
       GLuint max_lod:10; 
       GLuint min_lod:10; 
    } ss1;
@@ -868,7 +648,9 @@ struct brw_sampler_state
    
    struct
    {
-      GLuint pad:19;
+      GLuint non_normalized_coord:1;
+      GLuint pad:12;
+      GLuint address_round:6;
       GLuint max_aniso:3; 
       GLuint chroma_key_mode:1; 
       GLuint chroma_key_index:2; 
@@ -878,6 +660,54 @@ struct brw_sampler_state
    } ss3;
 };
 
+struct gen7_sampler_state
+{
+   struct
+   {
+      GLuint aniso_algorithm:1;
+      GLuint lod_bias:13;
+      GLuint min_filter:3;
+      GLuint mag_filter:3;
+      GLuint mip_filter:2;
+      GLuint base_level:5;
+      GLuint pad1:1;
+      GLuint lod_preclamp:1;
+      GLuint default_color_mode:1;
+      GLuint pad0:1;
+      GLuint disable:1;
+   } ss0;
+
+   struct
+   {
+      GLuint cube_control_mode:1;
+      GLuint shadow_function:3;
+      GLuint pad:4;
+      GLuint max_lod:12;
+      GLuint min_lod:12;
+   } ss1;
+
+   struct
+   {
+      GLuint pad:5;
+      GLuint default_color_pointer:27;
+   } ss2;
+
+   struct
+   {
+      GLuint r_wrap_mode:3;
+      GLuint t_wrap_mode:3;
+      GLuint s_wrap_mode:3;
+      GLuint pad:1;
+      GLuint non_normalized_coord:1;
+      GLuint trilinear_quality:2;
+      GLuint address_round:6;
+      GLuint max_aniso:3;
+      GLuint chroma_key_mode:1;
+      GLuint chroma_key_index:2;
+      GLuint chroma_key_enable:1;
+      GLuint pad0:6;
+   } ss3;
+};
 
 struct brw_clipper_viewport
 {
@@ -901,94 +731,48 @@ struct brw_sf_viewport
       GLfloat m22;  
       GLfloat m30;  
       GLfloat m31;  
-      GLfloat m32;  
-   } viewport;
-
-   struct {
-      GLshort xmin;
-      GLshort ymin;
-      GLshort xmax;
-      GLshort ymax;
-   } scissor;
-};
-
-/* Documented in the subsystem/shared-functions/sampler chapter...
- */
-struct brw_surface_state
-{
-   struct {
-      GLuint cube_pos_z:1; 
-      GLuint cube_neg_z:1; 
-      GLuint cube_pos_y:1; 
-      GLuint cube_neg_y:1; 
-      GLuint cube_pos_x:1; 
-      GLuint cube_neg_x:1; 
-      GLuint pad:4;
-      GLuint mipmap_layout_mode:1; 
-      GLuint vert_line_stride_ofs:1; 
-      GLuint vert_line_stride:1; 
-      GLuint color_blend:1; 
-      GLuint writedisable_blue:1; 
-      GLuint writedisable_green:1; 
-      GLuint writedisable_red:1; 
-      GLuint writedisable_alpha:1; 
-      GLuint surface_format:9; 
-      GLuint data_return_format:1; 
-      GLuint pad0:1;
-      GLuint surface_type:3; 
-   } ss0;
-   
-   struct {
-      GLuint base_addr;  
-   } ss1;
-   
-   struct {
-      GLuint pad:2;
-      GLuint mip_count:4; 
-      GLuint width:13; 
-      GLuint height:13; 
-   } ss2;
+      GLfloat m32;  
+   } viewport;
 
+   /* scissor coordinates are inclusive */
    struct {
-      GLuint tile_walk:1; 
-      GLuint tiled_surface:1; 
-      GLuint pad:1; 
-      GLuint pitch:18; 
-      GLuint depth:11; 
-   } ss3;
-   
-   struct {
-      GLuint pad:19;
-      GLuint min_array_elt:9; 
-      GLuint min_lod:4; 
-   } ss4;
+      GLshort xmin;
+      GLshort ymin;
+      GLshort xmax;
+      GLshort ymax;
+   } scissor;
 };
 
+struct gen6_sf_viewport {
+   GLfloat m00;
+   GLfloat m11;
+   GLfloat m22;
+   GLfloat m30;
+   GLfloat m31;
+   GLfloat m32;
+};
 
-
-struct brw_vertex_buffer_state
-{
+struct gen7_sf_clip_viewport {
    struct {
-      GLuint pitch:11; 
-      GLuint pad:15;
-      GLuint access_type:1; 
-      GLuint vb_index:5; 
-   } vb0;
-   
-   GLuint start_addr; 
-   GLuint max_index;   
-#if 1
-   GLuint instance_data_step_rate; /* not included for sequential/random vertices? */
-#endif
-};
+      GLfloat m00;
+      GLfloat m11;
+      GLfloat m22;
+      GLfloat m30;
+      GLfloat m31;
+      GLfloat m32;
+   } viewport;
 
-#define BRW_VBP_MAX 17
+   GLuint pad0[2];
 
-struct brw_vb_array_state {
-   struct header header;
-   struct brw_vertex_buffer_state vb[BRW_VBP_MAX];
-};
+   struct {
+      GLfloat xmin;
+      GLfloat xmax;
+      GLfloat ymin;
+      GLfloat ymax;
+   } guardband;
 
+   GLfloat pad1[4];
+};
 
 struct brw_vertex_element_state
 {
@@ -1013,14 +797,6 @@ struct brw_vertex_element_state
    } ve1;
 };
 
-#define BRW_VEP_MAX 18
-
-struct brw_vertex_element_packet {
-   struct header header;
-   struct brw_vertex_element_state ve[BRW_VEP_MAX]; /* note: less than _TNL_ATTRIB_MAX */
-};
-
-
 struct brw_urb_immediate {
    GLuint opcode:4;
    GLuint offset:6;
@@ -1067,18 +843,18 @@ struct brw_instruction
    union {
       struct
       {
-	 GLuint dest_reg_file:2;	/* 0x00000003 */
-	 GLuint dest_reg_type:3;	/* 0x0000001c */
-	 GLuint src0_reg_file:2;	/* 0x00000060 */
-	 GLuint src0_reg_type:3;	/* 0x00000380 */
-	 GLuint src1_reg_file:2;	/* 0x00000c00 */
-	 GLuint src1_reg_type:3;	/* 0x00007000 */
-	 GLuint pad:1;			/* 0x00008000 */
-	 GLuint dest_subreg_nr:5;	/* 0x001f0000 */
-	 GLuint dest_reg_nr:8;		/* 0x1f700000 */
-	 GLuint dest_horiz_stride:2;	/* 0x60000000 */
-	 GLuint dest_address_mode:1;	/* 0x80000000 */
-      } da1; /* direct align1 */
+	 GLuint dest_reg_file:2;
+	 GLuint dest_reg_type:3;
+	 GLuint src0_reg_file:2;
+	 GLuint src0_reg_type:3;
+	 GLuint src1_reg_file:2;
+	 GLuint src1_reg_type:3;
+	 GLuint pad:1;
+	 GLuint dest_subreg_nr:5;
+	 GLuint dest_reg_nr:8;
+	 GLuint dest_horiz_stride:2;
+	 GLuint dest_address_mode:1;
+      } da1;
 
       struct
       {
@@ -1086,14 +862,14 @@ struct brw_instruction
 	 GLuint dest_reg_type:3;
 	 GLuint src0_reg_file:2;
 	 GLuint src0_reg_type:3;
-	 GLuint src1_reg_file:2;	/* 0x00000c00 */
-	 GLuint src1_reg_type:3;	/* 0x00007000 */
+	 GLuint src1_reg_file:2;        /* 0x00000c00 */
+	 GLuint src1_reg_type:3;        /* 0x00007000 */
 	 GLuint pad:1;
 	 GLint dest_indirect_offset:10;	/* offset against the deref'd address reg */
 	 GLuint dest_subreg_nr:3; /* subnr for the address reg a0.x */
 	 GLuint dest_horiz_stride:2;
 	 GLuint dest_address_mode:1;
-      } ia1; /* indirect align1 */
+      } ia1;
 
       struct
       {
@@ -1103,13 +879,13 @@ struct brw_instruction
 	 GLuint src0_reg_type:3;
 	 GLuint src1_reg_file:2;
 	 GLuint src1_reg_type:3;
-	 GLuint pad0:1;
+	 GLuint pad:1;
 	 GLuint dest_writemask:4;
 	 GLuint dest_subreg_nr:1;
 	 GLuint dest_reg_nr:8;
 	 GLuint dest_horiz_stride:2;
 	 GLuint dest_address_mode:1;
-      } da16; /* direct align16 */
+      } da16;
 
       struct
       {
@@ -1123,7 +899,7 @@ struct brw_instruction
 	 GLuint dest_subreg_nr:3;
 	 GLuint dest_horiz_stride:2;
 	 GLuint dest_address_mode:1;
-      } ia16; /* indirect align16 */
+      } ia16;
 
       struct {
 	 GLuint dest_reg_file:2;
@@ -1137,43 +913,46 @@ struct brw_instruction
 	 GLint jump_count:16;
       } branch_gen6;
 
-      struct
-      {
-	 GLuint dest_reg_file:1; /* used in Gen6, deleted in Gen7 */
+      struct {
+	 GLuint dest_reg_file:1;
 	 GLuint flag_subreg_nr:1;
-	 GLuint flag_reg_nr:1;   /* not in Gen6. Add in Gen7 */
-	 GLuint pad1:1; /* reserved */
-	 GLuint src0_modifier:2;
-	 GLuint src1_modifier:2;
-	 GLuint src2_modifier:2;
+	 GLuint flag_reg_nr:1;
+	 GLuint pad0:1;
+	 GLuint src0_abs:1;
+	 GLuint src0_negate:1;
+	 GLuint src1_abs:1;
+	 GLuint src1_negate:1;
+	 GLuint src2_abs:1;
+	 GLuint src2_negate:1;
 	 GLuint src_reg_type:2;
 	 GLuint dest_reg_type:2;
-	 GLuint pad2:1; /* reserved */
+	 GLuint pad1:1;
 	 GLuint nib_ctrl:1;
-	 GLuint pad3:1; /* reserved */
+	 GLuint pad2:1;
 	 GLuint dest_writemask:4;
 	 GLuint dest_subreg_nr:3;
 	 GLuint dest_reg_nr:8;
       } da3src;
 
+      uint32_t ud;
    } bits1;
 
 
    union {
       struct
       {
-	 GLuint src0_subreg_nr:5;	/* 0x0000001f */
-	 GLuint src0_reg_nr:8;		/* 0x00001fe0 */
-	 GLuint src0_abs:1;		/* 0x00002000 */
-	 GLuint src0_negate:1;		/* 0x00004000 */
-	 GLuint src0_address_mode:1;	/* 0x00008000 */
-	 GLuint src0_horiz_stride:2;	/* 0x00030000 */
-	 GLuint src0_width:3;		/* 0x001c0000 */
-	 GLuint src0_vert_stride:4;	/* 0x01e00000 */
-	 GLuint flag_subreg_nr:1;	/* 0x02000000 */
-	 GLuint flag_reg_nr:1;		/* 0x04000000 */
-	 GLuint pad:5;			/* 0xf8000000 */
-      } da1; /* direct align1 */
+	 GLuint src0_subreg_nr:5;
+	 GLuint src0_reg_nr:8;
+	 GLuint src0_abs:1;
+	 GLuint src0_negate:1;
+	 GLuint src0_address_mode:1;
+	 GLuint src0_horiz_stride:2;
+	 GLuint src0_width:3;
+	 GLuint src0_vert_stride:4;
+	 GLuint flag_subreg_nr:1;
+	 GLuint flag_reg_nr:1;
+	 GLuint pad:5;
+      } da1;
 
       struct
       {
@@ -1187,8 +966,8 @@ struct brw_instruction
 	 GLuint src0_vert_stride:4;
 	 GLuint flag_subreg_nr:1;
 	 GLuint flag_reg_nr:1;
-	 GLuint pad:5;	
-      } ia1; /* indirect align1 */
+	 GLuint pad:5;
+      } ia1;
 
       struct
       {
@@ -1206,7 +985,7 @@ struct brw_instruction
 	 GLuint flag_subreg_nr:1;
 	 GLuint flag_reg_nr:1;
 	 GLuint pad1:5;
-      } da16; /* direct align16 */
+      } da16;
 
       struct
       {
@@ -1224,32 +1003,33 @@ struct brw_instruction
 	 GLuint flag_subreg_nr:1;
 	 GLuint flag_reg_nr:1;
 	 GLuint pad1:5;
-      } ia16; /* indirect align16 */
+      } ia16;
 
-      struct
-      {
+      /* Extended Message Descriptor for Ironlake (Gen5) SEND instruction.
+       *
+       * Does not apply to Gen6+.  The SFID/message target moved to bits
+       * 27:24 of the header (destreg__conditionalmod); EOT is in bits3.
+       */
+       struct 
+       {
+           GLuint pad:26;
+           GLuint end_of_thread:1;
+           GLuint pad1:1;
+           GLuint sfid:4;
+       } send_gen5;  /* for Ironlake only */
+
+      struct {
 	 GLuint src0_rep_ctrl:1;
 	 GLuint src0_swizzle:8;
 	 GLuint src0_subreg_nr:3;
 	 GLuint src0_reg_nr:8;
-	 GLuint pad0:1; /* reserved */
+	 GLuint pad0:1;
 	 GLuint src1_rep_ctrl:1;
 	 GLuint src1_swizzle:8;
-	 GLuint src1_subreg_nr_low:2; /* src1_subreg_nr spans on two DWORDs */
+	 GLuint src1_subreg_nr_low:2;
       } da3src;
 
-       struct 
-       {
-           GLuint pad:26;
-           GLuint end_of_thread:1;
-           GLuint pad1:1;
-           GLuint sfid:4;
-       } send_gen5;  /* for GEN5 only */
-       struct 
-       {
-           GLuint pad:26;
-           GLuint msg_ext:6;
-       } msg_ext;
+      uint32_t ud;
    } bits2;
 
    union
@@ -1265,7 +1045,7 @@ struct brw_instruction
 	 GLuint src1_width:3;
 	 GLuint src1_vert_stride:4;
 	 GLuint pad0:7;
-      } da1; /* direct align1 */
+      } da1;
 
       struct
       {
@@ -1281,7 +1061,7 @@ struct brw_instruction
 	 GLuint pad1:1;
 	 GLuint src1_vert_stride:4;
 	 GLuint pad2:7;
-      } da16; /* direct align16 */
+      } da16;
 
       struct
       {
@@ -1293,8 +1073,8 @@ struct brw_instruction
 	 GLuint src1_horiz_stride:2;
 	 GLuint src1_width:3;
 	 GLuint src1_vert_stride:4;
-	 GLuint pad1:7;	
-      } ia1; /* indirect align1 */
+	 GLuint pad1:7;
+      } ia1;
 
       struct
       {
@@ -1310,19 +1090,15 @@ struct brw_instruction
 	 GLuint pad1:1;
 	 GLuint src1_vert_stride:4;
 	 GLuint pad2:7;
-      } ia16; /* indirect align16 */
+      } ia16;
+
 
       struct
       {
-	 GLuint src1_subreg_nr_high:1; /* src1_subreg_nr spans on two DWORDs */
-	 GLuint src1_reg_nr:8;
-	 GLuint pad0:1; /* reserved */
-	 GLuint src2_rep_ctrl:1;
-	 GLuint src2_swizzle:8;
-	 GLuint src2_subreg_nr:3;
-	 GLuint src2_reg_nr:8;
-	 GLuint pad1:2; /* reserved */
-      } da3src;
+	 GLint  jump_count:16;	/* note: signed */
+	 GLuint  pop_count:4;
+	 GLuint  pad0:12;
+      } if_else;
 
       /* This is also used for gen7 IF/ELSE instructions */
       struct
@@ -1342,6 +1118,76 @@ struct brw_instruction
 
       GLint JIP; /* used by Gen6 CALL instructions; Gen7 JMPI */
 
+      /**
+       * \defgroup SEND instructions / Message Descriptors
+       *
+       * @{
+       */
+
+      /**
+       * Generic Message Descriptor for Gen4 SEND instructions.  The structs
+       * below expand function_control to something specific for their
+       * message.  Due to struct packing issues, they duplicate these bits.
+       *
+       * See the G45 PRM, Volume 4, Table 14-15.
+       */
+      struct {
+	 GLuint function_control:16;
+	 GLuint response_length:4;
+	 GLuint msg_length:4;
+	 GLuint msg_target:4;
+	 GLuint pad1:3;
+	 GLuint end_of_thread:1;
+      } generic;
+
+      /**
+       * Generic Message Descriptor for Gen5-7 SEND instructions.
+       *
+       * See the Sandybridge PRM, Volume 2 Part 2, Table 8-15.  (Sadly, most
+       * of the information on the SEND instruction is missing from the public
+       * Ironlake PRM.)
+       *
+       * The table claims that bit 31 is reserved/MBZ on Gen6+, but it lies.
+       * According to the SEND instruction description:
+       * "The MSb of the message description, the EOT field, always comes from
+       *  bit 127 of the instruction word"...which is bit 31 of this field.
+       */
+      struct {
+	 GLuint function_control:19;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } generic_gen5;
+
+      struct {
+	 GLuint opcode:1;
+	 GLuint requester_type:1;
+	 GLuint pad:2;
+	 GLuint resource_select:1;
+	 GLuint pad1:11;
+	 GLuint response_length:4;
+	 GLuint msg_length:4;
+	 GLuint msg_target:4;
+	 GLuint pad2:3;
+	 GLuint end_of_thread:1;
+      } thread_spawner;
+
+       struct {
+	 GLuint opcode:1;
+	 GLuint requester_type:1;
+	 GLuint pad0:2;
+	 GLuint resource_select:1;
+	 GLuint pad1:14;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad2:2;
+	 GLuint end_of_thread:1;
+      } thread_spawner_gen5;
+
+      /** G45 PRM, Volume 4, Section 6.1.1.1 */
       struct {
 	 GLuint function:4;
 	 GLuint int_type:1;
@@ -1356,6 +1202,23 @@ struct brw_instruction
 	 GLuint end_of_thread:1;
       } math;
 
+      /** Ironlake PRM, Volume 4 Part 1, Section 6.1.1.1 */
+      struct {
+	 GLuint function:4;
+	 GLuint int_type:1;
+	 GLuint precision:1;
+	 GLuint saturate:1;
+	 GLuint data_type:1;
+	 GLuint snapshot:1;
+	 GLuint pad0:10;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } math_gen5;
+
+      /** G45 PRM, Volume 4, Section 4.8.1.1.1 [DevBW] and [DevCL] */
       struct {
 	 GLuint binding_table_index:8;
 	 GLuint sampler:4;
@@ -1368,9 +1231,95 @@ struct brw_instruction
 	 GLuint end_of_thread:1;
       } sampler;
 
+      /** G45 PRM, Volume 4, Section 4.8.1.1.2 [DevCTG] */
+      struct {
+         GLuint binding_table_index:8;
+         GLuint sampler:4;
+         GLuint msg_type:4;
+         GLuint response_length:4;
+         GLuint msg_length:4;
+         GLuint msg_target:4;
+         GLuint pad1:3;
+         GLuint end_of_thread:1;
+      } sampler_g4x;
+
+      /** Ironlake PRM, Volume 4 Part 1, Section 4.11.1.1.3 */
+      struct {
+	 GLuint binding_table_index:8;
+	 GLuint sampler:4;
+	 GLuint msg_type:4;
+	 GLuint simd_mode:2;
+	 GLuint pad0:1;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } sampler_gen5;
+
+      struct {
+	 GLuint binding_table_index:8;
+	 GLuint sampler:4;
+	 GLuint msg_type:5;
+	 GLuint simd_mode:2;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } sampler_gen7;
+
       struct brw_urb_immediate urb;
 
       struct {
+	 GLuint opcode:4;
+	 GLuint offset:6;
+	 GLuint swizzle_control:2; 
+	 GLuint pad:1;
+	 GLuint allocate:1;
+	 GLuint used:1;
+	 GLuint complete:1;
+	 GLuint pad0:3;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } urb_gen5;
+
+      struct {
+	 GLuint opcode:3;
+	 GLuint offset:11;
+	 GLuint swizzle_control:1;
+	 GLuint complete:1;
+	 GLuint per_slot_offset:1;
+	 GLuint pad0:2;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } urb_gen7;
+
+      struct {
+	 GLuint binding_table_index:8;
+	 GLuint search_path_index:3;
+	 GLuint lut_subindex:2;
+	 GLuint message_type:2;
+	 GLuint pad0:4;
+	 GLuint header_present:1;
+      } vme_gen6;
+
+      struct {
+	 GLuint binding_table_index:8;
+	 GLuint pad0:5;
+	 GLuint message_type:2;
+	 GLuint pad1:4;
+	 GLuint header_present:1;
+      } cre_gen75;
+
+      /** 965 PRM, Volume 4, Section 5.10.1.1: Message Descriptor */
+      struct {
 	 GLuint binding_table_index:8;
 	 GLuint msg_control:4;  
 	 GLuint msg_type:2;  
@@ -1382,109 +1331,61 @@ struct brw_instruction
 	 GLuint end_of_thread:1;
       } dp_read;
 
+      /** G45 PRM, Volume 4, Section 5.10.1.1.2 */
       struct {
 	 GLuint binding_table_index:8;
 	 GLuint msg_control:3;
-	 GLuint last_render_target:1;
-	 GLuint msg_type:3;    
-	 GLuint send_commit_msg:1;
+	 GLuint msg_type:3;
+	 GLuint target_cache:2;
 	 GLuint response_length:4;
 	 GLuint msg_length:4;
 	 GLuint msg_target:4;
 	 GLuint pad1:3;
 	 GLuint end_of_thread:1;
-      } dp_write;
+      } dp_read_g4x;
 
+      /** Ironlake PRM, Volume 4 Part 1, Section 5.10.2.1.2. */
       struct {
-	  GLuint opcode:1;
-          GLuint requester_type:1;
-          GLuint pad:2;
-          GLuint resource_select:1;
-          GLuint pad1:11;
-          GLuint response_length:4;
-          GLuint msg_length:4;
-          GLuint msg_target:4;
-          GLuint pad2:3;
-          GLuint end_of_thread:1;
-      } thread_spawner;
+	 GLuint binding_table_index:8;
+	 GLuint msg_control:4;  
+	 GLuint msg_type:2;  
+	 GLuint target_cache:2;    
+	 GLuint pad0:3;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } dp_read_gen5;
 
+      /** G45 PRM, Volume 4, Section 5.10.1.1.2.  For both Gen4 and G45. */
       struct {
-	 GLuint pad:16;
+	 GLuint binding_table_index:8;
+	 GLuint msg_control:3;
+	 GLuint last_render_target:1;
+	 GLuint msg_type:3;    
+	 GLuint send_commit_msg:1;
 	 GLuint response_length:4;
 	 GLuint msg_length:4;
 	 GLuint msg_target:4;
 	 GLuint pad1:3;
 	 GLuint end_of_thread:1;
-      } generic;
-
-       struct {
-           GLuint function:4;
-           GLuint int_type:1;
-           GLuint precision:1;
-           GLuint saturate:1;
-           GLuint data_type:1;
-           GLuint snapshot:1;
-           GLuint pad0:10;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } math_gen5;
-
-       struct {
-           GLuint opcode:4;
-           GLuint offset:6;
-           GLuint swizzle_control:2; 
-           GLuint pad:1;
-           GLuint allocate:1;
-           GLuint used:1;
-           GLuint complete:1;
-           GLuint pad0:3;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } urb_gen5;
-
-       struct {
-           GLuint binding_table_index:8;
-           GLuint sampler:4;
-           GLuint msg_type:4;
-           GLuint simd_mode:2;
-           GLuint pad0:1;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } sampler_gen5;
-
-       struct {
-           GLuint binding_table_index:8;
-           GLuint sampler:4;
-           GLuint msg_type:5;
-           GLuint simd_mode:2;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } sampler_gen7;
+      } dp_write;
 
-       struct {
-           GLuint binding_table_index:8;
-           GLuint msg_control:4;  
-           GLuint msg_type:2;  
-           GLuint target_cache:2;    
-           GLuint pad0:3;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } dp_read_gen5;
+      /** Ironlake PRM, Volume 4 Part 1, Section 5.10.2.1.2. */
+      struct {
+	 GLuint binding_table_index:8;
+	 GLuint msg_control:3;
+	 GLuint last_render_target:1;
+	 GLuint msg_type:3;    
+	 GLuint send_commit_msg:1;
+	 GLuint pad0:3;
+	 GLuint header_present:1;
+	 GLuint response_length:5;
+	 GLuint msg_length:4;
+	 GLuint pad1:2;
+	 GLuint end_of_thread:1;
+      } dp_write_gen5;
 
       /**
        * Message for the Sandybridge Sampler Cache or Constant Cache Data Port.
@@ -1503,20 +1404,6 @@ struct brw_instruction
 	 GLuint end_of_thread:1;
       } gen6_dp_sampler_const_cache;
 
-       struct {
-           GLuint binding_table_index:8;
-           GLuint msg_control:3;
-           GLuint last_render_target:1;
-           GLuint msg_type:3;    
-           GLuint send_commit_msg:1;
-           GLuint pad0:3;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } dp_write_gen5;
-
       /**
        * Message for the Sandybridge Render Cache Data Port.
        *
@@ -1561,51 +1448,47 @@ struct brw_instruction
       } gen7_dp;
       /** @} */
 
-       struct {
-           GLuint opcode:1;
-           GLuint requester_type:1;
-           GLuint pad0:2;
-           GLuint resource_select:1;
-           GLuint pad1:14;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad2:2;
-           GLuint end_of_thread:1;
-       } thread_spawner_gen5;
-
-       struct {
-           GLuint binding_table_index:8;
-           GLuint search_path_index:3;
-           GLuint lut_subindex:2;
-           GLuint message_type:2;
-           GLuint pad0:4;
-           GLuint header_present:1;
-       } vme_gen6;
-       struct {
-           GLuint binding_table_index:8;
-	   GLuint pad0:5;
-           GLuint message_type:2;
-           GLuint pad1:4;
-           GLuint header_present:1;
-       } cre_gen75;
-       struct {
-           GLuint pad:19;
-           GLuint header_present:1;
-           GLuint response_length:5;
-           GLuint msg_length:4;
-           GLuint pad1:2;
-           GLuint end_of_thread:1;
-       } generic_gen5;
+      struct {
+	 GLuint src1_subreg_nr_high:1;
+	 GLuint src1_reg_nr:8;
+	 GLuint pad0:1;
+	 GLuint src2_rep_ctrl:1;
+	 GLuint src2_swizzle:8;
+	 GLuint src2_subreg_nr:3;
+	 GLuint src2_reg_nr:8;
+	 GLuint pad1:2;
+      } da3src;
 
-      GLuint ud;
       GLint d;
-      GLfloat f;
+      GLuint ud;
+      float f;
    } bits3;
 
    char *first_reloc_target, *second_reloc_target; // first for JIP, second for UIP
    GLint first_reloc_offset, second_reloc_offset; // in number of instructions
 };
 
+struct brw_compact_instruction {
+   struct {
+      unsigned opcode:7;          /*  0- 6 */
+      unsigned debug_control:1;   /*  7- 7 */
+      unsigned control_index:5;   /*  8-12 */
+      unsigned data_type_index:5; /* 13-17 */
+      unsigned sub_reg_index:5;   /* 18-22 */
+      unsigned acc_wr_control:1;  /* 23-23 */
+      unsigned conditionalmod:4;  /* 24-27 */
+      unsigned flag_subreg_nr:1;     /* 28-28 */
+      unsigned cmpt_ctrl:1;       /* 29-29 */
+      unsigned src0_index:2;      /* 30-31 */
+   } dw0;
+
+   struct {
+      unsigned src0_index:3;  /* 32-24 */
+      unsigned src1_index:5;  /* 35-39 */
+      unsigned dst_reg_nr:8;  /* 40-47 */
+      unsigned src0_reg_nr:8; /* 48-55 */
+      unsigned src1_reg_nr:8; /* 56-63 */
+   } dw1;
+};
 
 #endif
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 17/90] assembler: Remove trailing white spaces from brw_structs.h
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (15 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 16/90] assembler: Adopt brw_structs.h from mesa Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 18/90] assembler: Adopt enum brw_message_target from mesa Damien Lespiau
                   ` (73 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |  488 +++++++++++++++++++++++-----------------------
 1 files changed, 244 insertions(+), 244 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index 218f11e..db7a9be 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -2,7 +2,7 @@
  Copyright (C) Intel Corp.  2006.  All Rights Reserved.
  Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
  develop this 3D driver.
- 
+
  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
@@ -10,11 +10,11 @@
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:
- 
+
  The above copyright notice and this permission notice (including the
  next paragraph) shall be included in all copies or substantial
  portions of the Software.
- 
+
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -22,13 +22,13 @@
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- 
+
  **********************************************************************/
  /*
   * Authors:
   *   Keith Whitwell <keith@tungstengraphics.com>
   */
-        
+
 
 #ifndef BRW_STRUCTS_H
 #define BRW_STRUCTS_H
@@ -45,30 +45,30 @@ struct brw_urb_fence
 {
    struct
    {
-      GLuint length:8;   
-      GLuint vs_realloc:1;   
-      GLuint gs_realloc:1;   
-      GLuint clp_realloc:1;   
-      GLuint sf_realloc:1;   
-      GLuint vfe_realloc:1;   
-      GLuint cs_realloc:1;   
+      GLuint length:8;
+      GLuint vs_realloc:1;
+      GLuint gs_realloc:1;
+      GLuint clp_realloc:1;
+      GLuint sf_realloc:1;
+      GLuint vfe_realloc:1;
+      GLuint cs_realloc:1;
       GLuint pad:2;
-      GLuint opcode:16;   
+      GLuint opcode:16;
    } header;
 
    struct
    {
-      GLuint vs_fence:10;  
-      GLuint gs_fence:10;  
-      GLuint clp_fence:10;  
+      GLuint vs_fence:10;
+      GLuint gs_fence:10;
+      GLuint clp_fence:10;
       GLuint pad:2;
    } bits0;
 
    struct
    {
-      GLuint sf_fence:10;  
-      GLuint vf_fence:10;  
-      GLuint cs_fence:11;  
+      GLuint sf_fence:10;
+      GLuint vf_fence:10;
+      GLuint cs_fence:11;
       GLuint pad:1;
    } bits1;
 };
@@ -80,46 +80,46 @@ struct brw_urb_fence
 struct thread0
 {
    GLuint pad0:1;
-   GLuint grf_reg_count:3; 
+   GLuint grf_reg_count:3;
    GLuint pad1:2;
    GLuint kernel_start_pointer:26; /* Offset from GENERAL_STATE_BASE */
 };
 
 struct thread1
 {
-   GLuint ext_halt_exception_enable:1; 
-   GLuint sw_exception_enable:1; 
-   GLuint mask_stack_exception_enable:1; 
-   GLuint timeout_exception_enable:1; 
-   GLuint illegal_op_exception_enable:1; 
+   GLuint ext_halt_exception_enable:1;
+   GLuint sw_exception_enable:1;
+   GLuint mask_stack_exception_enable:1;
+   GLuint timeout_exception_enable:1;
+   GLuint illegal_op_exception_enable:1;
    GLuint pad0:3;
    GLuint depth_coef_urb_read_offset:6;	/* WM only */
    GLuint pad1:2;
-   GLuint floating_point_mode:1; 
-   GLuint thread_priority:1; 
-   GLuint binding_table_entry_count:8; 
+   GLuint floating_point_mode:1;
+   GLuint thread_priority:1;
+   GLuint binding_table_entry_count:8;
    GLuint pad3:5;
-   GLuint single_program_flow:1; 
+   GLuint single_program_flow:1;
 };
 
 struct thread2
 {
-   GLuint per_thread_scratch_space:4; 
+   GLuint per_thread_scratch_space:4;
    GLuint pad0:6;
-   GLuint scratch_space_base_pointer:22; 
+   GLuint scratch_space_base_pointer:22;
 };
 
-   
+
 struct thread3
 {
-   GLuint dispatch_grf_start_reg:4; 
-   GLuint urb_entry_read_offset:6; 
+   GLuint dispatch_grf_start_reg:4;
+   GLuint urb_entry_read_offset:6;
    GLuint pad0:1;
-   GLuint urb_entry_read_length:6; 
+   GLuint urb_entry_read_length:6;
    GLuint pad1:1;
-   GLuint const_urb_entry_read_offset:6; 
+   GLuint const_urb_entry_read_offset:6;
    GLuint pad2:1;
-   GLuint const_urb_entry_read_length:6; 
+   GLuint const_urb_entry_read_length:6;
    GLuint pad3:1;
 };
 
@@ -151,41 +151,41 @@ struct brw_clip_unit_state
    {
       GLuint pad0:9;
       GLuint gs_output_stats:1; /* not always */
-      GLuint stats_enable:1; 
-      GLuint nr_urb_entries:7; 
+      GLuint stats_enable:1;
+      GLuint nr_urb_entries:7;
       GLuint pad1:1;
-      GLuint urb_entry_allocation_size:5; 
+      GLuint urb_entry_allocation_size:5;
       GLuint pad2:1;
       GLuint max_threads:5; 	/* may be less */
       GLuint pad3:2;
-   } thread4;   
-      
+   } thread4;
+
    struct
    {
       GLuint pad0:13;
-      GLuint clip_mode:3; 
-      GLuint userclip_enable_flags:8; 
-      GLuint userclip_must_clip:1; 
+      GLuint clip_mode:3;
+      GLuint userclip_enable_flags:8;
+      GLuint userclip_must_clip:1;
       GLuint negative_w_clip_test:1;
-      GLuint guard_band_enable:1; 
-      GLuint viewport_z_clip_enable:1; 
-      GLuint viewport_xy_clip_enable:1; 
-      GLuint vertex_position_space:1; 
-      GLuint api_mode:1; 
+      GLuint guard_band_enable:1;
+      GLuint viewport_z_clip_enable:1;
+      GLuint viewport_xy_clip_enable:1;
+      GLuint vertex_position_space:1;
+      GLuint api_mode:1;
       GLuint pad2:1;
    } clip5;
-   
+
    struct
    {
       GLuint pad0:5;
-      GLuint clipper_viewport_state_ptr:27; 
+      GLuint clipper_viewport_state_ptr:27;
    } clip6;
 
-   
-   GLfloat viewport_xmin;  
-   GLfloat viewport_xmax;  
-   GLfloat viewport_ymin;  
-   GLfloat viewport_ymax;  
+
+   GLfloat viewport_xmin;
+   GLfloat viewport_xmax;
+   GLfloat viewport_ymin;
+   GLfloat viewport_ymax;
 };
 
 struct gen6_blend_state
@@ -293,88 +293,88 @@ struct brw_cc_unit_state
    struct
    {
       GLuint pad0:3;
-      GLuint bf_stencil_pass_depth_pass_op:3; 
-      GLuint bf_stencil_pass_depth_fail_op:3; 
-      GLuint bf_stencil_fail_op:3; 
-      GLuint bf_stencil_func:3; 
-      GLuint bf_stencil_enable:1; 
+      GLuint bf_stencil_pass_depth_pass_op:3;
+      GLuint bf_stencil_pass_depth_fail_op:3;
+      GLuint bf_stencil_fail_op:3;
+      GLuint bf_stencil_func:3;
+      GLuint bf_stencil_enable:1;
       GLuint pad1:2;
-      GLuint stencil_write_enable:1; 
-      GLuint stencil_pass_depth_pass_op:3; 
-      GLuint stencil_pass_depth_fail_op:3; 
-      GLuint stencil_fail_op:3; 
-      GLuint stencil_func:3; 
-      GLuint stencil_enable:1; 
+      GLuint stencil_write_enable:1;
+      GLuint stencil_pass_depth_pass_op:3;
+      GLuint stencil_pass_depth_fail_op:3;
+      GLuint stencil_fail_op:3;
+      GLuint stencil_func:3;
+      GLuint stencil_enable:1;
    } cc0;
 
-   
+
    struct
    {
-      GLuint bf_stencil_ref:8; 
-      GLuint stencil_write_mask:8; 
-      GLuint stencil_test_mask:8; 
-      GLuint stencil_ref:8; 
+      GLuint bf_stencil_ref:8;
+      GLuint stencil_write_mask:8;
+      GLuint stencil_test_mask:8;
+      GLuint stencil_ref:8;
    } cc1;
 
-   
+
    struct
    {
-      GLuint logicop_enable:1; 
+      GLuint logicop_enable:1;
       GLuint pad0:10;
-      GLuint depth_write_enable:1; 
-      GLuint depth_test_function:3; 
-      GLuint depth_test:1; 
-      GLuint bf_stencil_write_mask:8; 
-      GLuint bf_stencil_test_mask:8; 
+      GLuint depth_write_enable:1;
+      GLuint depth_test_function:3;
+      GLuint depth_test:1;
+      GLuint bf_stencil_write_mask:8;
+      GLuint bf_stencil_test_mask:8;
    } cc2;
 
-   
+
    struct
    {
       GLuint pad0:8;
-      GLuint alpha_test_func:3; 
-      GLuint alpha_test:1; 
-      GLuint blend_enable:1; 
-      GLuint ia_blend_enable:1; 
+      GLuint alpha_test_func:3;
+      GLuint alpha_test:1;
+      GLuint blend_enable:1;
+      GLuint ia_blend_enable:1;
       GLuint pad1:1;
       GLuint alpha_test_format:1;
       GLuint pad2:16;
    } cc3;
-   
+
    struct
    {
-      GLuint pad0:5; 
+      GLuint pad0:5;
       GLuint cc_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
    } cc4;
-   
+
    struct
    {
       GLuint pad0:2;
-      GLuint ia_dest_blend_factor:5; 
-      GLuint ia_src_blend_factor:5; 
-      GLuint ia_blend_function:3; 
-      GLuint statistics_enable:1; 
-      GLuint logicop_func:4; 
+      GLuint ia_dest_blend_factor:5;
+      GLuint ia_src_blend_factor:5;
+      GLuint ia_blend_function:3;
+      GLuint statistics_enable:1;
+      GLuint logicop_func:4;
       GLuint pad1:11;
-      GLuint dither_enable:1; 
+      GLuint dither_enable:1;
    } cc5;
 
    struct
    {
-      GLuint clamp_post_alpha_blend:1; 
-      GLuint clamp_pre_alpha_blend:1; 
-      GLuint clamp_range:2; 
+      GLuint clamp_post_alpha_blend:1;
+      GLuint clamp_pre_alpha_blend:1;
+      GLuint clamp_range:2;
       GLuint pad0:11;
-      GLuint y_dither_offset:2; 
-      GLuint x_dither_offset:2; 
-      GLuint dest_blend_factor:5; 
-      GLuint src_blend_factor:5; 
-      GLuint blend_function:3; 
+      GLuint y_dither_offset:2;
+      GLuint x_dither_offset:2;
+      GLuint dest_blend_factor:5;
+      GLuint src_blend_factor:5;
+      GLuint blend_function:3;
    } cc6;
 
    struct {
       union {
-	 GLfloat f;  
+	 GLfloat f;
 	 GLubyte ub[4];
       } alpha_ref;
    } cc7;
@@ -390,51 +390,51 @@ struct brw_sf_unit_state
    struct
    {
       GLuint pad0:10;
-      GLuint stats_enable:1; 
-      GLuint nr_urb_entries:7; 
+      GLuint stats_enable:1;
+      GLuint nr_urb_entries:7;
       GLuint pad1:1;
-      GLuint urb_entry_allocation_size:5; 
+      GLuint urb_entry_allocation_size:5;
       GLuint pad2:1;
-      GLuint max_threads:6; 
+      GLuint max_threads:6;
       GLuint pad3:1;
-   } thread4;   
+   } thread4;
 
    struct
    {
-      GLuint front_winding:1; 
-      GLuint viewport_transform:1; 
+      GLuint front_winding:1;
+      GLuint viewport_transform:1;
       GLuint pad0:3;
       GLuint sf_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
    } sf5;
-   
+
    struct
    {
       GLuint pad0:9;
-      GLuint dest_org_vbias:4; 
-      GLuint dest_org_hbias:4; 
-      GLuint scissor:1; 
-      GLuint disable_2x2_trifilter:1; 
-      GLuint disable_zero_pix_trifilter:1; 
-      GLuint point_rast_rule:2; 
-      GLuint line_endcap_aa_region_width:2; 
-      GLuint line_width:4; 
-      GLuint fast_scissor_disable:1; 
-      GLuint cull_mode:2; 
-      GLuint aa_enable:1; 
+      GLuint dest_org_vbias:4;
+      GLuint dest_org_hbias:4;
+      GLuint scissor:1;
+      GLuint disable_2x2_trifilter:1;
+      GLuint disable_zero_pix_trifilter:1;
+      GLuint point_rast_rule:2;
+      GLuint line_endcap_aa_region_width:2;
+      GLuint line_width:4;
+      GLuint fast_scissor_disable:1;
+      GLuint cull_mode:2;
+      GLuint aa_enable:1;
    } sf6;
 
    struct
    {
-      GLuint point_size:11; 
-      GLuint use_point_size_state:1; 
-      GLuint subpixel_precision:1; 
-      GLuint sprite_point:1; 
+      GLuint point_size:11;
+      GLuint use_point_size_state:1;
+      GLuint subpixel_precision:1;
+      GLuint sprite_point:1;
       GLuint pad0:10;
       GLuint aa_line_distance_mode:1;
-      GLuint trifan_pv:2; 
-      GLuint linestrip_pv:2; 
-      GLuint tristrip_pv:2; 
-      GLuint line_last_pixel_enable:1; 
+      GLuint trifan_pv:2;
+      GLuint linestrip_pv:2;
+      GLuint tristrip_pv:2;
+      GLuint line_last_pixel_enable:1;
    } sf7;
 
 };
@@ -459,33 +459,33 @@ struct brw_gs_unit_state
       GLuint pad0:8;
       GLuint rendering_enable:1; /* for Ironlake */
       GLuint pad4:1;
-      GLuint stats_enable:1; 
-      GLuint nr_urb_entries:7; 
+      GLuint stats_enable:1;
+      GLuint nr_urb_entries:7;
       GLuint pad1:1;
-      GLuint urb_entry_allocation_size:5; 
+      GLuint urb_entry_allocation_size:5;
       GLuint pad2:1;
-      GLuint max_threads:5; 
+      GLuint max_threads:5;
       GLuint pad3:2;
-   } thread4;   
-      
+   } thread4;
+
    struct
    {
-      GLuint sampler_count:3; 
+      GLuint sampler_count:3;
       GLuint pad0:2;
-      GLuint sampler_state_pointer:27; 
+      GLuint sampler_state_pointer:27;
    } gs5;
 
-   
+
    struct
    {
-      GLuint max_vp_index:4; 
+      GLuint max_vp_index:4;
       GLuint pad0:12;
       GLuint svbi_post_inc_value:10;
       GLuint pad1:1;
       GLuint svbi_post_inc_enable:1;
       GLuint svbi_payload:1;
       GLuint discard_adjaceny:1;
-      GLuint reorder_enable:1; 
+      GLuint reorder_enable:1;
       GLuint pad2:1;
    } gs6;
 };
@@ -497,30 +497,30 @@ struct brw_vs_unit_state
    struct thread1 thread1;
    struct thread2 thread2;
    struct thread3 thread3;
-   
+
    struct
    {
       GLuint pad0:10;
-      GLuint stats_enable:1; 
-      GLuint nr_urb_entries:7; 
+      GLuint stats_enable:1;
+      GLuint nr_urb_entries:7;
       GLuint pad1:1;
-      GLuint urb_entry_allocation_size:5; 
+      GLuint urb_entry_allocation_size:5;
       GLuint pad2:1;
-      GLuint max_threads:6; 
+      GLuint max_threads:6;
       GLuint pad3:1;
-   } thread4;   
+   } thread4;
 
    struct
    {
-      GLuint sampler_count:3; 
+      GLuint sampler_count:3;
       GLuint pad0:2;
-      GLuint sampler_state_pointer:27; 
+      GLuint sampler_state_pointer:27;
    } vs5;
 
    struct
    {
-      GLuint vs_enable:1; 
-      GLuint vert_cache_disable:1; 
+      GLuint vs_enable:1;
+      GLuint vert_cache_disable:1;
       GLuint pad0:30;
    } vs6;
 };
@@ -532,19 +532,19 @@ struct brw_wm_unit_state
    struct thread1 thread1;
    struct thread2 thread2;
    struct thread3 thread3;
-   
+
    struct {
-      GLuint stats_enable:1; 
+      GLuint stats_enable:1;
       GLuint depth_buffer_clear:1;
-      GLuint sampler_count:3; 
-      GLuint sampler_state_pointer:27; 
+      GLuint sampler_count:3;
+      GLuint sampler_state_pointer:27;
    } wm4;
-   
+
    struct
    {
-      GLuint enable_8_pix:1; 
-      GLuint enable_16_pix:1; 
-      GLuint enable_32_pix:1; 
+      GLuint enable_8_pix:1;
+      GLuint enable_16_pix:1;
+      GLuint enable_32_pix:1;
       GLuint enable_con_32_pix:1;
       GLuint enable_con_64_pix:1;
       GLuint pad0:1;
@@ -555,46 +555,46 @@ struct brw_wm_unit_state
       GLuint depth_buffer_resolve_enable:1;
       GLuint hierarchical_depth_buffer_resolve_enable:1;
 
-      GLuint legacy_global_depth_bias:1; 
-      GLuint line_stipple:1; 
-      GLuint depth_offset:1; 
-      GLuint polygon_stipple:1; 
-      GLuint line_aa_region_width:2; 
-      GLuint line_endcap_aa_region_width:2; 
-      GLuint early_depth_test:1; 
-      GLuint thread_dispatch_enable:1; 
-      GLuint program_uses_depth:1; 
-      GLuint program_computes_depth:1; 
-      GLuint program_uses_killpixel:1; 
-      GLuint legacy_line_rast: 1; 
-      GLuint transposed_urb_read_enable:1; 
-      GLuint max_threads:7; 
+      GLuint legacy_global_depth_bias:1;
+      GLuint line_stipple:1;
+      GLuint depth_offset:1;
+      GLuint polygon_stipple:1;
+      GLuint line_aa_region_width:2;
+      GLuint line_endcap_aa_region_width:2;
+      GLuint early_depth_test:1;
+      GLuint thread_dispatch_enable:1;
+      GLuint program_uses_depth:1;
+      GLuint program_computes_depth:1;
+      GLuint program_uses_killpixel:1;
+      GLuint legacy_line_rast: 1;
+      GLuint transposed_urb_read_enable:1;
+      GLuint max_threads:7;
    } wm5;
-   
-   GLfloat global_depth_offset_constant;  
-   GLfloat global_depth_offset_scale;   
-   
+
+   GLfloat global_depth_offset_constant;
+   GLfloat global_depth_offset_scale;
+
    /* for Ironlake only */
    struct {
       GLuint pad0:1;
-      GLuint grf_reg_count_1:3; 
+      GLuint grf_reg_count_1:3;
       GLuint pad1:2;
       GLuint kernel_start_pointer_1:26;
-   } wm8;       
+   } wm8;
 
    struct {
       GLuint pad0:1;
-      GLuint grf_reg_count_2:3; 
+      GLuint grf_reg_count_2:3;
       GLuint pad1:2;
       GLuint kernel_start_pointer_2:26;
-   } wm9;       
+   } wm9;
 
    struct {
       GLuint pad0:1;
-      GLuint grf_reg_count_3:3; 
+      GLuint grf_reg_count_3:3;
       GLuint pad1:2;
       GLuint kernel_start_pointer_3:26;
-   } wm10;       
+   } wm10;
 };
 
 struct brw_sampler_default_color {
@@ -612,51 +612,51 @@ struct gen5_sampler_default_color {
 
 struct brw_sampler_state
 {
-   
+
    struct
    {
-      GLuint shadow_function:3; 
-      GLuint lod_bias:11; 
-      GLuint min_filter:3; 
-      GLuint mag_filter:3; 
-      GLuint mip_filter:2; 
-      GLuint base_level:5; 
+      GLuint shadow_function:3;
+      GLuint lod_bias:11;
+      GLuint min_filter:3;
+      GLuint mag_filter:3;
+      GLuint mip_filter:2;
+      GLuint base_level:5;
       GLuint min_mag_neq:1;
-      GLuint lod_preclamp:1; 
-      GLuint default_color_mode:1; 
+      GLuint lod_preclamp:1;
+      GLuint default_color_mode:1;
       GLuint pad0:1;
-      GLuint disable:1; 
+      GLuint disable:1;
    } ss0;
 
    struct
    {
-      GLuint r_wrap_mode:3; 
-      GLuint t_wrap_mode:3; 
-      GLuint s_wrap_mode:3; 
+      GLuint r_wrap_mode:3;
+      GLuint t_wrap_mode:3;
+      GLuint s_wrap_mode:3;
       GLuint cube_control_mode:1;
       GLuint pad:2;
-      GLuint max_lod:10; 
-      GLuint min_lod:10; 
+      GLuint max_lod:10;
+      GLuint min_lod:10;
    } ss1;
 
-   
+
    struct
    {
       GLuint pad:5;
-      GLuint default_color_pointer:27; 
+      GLuint default_color_pointer:27;
    } ss2;
-   
+
    struct
    {
       GLuint non_normalized_coord:1;
       GLuint pad:12;
       GLuint address_round:6;
-      GLuint max_aniso:3; 
-      GLuint chroma_key_mode:1; 
-      GLuint chroma_key_index:2; 
-      GLuint chroma_key_enable:1; 
-      GLuint monochrome_filter_width:3; 
-      GLuint monochrome_filter_height:3; 
+      GLuint max_aniso:3;
+      GLuint chroma_key_mode:1;
+      GLuint chroma_key_index:2;
+      GLuint chroma_key_enable:1;
+      GLuint monochrome_filter_width:3;
+      GLuint monochrome_filter_height:3;
    } ss3;
 };
 
@@ -711,27 +711,27 @@ struct gen7_sampler_state
 
 struct brw_clipper_viewport
 {
-   GLfloat xmin;  
-   GLfloat xmax;  
-   GLfloat ymin;  
-   GLfloat ymax;  
+   GLfloat xmin;
+   GLfloat xmax;
+   GLfloat ymin;
+   GLfloat ymax;
 };
 
 struct brw_cc_viewport
 {
-   GLfloat min_depth;  
-   GLfloat max_depth;  
+   GLfloat min_depth;
+   GLfloat max_depth;
 };
 
 struct brw_sf_viewport
 {
    struct {
-      GLfloat m00;  
-      GLfloat m11;  
-      GLfloat m22;  
-      GLfloat m30;  
-      GLfloat m31;  
-      GLfloat m32;  
+      GLfloat m00;
+      GLfloat m11;
+      GLfloat m22;
+      GLfloat m30;
+      GLfloat m31;
+      GLfloat m32;
    } viewport;
 
    /* scissor coordinates are inclusive */
@@ -778,29 +778,29 @@ struct brw_vertex_element_state
 {
    struct
    {
-      GLuint src_offset:11; 
+      GLuint src_offset:11;
       GLuint pad:5;
-      GLuint src_format:9; 
+      GLuint src_format:9;
       GLuint pad0:1;
-      GLuint valid:1; 
-      GLuint vertex_buffer_index:5; 
+      GLuint valid:1;
+      GLuint vertex_buffer_index:5;
    } ve0;
-   
+
    struct
    {
-      GLuint dst_offset:8; 
+      GLuint dst_offset:8;
       GLuint pad:8;
-      GLuint vfcomponent3:4; 
-      GLuint vfcomponent2:4; 
-      GLuint vfcomponent1:4; 
-      GLuint vfcomponent0:4; 
+      GLuint vfcomponent3:4;
+      GLuint vfcomponent2:4;
+      GLuint vfcomponent1:4;
+      GLuint vfcomponent0:4;
    } ve1;
 };
 
 struct brw_urb_immediate {
    GLuint opcode:4;
    GLuint offset:6;
-   GLuint swizzle_control:2; 
+   GLuint swizzle_control:2;
    GLuint pad:1;
    GLuint allocate:1;
    GLuint used:1;
@@ -814,10 +814,10 @@ struct brw_urb_immediate {
 
 /* Instruction format for the execution units:
  */
- 
+
 struct brw_instruction
 {
-   struct 
+   struct
    {
       GLuint opcode:7;
       GLuint pad:1;
@@ -1010,7 +1010,7 @@ struct brw_instruction
        * Does not apply to Gen6+.  The SFID/message target moved to bits
        * 27:24 of the header (destreg__conditionalmod); EOT is in bits3.
        */
-       struct 
+       struct
        {
            GLuint pad:26;
            GLuint end_of_thread:1;
@@ -1222,8 +1222,8 @@ struct brw_instruction
       struct {
 	 GLuint binding_table_index:8;
 	 GLuint sampler:4;
-	 GLuint return_format:2; 
-	 GLuint msg_type:2;   
+	 GLuint return_format:2;
+	 GLuint msg_type:2;
 	 GLuint response_length:4;
 	 GLuint msg_length:4;
 	 GLuint msg_target:4;
@@ -1274,7 +1274,7 @@ struct brw_instruction
       struct {
 	 GLuint opcode:4;
 	 GLuint offset:6;
-	 GLuint swizzle_control:2; 
+	 GLuint swizzle_control:2;
 	 GLuint pad:1;
 	 GLuint allocate:1;
 	 GLuint used:1;
@@ -1321,9 +1321,9 @@ struct brw_instruction
       /** 965 PRM, Volume 4, Section 5.10.1.1: Message Descriptor */
       struct {
 	 GLuint binding_table_index:8;
-	 GLuint msg_control:4;  
-	 GLuint msg_type:2;  
-	 GLuint target_cache:2;    
+	 GLuint msg_control:4;
+	 GLuint msg_type:2;
+	 GLuint target_cache:2;
 	 GLuint response_length:4;
 	 GLuint msg_length:4;
 	 GLuint msg_target:4;
@@ -1347,9 +1347,9 @@ struct brw_instruction
       /** Ironlake PRM, Volume 4 Part 1, Section 5.10.2.1.2. */
       struct {
 	 GLuint binding_table_index:8;
-	 GLuint msg_control:4;  
-	 GLuint msg_type:2;  
-	 GLuint target_cache:2;    
+	 GLuint msg_control:4;
+	 GLuint msg_type:2;
+	 GLuint target_cache:2;
 	 GLuint pad0:3;
 	 GLuint header_present:1;
 	 GLuint response_length:5;
@@ -1363,7 +1363,7 @@ struct brw_instruction
 	 GLuint binding_table_index:8;
 	 GLuint msg_control:3;
 	 GLuint last_render_target:1;
-	 GLuint msg_type:3;    
+	 GLuint msg_type:3;
 	 GLuint send_commit_msg:1;
 	 GLuint response_length:4;
 	 GLuint msg_length:4;
@@ -1377,7 +1377,7 @@ struct brw_instruction
 	 GLuint binding_table_index:8;
 	 GLuint msg_control:3;
 	 GLuint last_render_target:1;
-	 GLuint msg_type:3;    
+	 GLuint msg_type:3;
 	 GLuint send_commit_msg:1;
 	 GLuint pad0:3;
 	 GLuint header_present:1;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 18/90] assembler: Adopt enum brw_message_target from mesa
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (16 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 17/90] assembler: Remove trailing white spaces from brw_structs.h Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 19/90] assembler: Rename BRW_ACCWRCTRL_ACCWRCTRL Damien Lespiau
                   ` (72 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_defines.h |   44 +++++++++++++++++--------
 assembler/disasm.c      |   26 +++++++-------
 assembler/gram.y        |   84 +++++++++++++++++++++-------------------------
 3 files changed, 81 insertions(+), 73 deletions(-)

diff --git a/assembler/brw_defines.h b/assembler/brw_defines.h
index 2ec8050..bd72abb 100644
--- a/assembler/brw_defines.h
+++ b/assembler/brw_defines.h
@@ -717,20 +717,36 @@
 #define BRW_POLYGON_FACING_FRONT      0
 #define BRW_POLYGON_FACING_BACK       1
 
-#define BRW_MESSAGE_TARGET_NULL               0
-#define BRW_MESSAGE_TARGET_MATH               1
-#define BRW_MESSAGE_TARGET_SAMPLER            2
-#define BRW_MESSAGE_TARGET_GATEWAY            3
-#define BRW_MESSAGE_TARGET_DATAPORT_READ      4
-#define BRW_MESSAGE_TARGET_DP_SC              4  /* data port sampler cache */
-#define BRW_MESSAGE_TARGET_DATAPORT_WRITE     5  
-#define BRW_MESSAGE_TARGET_DP_RC              5  /* data port render cache */
-#define BRW_MESSAGE_TARGET_URB                6
-#define BRW_MESSAGE_TARGET_THREAD_SPAWNER     7
-#define BRW_MESSAGE_TARGET_VME                8
-#define BRW_MESSAGE_TARGET_DP_CC              9  /* data port constant cache */
-#define BRW_MESSAGE_TARGET_DP_DC              10 /* data port data cache */
-#define BRW_MESSAGE_TARGET_CRE                0x0d /* check & refinement enginee */ 
+
+/**
+ * Message target: Shared Function ID for where to SEND a message.
+ *
+ * These are enumerated in the ISA reference under "send - Send Message".
+ * In particular, see the following tables:
+ * - G45 PRM, Volume 4, Table 14-15 "Message Descriptor Definition"
+ * - Sandybridge PRM, Volume 4 Part 2, Table 8-16 "Extended Message Descriptor"
+ * - BSpec, Volume 1a (GPU Overview) / Graphics Processing Engine (GPE) /
+ *   Overview / GPE Function IDs
+ */
+enum brw_message_target {
+   BRW_SFID_NULL                     = 0,
+   BRW_SFID_MATH                     = 1, /* Only valid on Gen4-5 */
+   BRW_SFID_SAMPLER                  = 2,
+   BRW_SFID_MESSAGE_GATEWAY          = 3,
+   BRW_SFID_DATAPORT_READ            = 4,
+   BRW_SFID_DATAPORT_WRITE           = 5,
+   BRW_SFID_URB                      = 6,
+   BRW_SFID_THREAD_SPAWNER           = 7,
+
+   GEN6_SFID_DATAPORT_SAMPLER_CACHE  = 4,
+   GEN6_SFID_DATAPORT_RENDER_CACHE   = 5,
+   GEN6_SFID_VME                     = 8,
+   GEN6_SFID_DATAPORT_CONSTANT_CACHE = 9,
+
+   GEN7_SFID_DATAPORT_DATA_CACHE     = 10,
+
+   HSW_SFID_CRE                      = 0x0d,
+};
 
 #define BRW_SAMPLER_RETURN_FORMAT_FLOAT32     0
 #define BRW_SAMPLER_RETURN_FORMAT_UINT32      2
diff --git a/assembler/disasm.c b/assembler/disasm.c
index b6fdc2e..9ebbeab 100644
--- a/assembler/disasm.c
+++ b/assembler/disasm.c
@@ -275,14 +275,14 @@ char *end_of_thread[2] = {
 };
 
 char *target_function[16] = {
-    [BRW_MESSAGE_TARGET_NULL] = "null",
-    [BRW_MESSAGE_TARGET_MATH] = "math",
-    [BRW_MESSAGE_TARGET_SAMPLER] = "sampler",
-    [BRW_MESSAGE_TARGET_GATEWAY] = "gateway",
-    [BRW_MESSAGE_TARGET_DATAPORT_READ] = "read",
-    [BRW_MESSAGE_TARGET_DATAPORT_WRITE] = "write",
-    [BRW_MESSAGE_TARGET_URB] = "urb",
-    [BRW_MESSAGE_TARGET_THREAD_SPAWNER] = "thread_spawner"
+    [BRW_SFID_NULL] = "null",
+    [BRW_SFID_MATH] = "math",
+    [BRW_SFID_SAMPLER] = "sampler",
+    [BRW_SFID_MESSAGE_GATEWAY] = "gateway",
+    [BRW_SFID_DATAPORT_READ] = "read",
+    [BRW_SFID_DATAPORT_WRITE] = "write",
+    [BRW_SFID_URB] = "urb",
+    [BRW_SFID_THREAD_SPAWNER] = "thread_spawner"
 };
 
 char *math_function[16] = {
@@ -831,7 +831,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
 	err |= control (file, "target function", target_function,
 			inst->header.destreg__conditionalmod, &space);
 	switch (inst->header.destreg__conditionalmod) {
-	case BRW_MESSAGE_TARGET_MATH:
+	case BRW_SFID_MATH:
 	    err |= control (file, "math function", math_function,
 			    inst->bits3.math.function, &space);
 	    err |= control (file, "math saturate", math_saturate,
@@ -843,7 +843,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
 	    err |= control (file, "math precision", math_precision,
 			    inst->bits3.math.precision, &space);
 	    break;
-	case BRW_MESSAGE_TARGET_SAMPLER:
+	case BRW_SFID_SAMPLER:
 	    format (file, " (%d, %d, ",
 		    inst->bits3.sampler.binding_table_index,
 		    inst->bits3.sampler.sampler);
@@ -851,7 +851,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
 			    inst->bits3.sampler.return_format, NULL);
 	    string (file, ")");
 	    break;
-	case BRW_MESSAGE_TARGET_DATAPORT_WRITE:
+	case BRW_SFID_DATAPORT_WRITE:
 	    format (file, " (%d, %d, %d, %d)",
 		    inst->bits3.dp_write.binding_table_index,
 		    (inst->bits3.dp_write.last_render_target << 3) |
@@ -859,7 +859,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
 		    inst->bits3.dp_write.msg_type,
 		    inst->bits3.dp_write.send_commit_msg);
 	    break;
-	case BRW_MESSAGE_TARGET_URB:
+	case BRW_SFID_URB:
 	    format (file, " %d", inst->bits3.urb.offset);
 	    space = 1;
 	    err |= control (file, "urb swizzle", urb_swizzle,
@@ -871,7 +871,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
 	    err |= control (file, "urb complete", urb_complete,
 			    inst->bits3.urb.complete, &space);
 	    break;
-	case BRW_MESSAGE_TARGET_THREAD_SPAWNER:
+	case BRW_SFID_THREAD_SPAWNER:
 	    break;
 	default:
 	    format (file, "unsupported target %d", inst->bits3.generic.msg_target);
diff --git a/assembler/gram.y b/assembler/gram.y
index f71f960..faba2eb 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1168,29 +1168,29 @@ post_dst:	dst
 msgtarget:	NULL_TOKEN
 		{
 		  if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid= BRW_MESSAGE_TARGET_NULL;
+                      $$.bits2.send_gen5.sfid= BRW_SFID_NULL;
                       $$.bits3.generic_gen5.header_present = 0;  /* ??? */
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_MESSAGE_TARGET_NULL;
+                      $$.bits3.generic.msg_target = BRW_SFID_NULL;
 		  }
 		}
 		| SAMPLER LPAREN INTEGER COMMA INTEGER COMMA
 		sampler_datatype RPAREN
 		{
 		  if (IS_GENp(7)) {
-                      $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_SAMPLER;
+                      $$.bits2.send_gen5.sfid = BRW_SFID_SAMPLER;
                       $$.bits3.generic_gen5.header_present = 1;   /* ??? */
                       $$.bits3.sampler_gen7.binding_table_index = $3;
                       $$.bits3.sampler_gen7.sampler = $5;
                       $$.bits3.sampler_gen7.simd_mode = 2; /* SIMD16, maybe we should add a new parameter */
 		  } else if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_SAMPLER;
+                      $$.bits2.send_gen5.sfid = BRW_SFID_SAMPLER;
                       $$.bits3.generic_gen5.header_present = 1;   /* ??? */
                       $$.bits3.sampler_gen5.binding_table_index = $3;
                       $$.bits3.sampler_gen5.sampler = $5;
                       $$.bits3.sampler_gen5.simd_mode = 2; /* SIMD16, maybe we should add a new parameter */
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_MESSAGE_TARGET_SAMPLER;	
+                      $$.bits3.generic.msg_target = BRW_SFID_SAMPLER;	
                       $$.bits3.sampler.binding_table_index = $3;
                       $$.bits3.sampler.sampler = $5;
                       switch ($7) {
@@ -1215,7 +1215,7 @@ msgtarget:	NULL_TOKEN
                       fprintf (stderr, "Gen6+ doesn't have math function\n");
                       YYERROR;
 		  } else if (IS_GENx(5)) {
-                      $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_MATH;
+                      $$.bits2.send_gen5.sfid = BRW_SFID_MATH;
                       $$.bits3.generic_gen5.header_present = 0;
                       $$.bits3.math_gen5.function = $2;
                       if ($3 == BRW_INSTRUCTION_SATURATE)
@@ -1226,7 +1226,7 @@ msgtarget:	NULL_TOKEN
                       $$.bits3.math_gen5.precision = BRW_MATH_PRECISION_FULL;
                       $$.bits3.math_gen5.data_type = $5;
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_MESSAGE_TARGET_MATH;
+                      $$.bits3.generic.msg_target = BRW_SFID_MATH;
                       $$.bits3.math.function = $2;
                       if ($3 == BRW_INSTRUCTION_SATURATE)
                           $$.bits3.math.saturate = 1;
@@ -1240,10 +1240,10 @@ msgtarget:	NULL_TOKEN
 		| GATEWAY
 		{
 		  if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_GATEWAY;
+                      $$.bits2.send_gen5.sfid = BRW_SFID_MESSAGE_GATEWAY;
                       $$.bits3.generic_gen5.header_present = 0;  /* ??? */
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_MESSAGE_TARGET_GATEWAY;
+                      $$.bits3.generic.msg_target = BRW_SFID_MESSAGE_GATEWAY;
 		  }
 		}
 		| READ  LPAREN INTEGER COMMA INTEGER COMMA INTEGER COMMA
@@ -1251,21 +1251,21 @@ msgtarget:	NULL_TOKEN
 		{
 		  if (IS_GENx(7)) {
                       $$.bits2.send_gen5.sfid = 
-                          BRW_MESSAGE_TARGET_DP_SC;
+                          GEN6_SFID_DATAPORT_SAMPLER_CACHE;
                       $$.bits3.generic_gen5.header_present = 1;
                       $$.bits3.gen7_dp.binding_table_index = $3;
                       $$.bits3.gen7_dp.msg_control = $7;
                       $$.bits3.gen7_dp.msg_type = $9;
 		  } else if (IS_GENx(6)) {
                       $$.bits2.send_gen5.sfid = 
-                          BRW_MESSAGE_TARGET_DP_SC;
+                          GEN6_SFID_DATAPORT_SAMPLER_CACHE;
                       $$.bits3.generic_gen5.header_present = 1;
                       $$.bits3.gen6_dp_sampler_const_cache.binding_table_index = $3;
                       $$.bits3.gen6_dp_sampler_const_cache.msg_control = $7;
                       $$.bits3.gen6_dp_sampler_const_cache.msg_type = $9;
 		  } else if (IS_GENx(5)) {
                       $$.bits2.send_gen5.sfid = 
-                          BRW_MESSAGE_TARGET_DATAPORT_READ;
+                          BRW_SFID_DATAPORT_READ;
                       $$.bits3.generic_gen5.header_present = 1;
                       $$.bits3.dp_read_gen5.binding_table_index = $3;
                       $$.bits3.dp_read_gen5.target_cache = $5;
@@ -1273,7 +1273,7 @@ msgtarget:	NULL_TOKEN
                       $$.bits3.dp_read_gen5.msg_type = $9;
 		  } else {
                       $$.bits3.generic.msg_target =
-                          BRW_MESSAGE_TARGET_DATAPORT_READ;
+                          BRW_SFID_DATAPORT_READ;
                       $$.bits3.dp_read.binding_table_index = $3;
                       $$.bits3.dp_read.target_cache = $5;
                       $$.bits3.dp_read.msg_control = $7;
@@ -1284,15 +1284,13 @@ msgtarget:	NULL_TOKEN
 		INTEGER RPAREN
 		{
 		  if (IS_GENx(7)) {
-                      $$.bits2.send_gen5.sfid =
-                          BRW_MESSAGE_TARGET_DP_RC;
+                      $$.bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
                       $$.bits3.generic_gen5.header_present = 1;
                       $$.bits3.gen7_dp.binding_table_index = $3;
                       $$.bits3.gen7_dp.msg_control = $5;
                       $$.bits3.gen7_dp.msg_type = $7;
                   } else if (IS_GENx(6)) {
-                      $$.bits2.send_gen5.sfid =
-                          BRW_MESSAGE_TARGET_DP_RC;
+                      $$.bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
                       /* Sandybridge supports headerlesss message for render target write.
                        * Currently the GFX assembler doesn't support it. so the program must provide 
                        * message header
@@ -1304,7 +1302,7 @@ msgtarget:	NULL_TOKEN
                       $$.bits3.gen6_dp.send_commit_msg = $9;
 		  } else if (IS_GENx(5)) {
                       $$.bits2.send_gen5.sfid =
-                          BRW_MESSAGE_TARGET_DATAPORT_WRITE;
+                          BRW_SFID_DATAPORT_WRITE;
                       $$.bits3.generic_gen5.header_present = 1;
                       $$.bits3.dp_write_gen5.binding_table_index = $3;
                       $$.bits3.dp_write_gen5.last_render_target = ($5 & 0x8) >> 3;
@@ -1313,7 +1311,7 @@ msgtarget:	NULL_TOKEN
                       $$.bits3.dp_write_gen5.send_commit_msg = $9;
 		  } else {
                       $$.bits3.generic.msg_target =
-                          BRW_MESSAGE_TARGET_DATAPORT_WRITE;
+                          BRW_SFID_DATAPORT_WRITE;
                       $$.bits3.dp_write.binding_table_index = $3;
                       /* The msg control field of brw_struct.h is split into
                        * msg control and last_render_target, even though
@@ -1329,15 +1327,13 @@ msgtarget:	NULL_TOKEN
 		INTEGER COMMA INTEGER RPAREN
 		{
 		  if (IS_GENx(7)) {
-                      $$.bits2.send_gen5.sfid =
-                          BRW_MESSAGE_TARGET_DP_RC;
+                      $$.bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
                       $$.bits3.generic_gen5.header_present = ($11 != 0);
                       $$.bits3.gen7_dp.binding_table_index = $3;
                       $$.bits3.gen7_dp.msg_control = $5;
                       $$.bits3.gen7_dp.msg_type = $7;
 		  } else if (IS_GENx(6)) {
-                      $$.bits2.send_gen5.sfid =
-                          BRW_MESSAGE_TARGET_DP_RC;
+                      $$.bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
                       $$.bits3.generic_gen5.header_present = ($11 != 0);
                       $$.bits3.gen6_dp.binding_table_index = $3;
                       $$.bits3.gen6_dp.msg_control = $5;
@@ -1345,7 +1341,7 @@ msgtarget:	NULL_TOKEN
                       $$.bits3.gen6_dp.send_commit_msg = $9;
 		  } else if (IS_GENx(5)) {
                       $$.bits2.send_gen5.sfid =
-                          BRW_MESSAGE_TARGET_DATAPORT_WRITE;
+                          BRW_SFID_DATAPORT_WRITE;
                       $$.bits3.generic_gen5.header_present = ($11 != 0);
                       $$.bits3.dp_write_gen5.binding_table_index = $3;
                       $$.bits3.dp_write_gen5.last_render_target = ($5 & 0x8) >> 3;
@@ -1354,7 +1350,7 @@ msgtarget:	NULL_TOKEN
                       $$.bits3.dp_write_gen5.send_commit_msg = $9;
 		  } else {
                       $$.bits3.generic.msg_target =
-                          BRW_MESSAGE_TARGET_DATAPORT_WRITE;
+                          BRW_SFID_DATAPORT_WRITE;
                       $$.bits3.dp_write.binding_table_index = $3;
                       /* The msg control field of brw_struct.h is split into
                        * msg control and last_render_target, even though
@@ -1368,9 +1364,9 @@ msgtarget:	NULL_TOKEN
 		}
 		| URB INTEGER urb_swizzle urb_allocate urb_used urb_complete
 		{
-		  $$.bits3.generic.msg_target = BRW_MESSAGE_TARGET_URB;
+		  $$.bits3.generic.msg_target = BRW_SFID_URB;
 		  if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_URB;
+                      $$.bits2.send_gen5.sfid = BRW_SFID_URB;
                       $$.bits3.generic_gen5.header_present = 1;
                       $$.bits3.urb_gen5.opcode = BRW_URB_OPCODE_WRITE;
                       $$.bits3.urb_gen5.offset = $2;
@@ -1380,7 +1376,7 @@ msgtarget:	NULL_TOKEN
                       $$.bits3.urb_gen5.used = $5;
                       $$.bits3.urb_gen5.complete = $6;
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_MESSAGE_TARGET_URB;
+                      $$.bits3.generic.msg_target = BRW_SFID_URB;
                       $$.bits3.urb.opcode = BRW_URB_OPCODE_WRITE;
                       $$.bits3.urb.offset = $2;
                       $$.bits3.urb.swizzle_control = $3;
@@ -1394,17 +1390,17 @@ msgtarget:	NULL_TOKEN
                         INTEGER RPAREN
 		{
 		  $$.bits3.generic.msg_target =
-		    BRW_MESSAGE_TARGET_THREAD_SPAWNER;
+		    BRW_SFID_THREAD_SPAWNER;
 		  if (IS_GENp(5)) {
                       $$.bits2.send_gen5.sfid = 
-                          BRW_MESSAGE_TARGET_THREAD_SPAWNER;
+                          BRW_SFID_THREAD_SPAWNER;
                       $$.bits3.generic_gen5.header_present = 0;
                       $$.bits3.thread_spawner_gen5.opcode = $3;
                       $$.bits3.thread_spawner_gen5.requester_type  = $5;
                       $$.bits3.thread_spawner_gen5.resource_select = $7;
 		  } else {
                       $$.bits3.generic.msg_target =
-                          BRW_MESSAGE_TARGET_THREAD_SPAWNER;
+                          BRW_SFID_THREAD_SPAWNER;
                       $$.bits3.thread_spawner.opcode = $3;
                       $$.bits3.thread_spawner.requester_type  = $5;
                       $$.bits3.thread_spawner.resource_select = $7;
@@ -1412,12 +1408,10 @@ msgtarget:	NULL_TOKEN
 		}
 		| VME  LPAREN INTEGER COMMA INTEGER COMMA INTEGER COMMA INTEGER RPAREN
 		{
-		  $$.bits3.generic.msg_target =
-                      BRW_MESSAGE_TARGET_VME;
+		  $$.bits3.generic.msg_target = GEN6_SFID_VME;
 
 		  if (IS_GENp(6)) { 
-                      $$.bits2.send_gen5.sfid =
-                          BRW_MESSAGE_TARGET_VME;
+                      $$.bits2.send_gen5.sfid = GEN6_SFID_VME;
                       $$.bits3.vme_gen6.binding_table_index = $3;
                       $$.bits3.vme_gen6.search_path_index = $5;
                       $$.bits3.vme_gen6.lut_subindex = $7;
@@ -1434,11 +1428,9 @@ msgtarget:	NULL_TOKEN
                       fprintf (stderr, "Below Gen7.5 doesn't have CRE function\n");
                       YYERROR;
 		    }
-		   $$.bits3.generic.msg_target =
-                      BRW_MESSAGE_TARGET_CRE;
+		   $$.bits3.generic.msg_target = HSW_SFID_CRE;
 
-                   $$.bits2.send_gen5.sfid =
-                          BRW_MESSAGE_TARGET_CRE;
+                   $$.bits2.send_gen5.sfid = HSW_SFID_CRE;
                    $$.bits3.cre_gen75.binding_table_index = $3;
                    $$.bits3.cre_gen75.message_type = $5;
                    $$.bits3.generic_gen5.header_present = 1; 
@@ -1451,10 +1443,10 @@ msgtarget:	NULL_TOKEN
                     $$.bits3.generic_gen5.header_present = ($13 != 0);
 
                     if (IS_GENp(7)) {
-                        if ($3 != BRW_MESSAGE_TARGET_DP_SC &&
-                            $3 != BRW_MESSAGE_TARGET_DP_RC &&
-                            $3 != BRW_MESSAGE_TARGET_DP_CC &&
-                            $3 != BRW_MESSAGE_TARGET_DP_DC) {
+                        if ($3 != GEN6_SFID_DATAPORT_SAMPLER_CACHE &&
+                            $3 != GEN6_SFID_DATAPORT_RENDER_CACHE &&
+                            $3 != GEN6_SFID_DATAPORT_CONSTANT_CACHE &&
+                            $3 != GEN7_SFID_DATAPORT_DATA_CACHE) {
                             fprintf (stderr, "error: wrong cache type\n");
                             YYERROR;
                         }
@@ -1464,9 +1456,9 @@ msgtarget:	NULL_TOKEN
                         $$.bits3.gen7_dp.msg_control = $7;
                         $$.bits3.gen7_dp.msg_type = $5;
                     } else if (IS_GENx(6)) {
-                        if ($3 != BRW_MESSAGE_TARGET_DP_SC &&
-                            $3 != BRW_MESSAGE_TARGET_DP_RC &&
-                            $3 != BRW_MESSAGE_TARGET_DP_CC) {
+                        if ($3 != GEN6_SFID_DATAPORT_SAMPLER_CACHE &&
+                            $3 != GEN6_SFID_DATAPORT_RENDER_CACHE &&
+                            $3 != GEN6_SFID_DATAPORT_CONSTANT_CACHE) {
                             fprintf (stderr, "error: wrong cache type\n");
                             YYERROR;
                         }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 19/90] assembler: Rename BRW_ACCWRCTRL_ACCWRCTRL
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (17 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 18/90] assembler: Adopt enum brw_message_target from mesa Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 20/90] assembler: Import brw_defines.h from Mesa Damien Lespiau
                   ` (71 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

To a more self-describing define. This hopefully will help its inclusion
into Mesa.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_defines.h |    4 ++--
 assembler/gram.y        |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/assembler/brw_defines.h b/assembler/brw_defines.h
index bd72abb..c4ffe9b 100644
--- a/assembler/brw_defines.h
+++ b/assembler/brw_defines.h
@@ -552,8 +552,8 @@
 #define BRW_MASK_ENABLE   0
 #define BRW_MASK_DISABLE  1
 
-#define BRW_ACCWRCTRL_NONE      0
-#define BRW_ACCWRCTRL_ACCWRCTRL 1
+#define BRW_ACCUMULATOR_WRITE_DISABLE 0
+#define BRW_ACCUMULATOR_WRITE_ENABLE  1
 
 #define BRW_OPCODE_MOV        1
 #define BRW_OPCODE_SEL        2
diff --git a/assembler/gram.y b/assembler/gram.y
index faba2eb..55708ca 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -2605,7 +2605,7 @@ instoption_list:instoption_list COMMA instoption
 		    $$.header.debug_control = BRW_DEBUG_BREAKPOINT;
 		    break;
 		  case ACCWRCTRL:
-		    $$.header.acc_wr_control = BRW_ACCWRCTRL_ACCWRCTRL;
+		    $$.header.acc_wr_control = BRW_ACCUMULATOR_WRITE_ENABLE;
 		  }
 		}
 		| instoption_list instoption
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 20/90] assembler: Import brw_defines.h from Mesa
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (18 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 19/90] assembler: Rename BRW_ACCWRCTRL_ACCWRCTRL Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 21/90] assembler: Remove trailing white space from brw_defines.h Damien Lespiau
                   ` (70 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Almost identical files now, the diff is:

-#include "intel_chipset.h"
+#define EX_DESC_SFID_MASK 0xF
+#define EX_DESC_EOT_MASK  0x20

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_defines.h | 1223 +++++++++++++++++++++++++++++++++++++---------
 1 files changed, 983 insertions(+), 240 deletions(-)

diff --git a/assembler/brw_defines.h b/assembler/brw_defines.h
index c4ffe9b..f0b358e 100644
--- a/assembler/brw_defines.h
+++ b/assembler/brw_defines.h
@@ -1,121 +1,43 @@
- /**************************************************************************
- * 
- * Copyright 2005 Tungsten Graphics, Inc., Cedar Park, Texas.
- * All Rights Reserved.
- * 
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the
- * "Software"), to deal in the Software without restriction, including
- * without limitation the rights to use, copy, modify, merge, publish,
- * distribute, sub license, and/or sell copies of the Software, and to
- * permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- * 
- * The above copyright notice and this permission notice (including the
- * next paragraph) shall be included in all copies or substantial portions
- * of the Software.
- * 
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
- * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
- * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
- * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
- * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
- * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- * 
- **************************************************************************/
+/*
+ Copyright (C) Intel Corp.  2006.  All Rights Reserved.
+ Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
+ develop this 3D driver.
+ 
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+ 
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.
+ 
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ 
+ **********************************************************************/
+ /*
+  * Authors:
+  *   Keith Whitwell <keith@tungstengraphics.com>
+  */
+
+#define INTEL_MASK(high, low) (((1<<((high)-(low)+1))-1)<<(low))
+#define SET_FIELD(value, field) (((value) << field ## _SHIFT) & field ## _MASK)
+#define GET_FIELD(word, field) (((word)  & field ## _MASK) >> field ## _SHIFT)
 
 #ifndef BRW_DEFINES_H
 #define BRW_DEFINES_H
 
-/*
- */
-#define MI_NOOP                              0x00
-#define MI_USER_INTERRUPT                    0x02
-#define MI_WAIT_FOR_EVENT                    0x03
-#define MI_FLUSH                             0x04
-#define MI_REPORT_HEAD                       0x07
-#define MI_ARB_ON_OFF                        0x08
-#define MI_BATCH_BUFFER_END                  0x0A
-#define MI_OVERLAY_FLIP                      0x11
-#define MI_LOAD_SCAN_LINES_INCL              0x12
-#define MI_LOAD_SCAN_LINES_EXCL              0x13
-#define MI_DISPLAY_BUFFER_INFO               0x14
-#define MI_SET_CONTEXT                       0x18
-#define MI_STORE_DATA_IMM                    0x20
-#define MI_STORE_DATA_INDEX                  0x21
-#define MI_LOAD_REGISTER_IMM                 0x22
-#define MI_STORE_REGISTER_MEM                0x24
-#define MI_BATCH_BUFFER_START                0x31
-
-#define MI_SYNCHRONOUS_FLIP                  0x0 
-#define MI_ASYNCHRONOUS_FLIP                 0x1
-
-#define MI_BUFFER_SECURE                     0x0 
-#define MI_BUFFER_NONSECURE                  0x1
-
-#define MI_ARBITRATE_AT_CHAIN_POINTS         0x0 
-#define MI_ARBITRATE_BETWEEN_INSTS           0x1
-#define MI_NO_ARBITRATION                    0x3 
-
-#define MI_CONDITION_CODE_WAIT_DISABLED      0x0
-#define MI_CONDITION_CODE_WAIT_0             0x1
-#define MI_CONDITION_CODE_WAIT_1             0x2
-#define MI_CONDITION_CODE_WAIT_2             0x3
-#define MI_CONDITION_CODE_WAIT_3             0x4
-#define MI_CONDITION_CODE_WAIT_4             0x5
-
-#define MI_DISPLAY_PIPE_A                    0x0
-#define MI_DISPLAY_PIPE_B                    0x1
-
-#define MI_DISPLAY_PLANE_A                   0x0 
-#define MI_DISPLAY_PLANE_B                   0x1
-#define MI_DISPLAY_PLANE_C                   0x2
-
-#define MI_STANDARD_FLIP                                 0x0
-#define MI_ENQUEUE_FLIP_PERFORM_BASE_FRAME_NUMBER_LOAD   0x1
-#define MI_ENQUEUE_FLIP_TARGET_FRAME_NUMBER_RELATIVE     0x2
-#define MI_ENQUEUE_FLIP_ABSOLUTE_TARGET_FRAME_NUMBER     0x3
-
-#define MI_PHYSICAL_ADDRESS                  0x0
-#define MI_VIRTUAL_ADDRESS                   0x1
-
-#define MI_BUFFER_MEMORY_MAIN                0x0 
-#define MI_BUFFER_MEMORY_GTT                 0x2
-#define MI_BUFFER_MEMORY_PER_PROCESS_GTT     0x3 
-
-#define MI_FLIP_CONTINUE                     0x0
-#define MI_FLIP_ON                           0x1
-#define MI_FLIP_OFF                          0x2
-
-#define MI_UNTRUSTED_REGISTER_SPACE          0x0
-#define MI_TRUSTED_REGISTER_SPACE            0x1
-
 /* 3D state:
  */
-#define _3DOP_3DSTATE_PIPELINED       0x0
-#define _3DOP_3DSTATE_NONPIPELINED    0x1
-#define _3DOP_3DCONTROL               0x2
-#define _3DOP_3DPRIMITIVE             0x3
-
-#define _3DSTATE_PIPELINED_POINTERS       0x00
-#define _3DSTATE_BINDING_TABLE_POINTERS   0x01
-#define _3DSTATE_VERTEX_BUFFERS           0x08
-#define _3DSTATE_VERTEX_ELEMENTS          0x09
-#define _3DSTATE_INDEX_BUFFER             0x0A
-#define _3DSTATE_VF_STATISTICS            0x0B
-#define _3DSTATE_DRAWING_RECTANGLE            0x00
-#define _3DSTATE_CONSTANT_COLOR               0x01
-#define _3DSTATE_SAMPLER_PALETTE_LOAD         0x02
-#define _3DSTATE_CHROMA_KEY                   0x04
-#define _3DSTATE_DEPTH_BUFFER                 0x05
-#define _3DSTATE_POLY_STIPPLE_OFFSET          0x06
-#define _3DSTATE_POLY_STIPPLE_PATTERN         0x07
-#define _3DSTATE_LINE_STIPPLE                 0x08
-#define _3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP    0x09
-#define _3DCONTROL    0x00
-#define _3DPRIMITIVE  0x00
-
 #define PIPE_CONTROL_NOWRITE          0x00
 #define PIPE_CONTROL_WRITEIMMEDIATE   0x01
 #define PIPE_CONTROL_WRITEDEPTH       0x02
@@ -124,6 +46,15 @@
 #define PIPE_CONTROL_GTTWRITE_PROCESS_LOCAL 0x00
 #define PIPE_CONTROL_GTTWRITE_GLOBAL        0x01
 
+#define CMD_3D_PRIM                                 0x7b00 /* 3DPRIMITIVE */
+/* DW0 */
+# define GEN4_3DPRIM_TOPOLOGY_TYPE_SHIFT            10
+# define GEN4_3DPRIM_VERTEXBUFFER_ACCESS_SEQUENTIAL (0 << 15)
+# define GEN4_3DPRIM_VERTEXBUFFER_ACCESS_RANDOM     (1 << 15)
+/* DW1 */
+# define GEN7_3DPRIM_VERTEXBUFFER_ACCESS_SEQUENTIAL (0 << 8)
+# define GEN7_3DPRIM_VERTEXBUFFER_ACCESS_RANDOM     (1 << 8)
+
 #define _3DPRIM_POINTLIST         0x01
 #define _3DPRIM_LINELIST          0x02
 #define _3DPRIM_LINESTRIP         0x03
@@ -146,9 +77,6 @@
 #define _3DPRIM_LINESTRIP_CONT_BF 0x14
 #define _3DPRIM_TRIFAN_NOSTIPPLE  0x15
 
-#define _3DPRIM_VERTEXBUFFER_ACCESS_SEQUENTIAL 0
-#define _3DPRIM_VERTEXBUFFER_ACCESS_RANDOM     1
-
 #define BRW_ANISORATIO_2     0 
 #define BRW_ANISORATIO_4     1 
 #define BRW_ANISORATIO_6     2 
@@ -198,6 +126,7 @@
 #define BRW_CLIPMODE_CLIP_NON_REJECTED   2
 #define BRW_CLIPMODE_REJECT_ALL          3
 #define BRW_CLIPMODE_ACCEPT_ALL          4
+#define BRW_CLIPMODE_KERNEL_CLIP         5
 
 #define BRW_CLIP_NDCSPACE     0
 #define BRW_CLIP_SCREENSPACE  1
@@ -227,6 +156,7 @@
 #define BRW_DEPTHFORMAT_D32_FLOAT_S8X24_UINT     0
 #define BRW_DEPTHFORMAT_D32_FLOAT                1
 #define BRW_DEPTHFORMAT_D24_UNORM_S8_UINT        2
+#define BRW_DEPTHFORMAT_D24_UNORM_X8_UINT        3 /* GEN5 */
 #define BRW_DEPTHFORMAT_D16_UNORM                5
 
 #define BRW_FLOATING_POINT_IEEE_754        0
@@ -235,6 +165,10 @@
 #define BRW_FRONTWINDING_CW      0
 #define BRW_FRONTWINDING_CCW     1
 
+#define BRW_SPRITE_POINT_ENABLE  16
+
+#define BRW_CUT_INDEX_ENABLE     (1 << 10)
+
 #define BRW_INDEX_BYTE     0
 #define BRW_INDEX_WORD     1
 #define BRW_INDEX_DWORD    2
@@ -264,6 +198,13 @@
 #define BRW_MIPFILTER_NEAREST     1   
 #define BRW_MIPFILTER_LINEAR      3
 
+#define BRW_ADDRESS_ROUNDING_ENABLE_U_MAG	0x20
+#define BRW_ADDRESS_ROUNDING_ENABLE_U_MIN	0x10
+#define BRW_ADDRESS_ROUNDING_ENABLE_V_MAG	0x08
+#define BRW_ADDRESS_ROUNDING_ENABLE_V_MIN	0x04
+#define BRW_ADDRESS_ROUNDING_ENABLE_R_MAG	0x02
+#define BRW_ADDRESS_ROUNDING_ENABLE_R_MIN	0x01
+
 #define BRW_POLYGON_FRONT_FACING     0
 #define BRW_POLYGON_BACK_FACING      1
 
@@ -282,6 +223,24 @@
 
 #define BRW_RASTRULE_UPPER_LEFT  0    
 #define BRW_RASTRULE_UPPER_RIGHT 1
+/* These are listed as "Reserved, but not seen as useful"
+ * in Intel documentation (page 212, "Point Rasterization Rule",
+ * section 7.4 "SF Pipeline State Summary", of document
+ * "Intel® 965 Express Chipset Family and Intel® G35 Express
+ * Chipset Graphics Controller Programmer's Reference Manual,
+ * Volume 2: 3D/Media", Revision 1.0b as of January 2008,
+ * available at 
+ *     http://intellinuxgraphics.org/documentation.html
+ * at the time of this writing).
+ *
+ * These appear to be supported on at least some
+ * i965-family devices, and the BRW_RASTRULE_LOWER_RIGHT
+ * is useful when using OpenGL to render to a FBO
+ * (which has the pixel coordinate Y orientation inverted
+ * with respect to the normal OpenGL pixel coordinate system).
+ */
+#define BRW_RASTRULE_LOWER_LEFT  2
+#define BRW_RASTRULE_LOWER_RIGHT 3
 
 #define BRW_RENDERTARGET_CLAMPRANGE_UNORM    0
 #define BRW_RENDERTARGET_CLAMPRANGE_SNORM    1
@@ -296,8 +255,17 @@
 #define BRW_STENCILOP_DECR               6
 #define BRW_STENCILOP_INVERT             7
 
+/* Surface state DW0 */
+#define BRW_SURFACE_RC_READ_WRITE	(1 << 8)
+#define BRW_SURFACE_MIPLAYOUT_SHIFT	10
 #define BRW_SURFACE_MIPMAPLAYOUT_BELOW   0
 #define BRW_SURFACE_MIPMAPLAYOUT_RIGHT   1
+#define BRW_SURFACE_CUBEFACE_ENABLES	0x3f
+#define BRW_SURFACE_BLEND_ENABLED	(1 << 13)
+#define BRW_SURFACE_WRITEDISABLE_B_SHIFT	14
+#define BRW_SURFACE_WRITEDISABLE_G_SHIFT	15
+#define BRW_SURFACE_WRITEDISABLE_R_SHIFT	16
+#define BRW_SURFACE_WRITEDISABLE_A_SHIFT	17
 
 #define BRW_SURFACEFORMAT_R32G32B32A32_FLOAT             0x000 
 #define BRW_SURFACEFORMAT_R32G32B32A32_SINT              0x001 
@@ -308,6 +276,7 @@
 #define BRW_SURFACEFORMAT_R32G32B32X32_FLOAT             0x006 
 #define BRW_SURFACEFORMAT_R32G32B32A32_SSCALED           0x007
 #define BRW_SURFACEFORMAT_R32G32B32A32_USCALED           0x008
+#define BRW_SURFACEFORMAT_R32G32B32A32_SFIXED            0x020
 #define BRW_SURFACEFORMAT_R32G32B32_FLOAT                0x040 
 #define BRW_SURFACEFORMAT_R32G32B32_SINT                 0x041 
 #define BRW_SURFACEFORMAT_R32G32B32_UINT                 0x042 
@@ -315,6 +284,7 @@
 #define BRW_SURFACEFORMAT_R32G32B32_SNORM                0x044 
 #define BRW_SURFACEFORMAT_R32G32B32_SSCALED              0x045 
 #define BRW_SURFACEFORMAT_R32G32B32_USCALED              0x046 
+#define BRW_SURFACEFORMAT_R32G32B32_SFIXED               0x050
 #define BRW_SURFACEFORMAT_R16G16B16A16_UNORM             0x080 
 #define BRW_SURFACEFORMAT_R16G16B16A16_SNORM             0x081 
 #define BRW_SURFACEFORMAT_R16G16B16A16_SINT              0x082 
@@ -338,6 +308,7 @@
 #define BRW_SURFACEFORMAT_R16G16B16A16_USCALED           0x094
 #define BRW_SURFACEFORMAT_R32G32_SSCALED                 0x095
 #define BRW_SURFACEFORMAT_R32G32_USCALED                 0x096
+#define BRW_SURFACEFORMAT_R32G32_SFIXED                  0x0A0
 #define BRW_SURFACEFORMAT_B8G8R8A8_UNORM                 0x0C0 
 #define BRW_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB            0x0C1 
 #define BRW_SURFACEFORMAT_R10G10B10A2_UNORM              0x0C2 
@@ -406,9 +377,10 @@
 #define BRW_SURFACEFORMAT_L8A8_UNORM                     0x114 
 #define BRW_SURFACEFORMAT_I16_FLOAT                      0x115
 #define BRW_SURFACEFORMAT_L16_FLOAT                      0x116
-#define BRW_SURFACEFORMAT_A16_FLOAT                      0x117 
-#define BRW_SURFACEFORMAT_R5G5_SNORM_B6_UNORM            0x119 
-#define BRW_SURFACEFORMAT_B5G5R5X1_UNORM                 0x11A 
+#define BRW_SURFACEFORMAT_A16_FLOAT                      0x117
+#define BRW_SURFACEFORMAT_L8A8_UNORM_SRGB                0x118
+#define BRW_SURFACEFORMAT_R5G5_SNORM_B6_UNORM            0x119
+#define BRW_SURFACEFORMAT_B5G5R5X1_UNORM                 0x11A
 #define BRW_SURFACEFORMAT_B5G5R5X1_UNORM_SRGB            0x11B
 #define BRW_SURFACEFORMAT_R8G8_SSCALED                   0x11C
 #define BRW_SURFACEFORMAT_R8G8_USCALED                   0x11D
@@ -425,6 +397,8 @@
 #define BRW_SURFACEFORMAT_A4P4_UNORM                     0x148
 #define BRW_SURFACEFORMAT_R8_SSCALED                     0x149
 #define BRW_SURFACEFORMAT_R8_USCALED                     0x14A
+#define BRW_SURFACEFORMAT_L8_UNORM_SRGB                  0x14C
+#define BRW_SURFACEFORMAT_DXT1_RGB_SRGB                  0x180
 #define BRW_SURFACEFORMAT_R1_UINT                        0x181 
 #define BRW_SURFACEFORMAT_YCRCB_NORMAL                   0x182 
 #define BRW_SURFACEFORMAT_YCRCB_SWAPUVY                  0x183 
@@ -453,10 +427,24 @@
 #define BRW_SURFACEFORMAT_R16G16B16_SNORM                0x19D 
 #define BRW_SURFACEFORMAT_R16G16B16_SSCALED              0x19E 
 #define BRW_SURFACEFORMAT_R16G16B16_USCALED              0x19F
+#define BRW_SURFACEFORMAT_R32_SFIXED                     0x1B2
+#define BRW_SURFACEFORMAT_R10G10B10A2_SNORM              0x1B3
+#define BRW_SURFACEFORMAT_R10G10B10A2_USCALED            0x1B4
+#define BRW_SURFACEFORMAT_R10G10B10A2_SSCALED            0x1B5
+#define BRW_SURFACEFORMAT_R10G10B10A2_SINT               0x1B6
+#define BRW_SURFACEFORMAT_B10G10R10A2_SNORM              0x1B7
+#define BRW_SURFACEFORMAT_B10G10R10A2_USCALED            0x1B8
+#define BRW_SURFACEFORMAT_B10G10R10A2_SSCALED            0x1B9
+#define BRW_SURFACEFORMAT_B10G10R10A2_UINT               0x1BA
+#define BRW_SURFACEFORMAT_B10G10R10A2_SINT               0x1BB
+#define BRW_SURFACE_FORMAT_SHIFT	18
+#define BRW_SURFACE_FORMAT_MASK		INTEL_MASK(26, 18)
 
 #define BRW_SURFACERETURNFORMAT_FLOAT32  0
 #define BRW_SURFACERETURNFORMAT_S1       1
 
+#define BRW_SURFACE_TYPE_SHIFT		29
+#define BRW_SURFACE_TYPE_MASK		INTEL_MASK(31, 29)
 #define BRW_SURFACE_1D      0
 #define BRW_SURFACE_2D      1
 #define BRW_SURFACE_3D      2
@@ -464,6 +452,80 @@
 #define BRW_SURFACE_BUFFER  4
 #define BRW_SURFACE_NULL    7
 
+#define GEN7_SURFACE_IS_ARRAY           (1 << 28)
+#define GEN7_SURFACE_VALIGN_2           (0 << 16)
+#define GEN7_SURFACE_VALIGN_4           (1 << 16)
+#define GEN7_SURFACE_HALIGN_4           (0 << 15)
+#define GEN7_SURFACE_HALIGN_8           (1 << 15)
+#define GEN7_SURFACE_TILING_NONE        (0 << 13)
+#define GEN7_SURFACE_TILING_X           (2 << 13)
+#define GEN7_SURFACE_TILING_Y           (3 << 13)
+#define GEN7_SURFACE_ARYSPC_FULL	(0 << 10)
+#define GEN7_SURFACE_ARYSPC_LOD0	(1 << 10)
+
+/* Surface state DW2 */
+#define BRW_SURFACE_HEIGHT_SHIFT	19
+#define BRW_SURFACE_HEIGHT_MASK		INTEL_MASK(31, 19)
+#define BRW_SURFACE_WIDTH_SHIFT		6
+#define BRW_SURFACE_WIDTH_MASK		INTEL_MASK(18, 6)
+#define BRW_SURFACE_LOD_SHIFT		2
+#define BRW_SURFACE_LOD_MASK		INTEL_MASK(5, 2)
+#define GEN7_SURFACE_HEIGHT_SHIFT       16
+#define GEN7_SURFACE_HEIGHT_MASK        INTEL_MASK(29, 16)
+#define GEN7_SURFACE_WIDTH_SHIFT        0
+#define GEN7_SURFACE_WIDTH_MASK         INTEL_MASK(13, 0)
+
+/* Surface state DW3 */
+#define BRW_SURFACE_DEPTH_SHIFT		21
+#define BRW_SURFACE_DEPTH_MASK		INTEL_MASK(31, 21)
+#define BRW_SURFACE_PITCH_SHIFT		3
+#define BRW_SURFACE_PITCH_MASK		INTEL_MASK(19, 3)
+#define BRW_SURFACE_TILED		(1 << 1)
+#define BRW_SURFACE_TILED_Y		(1 << 0)
+
+/* Surface state DW4 */
+#define BRW_SURFACE_MIN_LOD_SHIFT	28
+#define BRW_SURFACE_MIN_LOD_MASK	INTEL_MASK(31, 28)
+#define BRW_SURFACE_MULTISAMPLECOUNT_1  (0 << 4)
+#define BRW_SURFACE_MULTISAMPLECOUNT_4  (2 << 4)
+#define GEN7_SURFACE_MULTISAMPLECOUNT_1         (0 << 3)
+#define GEN7_SURFACE_MULTISAMPLECOUNT_4         (2 << 3)
+#define GEN7_SURFACE_MULTISAMPLECOUNT_8         (3 << 3)
+#define GEN7_SURFACE_MSFMT_MSS                  (0 << 6)
+#define GEN7_SURFACE_MSFMT_DEPTH_STENCIL        (1 << 6)
+
+/* Surface state DW5 */
+#define BRW_SURFACE_X_OFFSET_SHIFT		25
+#define BRW_SURFACE_X_OFFSET_MASK		INTEL_MASK(31, 25)
+#define BRW_SURFACE_VERTICAL_ALIGN_ENABLE	(1 << 24)
+#define BRW_SURFACE_Y_OFFSET_SHIFT		20
+#define BRW_SURFACE_Y_OFFSET_MASK		INTEL_MASK(23, 20)
+#define GEN7_SURFACE_MIN_LOD_SHIFT              4
+#define GEN7_SURFACE_MIN_LOD_MASK               INTEL_MASK(7, 4)
+
+/* Surface state DW6 */
+#define GEN7_SURFACE_MCS_ENABLE                 (1 << 0)
+#define GEN7_SURFACE_MCS_PITCH_SHIFT            3
+#define GEN7_SURFACE_MCS_PITCH_MASK             INTEL_MASK(11, 3)
+
+/* Surface state DW7 */
+#define GEN7_SURFACE_SCS_R_SHIFT                25
+#define GEN7_SURFACE_SCS_R_MASK                 INTEL_MASK(27, 25)
+#define GEN7_SURFACE_SCS_G_SHIFT                22
+#define GEN7_SURFACE_SCS_G_MASK                 INTEL_MASK(24, 22)
+#define GEN7_SURFACE_SCS_B_SHIFT                19
+#define GEN7_SURFACE_SCS_B_MASK                 INTEL_MASK(21, 19)
+#define GEN7_SURFACE_SCS_A_SHIFT                16
+#define GEN7_SURFACE_SCS_A_MASK                 INTEL_MASK(18, 16)
+
+/* The actual swizzle values/what channel to use */
+#define HSW_SCS_ZERO                     0
+#define HSW_SCS_ONE                      1
+#define HSW_SCS_RED                      4
+#define HSW_SCS_GREEN                    5
+#define HSW_SCS_BLUE                     6
+#define HSW_SCS_ALPHA                    7
+
 #define BRW_TEXCOORDMODE_WRAP            0
 #define BRW_TEXCOORDMODE_MIRROR          1
 #define BRW_TEXCOORDMODE_CLAMP           2
@@ -480,20 +542,6 @@
 #define BRW_VERTEX_SUBPIXEL_PRECISION_8BITS  0
 #define BRW_VERTEX_SUBPIXEL_PRECISION_4BITS  1
 
-#define BRW_VERTEXBUFFER_ACCESS_VERTEXDATA     0
-#define BRW_VERTEXBUFFER_ACCESS_INSTANCEDATA   1
-
-#define BRW_VFCOMPONENT_NOSTORE      0
-#define BRW_VFCOMPONENT_STORE_SRC    1
-#define BRW_VFCOMPONENT_STORE_0      2
-#define BRW_VFCOMPONENT_STORE_1_FLT  3
-#define BRW_VFCOMPONENT_STORE_1_INT  4
-#define BRW_VFCOMPONENT_STORE_VID    5
-#define BRW_VFCOMPONENT_STORE_IID    6
-#define BRW_VFCOMPONENT_STORE_PID    7
-
-
-
 /* Execution Unit (EU) defines
  */
 
@@ -508,9 +556,18 @@
 #define BRW_CHANNEL_Z     2
 #define BRW_CHANNEL_W     3
 
-#define BRW_COMPRESSION_NONE          0
-#define BRW_COMPRESSION_2NDHALF       1
-#define BRW_COMPRESSION_COMPRESSED    2
+enum brw_compression {
+   BRW_COMPRESSION_NONE       = 0,
+   BRW_COMPRESSION_2NDHALF    = 1,
+   BRW_COMPRESSION_COMPRESSED = 2,
+};
+
+#define GEN6_COMPRESSION_1Q		0
+#define GEN6_COMPRESSION_2Q		1
+#define GEN6_COMPRESSION_3Q		2
+#define GEN6_COMPRESSION_4Q		3
+#define GEN6_COMPRESSION_1H		0
+#define GEN6_COMPRESSION_2H		2
 
 #define BRW_CONDITIONAL_NONE  0
 #define BRW_CONDITIONAL_Z     1
@@ -521,10 +578,9 @@
 #define BRW_CONDITIONAL_GE    4
 #define BRW_CONDITIONAL_L     5
 #define BRW_CONDITIONAL_LE    6
-#define BRW_CONDITIONAL_C     7
-#define BRW_CONDITIONAL_R     7	/* round increment */
-#define BRW_CONDITIONAL_O     8	/* overflow */
-#define BRW_CONDITIONAL_U     9	/* unordered */
+#define BRW_CONDITIONAL_R     7
+#define BRW_CONDITIONAL_O     8
+#define BRW_CONDITIONAL_U     9
 
 #define BRW_DEBUG_NONE        0
 #define BRW_DEBUG_BREAKPOINT  1
@@ -555,78 +611,147 @@
 #define BRW_ACCUMULATOR_WRITE_DISABLE 0
 #define BRW_ACCUMULATOR_WRITE_ENABLE  1
 
-#define BRW_OPCODE_MOV        1
-#define BRW_OPCODE_SEL        2
-#define BRW_OPCODE_NOT        4
-#define BRW_OPCODE_AND        5
-#define BRW_OPCODE_OR         6
-#define BRW_OPCODE_XOR        7
-#define BRW_OPCODE_SHR        8
-#define BRW_OPCODE_SHL        9
-#define BRW_OPCODE_RSR        10
-#define BRW_OPCODE_RSL        11
-#define BRW_OPCODE_ASR        12
-#define BRW_OPCODE_CMP        16
-#define BRW_OPCODE_CMPN       17
-#define BRW_OPCODE_F32TO16    19
-#define BRW_OPCODE_F16TO32    20
-#define BRW_OPCODE_BFREV      23
-#define BRW_OPCODE_BFE        24
-#define BRW_OPCODE_BFI1       25
-#define BRW_OPCODE_BFI2       26
-#define BRW_OPCODE_JMPI       32
-#define BRW_OPCODE_BRD        33
-#define BRW_OPCODE_IF         34
-#define BRW_OPCODE_BRC        35
-#define BRW_OPCODE_IFF        35
-#define BRW_OPCODE_ELSE       36
-#define BRW_OPCODE_ENDIF      37
-#define BRW_OPCODE_DO         38
-#define BRW_OPCODE_WHILE      39
-#define BRW_OPCODE_BREAK      40
-#define BRW_OPCODE_CONTINUE   41
-#define BRW_OPCODE_HALT       42
-#define BRW_OPCODE_MSAVE      44
-#define BRW_OPCODE_CALL       44
-#define BRW_OPCODE_MRESTORE   45
-#define BRW_OPCODE_RET        45
-#define BRW_OPCODE_PUSH       46
-#define BRW_OPCODE_POP        47
-#define BRW_OPCODE_WAIT       48
-#define BRW_OPCODE_SEND       49
-#define BRW_OPCODE_SENDC      50
-#define BRW_OPCODE_MATH       56
-#define BRW_OPCODE_ADD        64
-#define BRW_OPCODE_MUL        65
-#define BRW_OPCODE_AVG        66
-#define BRW_OPCODE_FRC        67
-#define BRW_OPCODE_RNDU       68
-#define BRW_OPCODE_RNDD       69
-#define BRW_OPCODE_RNDE       70
-#define BRW_OPCODE_RNDZ       71
-#define BRW_OPCODE_MAC        72
-#define BRW_OPCODE_MACH       73
-#define BRW_OPCODE_LZD        74
-#define BRW_OPCODE_FBH        75
-#define BRW_OPCODE_FBL        76
-#define BRW_OPCODE_CBIT       77
-#define BRW_OPCODE_ADDC       78
-#define BRW_OPCODE_SUBB       79
-#define BRW_OPCODE_SAD2       80
-#define BRW_OPCODE_SADA2      81
-#define BRW_OPCODE_DP4        84
-#define BRW_OPCODE_DPH        85
-#define BRW_OPCODE_DP3        86
-#define BRW_OPCODE_DP2        87
-#define BRW_OPCODE_DPA2       88
-#define BRW_OPCODE_LINE       89
-#define BRW_OPCODE_PLN        90
-#define BRW_OPCODE_MAD        91
-#define BRW_OPCODE_LRP        92
-#define BRW_OPCODE_NOP        126
-
-#define BRW_PREDICATE_NONE		      0
-#define BRW_PREDICATE_NORMAL		      1
+/** @{
+ *
+ * Gen6 has replaced "mask enable/disable" with WECtrl, which is
+ * effectively the same but much simpler to think about.  Now, there
+ * are two contributors ANDed together to whether channels are
+ * executed: The predication on the instruction, and the channel write
+ * enable.
+ */
+/**
+ * This is the default value.  It means that a channel's write enable is set
+ * if the per-channel IP is pointing at this instruction.
+ */
+#define BRW_WE_NORMAL		0
+/**
+ * This is used like BRW_MASK_DISABLE, and causes all channels to have
+ * their write enable set.  Note that predication still contributes to
+ * whether the channel actually gets written.
+ */
+#define BRW_WE_ALL		1
+/** @} */
+
+enum opcode {
+   /* These are the actual hardware opcodes. */
+   BRW_OPCODE_MOV =	1,
+   BRW_OPCODE_SEL =	2,
+   BRW_OPCODE_NOT =	4,
+   BRW_OPCODE_AND =	5,
+   BRW_OPCODE_OR =	6,
+   BRW_OPCODE_XOR =	7,
+   BRW_OPCODE_SHR =	8,
+   BRW_OPCODE_SHL =	9,
+   BRW_OPCODE_RSR =	10,
+   BRW_OPCODE_RSL =	11,
+   BRW_OPCODE_ASR =	12,
+   BRW_OPCODE_CMP =	16,
+   BRW_OPCODE_CMPN =	17,
+   BRW_OPCODE_F32TO16 = 19,
+   BRW_OPCODE_F16TO32 = 20,
+   BRW_OPCODE_BFREV =	23,
+   BRW_OPCODE_BFE =	24,
+   BRW_OPCODE_BFI1 =	25,
+   BRW_OPCODE_BFI2 =	26,
+   BRW_OPCODE_JMPI =	32,
+   BRW_OPCODE_BRD =	33,
+   BRW_OPCODE_IF =	34,
+   BRW_OPCODE_IFF =	35,
+   BRW_OPCODE_BRC =	35,
+   BRW_OPCODE_ELSE =	36,
+   BRW_OPCODE_ENDIF =	37,
+   BRW_OPCODE_DO =	38,
+   BRW_OPCODE_WHILE =	39,
+   BRW_OPCODE_BREAK =	40,
+   BRW_OPCODE_CONTINUE = 41,
+   BRW_OPCODE_HALT =	42,
+   BRW_OPCODE_MSAVE =	44,
+   BRW_OPCODE_CALL =	44,
+   BRW_OPCODE_MRESTORE = 45,
+   BRW_OPCODE_RET =	45,
+   BRW_OPCODE_PUSH =	46,
+   BRW_OPCODE_POP =	47,
+   BRW_OPCODE_WAIT =	48,
+   BRW_OPCODE_SEND =	49,
+   BRW_OPCODE_SENDC =	50,
+   BRW_OPCODE_MATH =	56,
+   BRW_OPCODE_ADD =	64,
+   BRW_OPCODE_MUL =	65,
+   BRW_OPCODE_AVG =	66,
+   BRW_OPCODE_FRC =	67,
+   BRW_OPCODE_RNDU =	68,
+   BRW_OPCODE_RNDD =	69,
+   BRW_OPCODE_RNDE =	70,
+   BRW_OPCODE_RNDZ =	71,
+   BRW_OPCODE_MAC =	72,
+   BRW_OPCODE_MACH =	73,
+   BRW_OPCODE_LZD =	74,
+   BRW_OPCODE_FBH =	75,
+   BRW_OPCODE_FBL =	76,
+   BRW_OPCODE_CBIT =	77,
+   BRW_OPCODE_ADDC =	78,
+   BRW_OPCODE_SUBB =	79,
+   BRW_OPCODE_SAD2 =	80,
+   BRW_OPCODE_SADA2 =	81,
+   BRW_OPCODE_DP4 =	84,
+   BRW_OPCODE_DPH =	85,
+   BRW_OPCODE_DP3 =	86,
+   BRW_OPCODE_DP2 =	87,
+   BRW_OPCODE_DPA2 =	88,
+   BRW_OPCODE_LINE =	89,
+   BRW_OPCODE_PLN =	90,
+   BRW_OPCODE_MAD =	91,
+   BRW_OPCODE_LRP =	92,
+   BRW_OPCODE_NOP =	126,
+
+   /* These are compiler backend opcodes that get translated into other
+    * instructions.
+    */
+   FS_OPCODE_FB_WRITE = 128,
+   SHADER_OPCODE_RCP,
+   SHADER_OPCODE_RSQ,
+   SHADER_OPCODE_SQRT,
+   SHADER_OPCODE_EXP2,
+   SHADER_OPCODE_LOG2,
+   SHADER_OPCODE_POW,
+   SHADER_OPCODE_INT_QUOTIENT,
+   SHADER_OPCODE_INT_REMAINDER,
+   SHADER_OPCODE_SIN,
+   SHADER_OPCODE_COS,
+
+   SHADER_OPCODE_TEX,
+   SHADER_OPCODE_TXD,
+   SHADER_OPCODE_TXF,
+   SHADER_OPCODE_TXL,
+   SHADER_OPCODE_TXS,
+   FS_OPCODE_TXB,
+
+   SHADER_OPCODE_SHADER_TIME_ADD,
+
+   FS_OPCODE_DDX,
+   FS_OPCODE_DDY,
+   FS_OPCODE_PIXEL_X,
+   FS_OPCODE_PIXEL_Y,
+   FS_OPCODE_CINTERP,
+   FS_OPCODE_LINTERP,
+   FS_OPCODE_SPILL,
+   FS_OPCODE_UNSPILL,
+   FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
+   FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7,
+   FS_OPCODE_VARYING_PULL_CONSTANT_LOAD,
+   FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7,
+   FS_OPCODE_MOV_DISPATCH_TO_FLAGS,
+   FS_OPCODE_DISCARD_JUMP,
+   FS_OPCODE_SET_GLOBAL_OFFSET,
+
+   VS_OPCODE_URB_WRITE,
+   VS_OPCODE_SCRATCH_READ,
+   VS_OPCODE_SCRATCH_WRITE,
+   VS_OPCODE_PULL_CONSTANT_LOAD,
+};
+
+#define BRW_PREDICATE_NONE             0
+#define BRW_PREDICATE_NORMAL           1
 #define BRW_PREDICATE_ALIGN1_ANYV             2
 #define BRW_PREDICATE_ALIGN1_ALLV             3
 #define BRW_PREDICATE_ALIGN1_ANY2H            4
@@ -671,6 +796,10 @@
 #define BRW_ARF_CONTROL               0x80
 #define BRW_ARF_NOTIFICATION_COUNT    0x90
 #define BRW_ARF_IP                    0xA0
+#define BRW_ARF_TDR                   0xB0
+#define BRW_ARF_TIMESTAMP             0xC0
+
+#define BRW_MRF_COMPR4			(1 << 7)
 
 #define BRW_AMASK   0
 #define BRW_IMASK   1
@@ -717,7 +846,6 @@
 #define BRW_POLYGON_FACING_FRONT      0
 #define BRW_POLYGON_FACING_BACK       1
 
-
 /**
  * Message target: Shared Function ID for where to SEND a message.
  *
@@ -762,13 +890,35 @@ enum brw_message_target {
 #define BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_GRADIENTS    2
 #define BRW_SAMPLER_MESSAGE_SIMD4X2_SAMPLE_COMPARE    0
 #define BRW_SAMPLER_MESSAGE_SIMD16_SAMPLE_COMPARE     2
+#define BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_BIAS_COMPARE 0
+#define BRW_SAMPLER_MESSAGE_SIMD4X2_SAMPLE_LOD_COMPARE 1
+#define BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_LOD_COMPARE  1
 #define BRW_SAMPLER_MESSAGE_SIMD4X2_RESINFO           2
-#define BRW_SAMPLER_MESSAGE_SIMD8_RESINFO             2
 #define BRW_SAMPLER_MESSAGE_SIMD16_RESINFO            2
 #define BRW_SAMPLER_MESSAGE_SIMD4X2_LD                3
 #define BRW_SAMPLER_MESSAGE_SIMD8_LD                  3
 #define BRW_SAMPLER_MESSAGE_SIMD16_LD                 3
 
+#define GEN5_SAMPLER_MESSAGE_SAMPLE              0
+#define GEN5_SAMPLER_MESSAGE_SAMPLE_BIAS         1
+#define GEN5_SAMPLER_MESSAGE_SAMPLE_LOD          2
+#define GEN5_SAMPLER_MESSAGE_SAMPLE_COMPARE      3
+#define GEN5_SAMPLER_MESSAGE_SAMPLE_DERIVS       4
+#define GEN5_SAMPLER_MESSAGE_SAMPLE_BIAS_COMPARE 5
+#define GEN5_SAMPLER_MESSAGE_SAMPLE_LOD_COMPARE  6
+#define GEN5_SAMPLER_MESSAGE_SAMPLE_LD           7
+#define GEN5_SAMPLER_MESSAGE_SAMPLE_RESINFO      10
+#define HSW_SAMPLER_MESSAGE_SAMPLE_DERIV_COMPARE 20
+#define GEN7_SAMPLER_MESSAGE_SAMPLE_LD_MCS       29
+#define GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DMS       30
+#define GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DSS       31
+
+/* for GEN5 only */
+#define BRW_SAMPLER_SIMD_MODE_SIMD4X2                   0
+#define BRW_SAMPLER_SIMD_MODE_SIMD8                     1
+#define BRW_SAMPLER_SIMD_MODE_SIMD16                    2
+#define BRW_SAMPLER_SIMD_MODE_SIMD32_64                 3
+
 #define BRW_DATAPORT_OWORD_BLOCK_1_OWORDLOW   0
 #define BRW_DATAPORT_OWORD_BLOCK_1_OWORDHIGH  1
 #define BRW_DATAPORT_OWORD_BLOCK_2_OWORDS     2
@@ -781,10 +931,24 @@ enum brw_message_target {
 #define BRW_DATAPORT_DWORD_SCATTERED_BLOCK_8DWORDS   2
 #define BRW_DATAPORT_DWORD_SCATTERED_BLOCK_16DWORDS  3
 
+/* This one stays the same across generations. */
 #define BRW_DATAPORT_READ_MESSAGE_OWORD_BLOCK_READ          0
+/* GEN4 */
 #define BRW_DATAPORT_READ_MESSAGE_OWORD_DUAL_BLOCK_READ     1
-#define BRW_DATAPORT_READ_MESSAGE_DWORD_BLOCK_READ          2
+#define BRW_DATAPORT_READ_MESSAGE_MEDIA_BLOCK_READ          2
 #define BRW_DATAPORT_READ_MESSAGE_DWORD_SCATTERED_READ      3
+/* G45, GEN5 */
+#define G45_DATAPORT_READ_MESSAGE_RENDER_UNORM_READ	    1
+#define G45_DATAPORT_READ_MESSAGE_OWORD_DUAL_BLOCK_READ     2
+#define G45_DATAPORT_READ_MESSAGE_AVC_LOOP_FILTER_READ	    3
+#define G45_DATAPORT_READ_MESSAGE_MEDIA_BLOCK_READ          4
+#define G45_DATAPORT_READ_MESSAGE_DWORD_SCATTERED_READ      6
+/* GEN6 */
+#define GEN6_DATAPORT_READ_MESSAGE_RENDER_UNORM_READ	    1
+#define GEN6_DATAPORT_READ_MESSAGE_OWORD_DUAL_BLOCK_READ     2
+#define GEN6_DATAPORT_READ_MESSAGE_MEDIA_BLOCK_READ          4
+#define GEN6_DATAPORT_READ_MESSAGE_OWORD_UNALIGN_BLOCK_READ  5
+#define GEN6_DATAPORT_READ_MESSAGE_DWORD_SCATTERED_READ      6
 
 #define BRW_DATAPORT_READ_TARGET_DATA_CACHE      0
 #define BRW_DATAPORT_READ_TARGET_RENDER_CACHE    1
@@ -798,12 +962,43 @@ enum brw_message_target {
 
 #define BRW_DATAPORT_WRITE_MESSAGE_OWORD_BLOCK_WRITE                0
 #define BRW_DATAPORT_WRITE_MESSAGE_OWORD_DUAL_BLOCK_WRITE           1
-#define BRW_DATAPORT_WRITE_MESSAGE_DWORD_BLOCK_WRITE                2
+#define BRW_DATAPORT_WRITE_MESSAGE_MEDIA_BLOCK_WRITE                2
 #define BRW_DATAPORT_WRITE_MESSAGE_DWORD_SCATTERED_WRITE            3
 #define BRW_DATAPORT_WRITE_MESSAGE_RENDER_TARGET_WRITE              4
 #define BRW_DATAPORT_WRITE_MESSAGE_STREAMED_VERTEX_BUFFER_WRITE     5
 #define BRW_DATAPORT_WRITE_MESSAGE_FLUSH_RENDER_CACHE               7
 
+/* GEN6 */
+#define GEN6_DATAPORT_WRITE_MESSAGE_DWORD_ATOMIC_WRITE              7
+#define GEN6_DATAPORT_WRITE_MESSAGE_OWORD_BLOCK_WRITE               8
+#define GEN6_DATAPORT_WRITE_MESSAGE_OWORD_DUAL_BLOCK_WRITE          9
+#define GEN6_DATAPORT_WRITE_MESSAGE_MEDIA_BLOCK_WRITE               10
+#define GEN6_DATAPORT_WRITE_MESSAGE_DWORD_SCATTERED_WRITE           11
+#define GEN6_DATAPORT_WRITE_MESSAGE_RENDER_TARGET_WRITE             12
+#define GEN6_DATAPORT_WRITE_MESSAGE_STREAMED_VB_WRITE               13
+#define GEN6_DATAPORT_WRITE_MESSAGE_RENDER_TARGET_UNORM_WRITE       14
+
+/* GEN7 */
+#define GEN7_DATAPORT_WRITE_MESSAGE_OWORD_DUAL_BLOCK_WRITE          10
+#define GEN7_DATAPORT_DC_DWORD_SCATTERED_READ                       3
+
+/* dataport atomic operations. */
+#define BRW_AOP_AND                   1
+#define BRW_AOP_OR                    2
+#define BRW_AOP_XOR                   3
+#define BRW_AOP_MOV                   4
+#define BRW_AOP_INC                   5
+#define BRW_AOP_DEC                   6
+#define BRW_AOP_ADD                   7
+#define BRW_AOP_SUB                   8
+#define BRW_AOP_REVSUB                9
+#define BRW_AOP_IMAX                  10
+#define BRW_AOP_IMIN                  11
+#define BRW_AOP_UMAX                  12
+#define BRW_AOP_UMIN                  13
+#define BRW_AOP_CMPWR                 14
+#define BRW_AOP_PREDEC                15
+
 #define BRW_MATH_FUNCTION_INV                              1
 #define BRW_MATH_FUNCTION_LOG                              2
 #define BRW_MATH_FUNCTION_EXP                              3
@@ -812,7 +1007,8 @@ enum brw_message_target {
 #define BRW_MATH_FUNCTION_SIN                              6 /* was 7 */
 #define BRW_MATH_FUNCTION_COS                              7 /* was 8 */
 #define BRW_MATH_FUNCTION_SINCOS                           8 /* was 6 */
-#define BRW_MATH_FUNCTION_TAN                              9
+#define BRW_MATH_FUNCTION_TAN                              9 /* gen4 */
+#define BRW_MATH_FUNCTION_FDIV                             9 /* gen6+ */
 #define BRW_MATH_FUNCTION_POW                              10
 #define BRW_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER   11
 #define BRW_MATH_FUNCTION_INT_DIV_QUOTIENT                 12
@@ -850,43 +1046,590 @@ enum brw_message_target {
 #define BRW_SCRATCH_SPACE_SIZE_2M     11
 
 
-
-
 #define CMD_URB_FENCE                 0x6000
-#define CMD_CONST_BUFFER_STATE        0x6001
+#define CMD_CS_URB_STATE              0x6001
 #define CMD_CONST_BUFFER              0x6002
 
 #define CMD_STATE_BASE_ADDRESS        0x6101
-#define CMD_STATE_INSN_POINTER        0x6102
-#define CMD_PIPELINE_SELECT           0x6104
+#define CMD_STATE_SIP                 0x6102
+#define CMD_PIPELINE_SELECT_965       0x6104
+#define CMD_PIPELINE_SELECT_GM45      0x6904
+
+#define _3DSTATE_PIPELINED_POINTERS		0x7800
+#define _3DSTATE_BINDING_TABLE_POINTERS		0x7801
+# define GEN6_BINDING_TABLE_MODIFY_VS	(1 << 8)
+# define GEN6_BINDING_TABLE_MODIFY_GS	(1 << 9)
+# define GEN6_BINDING_TABLE_MODIFY_PS	(1 << 12)
+
+#define _3DSTATE_BINDING_TABLE_POINTERS_VS	0x7826 /* GEN7+ */
+#define _3DSTATE_BINDING_TABLE_POINTERS_HS	0x7827 /* GEN7+ */
+#define _3DSTATE_BINDING_TABLE_POINTERS_DS	0x7828 /* GEN7+ */
+#define _3DSTATE_BINDING_TABLE_POINTERS_GS	0x7829 /* GEN7+ */
+#define _3DSTATE_BINDING_TABLE_POINTERS_PS	0x782A /* GEN7+ */
+
+#define _3DSTATE_SAMPLER_STATE_POINTERS		0x7802 /* GEN6+ */
+# define PS_SAMPLER_STATE_CHANGE				(1 << 12)
+# define GS_SAMPLER_STATE_CHANGE				(1 << 9)
+# define VS_SAMPLER_STATE_CHANGE				(1 << 8)
+/* DW1: VS */
+/* DW2: GS */
+/* DW3: PS */
+
+#define _3DSTATE_SAMPLER_STATE_POINTERS_VS	0x782B /* GEN7+ */
+#define _3DSTATE_SAMPLER_STATE_POINTERS_GS	0x782E /* GEN7+ */
+#define _3DSTATE_SAMPLER_STATE_POINTERS_PS	0x782F /* GEN7+ */
+
+#define _3DSTATE_VERTEX_BUFFERS       0x7808
+# define BRW_VB0_INDEX_SHIFT		27
+# define GEN6_VB0_INDEX_SHIFT		26
+# define BRW_VB0_ACCESS_VERTEXDATA	(0 << 26)
+# define BRW_VB0_ACCESS_INSTANCEDATA	(1 << 26)
+# define GEN6_VB0_ACCESS_VERTEXDATA	(0 << 20)
+# define GEN6_VB0_ACCESS_INSTANCEDATA	(1 << 20)
+# define GEN7_VB0_ADDRESS_MODIFYENABLE  (1 << 14)
+# define BRW_VB0_PITCH_SHIFT		0
+
+#define _3DSTATE_VERTEX_ELEMENTS      0x7809
+# define BRW_VE0_INDEX_SHIFT		27
+# define GEN6_VE0_INDEX_SHIFT		26
+# define BRW_VE0_FORMAT_SHIFT		16
+# define BRW_VE0_VALID			(1 << 26)
+# define GEN6_VE0_VALID			(1 << 25)
+# define GEN6_VE0_EDGE_FLAG_ENABLE	(1 << 15)
+# define BRW_VE0_SRC_OFFSET_SHIFT	0
+# define BRW_VE1_COMPONENT_NOSTORE	0
+# define BRW_VE1_COMPONENT_STORE_SRC	1
+# define BRW_VE1_COMPONENT_STORE_0	2
+# define BRW_VE1_COMPONENT_STORE_1_FLT	3
+# define BRW_VE1_COMPONENT_STORE_1_INT	4
+# define BRW_VE1_COMPONENT_STORE_VID	5
+# define BRW_VE1_COMPONENT_STORE_IID	6
+# define BRW_VE1_COMPONENT_STORE_PID	7
+# define BRW_VE1_COMPONENT_0_SHIFT	28
+# define BRW_VE1_COMPONENT_1_SHIFT	24
+# define BRW_VE1_COMPONENT_2_SHIFT	20
+# define BRW_VE1_COMPONENT_3_SHIFT	16
+# define BRW_VE1_DST_OFFSET_SHIFT	0
 
-#define CMD_PIPELINED_STATE_POINTERS  0x7800
-#define CMD_BINDING_TABLE_PTRS        0x7801
-#define CMD_VERTEX_BUFFER             0x7808
-#define CMD_VERTEX_ELEMENT            0x7809
 #define CMD_INDEX_BUFFER              0x780a
-#define CMD_VF_STATISTICS             0x780b
-
-#define CMD_DRAW_RECT                 0x7900
-#define CMD_BLEND_CONSTANT_COLOR      0x7901
-#define CMD_CHROMA_KEY                0x7904
-#define CMD_DEPTH_BUFFER              0x7905
-#define CMD_POLY_STIPPLE_OFFSET       0x7906
-#define CMD_POLY_STIPPLE_PATTERN      0x7907
-#define CMD_LINE_STIPPLE_PATTERN      0x7908
-#define CMD_GLOBAL_DEPTH_OFFSET_CLAMP 0x7908
+#define GEN4_3DSTATE_VF_STATISTICS		0x780b
+#define GM45_3DSTATE_VF_STATISTICS		0x680b
+#define _3DSTATE_CC_STATE_POINTERS		0x780e /* GEN6+ */
+#define _3DSTATE_BLEND_STATE_POINTERS		0x7824 /* GEN7+ */
+#define _3DSTATE_DEPTH_STENCIL_STATE_POINTERS	0x7825 /* GEN7+ */
+
+#define _3DSTATE_URB				0x7805 /* GEN6 */
+# define GEN6_URB_VS_SIZE_SHIFT				16
+# define GEN6_URB_VS_ENTRIES_SHIFT			0
+# define GEN6_URB_GS_ENTRIES_SHIFT			8
+# define GEN6_URB_GS_SIZE_SHIFT				0
+
+#define _3DSTATE_VF                             0x780c /* GEN7.5+ */
+#define HSW_CUT_INDEX_ENABLE                            (1 << 8)
+
+#define _3DSTATE_URB_VS                         0x7830 /* GEN7+ */
+#define _3DSTATE_URB_HS                         0x7831 /* GEN7+ */
+#define _3DSTATE_URB_DS                         0x7832 /* GEN7+ */
+#define _3DSTATE_URB_GS                         0x7833 /* GEN7+ */
+# define GEN7_URB_ENTRY_SIZE_SHIFT                      16
+# define GEN7_URB_STARTING_ADDRESS_SHIFT                25
+
+#define _3DSTATE_PUSH_CONSTANT_ALLOC_VS         0x7912 /* GEN7+ */
+#define _3DSTATE_PUSH_CONSTANT_ALLOC_PS         0x7916 /* GEN7+ */
+# define GEN7_PUSH_CONSTANT_BUFFER_OFFSET_SHIFT         16
+
+#define _3DSTATE_VIEWPORT_STATE_POINTERS	0x780d /* GEN6+ */
+# define GEN6_CC_VIEWPORT_MODIFY			(1 << 12)
+# define GEN6_SF_VIEWPORT_MODIFY			(1 << 11)
+# define GEN6_CLIP_VIEWPORT_MODIFY			(1 << 10)
+
+#define _3DSTATE_VIEWPORT_STATE_POINTERS_CC	0x7823 /* GEN7+ */
+#define _3DSTATE_VIEWPORT_STATE_POINTERS_SF_CL	0x7821 /* GEN7+ */
+
+#define _3DSTATE_SCISSOR_STATE_POINTERS		0x780f /* GEN6+ */
+
+#define _3DSTATE_VS				0x7810 /* GEN6+ */
+/* DW2 */
+# define GEN6_VS_SPF_MODE				(1 << 31)
+# define GEN6_VS_VECTOR_MASK_ENABLE			(1 << 30)
+# define GEN6_VS_SAMPLER_COUNT_SHIFT			27
+# define GEN6_VS_BINDING_TABLE_ENTRY_COUNT_SHIFT	18
+# define GEN6_VS_FLOATING_POINT_MODE_IEEE_754		(0 << 16)
+# define GEN6_VS_FLOATING_POINT_MODE_ALT		(1 << 16)
+/* DW4 */
+# define GEN6_VS_DISPATCH_START_GRF_SHIFT		20
+# define GEN6_VS_URB_READ_LENGTH_SHIFT			11
+# define GEN6_VS_URB_ENTRY_READ_OFFSET_SHIFT		4
+/* DW5 */
+# define GEN6_VS_MAX_THREADS_SHIFT			25
+# define HSW_VS_MAX_THREADS_SHIFT			23
+# define GEN6_VS_STATISTICS_ENABLE			(1 << 10)
+# define GEN6_VS_CACHE_DISABLE				(1 << 1)
+# define GEN6_VS_ENABLE					(1 << 0)
+
+#define _3DSTATE_GS		      		0x7811 /* GEN6+ */
+/* DW2 */
+# define GEN6_GS_SPF_MODE				(1 << 31)
+# define GEN6_GS_VECTOR_MASK_ENABLE			(1 << 30)
+# define GEN6_GS_SAMPLER_COUNT_SHIFT			27
+# define GEN6_GS_BINDING_TABLE_ENTRY_COUNT_SHIFT	18
+# define GEN6_GS_FLOATING_POINT_MODE_IEEE_754		(0 << 16)
+# define GEN6_GS_FLOATING_POINT_MODE_ALT		(1 << 16)
+/* DW4 */
+# define GEN6_GS_URB_READ_LENGTH_SHIFT			11
+# define GEN7_GS_INCLUDE_VERTEX_HANDLES		        (1 << 10)
+# define GEN6_GS_URB_ENTRY_READ_OFFSET_SHIFT		4
+# define GEN6_GS_DISPATCH_START_GRF_SHIFT		0
+/* DW5 */
+# define GEN6_GS_MAX_THREADS_SHIFT			25
+# define GEN6_GS_STATISTICS_ENABLE			(1 << 10)
+# define GEN6_GS_SO_STATISTICS_ENABLE			(1 << 9)
+# define GEN6_GS_RENDERING_ENABLE			(1 << 8)
+# define GEN7_GS_ENABLE					(1 << 0)
+/* DW6 */
+# define GEN6_GS_REORDER				(1 << 30)
+# define GEN6_GS_DISCARD_ADJACENCY			(1 << 29)
+# define GEN6_GS_SVBI_PAYLOAD_ENABLE			(1 << 28)
+# define GEN6_GS_SVBI_POSTINCREMENT_ENABLE		(1 << 27)
+# define GEN6_GS_SVBI_POSTINCREMENT_VALUE_SHIFT		16
+# define GEN6_GS_SVBI_POSTINCREMENT_VALUE_MASK		INTEL_MASK(25, 16)
+# define GEN6_GS_ENABLE					(1 << 15)
+
+# define BRW_GS_EDGE_INDICATOR_0			(1 << 8)
+# define BRW_GS_EDGE_INDICATOR_1			(1 << 9)
+
+#define _3DSTATE_HS                             0x781B /* GEN7+ */
+#define _3DSTATE_TE                             0x781C /* GEN7+ */
+#define _3DSTATE_DS                             0x781D /* GEN7+ */
+
+#define _3DSTATE_CLIP				0x7812 /* GEN6+ */
+/* DW1 */
+# define GEN7_CLIP_WINDING_CW                           (0 << 20)
+# define GEN7_CLIP_WINDING_CCW                          (1 << 20)
+# define GEN7_CLIP_VERTEX_SUBPIXEL_PRECISION_8          (0 << 19)
+# define GEN7_CLIP_VERTEX_SUBPIXEL_PRECISION_4          (1 << 19)
+# define GEN7_CLIP_EARLY_CULL                           (1 << 18)
+# define GEN7_CLIP_CULLMODE_BOTH                        (0 << 16)
+# define GEN7_CLIP_CULLMODE_NONE                        (1 << 16)
+# define GEN7_CLIP_CULLMODE_FRONT                       (2 << 16)
+# define GEN7_CLIP_CULLMODE_BACK                        (3 << 16)
+# define GEN6_CLIP_STATISTICS_ENABLE			(1 << 10)
+/**
+ * Just does cheap culling based on the clip distance.  Bits must be
+ * disjoint with USER_CLIP_CLIP_DISTANCE bits.
+ */
+# define GEN6_USER_CLIP_CULL_DISTANCES_SHIFT		0
+/* DW2 */
+# define GEN6_CLIP_ENABLE				(1 << 31)
+# define GEN6_CLIP_API_OGL				(0 << 30)
+# define GEN6_CLIP_API_D3D				(1 << 30)
+# define GEN6_CLIP_XY_TEST				(1 << 28)
+# define GEN6_CLIP_Z_TEST				(1 << 27)
+# define GEN6_CLIP_GB_TEST				(1 << 26)
+/** 8-bit field of which user clip distances to clip aganist. */
+# define GEN6_USER_CLIP_CLIP_DISTANCES_SHIFT		16
+# define GEN6_CLIP_MODE_NORMAL				(0 << 13)
+# define GEN6_CLIP_MODE_REJECT_ALL			(3 << 13)
+# define GEN6_CLIP_MODE_ACCEPT_ALL			(4 << 13)
+# define GEN6_CLIP_PERSPECTIVE_DIVIDE_DISABLE		(1 << 9)
+# define GEN6_CLIP_NON_PERSPECTIVE_BARYCENTRIC_ENABLE	(1 << 8)
+# define GEN6_CLIP_TRI_PROVOKE_SHIFT			4
+# define GEN6_CLIP_LINE_PROVOKE_SHIFT			2
+# define GEN6_CLIP_TRIFAN_PROVOKE_SHIFT			0
+/* DW3 */
+# define GEN6_CLIP_MIN_POINT_WIDTH_SHIFT		17
+# define GEN6_CLIP_MAX_POINT_WIDTH_SHIFT		6
+# define GEN6_CLIP_FORCE_ZERO_RTAINDEX			(1 << 5)
+
+#define _3DSTATE_SF				0x7813 /* GEN6+ */
+/* DW1 (for gen6) */
+# define GEN6_SF_NUM_OUTPUTS_SHIFT			22
+# define GEN6_SF_SWIZZLE_ENABLE				(1 << 21)
+# define GEN6_SF_POINT_SPRITE_UPPERLEFT			(0 << 20)
+# define GEN6_SF_POINT_SPRITE_LOWERLEFT			(1 << 20)
+# define GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT		11
+# define GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT		4
+/* DW2 */
+# define GEN6_SF_LEGACY_GLOBAL_DEPTH_BIAS		(1 << 11)
+# define GEN6_SF_STATISTICS_ENABLE			(1 << 10)
+# define GEN6_SF_GLOBAL_DEPTH_OFFSET_SOLID		(1 << 9)
+# define GEN6_SF_GLOBAL_DEPTH_OFFSET_WIREFRAME		(1 << 8)
+# define GEN6_SF_GLOBAL_DEPTH_OFFSET_POINT		(1 << 7)
+# define GEN6_SF_FRONT_SOLID				(0 << 5)
+# define GEN6_SF_FRONT_WIREFRAME			(1 << 5)
+# define GEN6_SF_FRONT_POINT				(2 << 5)
+# define GEN6_SF_BACK_SOLID				(0 << 3)
+# define GEN6_SF_BACK_WIREFRAME				(1 << 3)
+# define GEN6_SF_BACK_POINT				(2 << 3)
+# define GEN6_SF_VIEWPORT_TRANSFORM_ENABLE		(1 << 1)
+# define GEN6_SF_WINDING_CCW				(1 << 0)
+/* DW3 */
+# define GEN6_SF_LINE_AA_ENABLE				(1 << 31)
+# define GEN6_SF_CULL_BOTH				(0 << 29)
+# define GEN6_SF_CULL_NONE				(1 << 29)
+# define GEN6_SF_CULL_FRONT				(2 << 29)
+# define GEN6_SF_CULL_BACK				(3 << 29)
+# define GEN6_SF_LINE_WIDTH_SHIFT			18 /* U3.7 */
+# define GEN6_SF_LINE_END_CAP_WIDTH_0_5			(0 << 16)
+# define GEN6_SF_LINE_END_CAP_WIDTH_1_0			(1 << 16)
+# define GEN6_SF_LINE_END_CAP_WIDTH_2_0			(2 << 16)
+# define GEN6_SF_LINE_END_CAP_WIDTH_4_0			(3 << 16)
+# define GEN6_SF_SCISSOR_ENABLE				(1 << 11)
+# define GEN6_SF_MSRAST_OFF_PIXEL			(0 << 8)
+# define GEN6_SF_MSRAST_OFF_PATTERN			(1 << 8)
+# define GEN6_SF_MSRAST_ON_PIXEL			(2 << 8)
+# define GEN6_SF_MSRAST_ON_PATTERN			(3 << 8)
+/* DW4 */
+# define GEN6_SF_TRI_PROVOKE_SHIFT			29
+# define GEN6_SF_LINE_PROVOKE_SHIFT			27
+# define GEN6_SF_TRIFAN_PROVOKE_SHIFT			25
+# define GEN6_SF_LINE_AA_MODE_MANHATTAN			(0 << 14)
+# define GEN6_SF_LINE_AA_MODE_TRUE			(1 << 14)
+# define GEN6_SF_VERTEX_SUBPIXEL_8BITS			(0 << 12)
+# define GEN6_SF_VERTEX_SUBPIXEL_4BITS			(1 << 12)
+# define GEN6_SF_USE_STATE_POINT_WIDTH			(1 << 11)
+# define GEN6_SF_POINT_WIDTH_SHIFT			0 /* U8.3 */
+/* DW5: depth offset constant */
+/* DW6: depth offset scale */
+/* DW7: depth offset clamp */
+/* DW8 */
+# define ATTRIBUTE_1_OVERRIDE_W				(1 << 31)
+# define ATTRIBUTE_1_OVERRIDE_Z				(1 << 30)
+# define ATTRIBUTE_1_OVERRIDE_Y				(1 << 29)
+# define ATTRIBUTE_1_OVERRIDE_X				(1 << 28)
+# define ATTRIBUTE_1_CONST_SOURCE_SHIFT			25
+# define ATTRIBUTE_1_SWIZZLE_SHIFT			22
+# define ATTRIBUTE_1_SOURCE_SHIFT			16
+# define ATTRIBUTE_0_OVERRIDE_W				(1 << 15)
+# define ATTRIBUTE_0_OVERRIDE_Z				(1 << 14)
+# define ATTRIBUTE_0_OVERRIDE_Y				(1 << 13)
+# define ATTRIBUTE_0_OVERRIDE_X				(1 << 12)
+# define ATTRIBUTE_0_CONST_SOURCE_SHIFT			9
+# define ATTRIBUTE_0_SWIZZLE_SHIFT			6
+# define ATTRIBUTE_0_SOURCE_SHIFT			0
+
+# define ATTRIBUTE_SWIZZLE_INPUTATTR                    0
+# define ATTRIBUTE_SWIZZLE_INPUTATTR_FACING             1
+# define ATTRIBUTE_SWIZZLE_INPUTATTR_W                  2
+# define ATTRIBUTE_SWIZZLE_INPUTATTR_FACING_W           3
+# define ATTRIBUTE_SWIZZLE_SHIFT                        6
+
+/* DW16: Point sprite texture coordinate enables */
+/* DW17: Constant interpolation enables */
+/* DW18: attr 0-7 wrap shortest enables */
+/* DW19: attr 8-16 wrap shortest enables */
+
+/* On GEN7, many fields of 3DSTATE_SF were split out into a new command:
+ * 3DSTATE_SBE.  The remaining fields live in different DWords, but retain
+ * the same bit-offset.  The only new field:
+ */
+/* GEN7/DW1: */
+# define GEN7_SF_DEPTH_BUFFER_SURFACE_FORMAT_SHIFT	12
+/* GEN7/DW2: */
+# define HSW_SF_LINE_STIPPLE_ENABLE			14
+
+#define _3DSTATE_SBE				0x781F /* GEN7+ */
+/* DW1 */
+# define GEN7_SBE_SWIZZLE_CONTROL_MODE			(1 << 28)
+# define GEN7_SBE_NUM_OUTPUTS_SHIFT			22
+# define GEN7_SBE_SWIZZLE_ENABLE			(1 << 21)
+# define GEN7_SBE_POINT_SPRITE_LOWERLEFT		(1 << 20)
+# define GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT		11
+# define GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT		4
+/* DW2-9: Attribute setup (same as DW8-15 of gen6 _3DSTATE_SF) */
+/* DW10: Point sprite texture coordinate enables */
+/* DW11: Constant interpolation enables */
+/* DW12: attr 0-7 wrap shortest enables */
+/* DW13: attr 8-16 wrap shortest enables */
+
+enum brw_wm_barycentric_interp_mode {
+   BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC		= 0,
+   BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC	= 1,
+   BRW_WM_PERSPECTIVE_SAMPLE_BARYCENTRIC	= 2,
+   BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC	= 3,
+   BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC	= 4,
+   BRW_WM_NONPERSPECTIVE_SAMPLE_BARYCENTRIC	= 5,
+   BRW_WM_BARYCENTRIC_INTERP_MODE_COUNT  = 6
+};
+#define BRW_WM_NONPERSPECTIVE_BARYCENTRIC_BITS \
+   ((1 << BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC) | \
+    (1 << BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC) | \
+    (1 << BRW_WM_NONPERSPECTIVE_SAMPLE_BARYCENTRIC))
+
+#define _3DSTATE_WM				0x7814 /* GEN6+ */
+/* DW1: kernel pointer */
+/* DW2 */
+# define GEN6_WM_SPF_MODE				(1 << 31)
+# define GEN6_WM_VECTOR_MASK_ENABLE			(1 << 30)
+# define GEN6_WM_SAMPLER_COUNT_SHIFT			27
+# define GEN6_WM_BINDING_TABLE_ENTRY_COUNT_SHIFT	18
+# define GEN6_WM_FLOATING_POINT_MODE_IEEE_754		(0 << 16)
+# define GEN6_WM_FLOATING_POINT_MODE_ALT		(1 << 16)
+/* DW3: scratch space */
+/* DW4 */
+# define GEN6_WM_STATISTICS_ENABLE			(1 << 31)
+# define GEN6_WM_DEPTH_CLEAR				(1 << 30)
+# define GEN6_WM_DEPTH_RESOLVE				(1 << 28)
+# define GEN6_WM_HIERARCHICAL_DEPTH_RESOLVE		(1 << 27)
+# define GEN6_WM_DISPATCH_START_GRF_SHIFT_0		16
+# define GEN6_WM_DISPATCH_START_GRF_SHIFT_1		8
+# define GEN6_WM_DISPATCH_START_GRF_SHIFT_2		0
+/* DW5 */
+# define GEN6_WM_MAX_THREADS_SHIFT			25
+# define GEN6_WM_KILL_ENABLE				(1 << 22)
+# define GEN6_WM_COMPUTED_DEPTH				(1 << 21)
+# define GEN6_WM_USES_SOURCE_DEPTH			(1 << 20)
+# define GEN6_WM_DISPATCH_ENABLE			(1 << 19)
+# define GEN6_WM_LINE_END_CAP_AA_WIDTH_0_5		(0 << 16)
+# define GEN6_WM_LINE_END_CAP_AA_WIDTH_1_0		(1 << 16)
+# define GEN6_WM_LINE_END_CAP_AA_WIDTH_2_0		(2 << 16)
+# define GEN6_WM_LINE_END_CAP_AA_WIDTH_4_0		(3 << 16)
+# define GEN6_WM_LINE_AA_WIDTH_0_5			(0 << 14)
+# define GEN6_WM_LINE_AA_WIDTH_1_0			(1 << 14)
+# define GEN6_WM_LINE_AA_WIDTH_2_0			(2 << 14)
+# define GEN6_WM_LINE_AA_WIDTH_4_0			(3 << 14)
+# define GEN6_WM_POLYGON_STIPPLE_ENABLE			(1 << 13)
+# define GEN6_WM_LINE_STIPPLE_ENABLE			(1 << 11)
+# define GEN6_WM_OMASK_TO_RENDER_TARGET			(1 << 9)
+# define GEN6_WM_USES_SOURCE_W				(1 << 8)
+# define GEN6_WM_DUAL_SOURCE_BLEND_ENABLE		(1 << 7)
+# define GEN6_WM_32_DISPATCH_ENABLE			(1 << 2)
+# define GEN6_WM_16_DISPATCH_ENABLE			(1 << 1)
+# define GEN6_WM_8_DISPATCH_ENABLE			(1 << 0)
+/* DW6 */
+# define GEN6_WM_NUM_SF_OUTPUTS_SHIFT			20
+# define GEN6_WM_POSOFFSET_NONE				(0 << 18)
+# define GEN6_WM_POSOFFSET_CENTROID			(2 << 18)
+# define GEN6_WM_POSOFFSET_SAMPLE			(3 << 18)
+# define GEN6_WM_POSITION_ZW_PIXEL			(0 << 16)
+# define GEN6_WM_POSITION_ZW_CENTROID			(2 << 16)
+# define GEN6_WM_POSITION_ZW_SAMPLE			(3 << 16)
+# define GEN6_WM_NONPERSPECTIVE_SAMPLE_BARYCENTRIC	(1 << 15)
+# define GEN6_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC	(1 << 14)
+# define GEN6_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC	(1 << 13)
+# define GEN6_WM_PERSPECTIVE_SAMPLE_BARYCENTRIC		(1 << 12)
+# define GEN6_WM_PERSPECTIVE_CENTROID_BARYCENTRIC	(1 << 11)
+# define GEN6_WM_PERSPECTIVE_PIXEL_BARYCENTRIC		(1 << 10)
+# define GEN6_WM_BARYCENTRIC_INTERPOLATION_MODE_SHIFT   10
+# define GEN6_WM_POINT_RASTRULE_UPPER_RIGHT		(1 << 9)
+# define GEN6_WM_MSRAST_OFF_PIXEL			(0 << 1)
+# define GEN6_WM_MSRAST_OFF_PATTERN			(1 << 1)
+# define GEN6_WM_MSRAST_ON_PIXEL			(2 << 1)
+# define GEN6_WM_MSRAST_ON_PATTERN			(3 << 1)
+# define GEN6_WM_MSDISPMODE_PERSAMPLE			(0 << 0)
+# define GEN6_WM_MSDISPMODE_PERPIXEL			(1 << 0)
+/* DW7: kernel 1 pointer */
+/* DW8: kernel 2 pointer */
+
+#define _3DSTATE_CONSTANT_VS		      0x7815 /* GEN6+ */
+#define _3DSTATE_CONSTANT_GS		      0x7816 /* GEN6+ */
+#define _3DSTATE_CONSTANT_PS		      0x7817 /* GEN6+ */
+# define GEN6_CONSTANT_BUFFER_3_ENABLE			(1 << 15)
+# define GEN6_CONSTANT_BUFFER_2_ENABLE			(1 << 14)
+# define GEN6_CONSTANT_BUFFER_1_ENABLE			(1 << 13)
+# define GEN6_CONSTANT_BUFFER_0_ENABLE			(1 << 12)
+
+#define _3DSTATE_CONSTANT_HS                  0x7819 /* GEN7+ */
+#define _3DSTATE_CONSTANT_DS                  0x781A /* GEN7+ */
+
+#define _3DSTATE_STREAMOUT                    0x781e /* GEN7+ */
+/* DW1 */
+# define SO_FUNCTION_ENABLE				(1 << 31)
+# define SO_RENDERING_DISABLE				(1 << 30)
+/* This selects which incoming rendering stream goes down the pipeline.  The
+ * rendering stream is 0 if not defined by special cases in the GS state.
+ */
+# define SO_RENDER_STREAM_SELECT_SHIFT			27
+# define SO_RENDER_STREAM_SELECT_MASK			INTEL_MASK(28, 27)
+/* Controls reordering of TRISTRIP_* elements in stream output (not rendering).
+ */
+# define SO_REORDER_TRAILING				(1 << 26)
+/* Controls SO_NUM_PRIMS_WRITTEN_* and SO_PRIM_STORAGE_* */
+# define SO_STATISTICS_ENABLE				(1 << 25)
+# define SO_BUFFER_ENABLE(n)				(1 << (8 + (n)))
+/* DW2 */
+# define SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT		29
+# define SO_STREAM_3_VERTEX_READ_OFFSET_MASK		INTEL_MASK(29, 29)
+# define SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT		24
+# define SO_STREAM_3_VERTEX_READ_LENGTH_MASK		INTEL_MASK(28, 24)
+# define SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT		21
+# define SO_STREAM_2_VERTEX_READ_OFFSET_MASK		INTEL_MASK(21, 21)
+# define SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT		16
+# define SO_STREAM_2_VERTEX_READ_LENGTH_MASK		INTEL_MASK(20, 16)
+# define SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT		13
+# define SO_STREAM_1_VERTEX_READ_OFFSET_MASK		INTEL_MASK(13, 13)
+# define SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT		8
+# define SO_STREAM_1_VERTEX_READ_LENGTH_MASK		INTEL_MASK(12, 8)
+# define SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT		5
+# define SO_STREAM_0_VERTEX_READ_OFFSET_MASK		INTEL_MASK(5, 5)
+# define SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT		0
+# define SO_STREAM_0_VERTEX_READ_LENGTH_MASK		INTEL_MASK(4, 0)
+
+/* 3DSTATE_WM for Gen7 */
+/* DW1 */
+# define GEN7_WM_STATISTICS_ENABLE			(1 << 31)
+# define GEN7_WM_DEPTH_CLEAR				(1 << 30)
+# define GEN7_WM_DISPATCH_ENABLE			(1 << 29)
+# define GEN7_WM_DEPTH_RESOLVE				(1 << 28)
+# define GEN7_WM_HIERARCHICAL_DEPTH_RESOLVE		(1 << 27)
+# define GEN7_WM_KILL_ENABLE				(1 << 25)
+# define GEN7_WM_PSCDEPTH_OFF			        (0 << 23)
+# define GEN7_WM_PSCDEPTH_ON			        (1 << 23)
+# define GEN7_WM_PSCDEPTH_ON_GE			        (2 << 23)
+# define GEN7_WM_PSCDEPTH_ON_LE			        (3 << 23)
+# define GEN7_WM_USES_SOURCE_DEPTH			(1 << 20)
+# define GEN7_WM_USES_SOURCE_W			        (1 << 19)
+# define GEN7_WM_POSITION_ZW_PIXEL			(0 << 17)
+# define GEN7_WM_POSITION_ZW_CENTROID			(2 << 17)
+# define GEN7_WM_POSITION_ZW_SAMPLE			(3 << 17)
+# define GEN7_WM_BARYCENTRIC_INTERPOLATION_MODE_SHIFT   11
+# define GEN7_WM_USES_INPUT_COVERAGE_MASK	        (1 << 10)
+# define GEN7_WM_LINE_END_CAP_AA_WIDTH_0_5		(0 << 8)
+# define GEN7_WM_LINE_END_CAP_AA_WIDTH_1_0		(1 << 8)
+# define GEN7_WM_LINE_END_CAP_AA_WIDTH_2_0		(2 << 8)
+# define GEN7_WM_LINE_END_CAP_AA_WIDTH_4_0		(3 << 8)
+# define GEN7_WM_LINE_AA_WIDTH_0_5			(0 << 6)
+# define GEN7_WM_LINE_AA_WIDTH_1_0			(1 << 6)
+# define GEN7_WM_LINE_AA_WIDTH_2_0			(2 << 6)
+# define GEN7_WM_LINE_AA_WIDTH_4_0			(3 << 6)
+# define GEN7_WM_POLYGON_STIPPLE_ENABLE			(1 << 4)
+# define GEN7_WM_LINE_STIPPLE_ENABLE			(1 << 3)
+# define GEN7_WM_POINT_RASTRULE_UPPER_RIGHT		(1 << 2)
+# define GEN7_WM_MSRAST_OFF_PIXEL			(0 << 0)
+# define GEN7_WM_MSRAST_OFF_PATTERN			(1 << 0)
+# define GEN7_WM_MSRAST_ON_PIXEL			(2 << 0)
+# define GEN7_WM_MSRAST_ON_PATTERN			(3 << 0)
+/* DW2 */
+# define GEN7_WM_MSDISPMODE_PERSAMPLE			(0 << 31)
+# define GEN7_WM_MSDISPMODE_PERPIXEL			(1 << 31)
+
+#define _3DSTATE_PS				0x7820 /* GEN7+ */
+/* DW1: kernel pointer */
+/* DW2 */
+# define GEN7_PS_SPF_MODE				(1 << 31)
+# define GEN7_PS_VECTOR_MASK_ENABLE			(1 << 30)
+# define GEN7_PS_SAMPLER_COUNT_SHIFT			27
+# define GEN7_PS_BINDING_TABLE_ENTRY_COUNT_SHIFT	18
+# define GEN7_PS_FLOATING_POINT_MODE_IEEE_754		(0 << 16)
+# define GEN7_PS_FLOATING_POINT_MODE_ALT		(1 << 16)
+/* DW3: scratch space */
+/* DW4 */
+# define IVB_PS_MAX_THREADS_SHIFT			24
+# define HSW_PS_MAX_THREADS_SHIFT			23
+# define HSW_PS_SAMPLE_MASK_SHIFT		        12
+# define HSW_PS_SAMPLE_MASK_MASK			INTEL_MASK(19, 12)
+# define GEN7_PS_PUSH_CONSTANT_ENABLE		        (1 << 11)
+# define GEN7_PS_ATTRIBUTE_ENABLE		        (1 << 10)
+# define GEN7_PS_OMASK_TO_RENDER_TARGET			(1 << 9)
+# define GEN7_PS_DUAL_SOURCE_BLEND_ENABLE		(1 << 7)
+# define GEN7_PS_POSOFFSET_NONE				(0 << 3)
+# define GEN7_PS_POSOFFSET_CENTROID			(2 << 3)
+# define GEN7_PS_POSOFFSET_SAMPLE			(3 << 3)
+# define GEN7_PS_32_DISPATCH_ENABLE			(1 << 2)
+# define GEN7_PS_16_DISPATCH_ENABLE			(1 << 1)
+# define GEN7_PS_8_DISPATCH_ENABLE			(1 << 0)
+/* DW5 */
+# define GEN7_PS_DISPATCH_START_GRF_SHIFT_0		16
+# define GEN7_PS_DISPATCH_START_GRF_SHIFT_1		8
+# define GEN7_PS_DISPATCH_START_GRF_SHIFT_2		0
+/* DW6: kernel 1 pointer */
+/* DW7: kernel 2 pointer */
+
+#define _3DSTATE_SAMPLE_MASK			0x7818 /* GEN6+ */
+
+#define _3DSTATE_DRAWING_RECTANGLE		0x7900
+#define _3DSTATE_BLEND_CONSTANT_COLOR		0x7901
+#define _3DSTATE_CHROMA_KEY			0x7904
+#define _3DSTATE_DEPTH_BUFFER			0x7905 /* GEN4-6 */
+#define _3DSTATE_POLY_STIPPLE_OFFSET		0x7906
+#define _3DSTATE_POLY_STIPPLE_PATTERN		0x7907
+#define _3DSTATE_LINE_STIPPLE_PATTERN		0x7908
+#define _3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP	0x7909
+#define _3DSTATE_AA_LINE_PARAMETERS		0x790a /* G45+ */
+
+#define _3DSTATE_GS_SVB_INDEX			0x790b /* CTG+ */
+/* DW1 */
+# define SVB_INDEX_SHIFT				29
+# define SVB_LOAD_INTERNAL_VERTEX_COUNT			(1 << 0) /* SNB+ */
+/* DW2: SVB index */
+/* DW3: SVB maximum index */
+
+#define _3DSTATE_MULTISAMPLE			0x790d /* GEN6+ */
+/* DW1 */
+# define MS_PIXEL_LOCATION_CENTER			(0 << 4)
+# define MS_PIXEL_LOCATION_UPPER_LEFT			(1 << 4)
+# define MS_NUMSAMPLES_1				(0 << 1)
+# define MS_NUMSAMPLES_4				(2 << 1)
+# define MS_NUMSAMPLES_8				(3 << 1)
+
+#define _3DSTATE_STENCIL_BUFFER			0x790e /* ILK, SNB */
+#define _3DSTATE_HIER_DEPTH_BUFFER		0x790f /* ILK, SNB */
+
+#define GEN7_3DSTATE_CLEAR_PARAMS		0x7804
+#define GEN7_3DSTATE_DEPTH_BUFFER		0x7805
+#define GEN7_3DSTATE_STENCIL_BUFFER		0x7806
+# define HSW_STENCIL_ENABLED                            (1 << 31)
+#define GEN7_3DSTATE_HIER_DEPTH_BUFFER		0x7807
+
+#define _3DSTATE_CLEAR_PARAMS			0x7910 /* ILK, SNB */
+# define GEN5_DEPTH_CLEAR_VALID				(1 << 15)
+/* DW1: depth clear value */
+/* DW2 */
+# define GEN7_DEPTH_CLEAR_VALID				(1 << 0)
+
+#define _3DSTATE_SO_DECL_LIST			0x7917 /* GEN7+ */
+/* DW1 */
+# define SO_STREAM_TO_BUFFER_SELECTS_3_SHIFT		12
+# define SO_STREAM_TO_BUFFER_SELECTS_3_MASK		INTEL_MASK(15, 12)
+# define SO_STREAM_TO_BUFFER_SELECTS_2_SHIFT		8
+# define SO_STREAM_TO_BUFFER_SELECTS_2_MASK		INTEL_MASK(11, 8)
+# define SO_STREAM_TO_BUFFER_SELECTS_1_SHIFT		4
+# define SO_STREAM_TO_BUFFER_SELECTS_1_MASK		INTEL_MASK(7, 4)
+# define SO_STREAM_TO_BUFFER_SELECTS_0_SHIFT		0
+# define SO_STREAM_TO_BUFFER_SELECTS_0_MASK		INTEL_MASK(3, 0)
+/* DW2 */
+# define SO_NUM_ENTRIES_3_SHIFT				24
+# define SO_NUM_ENTRIES_3_MASK				INTEL_MASK(31, 24)
+# define SO_NUM_ENTRIES_2_SHIFT				16
+# define SO_NUM_ENTRIES_2_MASK				INTEL_MASK(23, 16)
+# define SO_NUM_ENTRIES_1_SHIFT				8
+# define SO_NUM_ENTRIES_1_MASK				INTEL_MASK(15, 8)
+# define SO_NUM_ENTRIES_0_SHIFT				0
+# define SO_NUM_ENTRIES_0_MASK				INTEL_MASK(7, 0)
+
+/* SO_DECL DW0 */
+# define SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT		12
+# define SO_DECL_OUTPUT_BUFFER_SLOT_MASK		INTEL_MASK(13, 12)
+# define SO_DECL_HOLE_FLAG				(1 << 11)
+# define SO_DECL_REGISTER_INDEX_SHIFT			4
+# define SO_DECL_REGISTER_INDEX_MASK			INTEL_MASK(9, 4)
+# define SO_DECL_COMPONENT_MASK_SHIFT			0
+# define SO_DECL_COMPONENT_MASK_MASK			INTEL_MASK(3, 0)
+
+#define _3DSTATE_SO_BUFFER                    0x7918 /* GEN7+ */
+/* DW1 */
+# define SO_BUFFER_INDEX_SHIFT				29
+# define SO_BUFFER_INDEX_MASK				INTEL_MASK(30, 29)
+# define SO_BUFFER_PITCH_SHIFT				0
+# define SO_BUFFER_PITCH_MASK				INTEL_MASK(11, 0)
+/* DW2: start address */
+/* DW3: end address. */
 
 #define CMD_PIPE_CONTROL              0x7a00
 
-#define CMD_3D_PRIM                   0x7b00
-
 #define CMD_MI_FLUSH                  0x0200
 
 
-/* Various values from the R0 vertex header:
+/* Bitfields for the URB_WRITE message, DW2 of message header: */
+#define URB_WRITE_PRIM_END		0x1
+#define URB_WRITE_PRIM_START		0x2
+#define URB_WRITE_PRIM_TYPE_SHIFT	2
+
+
+/* Maximum number of entries that can be addressed using a binding table
+ * pointer of type SURFTYPE_BUFFER
  */
-#define R02_PRIM_END    0x1
-#define R02_PRIM_START  0x2
+#define BRW_MAX_NUM_BUFFER_ENTRIES	(1 << 27)
 
 #define EX_DESC_SFID_MASK 0xF
 #define EX_DESC_EOT_MASK  0x20
-- 
1.7.7.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 21/90] assembler: Remove trailing white space from brw_defines.h
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (19 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 20/90] assembler: Import brw_defines.h from Mesa Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 22/90] assembler: Update the disassembler code Damien Lespiau
                   ` (69 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_defines.h |  296 +++++++++++++++++++++++-----------------------
 1 files changed, 148 insertions(+), 148 deletions(-)

diff --git a/assembler/brw_defines.h b/assembler/brw_defines.h
index f0b358e..23402e3 100644
--- a/assembler/brw_defines.h
+++ b/assembler/brw_defines.h
@@ -2,7 +2,7 @@
  Copyright (C) Intel Corp.  2006.  All Rights Reserved.
  Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
  develop this 3D driver.
- 
+
  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
@@ -10,11 +10,11 @@
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:
- 
+
  The above copyright notice and this permission notice (including the
  next paragraph) shall be included in all copies or substantial
  portions of the Software.
- 
+
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -22,7 +22,7 @@
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- 
+
  **********************************************************************/
  /*
   * Authors:
@@ -77,13 +77,13 @@
 #define _3DPRIM_LINESTRIP_CONT_BF 0x14
 #define _3DPRIM_TRIFAN_NOSTIPPLE  0x15
 
-#define BRW_ANISORATIO_2     0 
-#define BRW_ANISORATIO_4     1 
-#define BRW_ANISORATIO_6     2 
-#define BRW_ANISORATIO_8     3 
-#define BRW_ANISORATIO_10    4 
-#define BRW_ANISORATIO_12    5 
-#define BRW_ANISORATIO_14    6 
+#define BRW_ANISORATIO_2     0
+#define BRW_ANISORATIO_4     1
+#define BRW_ANISORATIO_6     2
+#define BRW_ANISORATIO_8     3
+#define BRW_ANISORATIO_10    4
+#define BRW_ANISORATIO_12    5
+#define BRW_ANISORATIO_14    6
 #define BRW_ANISORATIO_16    7
 
 #define BRW_BLENDFACTOR_ONE                 0x1
@@ -188,14 +188,14 @@
 #define BRW_LOGICOPFUNCTION_COPY             12
 #define BRW_LOGICOPFUNCTION_OR_REVERSE       13
 #define BRW_LOGICOPFUNCTION_OR               14
-#define BRW_LOGICOPFUNCTION_SET              15  
+#define BRW_LOGICOPFUNCTION_SET              15
 
-#define BRW_MAPFILTER_NEAREST        0x0 
-#define BRW_MAPFILTER_LINEAR         0x1 
+#define BRW_MAPFILTER_NEAREST        0x0
+#define BRW_MAPFILTER_LINEAR         0x1
 #define BRW_MAPFILTER_ANISOTROPIC    0x2
 
-#define BRW_MIPFILTER_NONE        0   
-#define BRW_MIPFILTER_NEAREST     1   
+#define BRW_MIPFILTER_NONE        0
+#define BRW_MIPFILTER_NEAREST     1
 #define BRW_MIPFILTER_LINEAR      3
 
 #define BRW_ADDRESS_ROUNDING_ENABLE_U_MAG	0x20
@@ -208,7 +208,7 @@
 #define BRW_POLYGON_FRONT_FACING     0
 #define BRW_POLYGON_BACK_FACING      1
 
-#define BRW_PREFILTER_ALWAYS     0x0 
+#define BRW_PREFILTER_ALWAYS     0x0
 #define BRW_PREFILTER_NEVER      0x1
 #define BRW_PREFILTER_LESS       0x2
 #define BRW_PREFILTER_EQUAL      0x3
@@ -218,10 +218,10 @@
 #define BRW_PREFILTER_GEQUAL     0x7
 
 #define BRW_PROVOKING_VERTEX_0    0
-#define BRW_PROVOKING_VERTEX_1    1 
+#define BRW_PROVOKING_VERTEX_1    1
 #define BRW_PROVOKING_VERTEX_2    2
 
-#define BRW_RASTRULE_UPPER_LEFT  0    
+#define BRW_RASTRULE_UPPER_LEFT  0
 #define BRW_RASTRULE_UPPER_RIGHT 1
 /* These are listed as "Reserved, but not seen as useful"
  * in Intel documentation (page 212, "Point Rasterization Rule",
@@ -229,7 +229,7 @@
  * "Intel® 965 Express Chipset Family and Intel® G35 Express
  * Chipset Graphics Controller Programmer's Reference Manual,
  * Volume 2: 3D/Media", Revision 1.0b as of January 2008,
- * available at 
+ * available at
  *     http://intellinuxgraphics.org/documentation.html
  * at the time of this writing).
  *
@@ -267,88 +267,88 @@
 #define BRW_SURFACE_WRITEDISABLE_R_SHIFT	16
 #define BRW_SURFACE_WRITEDISABLE_A_SHIFT	17
 
-#define BRW_SURFACEFORMAT_R32G32B32A32_FLOAT             0x000 
-#define BRW_SURFACEFORMAT_R32G32B32A32_SINT              0x001 
-#define BRW_SURFACEFORMAT_R32G32B32A32_UINT              0x002 
-#define BRW_SURFACEFORMAT_R32G32B32A32_UNORM             0x003 
-#define BRW_SURFACEFORMAT_R32G32B32A32_SNORM             0x004 
-#define BRW_SURFACEFORMAT_R64G64_FLOAT                   0x005 
-#define BRW_SURFACEFORMAT_R32G32B32X32_FLOAT             0x006 
+#define BRW_SURFACEFORMAT_R32G32B32A32_FLOAT             0x000
+#define BRW_SURFACEFORMAT_R32G32B32A32_SINT              0x001
+#define BRW_SURFACEFORMAT_R32G32B32A32_UINT              0x002
+#define BRW_SURFACEFORMAT_R32G32B32A32_UNORM             0x003
+#define BRW_SURFACEFORMAT_R32G32B32A32_SNORM             0x004
+#define BRW_SURFACEFORMAT_R64G64_FLOAT                   0x005
+#define BRW_SURFACEFORMAT_R32G32B32X32_FLOAT             0x006
 #define BRW_SURFACEFORMAT_R32G32B32A32_SSCALED           0x007
 #define BRW_SURFACEFORMAT_R32G32B32A32_USCALED           0x008
 #define BRW_SURFACEFORMAT_R32G32B32A32_SFIXED            0x020
-#define BRW_SURFACEFORMAT_R32G32B32_FLOAT                0x040 
-#define BRW_SURFACEFORMAT_R32G32B32_SINT                 0x041 
-#define BRW_SURFACEFORMAT_R32G32B32_UINT                 0x042 
-#define BRW_SURFACEFORMAT_R32G32B32_UNORM                0x043 
-#define BRW_SURFACEFORMAT_R32G32B32_SNORM                0x044 
-#define BRW_SURFACEFORMAT_R32G32B32_SSCALED              0x045 
-#define BRW_SURFACEFORMAT_R32G32B32_USCALED              0x046 
+#define BRW_SURFACEFORMAT_R32G32B32_FLOAT                0x040
+#define BRW_SURFACEFORMAT_R32G32B32_SINT                 0x041
+#define BRW_SURFACEFORMAT_R32G32B32_UINT                 0x042
+#define BRW_SURFACEFORMAT_R32G32B32_UNORM                0x043
+#define BRW_SURFACEFORMAT_R32G32B32_SNORM                0x044
+#define BRW_SURFACEFORMAT_R32G32B32_SSCALED              0x045
+#define BRW_SURFACEFORMAT_R32G32B32_USCALED              0x046
 #define BRW_SURFACEFORMAT_R32G32B32_SFIXED               0x050
-#define BRW_SURFACEFORMAT_R16G16B16A16_UNORM             0x080 
-#define BRW_SURFACEFORMAT_R16G16B16A16_SNORM             0x081 
-#define BRW_SURFACEFORMAT_R16G16B16A16_SINT              0x082 
-#define BRW_SURFACEFORMAT_R16G16B16A16_UINT              0x083 
-#define BRW_SURFACEFORMAT_R16G16B16A16_FLOAT             0x084 
-#define BRW_SURFACEFORMAT_R32G32_FLOAT                   0x085 
-#define BRW_SURFACEFORMAT_R32G32_SINT                    0x086 
-#define BRW_SURFACEFORMAT_R32G32_UINT                    0x087 
-#define BRW_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS       0x088 
-#define BRW_SURFACEFORMAT_X32_TYPELESS_G8X24_UINT        0x089 
-#define BRW_SURFACEFORMAT_L32A32_FLOAT                   0x08A 
-#define BRW_SURFACEFORMAT_R32G32_UNORM                   0x08B 
-#define BRW_SURFACEFORMAT_R32G32_SNORM                   0x08C 
-#define BRW_SURFACEFORMAT_R64_FLOAT                      0x08D 
-#define BRW_SURFACEFORMAT_R16G16B16X16_UNORM             0x08E 
-#define BRW_SURFACEFORMAT_R16G16B16X16_FLOAT             0x08F 
-#define BRW_SURFACEFORMAT_A32X32_FLOAT                   0x090 
-#define BRW_SURFACEFORMAT_L32X32_FLOAT                   0x091 
-#define BRW_SURFACEFORMAT_I32X32_FLOAT                   0x092 
+#define BRW_SURFACEFORMAT_R16G16B16A16_UNORM             0x080
+#define BRW_SURFACEFORMAT_R16G16B16A16_SNORM             0x081
+#define BRW_SURFACEFORMAT_R16G16B16A16_SINT              0x082
+#define BRW_SURFACEFORMAT_R16G16B16A16_UINT              0x083
+#define BRW_SURFACEFORMAT_R16G16B16A16_FLOAT             0x084
+#define BRW_SURFACEFORMAT_R32G32_FLOAT                   0x085
+#define BRW_SURFACEFORMAT_R32G32_SINT                    0x086
+#define BRW_SURFACEFORMAT_R32G32_UINT                    0x087
+#define BRW_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS       0x088
+#define BRW_SURFACEFORMAT_X32_TYPELESS_G8X24_UINT        0x089
+#define BRW_SURFACEFORMAT_L32A32_FLOAT                   0x08A
+#define BRW_SURFACEFORMAT_R32G32_UNORM                   0x08B
+#define BRW_SURFACEFORMAT_R32G32_SNORM                   0x08C
+#define BRW_SURFACEFORMAT_R64_FLOAT                      0x08D
+#define BRW_SURFACEFORMAT_R16G16B16X16_UNORM             0x08E
+#define BRW_SURFACEFORMAT_R16G16B16X16_FLOAT             0x08F
+#define BRW_SURFACEFORMAT_A32X32_FLOAT                   0x090
+#define BRW_SURFACEFORMAT_L32X32_FLOAT                   0x091
+#define BRW_SURFACEFORMAT_I32X32_FLOAT                   0x092
 #define BRW_SURFACEFORMAT_R16G16B16A16_SSCALED           0x093
 #define BRW_SURFACEFORMAT_R16G16B16A16_USCALED           0x094
 #define BRW_SURFACEFORMAT_R32G32_SSCALED                 0x095
 #define BRW_SURFACEFORMAT_R32G32_USCALED                 0x096
 #define BRW_SURFACEFORMAT_R32G32_SFIXED                  0x0A0
-#define BRW_SURFACEFORMAT_B8G8R8A8_UNORM                 0x0C0 
-#define BRW_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB            0x0C1 
-#define BRW_SURFACEFORMAT_R10G10B10A2_UNORM              0x0C2 
-#define BRW_SURFACEFORMAT_R10G10B10A2_UNORM_SRGB         0x0C3 
-#define BRW_SURFACEFORMAT_R10G10B10A2_UINT               0x0C4 
-#define BRW_SURFACEFORMAT_R10G10B10_SNORM_A2_UNORM       0x0C5 
-#define BRW_SURFACEFORMAT_R8G8B8A8_UNORM                 0x0C7 
-#define BRW_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB            0x0C8 
-#define BRW_SURFACEFORMAT_R8G8B8A8_SNORM                 0x0C9 
-#define BRW_SURFACEFORMAT_R8G8B8A8_SINT                  0x0CA 
-#define BRW_SURFACEFORMAT_R8G8B8A8_UINT                  0x0CB 
-#define BRW_SURFACEFORMAT_R16G16_UNORM                   0x0CC 
-#define BRW_SURFACEFORMAT_R16G16_SNORM                   0x0CD 
-#define BRW_SURFACEFORMAT_R16G16_SINT                    0x0CE 
-#define BRW_SURFACEFORMAT_R16G16_UINT                    0x0CF 
-#define BRW_SURFACEFORMAT_R16G16_FLOAT                   0x0D0 
-#define BRW_SURFACEFORMAT_B10G10R10A2_UNORM              0x0D1 
-#define BRW_SURFACEFORMAT_B10G10R10A2_UNORM_SRGB         0x0D2 
-#define BRW_SURFACEFORMAT_R11G11B10_FLOAT                0x0D3 
-#define BRW_SURFACEFORMAT_R32_SINT                       0x0D6 
-#define BRW_SURFACEFORMAT_R32_UINT                       0x0D7 
-#define BRW_SURFACEFORMAT_R32_FLOAT                      0x0D8 
-#define BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS          0x0D9 
-#define BRW_SURFACEFORMAT_X24_TYPELESS_G8_UINT           0x0DA 
-#define BRW_SURFACEFORMAT_L16A16_UNORM                   0x0DF 
-#define BRW_SURFACEFORMAT_I24X8_UNORM                    0x0E0 
-#define BRW_SURFACEFORMAT_L24X8_UNORM                    0x0E1 
-#define BRW_SURFACEFORMAT_A24X8_UNORM                    0x0E2 
-#define BRW_SURFACEFORMAT_I32_FLOAT                      0x0E3 
-#define BRW_SURFACEFORMAT_L32_FLOAT                      0x0E4 
-#define BRW_SURFACEFORMAT_A32_FLOAT                      0x0E5 
-#define BRW_SURFACEFORMAT_B8G8R8X8_UNORM                 0x0E9 
-#define BRW_SURFACEFORMAT_B8G8R8X8_UNORM_SRGB            0x0EA 
-#define BRW_SURFACEFORMAT_R8G8B8X8_UNORM                 0x0EB 
-#define BRW_SURFACEFORMAT_R8G8B8X8_UNORM_SRGB            0x0EC 
-#define BRW_SURFACEFORMAT_R9G9B9E5_SHAREDEXP             0x0ED 
-#define BRW_SURFACEFORMAT_B10G10R10X2_UNORM              0x0EE 
-#define BRW_SURFACEFORMAT_L16A16_FLOAT                   0x0F0 
-#define BRW_SURFACEFORMAT_R32_UNORM                      0x0F1 
-#define BRW_SURFACEFORMAT_R32_SNORM                      0x0F2 
+#define BRW_SURFACEFORMAT_B8G8R8A8_UNORM                 0x0C0
+#define BRW_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB            0x0C1
+#define BRW_SURFACEFORMAT_R10G10B10A2_UNORM              0x0C2
+#define BRW_SURFACEFORMAT_R10G10B10A2_UNORM_SRGB         0x0C3
+#define BRW_SURFACEFORMAT_R10G10B10A2_UINT               0x0C4
+#define BRW_SURFACEFORMAT_R10G10B10_SNORM_A2_UNORM       0x0C5
+#define BRW_SURFACEFORMAT_R8G8B8A8_UNORM                 0x0C7
+#define BRW_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB            0x0C8
+#define BRW_SURFACEFORMAT_R8G8B8A8_SNORM                 0x0C9
+#define BRW_SURFACEFORMAT_R8G8B8A8_SINT                  0x0CA
+#define BRW_SURFACEFORMAT_R8G8B8A8_UINT                  0x0CB
+#define BRW_SURFACEFORMAT_R16G16_UNORM                   0x0CC
+#define BRW_SURFACEFORMAT_R16G16_SNORM                   0x0CD
+#define BRW_SURFACEFORMAT_R16G16_SINT                    0x0CE
+#define BRW_SURFACEFORMAT_R16G16_UINT                    0x0CF
+#define BRW_SURFACEFORMAT_R16G16_FLOAT                   0x0D0
+#define BRW_SURFACEFORMAT_B10G10R10A2_UNORM              0x0D1
+#define BRW_SURFACEFORMAT_B10G10R10A2_UNORM_SRGB         0x0D2
+#define BRW_SURFACEFORMAT_R11G11B10_FLOAT                0x0D3
+#define BRW_SURFACEFORMAT_R32_SINT                       0x0D6
+#define BRW_SURFACEFORMAT_R32_UINT                       0x0D7
+#define BRW_SURFACEFORMAT_R32_FLOAT                      0x0D8
+#define BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS          0x0D9
+#define BRW_SURFACEFORMAT_X24_TYPELESS_G8_UINT           0x0DA
+#define BRW_SURFACEFORMAT_L16A16_UNORM                   0x0DF
+#define BRW_SURFACEFORMAT_I24X8_UNORM                    0x0E0
+#define BRW_SURFACEFORMAT_L24X8_UNORM                    0x0E1
+#define BRW_SURFACEFORMAT_A24X8_UNORM                    0x0E2
+#define BRW_SURFACEFORMAT_I32_FLOAT                      0x0E3
+#define BRW_SURFACEFORMAT_L32_FLOAT                      0x0E4
+#define BRW_SURFACEFORMAT_A32_FLOAT                      0x0E5
+#define BRW_SURFACEFORMAT_B8G8R8X8_UNORM                 0x0E9
+#define BRW_SURFACEFORMAT_B8G8R8X8_UNORM_SRGB            0x0EA
+#define BRW_SURFACEFORMAT_R8G8B8X8_UNORM                 0x0EB
+#define BRW_SURFACEFORMAT_R8G8B8X8_UNORM_SRGB            0x0EC
+#define BRW_SURFACEFORMAT_R9G9B9E5_SHAREDEXP             0x0ED
+#define BRW_SURFACEFORMAT_B10G10R10X2_UNORM              0x0EE
+#define BRW_SURFACEFORMAT_L16A16_FLOAT                   0x0F0
+#define BRW_SURFACEFORMAT_R32_UNORM                      0x0F1
+#define BRW_SURFACEFORMAT_R32_SNORM                      0x0F2
 #define BRW_SURFACEFORMAT_R10G10B10X2_USCALED            0x0F3
 #define BRW_SURFACEFORMAT_R8G8B8A8_SSCALED               0x0F4
 #define BRW_SURFACEFORMAT_R8G8B8A8_USCALED               0x0F5
@@ -356,25 +356,25 @@
 #define BRW_SURFACEFORMAT_R16G16_USCALED                 0x0F7
 #define BRW_SURFACEFORMAT_R32_SSCALED                    0x0F8
 #define BRW_SURFACEFORMAT_R32_USCALED                    0x0F9
-#define BRW_SURFACEFORMAT_B5G6R5_UNORM                   0x100 
-#define BRW_SURFACEFORMAT_B5G6R5_UNORM_SRGB              0x101 
-#define BRW_SURFACEFORMAT_B5G5R5A1_UNORM                 0x102 
-#define BRW_SURFACEFORMAT_B5G5R5A1_UNORM_SRGB            0x103 
-#define BRW_SURFACEFORMAT_B4G4R4A4_UNORM                 0x104 
-#define BRW_SURFACEFORMAT_B4G4R4A4_UNORM_SRGB            0x105 
-#define BRW_SURFACEFORMAT_R8G8_UNORM                     0x106 
-#define BRW_SURFACEFORMAT_R8G8_SNORM                     0x107 
-#define BRW_SURFACEFORMAT_R8G8_SINT                      0x108 
-#define BRW_SURFACEFORMAT_R8G8_UINT                      0x109 
-#define BRW_SURFACEFORMAT_R16_UNORM                      0x10A 
-#define BRW_SURFACEFORMAT_R16_SNORM                      0x10B 
-#define BRW_SURFACEFORMAT_R16_SINT                       0x10C 
-#define BRW_SURFACEFORMAT_R16_UINT                       0x10D 
-#define BRW_SURFACEFORMAT_R16_FLOAT                      0x10E 
-#define BRW_SURFACEFORMAT_I16_UNORM                      0x111 
-#define BRW_SURFACEFORMAT_L16_UNORM                      0x112 
-#define BRW_SURFACEFORMAT_A16_UNORM                      0x113 
-#define BRW_SURFACEFORMAT_L8A8_UNORM                     0x114 
+#define BRW_SURFACEFORMAT_B5G6R5_UNORM                   0x100
+#define BRW_SURFACEFORMAT_B5G6R5_UNORM_SRGB              0x101
+#define BRW_SURFACEFORMAT_B5G5R5A1_UNORM                 0x102
+#define BRW_SURFACEFORMAT_B5G5R5A1_UNORM_SRGB            0x103
+#define BRW_SURFACEFORMAT_B4G4R4A4_UNORM                 0x104
+#define BRW_SURFACEFORMAT_B4G4R4A4_UNORM_SRGB            0x105
+#define BRW_SURFACEFORMAT_R8G8_UNORM                     0x106
+#define BRW_SURFACEFORMAT_R8G8_SNORM                     0x107
+#define BRW_SURFACEFORMAT_R8G8_SINT                      0x108
+#define BRW_SURFACEFORMAT_R8G8_UINT                      0x109
+#define BRW_SURFACEFORMAT_R16_UNORM                      0x10A
+#define BRW_SURFACEFORMAT_R16_SNORM                      0x10B
+#define BRW_SURFACEFORMAT_R16_SINT                       0x10C
+#define BRW_SURFACEFORMAT_R16_UINT                       0x10D
+#define BRW_SURFACEFORMAT_R16_FLOAT                      0x10E
+#define BRW_SURFACEFORMAT_I16_UNORM                      0x111
+#define BRW_SURFACEFORMAT_L16_UNORM                      0x112
+#define BRW_SURFACEFORMAT_A16_UNORM                      0x113
+#define BRW_SURFACEFORMAT_L8A8_UNORM                     0x114
 #define BRW_SURFACEFORMAT_I16_FLOAT                      0x115
 #define BRW_SURFACEFORMAT_L16_FLOAT                      0x116
 #define BRW_SURFACEFORMAT_A16_FLOAT                      0x117
@@ -386,46 +386,46 @@
 #define BRW_SURFACEFORMAT_R8G8_USCALED                   0x11D
 #define BRW_SURFACEFORMAT_R16_SSCALED                    0x11E
 #define BRW_SURFACEFORMAT_R16_USCALED                    0x11F
-#define BRW_SURFACEFORMAT_R8_UNORM                       0x140 
-#define BRW_SURFACEFORMAT_R8_SNORM                       0x141 
-#define BRW_SURFACEFORMAT_R8_SINT                        0x142 
-#define BRW_SURFACEFORMAT_R8_UINT                        0x143 
-#define BRW_SURFACEFORMAT_A8_UNORM                       0x144 
-#define BRW_SURFACEFORMAT_I8_UNORM                       0x145 
-#define BRW_SURFACEFORMAT_L8_UNORM                       0x146 
-#define BRW_SURFACEFORMAT_P4A4_UNORM                     0x147 
+#define BRW_SURFACEFORMAT_R8_UNORM                       0x140
+#define BRW_SURFACEFORMAT_R8_SNORM                       0x141
+#define BRW_SURFACEFORMAT_R8_SINT                        0x142
+#define BRW_SURFACEFORMAT_R8_UINT                        0x143
+#define BRW_SURFACEFORMAT_A8_UNORM                       0x144
+#define BRW_SURFACEFORMAT_I8_UNORM                       0x145
+#define BRW_SURFACEFORMAT_L8_UNORM                       0x146
+#define BRW_SURFACEFORMAT_P4A4_UNORM                     0x147
 #define BRW_SURFACEFORMAT_A4P4_UNORM                     0x148
 #define BRW_SURFACEFORMAT_R8_SSCALED                     0x149
 #define BRW_SURFACEFORMAT_R8_USCALED                     0x14A
 #define BRW_SURFACEFORMAT_L8_UNORM_SRGB                  0x14C
 #define BRW_SURFACEFORMAT_DXT1_RGB_SRGB                  0x180
-#define BRW_SURFACEFORMAT_R1_UINT                        0x181 
-#define BRW_SURFACEFORMAT_YCRCB_NORMAL                   0x182 
-#define BRW_SURFACEFORMAT_YCRCB_SWAPUVY                  0x183 
-#define BRW_SURFACEFORMAT_BC1_UNORM                      0x186 
-#define BRW_SURFACEFORMAT_BC2_UNORM                      0x187 
-#define BRW_SURFACEFORMAT_BC3_UNORM                      0x188 
-#define BRW_SURFACEFORMAT_BC4_UNORM                      0x189 
-#define BRW_SURFACEFORMAT_BC5_UNORM                      0x18A 
-#define BRW_SURFACEFORMAT_BC1_UNORM_SRGB                 0x18B 
-#define BRW_SURFACEFORMAT_BC2_UNORM_SRGB                 0x18C 
-#define BRW_SURFACEFORMAT_BC3_UNORM_SRGB                 0x18D 
-#define BRW_SURFACEFORMAT_MONO8                          0x18E 
-#define BRW_SURFACEFORMAT_YCRCB_SWAPUV                   0x18F 
-#define BRW_SURFACEFORMAT_YCRCB_SWAPY                    0x190 
-#define BRW_SURFACEFORMAT_DXT1_RGB                       0x191 
-#define BRW_SURFACEFORMAT_FXT1                           0x192 
-#define BRW_SURFACEFORMAT_R8G8B8_UNORM                   0x193 
-#define BRW_SURFACEFORMAT_R8G8B8_SNORM                   0x194 
-#define BRW_SURFACEFORMAT_R8G8B8_SSCALED                 0x195 
-#define BRW_SURFACEFORMAT_R8G8B8_USCALED                 0x196 
-#define BRW_SURFACEFORMAT_R64G64B64A64_FLOAT             0x197 
-#define BRW_SURFACEFORMAT_R64G64B64_FLOAT                0x198 
-#define BRW_SURFACEFORMAT_BC4_SNORM                      0x199 
-#define BRW_SURFACEFORMAT_BC5_SNORM                      0x19A 
-#define BRW_SURFACEFORMAT_R16G16B16_UNORM                0x19C 
-#define BRW_SURFACEFORMAT_R16G16B16_SNORM                0x19D 
-#define BRW_SURFACEFORMAT_R16G16B16_SSCALED              0x19E 
+#define BRW_SURFACEFORMAT_R1_UINT                        0x181
+#define BRW_SURFACEFORMAT_YCRCB_NORMAL                   0x182
+#define BRW_SURFACEFORMAT_YCRCB_SWAPUVY                  0x183
+#define BRW_SURFACEFORMAT_BC1_UNORM                      0x186
+#define BRW_SURFACEFORMAT_BC2_UNORM                      0x187
+#define BRW_SURFACEFORMAT_BC3_UNORM                      0x188
+#define BRW_SURFACEFORMAT_BC4_UNORM                      0x189
+#define BRW_SURFACEFORMAT_BC5_UNORM                      0x18A
+#define BRW_SURFACEFORMAT_BC1_UNORM_SRGB                 0x18B
+#define BRW_SURFACEFORMAT_BC2_UNORM_SRGB                 0x18C
+#define BRW_SURFACEFORMAT_BC3_UNORM_SRGB                 0x18D
+#define BRW_SURFACEFORMAT_MONO8                          0x18E
+#define BRW_SURFACEFORMAT_YCRCB_SWAPUV                   0x18F
+#define BRW_SURFACEFORMAT_YCRCB_SWAPY                    0x190
+#define BRW_SURFACEFORMAT_DXT1_RGB                       0x191
+#define BRW_SURFACEFORMAT_FXT1                           0x192
+#define BRW_SURFACEFORMAT_R8G8B8_UNORM                   0x193
+#define BRW_SURFACEFORMAT_R8G8B8_SNORM                   0x194
+#define BRW_SURFACEFORMAT_R8G8B8_SSCALED                 0x195
+#define BRW_SURFACEFORMAT_R8G8B8_USCALED                 0x196
+#define BRW_SURFACEFORMAT_R64G64B64A64_FLOAT             0x197
+#define BRW_SURFACEFORMAT_R64G64B64_FLOAT                0x198
+#define BRW_SURFACEFORMAT_BC4_SNORM                      0x199
+#define BRW_SURFACEFORMAT_BC5_SNORM                      0x19A
+#define BRW_SURFACEFORMAT_R16G16B16_UNORM                0x19C
+#define BRW_SURFACEFORMAT_R16G16B16_SNORM                0x19D
+#define BRW_SURFACEFORMAT_R16G16B16_SSCALED              0x19E
 #define BRW_SURFACEFORMAT_R16G16B16_USCALED              0x19F
 #define BRW_SURFACEFORMAT_R32_SFIXED                     0x1B2
 #define BRW_SURFACEFORMAT_R10G10B10A2_SNORM              0x1B3
@@ -787,7 +787,7 @@ enum opcode {
 
 #define BRW_ARF_NULL                  0x00
 #define BRW_ARF_ADDRESS               0x10
-#define BRW_ARF_ACCUMULATOR           0x20   
+#define BRW_ARF_ACCUMULATOR           0x20
 #define BRW_ARF_FLAG                  0x30
 #define BRW_ARF_MASK                  0x40
 #define BRW_ARF_MASK_STACK            0x50
-- 
1.7.7.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 22/90] assembler: Update the disassembler code
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (20 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 21/90] assembler: Remove trailing white space from brw_defines.h Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 23/90] assembler: Import ralloc from Mesa Damien Lespiau
                   ` (68 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

>From Mesa. This imports a bit more the of brw_eu* infrastructure (which
is going towards the right direction!) from mesa and the update is quite
a significant improvement over what we had.

I also verified that the changes that were done on the assembler old
version of brw_disasm.c were already supported by the Mesa version, and
indeed they were.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/Makefile.am                |    4 +-
 assembler/{disasm.c => brw_disasm.c} |  660 +++++++++++++++++++++++-----
 assembler/brw_eu.h                   |  405 +++++++++++++++++
 assembler/brw_reg.h                  |  808 ++++++++++++++++++++++++++++++++++
 assembler/disasm-main.c              |   20 +-
 assembler/gen4asm.h                  |    3 -
 6 files changed, 1783 insertions(+), 117 deletions(-)
 rename assembler/{disasm.c => brw_disasm.c} (52%)
 create mode 100644 assembler/brw_eu.h
 create mode 100644 assembler/brw_reg.h

diff --git a/assembler/Makefile.am b/assembler/Makefile.am
index 5914356..8843e1a 100644
--- a/assembler/Makefile.am
+++ b/assembler/Makefile.am
@@ -11,6 +11,8 @@ gram.h: gram.c
 
 intel_gen4asm_SOURCES =	\
 	brw_defines.h	\
+	brw_eu.h	\
+	brw_reg.h	\
 	brw_structs.h	\
 	gen4asm.h	\
 	gram.y		\
@@ -19,7 +21,7 @@ intel_gen4asm_SOURCES =	\
 	$(NULL)
 
 intel_gen4disasm_SOURCES =  \
-	disasm.c disasm-main.c
+	brw_disasm.c disasm-main.c
 
 pkgconfigdir = $(libdir)/pkgconfig
 pkgconfig_DATA = intel-gen4asm.pc
diff --git a/assembler/disasm.c b/assembler/brw_disasm.c
similarity index 52%
rename from assembler/disasm.c
rename to assembler/brw_disasm.c
index 9ebbeab..8eaeb78 100644
--- a/assembler/disasm.c
+++ b/assembler/brw_disasm.c
@@ -28,13 +28,9 @@
 #include <stdarg.h>
 
 #include "gen4asm.h"
-#include "brw_defines.h"
+#include "brw_eu.h"
 
-struct {
-    char    *name;
-    int	    nsrc;
-    int	    ndst;
-} opcode[128] = {
+const struct opcode_desc opcode_descs[128] = {
     [BRW_OPCODE_MOV] = { .name = "mov", .nsrc = 1, .ndst = 1 },
     [BRW_OPCODE_FRC] = { .name = "frc", .nsrc = 1, .ndst = 1 },
     [BRW_OPCODE_RNDU] = { .name = "rndu", .nsrc = 1, .ndst = 1 },
@@ -48,13 +44,15 @@ struct {
     [BRW_OPCODE_MAC] = { .name = "mac", .nsrc = 2, .ndst = 1 },
     [BRW_OPCODE_MACH] = { .name = "mach", .nsrc = 2, .ndst = 1 },
     [BRW_OPCODE_LINE] = { .name = "line", .nsrc = 2, .ndst = 1 },
+    [BRW_OPCODE_PLN] = { .name = "pln", .nsrc = 2, .ndst = 1 },
+    [BRW_OPCODE_MAD] = { .name = "mad", .nsrc = 3, .ndst = 1 },
     [BRW_OPCODE_SAD2] = { .name = "sad2", .nsrc = 2, .ndst = 1 },
     [BRW_OPCODE_SADA2] = { .name = "sada2", .nsrc = 2, .ndst = 1 },
     [BRW_OPCODE_DP4] = { .name = "dp4", .nsrc = 2, .ndst = 1 },
     [BRW_OPCODE_DPH] = { .name = "dph", .nsrc = 2, .ndst = 1 },
     [BRW_OPCODE_DP3] = { .name = "dp3", .nsrc = 2, .ndst = 1 },
     [BRW_OPCODE_DP2] = { .name = "dp2", .nsrc = 2, .ndst = 1 },
-    [BRW_OPCODE_PLN] = { .name = "pln", .nsrc = 2, .ndst = 1},
+    [BRW_OPCODE_MATH] = { .name = "math", .nsrc = 2, .ndst = 1 },
 
     [BRW_OPCODE_AVG] = { .name = "avg", .nsrc = 2, .ndst = 1 },
     [BRW_OPCODE_ADD] = { .name = "add", .nsrc = 2, .ndst = 1 },
@@ -71,12 +69,12 @@ struct {
     [BRW_OPCODE_SEND] = { .name = "send", .nsrc = 1, .ndst = 1 },
     [BRW_OPCODE_SENDC] = { .name = "sendc", .nsrc = 1, .ndst = 1 },
     [BRW_OPCODE_NOP] = { .name = "nop", .nsrc = 0, .ndst = 0 },
-    [BRW_OPCODE_JMPI] = { .name = "jmpi", .nsrc = 1, .ndst = 0 },
+    [BRW_OPCODE_JMPI] = { .name = "jmpi", .nsrc = 0, .ndst = 0 },
     [BRW_OPCODE_IF] = { .name = "if", .nsrc = 2, .ndst = 0 },
-    [BRW_OPCODE_IFF] = { .name = "iff", .nsrc = 1, .ndst = 01 },
-    [BRW_OPCODE_WHILE] = { .name = "while", .nsrc = 1, .ndst = 0 },
+    [BRW_OPCODE_IFF] = { .name = "iff", .nsrc = 2, .ndst = 1 },
+    [BRW_OPCODE_WHILE] = { .name = "while", .nsrc = 2, .ndst = 0 },
     [BRW_OPCODE_ELSE] = { .name = "else", .nsrc = 2, .ndst = 0 },
-    [BRW_OPCODE_BREAK] = { .name = "break", .nsrc = 1, .ndst = 0 },
+    [BRW_OPCODE_BREAK] = { .name = "break", .nsrc = 2, .ndst = 0 },
     [BRW_OPCODE_CONTINUE] = { .name = "cont", .nsrc = 1, .ndst = 0 },
     [BRW_OPCODE_HALT] = { .name = "halt", .nsrc = 1, .ndst = 0 },
     [BRW_OPCODE_MSAVE] = { .name = "msave", .nsrc = 1, .ndst = 1 },
@@ -87,8 +85,9 @@ struct {
     [BRW_OPCODE_DO] = { .name = "do", .nsrc = 0, .ndst = 0 },
     [BRW_OPCODE_ENDIF] = { .name = "endif", .nsrc = 2, .ndst = 0 },
 };
+static const struct opcode_desc *opcode = opcode_descs;
 
-char *conditional_modifier[16] = {
+static const char * const conditional_modifier[16] = {
     [BRW_CONDITIONAL_NONE] = "",
     [BRW_CONDITIONAL_Z] = ".e",
     [BRW_CONDITIONAL_NZ] = ".ne",
@@ -101,17 +100,17 @@ char *conditional_modifier[16] = {
     [BRW_CONDITIONAL_U] = ".u",
 };
 
-char *negate[2] = {
+static const char * const negate_op[2] = {
     [0] = "",
     [1] = "-",
 };
 
-char *_abs[2] = {
+static const char * const _abs[2] = {
     [0] = "",
     [1] = "(abs)",
 };
 
-char *vert_stride[16] = {
+static const char * const vert_stride[16] = {
     [0] = "0",
     [1] = "1",
     [2] = "2",
@@ -122,7 +121,7 @@ char *vert_stride[16] = {
     [15] = "VxH",
 };
 
-char *width[8] = {
+static const char * const width[8] = {
     [0] = "1",
     [1] = "2",
     [2] = "4",
@@ -130,34 +129,41 @@ char *width[8] = {
     [4] = "16",
 };
 
-char *horiz_stride[4] = {
+static const char * const horiz_stride[4] = {
     [0] = "0",
     [1] = "1",
     [2] = "2",
     [3] = "4"
 };
 
-char *chan_sel[4] = {
+static const char * const chan_sel[4] = {
     [0] = "x",
     [1] = "y",
     [2] = "z",
     [3] = "w",
 };
 
-char *dest_condmod[16] = {
-};
-
-char *debug_ctrl[2] = {
+static const char * const debug_ctrl[2] = {
     [0] = "",
     [1] = ".breakpoint"
 };
 
-char *saturate[2] = {
+static const char * const saturate[2] = {
     [0] = "",
     [1] = ".sat"
 };
 
-char *exec_size[8] = {
+static const char * const accwr[2] = {
+    [0] = "",
+    [1] = "AccWrEnable"
+};
+
+static const char * const wectrl[2] = {
+    [0] = "WE_normal",
+    [1] = "WE_all"
+};
+
+static const char * const exec_size[8] = {
     [0] = "1",
     [1] = "2",
     [2] = "4",
@@ -166,12 +172,12 @@ char *exec_size[8] = {
     [5] = "32"
 };
 
-char *pred_inv[2] = {
+static const char * const pred_inv[2] = {
     [0] = "+",
     [1] = "-"
 };
 
-char *pred_ctrl_align16[16] = {
+static const char * const pred_ctrl_align16[16] = {
     [1] = "",
     [2] = ".x",
     [3] = ".y",
@@ -181,7 +187,7 @@ char *pred_ctrl_align16[16] = {
     [7] = ".all4h",
 };
 
-char *pred_ctrl_align1[16] = {
+static const char * const pred_ctrl_align1[16] = {
     [1] = "",
     [2] = ".anyv",
     [3] = ".allv",
@@ -195,35 +201,36 @@ char *pred_ctrl_align1[16] = {
     [11] = ".all16h",
 };
 
-char *thread_ctrl[4] = {
+static const char * const thread_ctrl[4] = {
     [0] = "",
     [2] = "switch"
 };
 
-char *compr_ctrl[4] = {
+static const char * const compr_ctrl[4] = {
     [0] = "",
     [1] = "sechalf",
     [2] = "compr",
+    [3] = "compr4",
 };
 
-char *dep_ctrl[4] = {
+static const char * const dep_ctrl[4] = {
     [0] = "",
     [1] = "NoDDClr",
     [2] = "NoDDChk",
     [3] = "NoDDClr,NoDDChk",
 };
 
-char *mask_ctrl[4] = {
+static const char * const mask_ctrl[4] = {
     [0] = "",
     [1] = "nomask",
 };
 
-char *access_mode[2] = {
+static const char * const access_mode[2] = {
     [0] = "align1",
     [1] = "align16",
 };
 
-char *reg_encoding[8] = {
+static const char * const reg_encoding[8] = {
     [0] = "UD",
     [1] = "D",
     [2] = "UW",
@@ -233,24 +240,24 @@ char *reg_encoding[8] = {
     [7] = "F"
 };
 
-char *imm_encoding[8] = {
-    [0] = "UD",
-    [1] = "D",
-    [2] = "UW",
-    [3] = "W",
-    [5] = "VF",
-    [6] = "V",
-    [7] = "F"
+const int reg_type_size[8] = {
+    [0] = 4,
+    [1] = 4,
+    [2] = 2,
+    [3] = 2,
+    [4] = 1,
+    [5] = 1,
+    [7] = 4
 };
 
-char *reg_file[4] = {
+static const char * const reg_file[4] = {
     [0] = "A",
     [1] = "g",
     [2] = "m",
     [3] = "imm",
 };
 
-char *writemask[16] = {
+static const char * const writemask[16] = {
     [0x0] = ".",
     [0x1] = ".x",
     [0x2] = ".y",
@@ -269,12 +276,12 @@ char *writemask[16] = {
     [0xf] = "",
 };
 
-char *end_of_thread[2] = {
+static const char * const end_of_thread[2] = {
     [0] = "",
     [1] = "EOT"
 };
 
-char *target_function[16] = {
+static const char * const target_function[16] = {
     [BRW_SFID_NULL] = "null",
     [BRW_SFID_MATH] = "math",
     [BRW_SFID_SAMPLER] = "sampler",
@@ -285,7 +292,37 @@ char *target_function[16] = {
     [BRW_SFID_THREAD_SPAWNER] = "thread_spawner"
 };
 
-char *math_function[16] = {
+static const char * const target_function_gen6[16] = {
+    [BRW_SFID_NULL] = "null",
+    [BRW_SFID_MATH] = "math",
+    [BRW_SFID_SAMPLER] = "sampler",
+    [BRW_SFID_MESSAGE_GATEWAY] = "gateway",
+    [BRW_SFID_URB] = "urb",
+    [BRW_SFID_THREAD_SPAWNER] = "thread_spawner",
+    [GEN6_SFID_DATAPORT_SAMPLER_CACHE] = "sampler",
+    [GEN6_SFID_DATAPORT_RENDER_CACHE] = "render",
+    [GEN6_SFID_DATAPORT_CONSTANT_CACHE] = "const",
+    [GEN7_SFID_DATAPORT_DATA_CACHE] = "data"
+};
+
+static const char * const dp_rc_msg_type_gen6[16] = {
+    [BRW_DATAPORT_READ_MESSAGE_OWORD_BLOCK_READ] = "OWORD block read",
+    [GEN6_DATAPORT_READ_MESSAGE_RENDER_UNORM_READ] = "RT UNORM read",
+    [GEN6_DATAPORT_READ_MESSAGE_OWORD_DUAL_BLOCK_READ] = "OWORD dual block read",
+    [GEN6_DATAPORT_READ_MESSAGE_MEDIA_BLOCK_READ] = "media block read",
+    [GEN6_DATAPORT_READ_MESSAGE_OWORD_UNALIGN_BLOCK_READ] = "OWORD unaligned block read",
+    [GEN6_DATAPORT_READ_MESSAGE_DWORD_SCATTERED_READ] = "DWORD scattered read",
+    [GEN6_DATAPORT_WRITE_MESSAGE_DWORD_ATOMIC_WRITE] = "DWORD atomic write",
+    [GEN6_DATAPORT_WRITE_MESSAGE_OWORD_BLOCK_WRITE] = "OWORD block write",
+    [GEN6_DATAPORT_WRITE_MESSAGE_OWORD_DUAL_BLOCK_WRITE] = "OWORD dual block write",
+    [GEN6_DATAPORT_WRITE_MESSAGE_MEDIA_BLOCK_WRITE] = "media block write",
+    [GEN6_DATAPORT_WRITE_MESSAGE_DWORD_SCATTERED_WRITE] = "DWORD scattered write",
+    [GEN6_DATAPORT_WRITE_MESSAGE_RENDER_TARGET_WRITE] = "RT write",
+    [GEN6_DATAPORT_WRITE_MESSAGE_STREAMED_VB_WRITE] = "streamed VB write",
+    [GEN6_DATAPORT_WRITE_MESSAGE_RENDER_TARGET_UNORM_WRITE] = "RT UNORMc write",
+};
+
+static const char * const math_function[16] = {
     [BRW_MATH_FUNCTION_INV] = "inv",
     [BRW_MATH_FUNCTION_LOG] = "log",
     [BRW_MATH_FUNCTION_EXP] = "exp",
@@ -297,52 +334,57 @@ char *math_function[16] = {
     [BRW_MATH_FUNCTION_TAN] = "tan",
     [BRW_MATH_FUNCTION_POW] = "pow",
     [BRW_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER] = "intdivmod",
-    [BRW_MATH_FUNCTION_INT_DIV_QUOTIENT] = "intmod",
-    [BRW_MATH_FUNCTION_INT_DIV_REMAINDER] = "intdiv",
+    [BRW_MATH_FUNCTION_INT_DIV_QUOTIENT] = "intdiv",
+    [BRW_MATH_FUNCTION_INT_DIV_REMAINDER] = "intmod",
 };
 
-char *math_saturate[2] = {
+static const char * const math_saturate[2] = {
     [0] = "",
     [1] = "sat"
 };
 
-char *math_signed[2] = {
+static const char * const math_signed[2] = {
     [0] = "",
     [1] = "signed"
 };
 
-char *math_scalar[2] = {
+static const char * const math_scalar[2] = {
     [0] = "",
     [1] = "scalar"
 };
 
-char *math_precision[2] = {
+static const char * const math_precision[2] = {
     [0] = "",
     [1] = "partial_precision"
 };
 
-char *urb_swizzle[4] = {
+static const char * const urb_opcode[2] = {
+    [0] = "urb_write",
+    [1] = "ff_sync",
+};
+
+static const char * const urb_swizzle[4] = {
     [BRW_URB_SWIZZLE_NONE] = "",
     [BRW_URB_SWIZZLE_INTERLEAVE] = "interleave",
     [BRW_URB_SWIZZLE_TRANSPOSE] = "transpose",
 };
 
-char *urb_allocate[2] = {
+static const char * const urb_allocate[2] = {
     [0] = "",
     [1] = "allocate"
 };
 
-char *urb_used[2] = {
+static const char * const urb_used[2] = {
     [0] = "",
     [1] = "used"
 };
 
-char *urb_complete[2] = {
+static const char * const urb_complete[2] = {
     [0] = "",
     [1] = "complete"
 };
 
-char *sampler_target_format[4] = {
+static const char * const sampler_target_format[4] = {
     [0] = "F",
     [2] = "UD",
     [3] = "D"
@@ -351,20 +393,21 @@ char *sampler_target_format[4] = {
 
 static int column;
 
-static int string (FILE *file, char *string)
+static int string (FILE *file, const char *string)
 {
     fputs (string, file);
     column += strlen (string);
     return 0;
 }
 
-static int format (FILE *f, char *format, ...)
+static int format (FILE *f, const char *format, ...)
 {
     char    buf[1024];
     va_list	args;
     va_start (args, format);
 
     vsnprintf (buf, sizeof (buf) - 1, format, args);
+    va_end (args);
     string (f, buf);
     return 0;
 }
@@ -384,7 +427,8 @@ static int pad (FILE *f, int c)
     return 0;
 }
 
-static int control (FILE *file, char *name, char *ctrl[], GLuint id, int *space)
+static int control (FILE *file, const char *name, const char * const ctrl[],
+                    GLuint id, int *space)
 {
     if (!ctrl[id]) {
 	fprintf (file, "*** invalid %s value %d ",
@@ -415,6 +459,11 @@ static int print_opcode (FILE *file, int id)
 static int reg (FILE *file, GLuint _reg_file, GLuint _reg_nr)
 {
     int	err = 0;
+
+    /* Clear the Compr4 instruction compression bit. */
+    if (_reg_file == BRW_MESSAGE_REGISTER_FILE)
+       _reg_nr &= ~(1 << 7);
+
     if (_reg_file == BRW_ARCHITECTURE_REGISTER_FILE) {
 	switch (_reg_nr & 0xf0) {
 	case BRW_ARF_NULL:
@@ -426,6 +475,9 @@ static int reg (FILE *file, GLuint _reg_file, GLuint _reg_nr)
 	case BRW_ARF_ACCUMULATOR:
 	    format (file, "acc%d", _reg_nr & 0x0f);
 	    break;
+	case BRW_ARF_FLAG:
+	    format (file, "f%d", _reg_nr & 0x0f);
+	    break;
 	case BRW_ARF_MASK:
 	    format (file, "mask%d", _reg_nr & 0x0f);
 	    break;
@@ -468,7 +520,8 @@ static int dest (FILE *file, struct brw_instruction *inst)
 	    if (err == -1)
 		return 0;
 	    if (inst->bits1.da1.dest_subreg_nr)
-		format (file, ".%d", inst->bits1.da1.dest_subreg_nr);
+		format (file, ".%d", inst->bits1.da1.dest_subreg_nr /
+				     reg_type_size[inst->bits1.da1.dest_reg_type]);
 	    format (file, "<%d>", inst->bits1.da1.dest_horiz_stride);
 	    err |= control (file, "dest reg encoding", reg_encoding, inst->bits1.da1.dest_reg_type, NULL);
 	}
@@ -476,7 +529,8 @@ static int dest (FILE *file, struct brw_instruction *inst)
 	{
 	    string (file, "g[a0");
 	    if (inst->bits1.ia1.dest_subreg_nr)
-		format (file, ".%d", inst->bits1.ia1.dest_subreg_nr);
+		format (file, ".%d", inst->bits1.ia1.dest_subreg_nr /
+					reg_type_size[inst->bits1.ia1.dest_reg_type]);
 	    if (inst->bits1.ia1.dest_indirect_offset)
 		format (file, " %d", inst->bits1.ia1.dest_indirect_offset);
 	    string (file, "]");
@@ -492,7 +546,8 @@ static int dest (FILE *file, struct brw_instruction *inst)
 	    if (err == -1)
 		return 0;
 	    if (inst->bits1.da16.dest_subreg_nr)
-		format (file, ".%d", inst->bits1.da16.dest_subreg_nr);
+		format (file, ".%d", inst->bits1.da16.dest_subreg_nr /
+				     reg_type_size[inst->bits1.da16.dest_reg_type]);
 	    string (file, "<1>");
 	    err |= control (file, "writemask", writemask, inst->bits1.da16.dest_writemask, NULL);
 	    err |= control (file, "dest reg encoding", reg_encoding, inst->bits1.da16.dest_reg_type, NULL);
@@ -507,6 +562,28 @@ static int dest (FILE *file, struct brw_instruction *inst)
     return 0;
 }
 
+static int dest_3src (FILE *file, struct brw_instruction *inst)
+{
+    int	err = 0;
+    uint32_t reg_file;
+
+    if (inst->bits1.da3src.dest_reg_file)
+       reg_file = BRW_MESSAGE_REGISTER_FILE;
+    else
+       reg_file = BRW_GENERAL_REGISTER_FILE;
+
+    err |= reg (file, reg_file, inst->bits1.da3src.dest_reg_nr);
+    if (err == -1)
+       return 0;
+    if (inst->bits1.da3src.dest_subreg_nr)
+       format (file, ".%d", inst->bits1.da3src.dest_subreg_nr);
+    string (file, "<1>");
+    err |= control (file, "writemask", writemask, inst->bits1.da3src.dest_writemask, NULL);
+    err |= control (file, "dest reg encoding", reg_encoding, BRW_REGISTER_TYPE_F, NULL);
+
+    return 0;
+}
+
 static int src_align1_region (FILE *file,
 			      GLuint _vert_stride, GLuint _width, GLuint _horiz_stride)
 {
@@ -526,14 +603,14 @@ static int src_da1 (FILE *file, GLuint type, GLuint _reg_file,
 		    GLuint reg_num, GLuint sub_reg_num, GLuint __abs, GLuint _negate)
 {
     int err = 0;
-    err |= control (file, "negate", negate, _negate, NULL);
+    err |= control (file, "negate", negate_op, _negate, NULL);
     err |= control (file, "abs", _abs, __abs, NULL);
 
     err |= reg (file, _reg_file, reg_num);
     if (err == -1)
 	return 0;
     if (sub_reg_num)
-	format (file, ".%d", sub_reg_num);
+	format (file, ".%d", sub_reg_num / reg_type_size[type]); /* use formal style like spec */
     src_align1_region (file, _vert_stride, _width, _horiz_stride);
     err |= control (file, "src reg encoding", reg_encoding, type, NULL);
     return err;
@@ -552,7 +629,7 @@ static int src_ia1 (FILE *file,
 		    GLuint _vert_stride)
 {
     int err = 0;
-    err |= control (file, "negate", negate, _negate, NULL);
+    err |= control (file, "negate", negate_op, _negate, NULL);
     err |= control (file, "abs", _abs, __abs, NULL);
 
     string (file, "g[a0");
@@ -580,18 +657,120 @@ static int src_da16 (FILE *file,
 		     GLuint swz_w)
 {
     int err = 0;
-    err |= control (file, "negate", negate, _negate, NULL);
+    err |= control (file, "negate", negate_op, _negate, NULL);
     err |= control (file, "abs", _abs, __abs, NULL);
 
     err |= reg (file, _reg_file, _reg_nr);
     if (err == -1)
 	return 0;
     if (_subreg_nr)
-	format (file, ".%d", _subreg_nr);
+	/* bit4 for subreg number byte addressing. Make this same meaning as
+	   in da1 case, so output looks consistent. */
+	format (file, ".%d", 16 / reg_type_size[_reg_type]);
     string (file, "<");
     err |= control (file, "vert stride", vert_stride, _vert_stride, NULL);
-    string (file, ",1,1>");
+    string (file, ",4,1>");
+    /*
+     * Three kinds of swizzle display:
+     *  identity - nothing printed
+     *  1->all	 - print the single channel
+     *  1->1     - print the mapping
+     */
+    if (swz_x == BRW_CHANNEL_X &&
+	swz_y == BRW_CHANNEL_Y &&
+	swz_z == BRW_CHANNEL_Z &&
+	swz_w == BRW_CHANNEL_W)
+    {
+	;
+    }
+    else if (swz_x == swz_y && swz_x == swz_z && swz_x == swz_w)
+    {
+	string (file, ".");
+	err |= control (file, "channel select", chan_sel, swz_x, NULL);
+    }
+    else
+    {
+	string (file, ".");
+	err |= control (file, "channel select", chan_sel, swz_x, NULL);
+	err |= control (file, "channel select", chan_sel, swz_y, NULL);
+	err |= control (file, "channel select", chan_sel, swz_z, NULL);
+	err |= control (file, "channel select", chan_sel, swz_w, NULL);
+    }
     err |= control (file, "src da16 reg type", reg_encoding, _reg_type, NULL);
+    return err;
+}
+
+static int src0_3src (FILE *file, struct brw_instruction *inst)
+{
+    int err = 0;
+    GLuint swz_x = (inst->bits2.da3src.src0_swizzle >> 0) & 0x3;
+    GLuint swz_y = (inst->bits2.da3src.src0_swizzle >> 2) & 0x3;
+    GLuint swz_z = (inst->bits2.da3src.src0_swizzle >> 4) & 0x3;
+    GLuint swz_w = (inst->bits2.da3src.src0_swizzle >> 6) & 0x3;
+
+    err |= control (file, "negate", negate_op, inst->bits1.da3src.src0_negate, NULL);
+    err |= control (file, "abs", _abs, inst->bits1.da3src.src0_abs, NULL);
+
+    err |= reg (file, BRW_GENERAL_REGISTER_FILE, inst->bits2.da3src.src0_reg_nr);
+    if (err == -1)
+	return 0;
+    if (inst->bits2.da3src.src0_subreg_nr)
+	format (file, ".%d", inst->bits2.da3src.src0_subreg_nr);
+    string (file, "<4,1,1>");
+    err |= control (file, "src da16 reg type", reg_encoding,
+		    BRW_REGISTER_TYPE_F, NULL);
+    /*
+     * Three kinds of swizzle display:
+     *  identity - nothing printed
+     *  1->all	 - print the single channel
+     *  1->1     - print the mapping
+     */
+    if (swz_x == BRW_CHANNEL_X &&
+	swz_y == BRW_CHANNEL_Y &&
+	swz_z == BRW_CHANNEL_Z &&
+	swz_w == BRW_CHANNEL_W)
+    {
+	;
+    }
+    else if (swz_x == swz_y && swz_x == swz_z && swz_x == swz_w)
+    {
+	string (file, ".");
+	err |= control (file, "channel select", chan_sel, swz_x, NULL);
+    }
+    else
+    {
+	string (file, ".");
+	err |= control (file, "channel select", chan_sel, swz_x, NULL);
+	err |= control (file, "channel select", chan_sel, swz_y, NULL);
+	err |= control (file, "channel select", chan_sel, swz_z, NULL);
+	err |= control (file, "channel select", chan_sel, swz_w, NULL);
+    }
+    return err;
+}
+
+static int src1_3src (FILE *file, struct brw_instruction *inst)
+{
+    int err = 0;
+    GLuint swz_x = (inst->bits2.da3src.src1_swizzle >> 0) & 0x3;
+    GLuint swz_y = (inst->bits2.da3src.src1_swizzle >> 2) & 0x3;
+    GLuint swz_z = (inst->bits2.da3src.src1_swizzle >> 4) & 0x3;
+    GLuint swz_w = (inst->bits2.da3src.src1_swizzle >> 6) & 0x3;
+    GLuint src1_subreg_nr = (inst->bits2.da3src.src1_subreg_nr_low |
+			     (inst->bits3.da3src.src1_subreg_nr_high << 2));
+
+    err |= control (file, "negate", negate_op, inst->bits1.da3src.src1_negate,
+		    NULL);
+    err |= control (file, "abs", _abs, inst->bits1.da3src.src1_abs, NULL);
+
+    err |= reg (file, BRW_GENERAL_REGISTER_FILE,
+		inst->bits3.da3src.src1_reg_nr);
+    if (err == -1)
+	return 0;
+    if (src1_subreg_nr)
+	format (file, ".%d", src1_subreg_nr);
+    string (file, "<4,1,1>");
+    err |= control (file, "src da16 reg type", reg_encoding,
+		    BRW_REGISTER_TYPE_F, NULL);
     /*
      * Three kinds of swizzle display:
      *  identity - nothing printed
@@ -622,6 +801,56 @@ static int src_da16 (FILE *file,
 }
 
 
+static int src2_3src (FILE *file, struct brw_instruction *inst)
+{
+    int err = 0;
+    GLuint swz_x = (inst->bits3.da3src.src2_swizzle >> 0) & 0x3;
+    GLuint swz_y = (inst->bits3.da3src.src2_swizzle >> 2) & 0x3;
+    GLuint swz_z = (inst->bits3.da3src.src2_swizzle >> 4) & 0x3;
+    GLuint swz_w = (inst->bits3.da3src.src2_swizzle >> 6) & 0x3;
+
+    err |= control (file, "negate", negate_op, inst->bits1.da3src.src2_negate,
+		    NULL);
+    err |= control (file, "abs", _abs, inst->bits1.da3src.src2_abs, NULL);
+
+    err |= reg (file, BRW_GENERAL_REGISTER_FILE,
+		inst->bits3.da3src.src2_reg_nr);
+    if (err == -1)
+	return 0;
+    if (inst->bits3.da3src.src2_subreg_nr)
+	format (file, ".%d", inst->bits3.da3src.src2_subreg_nr);
+    string (file, "<4,1,1>");
+    err |= control (file, "src da16 reg type", reg_encoding,
+		    BRW_REGISTER_TYPE_F, NULL);
+    /*
+     * Three kinds of swizzle display:
+     *  identity - nothing printed
+     *  1->all	 - print the single channel
+     *  1->1     - print the mapping
+     */
+    if (swz_x == BRW_CHANNEL_X &&
+	swz_y == BRW_CHANNEL_Y &&
+	swz_z == BRW_CHANNEL_Z &&
+	swz_w == BRW_CHANNEL_W)
+    {
+	;
+    }
+    else if (swz_x == swz_y && swz_x == swz_z && swz_x == swz_w)
+    {
+	string (file, ".");
+	err |= control (file, "channel select", chan_sel, swz_x, NULL);
+    }
+    else
+    {
+	string (file, ".");
+	err |= control (file, "channel select", chan_sel, swz_x, NULL);
+	err |= control (file, "channel select", chan_sel, swz_y, NULL);
+	err |= control (file, "channel select", chan_sel, swz_z, NULL);
+	err |= control (file, "channel select", chan_sel, swz_w, NULL);
+    }
+    return err;
+}
+
 static int imm (FILE *file, GLuint type, struct brw_instruction *inst) {
     switch (type) {
     case BRW_REGISTER_TYPE_UD:
@@ -771,7 +1000,45 @@ static int src1 (FILE *file, struct brw_instruction *inst)
     }
 }
 
-int disasm (FILE *file, struct brw_instruction *inst)
+int esize[6] = {
+	[0] = 1,
+	[1] = 2,
+	[2] = 4,
+	[3] = 8,
+	[4] = 16,
+	[5] = 32,
+};
+
+static int qtr_ctrl(FILE *file, struct brw_instruction *inst)
+{
+    int qtr_ctl = inst->header.compression_control;
+    int exec_size = esize[inst->header.execution_size];
+
+    if (exec_size == 8) {
+	switch (qtr_ctl) {
+	case 0:
+	    string (file, " 1Q");
+	    break;
+	case 1:
+	    string (file, " 2Q");
+	    break;
+	case 2:
+	    string (file, " 3Q");
+	    break;
+	case 3:
+	    string (file, " 4Q");
+	    break;
+	}
+    } else if (exec_size == 16){
+	if (qtr_ctl < 2)
+	    string (file, " 1H");
+	else
+	    string (file, " 2H");
+    }
+    return 0;
+}
+
+int brw_disasm (FILE *file, struct brw_instruction *inst, int gen)
 {
     int	err = 0;
     int space = 0;
@@ -779,7 +1046,7 @@ int disasm (FILE *file, struct brw_instruction *inst)
     if (inst->header.predicate_control) {
 	string (file, "(");
 	err |= control (file, "predicate inverse", pred_inv, inst->header.predicate_inverse, NULL);
-	format (file, "f%d", inst->bits2.da1.flag_reg_nr);
+	format (file, "f%d", gen >= 7 ? inst->bits2.da1.flag_reg_nr : 0);
 	if (inst->bits2.da1.flag_subreg_nr)
 	    format (file, ".%d", inst->bits2.da1.flag_subreg_nr);
 	if (inst->header.access_mode == BRW_ALIGN_1)
@@ -795,42 +1062,106 @@ int disasm (FILE *file, struct brw_instruction *inst)
     err |= control (file, "saturate", saturate, inst->header.saturate, NULL);
     err |= control (file, "debug control", debug_ctrl, inst->header.debug_control, NULL);
 
-    if (inst->header.opcode != BRW_OPCODE_SEND &&
-	inst->header.opcode != BRW_OPCODE_SENDC)
+    if (inst->header.opcode == BRW_OPCODE_MATH) {
+	string (file, " ");
+	err |= control (file, "function", math_function,
+			inst->header.destreg__conditionalmod, NULL);
+    } else if (inst->header.opcode != BRW_OPCODE_SEND &&
+	       inst->header.opcode != BRW_OPCODE_SENDC) {
 	err |= control (file, "conditional modifier", conditional_modifier,
 			inst->header.destreg__conditionalmod, NULL);
 
+        /* If we're using the conditional modifier, print which flags reg is
+         * used for it.  Note that on gen6+, the embedded-condition SEL and
+         * control flow doesn't update flags.
+         */
+	if (inst->header.destreg__conditionalmod &&
+            (gen < 6 || (inst->header.opcode != BRW_OPCODE_SEL &&
+                         inst->header.opcode != BRW_OPCODE_IF &&
+                         inst->header.opcode != BRW_OPCODE_WHILE))) {
+	    format (file, ".f%d", gen >= 7 ? inst->bits2.da1.flag_reg_nr : 0);
+	    if (inst->bits2.da1.flag_subreg_nr)
+		format (file, ".%d", inst->bits2.da1.flag_subreg_nr);
+        }
+    }
+
     if (inst->header.opcode != BRW_OPCODE_NOP) {
 	string (file, "(");
 	err |= control (file, "execution size", exec_size, inst->header.execution_size, NULL);
 	string (file, ")");
     }
 
-    if (inst->header.opcode == BRW_OPCODE_SEND ||
-	inst->header.opcode == BRW_OPCODE_SENDC)
+    if (inst->header.opcode == BRW_OPCODE_SEND && gen < 6)
 	format (file, " %d", inst->header.destreg__conditionalmod);
 
-    if (opcode[inst->header.opcode].ndst > 0) {
-	pad (file, 16);
-	err |= dest (file, inst);
-    }
-    if (opcode[inst->header.opcode].nsrc > 0) {
-	pad (file, 32);
-	err |= src0 (file, inst);
-    }
-    if (opcode[inst->header.opcode].nsrc > 1) {
-	pad (file, 48);
-	err |= src1 (file, inst);
+    if (opcode[inst->header.opcode].nsrc == 3) {
+       pad (file, 16);
+       err |= dest_3src (file, inst);
+
+       pad (file, 32);
+       err |= src0_3src (file, inst);
+
+       pad (file, 48);
+       err |= src1_3src (file, inst);
+
+       pad (file, 64);
+       err |= src2_3src (file, inst);
+    } else {
+       if (opcode[inst->header.opcode].ndst > 0) {
+	  pad (file, 16);
+	  err |= dest (file, inst);
+       } else if (gen == 7 && (inst->header.opcode == BRW_OPCODE_ELSE ||
+			       inst->header.opcode == BRW_OPCODE_ENDIF ||
+			       inst->header.opcode == BRW_OPCODE_WHILE)) {
+	  format (file, " %d", inst->bits3.break_cont.jip);
+       } else if (gen == 6 && (inst->header.opcode == BRW_OPCODE_IF ||
+			       inst->header.opcode == BRW_OPCODE_ELSE ||
+			       inst->header.opcode == BRW_OPCODE_ENDIF ||
+			       inst->header.opcode == BRW_OPCODE_WHILE)) {
+	  format (file, " %d", inst->bits1.branch_gen6.jump_count);
+       } else if ((gen >= 6 && (inst->header.opcode == BRW_OPCODE_BREAK ||
+                                inst->header.opcode == BRW_OPCODE_CONTINUE ||
+                                inst->header.opcode == BRW_OPCODE_HALT)) ||
+                  (gen == 7 && inst->header.opcode == BRW_OPCODE_IF)) {
+	  format (file, " %d %d", inst->bits3.break_cont.uip, inst->bits3.break_cont.jip);
+       } else if (inst->header.opcode == BRW_OPCODE_JMPI) {
+	  format (file, " %d", inst->bits3.d);
+       }
+
+       if (opcode[inst->header.opcode].nsrc > 0) {
+	  pad (file, 32);
+	  err |= src0 (file, inst);
+       }
+       if (opcode[inst->header.opcode].nsrc > 1) {
+	  pad (file, 48);
+	  err |= src1 (file, inst);
+       }
     }
 
     if (inst->header.opcode == BRW_OPCODE_SEND ||
 	inst->header.opcode == BRW_OPCODE_SENDC) {
+	enum brw_message_target target;
+
+	if (gen >= 6)
+	    target = inst->header.destreg__conditionalmod;
+	else if (gen == 5)
+	    target = inst->bits2.send_gen5.sfid;
+	else
+	    target = inst->bits3.generic.msg_target;
+
 	newline (file);
 	pad (file, 16);
 	space = 0;
-	err |= control (file, "target function", target_function,
-			inst->header.destreg__conditionalmod, &space);
-	switch (inst->header.destreg__conditionalmod) {
+
+	if (gen >= 6) {
+	   err |= control (file, "target function", target_function_gen6,
+			   target, &space);
+	} else {
+	   err |= control (file, "target function", target_function,
+			   target, &space);
+	}
+
+	switch (target) {
 	case BRW_SFID_MATH:
 	    err |= control (file, "math function", math_function,
 			    inst->bits3.math.function, &space);
@@ -844,24 +1175,98 @@ int disasm (FILE *file, struct brw_instruction *inst)
 			    inst->bits3.math.precision, &space);
 	    break;
 	case BRW_SFID_SAMPLER:
-	    format (file, " (%d, %d, ",
-		    inst->bits3.sampler.binding_table_index,
-		    inst->bits3.sampler.sampler);
-	    err |= control (file, "sampler target format", sampler_target_format,
-			    inst->bits3.sampler.return_format, NULL);
-	    string (file, ")");
+	    if (gen >= 7) {
+		format (file, " (%d, %d, %d, %d)",
+			inst->bits3.sampler_gen7.binding_table_index,
+			inst->bits3.sampler_gen7.sampler,
+			inst->bits3.sampler_gen7.msg_type,
+			inst->bits3.sampler_gen7.simd_mode);
+	    } else if (gen >= 5) {
+		format (file, " (%d, %d, %d, %d)",
+			inst->bits3.sampler_gen5.binding_table_index,
+			inst->bits3.sampler_gen5.sampler,
+			inst->bits3.sampler_gen5.msg_type,
+			inst->bits3.sampler_gen5.simd_mode);
+	    } else if (0 /* FINISHME: is_g4x */) {
+		format (file, " (%d, %d)",
+			inst->bits3.sampler_g4x.binding_table_index,
+			inst->bits3.sampler_g4x.sampler);
+	    } else {
+		format (file, " (%d, %d, ",
+			inst->bits3.sampler.binding_table_index,
+			inst->bits3.sampler.sampler);
+		err |= control (file, "sampler target format",
+				sampler_target_format,
+				inst->bits3.sampler.return_format, NULL);
+		string (file, ")");
+	    }
+	    break;
+	case BRW_SFID_DATAPORT_READ:
+	    if (gen >= 6) {
+		format (file, " (%d, %d, %d, %d)",
+			inst->bits3.gen6_dp.binding_table_index,
+			inst->bits3.gen6_dp.msg_control,
+			inst->bits3.gen6_dp.msg_type,
+			inst->bits3.gen6_dp.send_commit_msg);
+	    } else if (gen >= 5 /* FINISHME: || is_g4x */) {
+		format (file, " (%d, %d, %d)",
+			inst->bits3.dp_read_gen5.binding_table_index,
+			inst->bits3.dp_read_gen5.msg_control,
+			inst->bits3.dp_read_gen5.msg_type);
+	    } else {
+		format (file, " (%d, %d, %d)",
+			inst->bits3.dp_read.binding_table_index,
+			inst->bits3.dp_read.msg_control,
+			inst->bits3.dp_read.msg_type);
+	    }
 	    break;
+
 	case BRW_SFID_DATAPORT_WRITE:
-	    format (file, " (%d, %d, %d, %d)",
-		    inst->bits3.dp_write.binding_table_index,
-		    (inst->bits3.dp_write.last_render_target << 3) |
-		    inst->bits3.dp_write.msg_control,
-		    inst->bits3.dp_write.msg_type,
-		    inst->bits3.dp_write.send_commit_msg);
+	    if (gen >= 7) {
+		format (file, " (");
+
+		err |= control (file, "DP rc message type",
+				dp_rc_msg_type_gen6,
+				inst->bits3.gen7_dp.msg_type, &space);
+
+		format (file, ", %d, %d, %d)",
+			inst->bits3.gen7_dp.binding_table_index,
+			inst->bits3.gen7_dp.msg_control,
+			inst->bits3.gen7_dp.msg_type);
+	    } else if (gen == 6) {
+		format (file, " (");
+
+		err |= control (file, "DP rc message type",
+				dp_rc_msg_type_gen6,
+				inst->bits3.gen6_dp.msg_type, &space);
+
+		format (file, ", %d, %d, %d, %d)",
+			inst->bits3.gen6_dp.binding_table_index,
+			inst->bits3.gen6_dp.msg_control,
+			inst->bits3.gen6_dp.msg_type,
+			inst->bits3.gen6_dp.send_commit_msg);
+	    } else {
+		format (file, " (%d, %d, %d, %d)",
+			inst->bits3.dp_write.binding_table_index,
+			(inst->bits3.dp_write.last_render_target << 3) |
+			inst->bits3.dp_write.msg_control,
+			inst->bits3.dp_write.msg_type,
+			inst->bits3.dp_write.send_commit_msg);
+	    }
 	    break;
+
 	case BRW_SFID_URB:
-	    format (file, " %d", inst->bits3.urb.offset);
+	    if (gen >= 5) {
+		format (file, " %d", inst->bits3.urb_gen5.offset);
+	    } else {
+		format (file, " %d", inst->bits3.urb.offset);
+	    }
+
 	    space = 1;
+	    if (gen >= 5) {
+		err |= control (file, "urb opcode", urb_opcode,
+				inst->bits3.urb_gen5.opcode, &space);
+	    }
 	    err |= control (file, "urb swizzle", urb_swizzle,
 			    inst->bits3.urb.swizzle_control, &space);
 	    err |= control (file, "urb allocate", urb_allocate,
@@ -873,27 +1278,62 @@ int disasm (FILE *file, struct brw_instruction *inst)
 	    break;
 	case BRW_SFID_THREAD_SPAWNER:
 	    break;
+	case GEN7_SFID_DATAPORT_DATA_CACHE:
+	    format (file, " (%d, %d, %d)",
+		    inst->bits3.gen7_dp.binding_table_index,
+		    inst->bits3.gen7_dp.msg_control,
+		    inst->bits3.gen7_dp.msg_type);
+	    break;
+
+
 	default:
-	    format (file, "unsupported target %d", inst->bits3.generic.msg_target);
+	    format (file, "unsupported target %d", target);
 	    break;
 	}
 	if (space)
 	    string (file, " ");
-	format (file, "mlen %d",
-		inst->bits3.generic.msg_length);
-	format (file, " rlen %d",
-		inst->bits3.generic.response_length);
+	if (gen >= 5) {
+	   format (file, "mlen %d",
+		   inst->bits3.generic_gen5.msg_length);
+	   format (file, " rlen %d",
+		   inst->bits3.generic_gen5.response_length);
+	} else {
+	   format (file, "mlen %d",
+		   inst->bits3.generic.msg_length);
+	   format (file, " rlen %d",
+		   inst->bits3.generic.response_length);
+	}
     }
     pad (file, 64);
     if (inst->header.opcode != BRW_OPCODE_NOP) {
 	string (file, "{");
 	space = 1;
 	err |= control(file, "access mode", access_mode, inst->header.access_mode, &space);
-	err |= control (file, "mask control", mask_ctrl, inst->header.mask_control, &space);
+	if (gen >= 6)
+	    err |= control (file, "write enable control", wectrl, inst->header.mask_control, &space);
+	else
+	    err |= control (file, "mask control", mask_ctrl, inst->header.mask_control, &space);
 	err |= control (file, "dependency control", dep_ctrl, inst->header.dependency_control, &space);
-	err |= control (file, "compression control", compr_ctrl, inst->header.compression_control, &space);
+
+	if (gen >= 6)
+	    err |= qtr_ctrl (file, inst);
+	else {
+	    if (inst->header.compression_control == BRW_COMPRESSION_COMPRESSED &&
+		opcode[inst->header.opcode].ndst > 0 &&
+		inst->bits1.da1.dest_reg_file == BRW_MESSAGE_REGISTER_FILE &&
+		inst->bits1.da1.dest_reg_nr & (1 << 7)) {
+		format (file, " compr4");
+	    } else {
+		err |= control (file, "compression control", compr_ctrl,
+				inst->header.compression_control, &space);
+	    }
+	}
+
 	err |= control (file, "thread control", thread_ctrl, inst->header.thread_control, &space);
-	if (inst->header.opcode == BRW_OPCODE_SEND)
+	if (gen >= 6)
+	    err |= control (file, "acc write control", accwr, inst->header.acc_wr_control, &space);
+	if (inst->header.opcode == BRW_OPCODE_SEND ||
+	    inst->header.opcode == BRW_OPCODE_SENDC)
 	    err |= control (file, "end of thread", end_of_thread,
 			    inst->bits3.generic.end_of_thread, &space);
 	if (space)
diff --git a/assembler/brw_eu.h b/assembler/brw_eu.h
new file mode 100644
index 0000000..b7009ff
--- /dev/null
+++ b/assembler/brw_eu.h
@@ -0,0 +1,405 @@
+/*
+ Copyright (C) Intel Corp.  2006.  All Rights Reserved.
+ Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
+ develop this 3D driver.
+ 
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+ 
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.
+ 
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ 
+ **********************************************************************/
+ /*
+  * Authors:
+  *   Keith Whitwell <keith@tungstengraphics.com>
+  */
+   
+
+#ifndef BRW_EU_H
+#define BRW_EU_H
+
+#include <stdbool.h>
+#include "brw_structs.h"
+#include "brw_defines.h"
+#include "brw_reg.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define BRW_EU_MAX_INSN_STACK 5
+
+struct brw_compile {
+   struct brw_instruction *store;
+   int store_size;
+   GLuint nr_insn;
+   unsigned int next_insn_offset;
+
+   void *mem_ctx;
+
+   /* Allow clients to push/pop instruction state:
+    */
+   struct brw_instruction stack[BRW_EU_MAX_INSN_STACK];
+   bool compressed_stack[BRW_EU_MAX_INSN_STACK];
+   struct brw_instruction *current;
+
+   GLuint flag_value;
+   bool single_program_flow;
+   bool compressed;
+   struct brw_context *brw;
+
+   /* Control flow stacks:
+    * - if_stack contains IF and ELSE instructions which must be patched
+    *   (and popped) once the matching ENDIF instruction is encountered.
+    *
+    *   Just store the instruction pointer(an index).
+    */
+   int *if_stack;
+   int if_stack_depth;
+   int if_stack_array_size;
+
+   /**
+    * loop_stack contains the instruction pointers of the starts of loops which
+    * must be patched (and popped) once the matching WHILE instruction is
+    * encountered.
+    */
+   int *loop_stack;
+   /**
+    * pre-gen6, the BREAK and CONT instructions had to tell how many IF/ENDIF
+    * blocks they were popping out of, to fix up the mask stack.  This tracks
+    * the IF/ENDIF nesting in each current nested loop level.
+    */
+   int *if_depth_in_loop;
+   int loop_stack_depth;
+   int loop_stack_array_size;
+};
+
+static inline struct brw_instruction *current_insn( struct brw_compile *p)
+{
+   return &p->store[p->nr_insn];
+}
+
+void brw_pop_insn_state( struct brw_compile *p );
+void brw_push_insn_state( struct brw_compile *p );
+void brw_set_mask_control( struct brw_compile *p, GLuint value );
+void brw_set_saturate( struct brw_compile *p, bool enable );
+void brw_set_access_mode( struct brw_compile *p, GLuint access_mode );
+void brw_set_compression_control(struct brw_compile *p, enum brw_compression c);
+void brw_set_predicate_control_flag_value( struct brw_compile *p, GLuint value );
+void brw_set_predicate_control( struct brw_compile *p, GLuint pc );
+void brw_set_predicate_inverse(struct brw_compile *p, bool predicate_inverse);
+void brw_set_conditionalmod( struct brw_compile *p, GLuint conditional );
+void brw_set_flag_reg(struct brw_compile *p, int reg, int subreg);
+void brw_set_acc_write_control(struct brw_compile *p, GLuint value);
+
+void brw_init_compile(struct brw_context *, struct brw_compile *p,
+		      void *mem_ctx);
+void brw_dump_compile(struct brw_compile *p, FILE *out, int start, int end);
+const GLuint *brw_get_program( struct brw_compile *p, GLuint *sz );
+
+struct brw_instruction *brw_next_insn(struct brw_compile *p, GLuint opcode);
+void brw_set_dest(struct brw_compile *p, struct brw_instruction *insn,
+		  struct brw_reg dest);
+void brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
+		  struct brw_reg reg);
+
+void gen6_resolve_implied_move(struct brw_compile *p,
+			       struct brw_reg *src,
+			       GLuint msg_reg_nr);
+
+/* Helpers for regular instructions:
+ */
+#define ALU1(OP)					\
+struct brw_instruction *brw_##OP(struct brw_compile *p,	\
+	      struct brw_reg dest,			\
+	      struct brw_reg src0);
+
+#define ALU2(OP)					\
+struct brw_instruction *brw_##OP(struct brw_compile *p,	\
+	      struct brw_reg dest,			\
+	      struct brw_reg src0,			\
+	      struct brw_reg src1);
+
+#define ALU3(OP)					\
+struct brw_instruction *brw_##OP(struct brw_compile *p,	\
+	      struct brw_reg dest,			\
+	      struct brw_reg src0,			\
+	      struct brw_reg src1,			\
+	      struct brw_reg src2);
+
+#define ROUND(OP) \
+void brw_##OP(struct brw_compile *p, struct brw_reg dest, struct brw_reg src0);
+
+ALU1(MOV)
+ALU2(SEL)
+ALU1(NOT)
+ALU2(AND)
+ALU2(OR)
+ALU2(XOR)
+ALU2(SHR)
+ALU2(SHL)
+ALU2(RSR)
+ALU2(RSL)
+ALU2(ASR)
+ALU2(JMPI)
+ALU2(ADD)
+ALU2(AVG)
+ALU2(MUL)
+ALU1(FRC)
+ALU1(RNDD)
+ALU2(MAC)
+ALU2(MACH)
+ALU1(LZD)
+ALU2(DP4)
+ALU2(DPH)
+ALU2(DP3)
+ALU2(DP2)
+ALU2(LINE)
+ALU2(PLN)
+ALU3(MAD)
+
+ROUND(RNDZ)
+ROUND(RNDE)
+
+#undef ALU1
+#undef ALU2
+#undef ALU3
+#undef ROUND
+
+
+/* Helpers for SEND instruction:
+ */
+void brw_set_sampler_message(struct brw_compile *p,
+                             struct brw_instruction *insn,
+                             GLuint binding_table_index,
+                             GLuint sampler,
+                             GLuint msg_type,
+                             GLuint response_length,
+                             GLuint msg_length,
+                             GLuint header_present,
+                             GLuint simd_mode,
+                             GLuint return_format);
+
+void brw_set_dp_read_message(struct brw_compile *p,
+			     struct brw_instruction *insn,
+			     GLuint binding_table_index,
+			     GLuint msg_control,
+			     GLuint msg_type,
+			     GLuint target_cache,
+			     GLuint msg_length,
+                             bool header_present,
+			     GLuint response_length);
+
+void brw_set_dp_write_message(struct brw_compile *p,
+			      struct brw_instruction *insn,
+			      GLuint binding_table_index,
+			      GLuint msg_control,
+			      GLuint msg_type,
+			      GLuint msg_length,
+			      bool header_present,
+			      GLuint last_render_target,
+			      GLuint response_length,
+			      GLuint end_of_thread,
+			      GLuint send_commit_msg);
+
+void brw_urb_WRITE(struct brw_compile *p,
+		   struct brw_reg dest,
+		   GLuint msg_reg_nr,
+		   struct brw_reg src0,
+		   bool allocate,
+		   bool used,
+		   GLuint msg_length,
+		   GLuint response_length,
+		   bool eot,
+		   bool writes_complete,
+		   GLuint offset,
+		   GLuint swizzle);
+
+void brw_ff_sync(struct brw_compile *p,
+		   struct brw_reg dest,
+		   GLuint msg_reg_nr,
+		   struct brw_reg src0,
+		   bool allocate,
+		   GLuint response_length,
+		   bool eot);
+
+void brw_svb_write(struct brw_compile *p,
+                   struct brw_reg dest,
+                   GLuint msg_reg_nr,
+                   struct brw_reg src0,
+                   GLuint binding_table_index,
+                   bool   send_commit_msg);
+
+void brw_fb_WRITE(struct brw_compile *p,
+		  int dispatch_width,
+		   GLuint msg_reg_nr,
+		   struct brw_reg src0,
+		   GLuint msg_control,
+		   GLuint binding_table_index,
+		   GLuint msg_length,
+		   GLuint response_length,
+		   bool eot,
+		   bool header_present);
+
+void brw_SAMPLE(struct brw_compile *p,
+		struct brw_reg dest,
+		GLuint msg_reg_nr,
+		struct brw_reg src0,
+		GLuint binding_table_index,
+		GLuint sampler,
+		GLuint writemask,
+		GLuint msg_type,
+		GLuint response_length,
+		GLuint msg_length,
+		GLuint header_present,
+		GLuint simd_mode,
+		GLuint return_format);
+
+void brw_math( struct brw_compile *p,
+	       struct brw_reg dest,
+	       GLuint function,
+	       GLuint msg_reg_nr,
+	       struct brw_reg src,
+	       GLuint data_type,
+	       GLuint precision );
+
+void brw_math2(struct brw_compile *p,
+	       struct brw_reg dest,
+	       GLuint function,
+	       struct brw_reg src0,
+	       struct brw_reg src1);
+
+void brw_oword_block_read(struct brw_compile *p,
+			  struct brw_reg dest,
+			  struct brw_reg mrf,
+			  uint32_t offset,
+			  uint32_t bind_table_index);
+
+void brw_oword_block_read_scratch(struct brw_compile *p,
+				  struct brw_reg dest,
+				  struct brw_reg mrf,
+				  int num_regs,
+				  GLuint offset);
+
+void brw_oword_block_write_scratch(struct brw_compile *p,
+				   struct brw_reg mrf,
+				   int num_regs,
+				   GLuint offset);
+
+void brw_shader_time_add(struct brw_compile *p,
+                         int mrf,
+                         uint32_t surf_index);
+
+/* If/else/endif.  Works by manipulating the execution flags on each
+ * channel.
+ */
+struct brw_instruction *brw_IF(struct brw_compile *p, 
+			       GLuint execute_size);
+struct brw_instruction *gen6_IF(struct brw_compile *p, uint32_t conditional,
+				struct brw_reg src0, struct brw_reg src1);
+
+void brw_ELSE(struct brw_compile *p);
+void brw_ENDIF(struct brw_compile *p);
+
+/* DO/WHILE loops:
+ */
+struct brw_instruction *brw_DO(struct brw_compile *p,
+			       GLuint execute_size);
+
+struct brw_instruction *brw_WHILE(struct brw_compile *p);
+
+struct brw_instruction *brw_BREAK(struct brw_compile *p);
+struct brw_instruction *brw_CONT(struct brw_compile *p);
+struct brw_instruction *gen6_CONT(struct brw_compile *p);
+struct brw_instruction *gen6_HALT(struct brw_compile *p);
+/* Forward jumps:
+ */
+void brw_land_fwd_jump(struct brw_compile *p, int jmp_insn_idx);
+
+
+
+void brw_NOP(struct brw_compile *p);
+
+void brw_WAIT(struct brw_compile *p);
+
+/* Special case: there is never a destination, execution size will be
+ * taken from src0:
+ */
+void brw_CMP(struct brw_compile *p,
+	     struct brw_reg dest,
+	     GLuint conditional,
+	     struct brw_reg src0,
+	     struct brw_reg src1);
+
+/*********************************************************************** 
+ * brw_eu_util.c:
+ */
+
+void brw_copy_indirect_to_indirect(struct brw_compile *p,
+				   struct brw_indirect dst_ptr,
+				   struct brw_indirect src_ptr,
+				   GLuint count);
+
+void brw_copy_from_indirect(struct brw_compile *p,
+			    struct brw_reg dst,
+			    struct brw_indirect ptr,
+			    GLuint count);
+
+void brw_copy4(struct brw_compile *p,
+	       struct brw_reg dst,
+	       struct brw_reg src,
+	       GLuint count);
+
+void brw_copy8(struct brw_compile *p,
+	       struct brw_reg dst,
+	       struct brw_reg src,
+	       GLuint count);
+
+void brw_math_invert( struct brw_compile *p, 
+		      struct brw_reg dst,
+		      struct brw_reg src);
+
+void brw_set_src1(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg reg);
+
+void brw_set_uip_jip(struct brw_compile *p);
+
+uint32_t brw_swap_cmod(uint32_t cmod);
+
+/* brw_optimize.c */
+void brw_optimize(struct brw_compile *p);
+void brw_remove_duplicate_mrf_moves(struct brw_compile *p);
+void brw_remove_grf_to_mrf_moves(struct brw_compile *p);
+
+/* brw_disasm.c */
+struct opcode_desc {
+    char    *name;
+    int	    nsrc;
+    int	    ndst;
+};
+
+extern const struct opcode_desc opcode_descs[128];
+
+int brw_disasm (FILE *file, struct brw_instruction *inst, int gen);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/assembler/brw_reg.h b/assembler/brw_reg.h
new file mode 100644
index 0000000..f225915
--- /dev/null
+++ b/assembler/brw_reg.h
@@ -0,0 +1,808 @@
+/*
+ Copyright (C) Intel Corp.  2006.  All Rights Reserved.
+ Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
+ develop this 3D driver.
+
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.
+
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+ **********************************************************************/
+ /*
+  * Authors:
+  *   Keith Whitwell <keith@tungstengraphics.com>
+  */
+
+/** @file brw_reg.h
+ *
+ * This file defines struct brw_reg, which is our representation for EU
+ * registers.  They're not a hardware specific format, just an abstraction
+ * that intends to capture the full flexibility of the hardware registers.
+ *
+ * The brw_eu_emit.c layer's brw_set_dest/brw_set_src[01] functions encode
+ * the abstract brw_reg type into the actual hardware instruction encoding.
+ */
+
+#ifndef BRW_REG_H
+#define BRW_REG_H
+
+#include <stdbool.h>
+#include <assert.h>
+#include "brw_defines.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/** Number of general purpose registers (VS, WM, etc) */
+#define BRW_MAX_GRF 128
+
+/**
+ * First GRF used for the MRF hack.
+ *
+ * On gen7, MRFs are no longer used, and contiguous GRFs are used instead.  We
+ * haven't converted our compiler to be aware of this, so it asks for MRFs and
+ * brw_eu_emit.c quietly converts them to be accesses of the top GRFs.  The
+ * register allocators have to be careful of this to avoid corrupting the "MRF"s
+ * with actual GRF allocations.
+ */
+#define GEN7_MRF_HACK_START 112
+
+/** Number of message register file registers */
+#define BRW_MAX_MRF 16
+
+#define BRW_SWIZZLE4(a,b,c,d) (((a)<<0) | ((b)<<2) | ((c)<<4) | ((d)<<6))
+#define BRW_GET_SWZ(swz, idx) (((swz) >> ((idx)*2)) & 0x3)
+
+#define BRW_SWIZZLE_NOOP      BRW_SWIZZLE4(0,1,2,3)
+#define BRW_SWIZZLE_XYZW      BRW_SWIZZLE4(0,1,2,3)
+#define BRW_SWIZZLE_XXXX      BRW_SWIZZLE4(0,0,0,0)
+#define BRW_SWIZZLE_YYYY      BRW_SWIZZLE4(1,1,1,1)
+#define BRW_SWIZZLE_ZZZZ      BRW_SWIZZLE4(2,2,2,2)
+#define BRW_SWIZZLE_WWWW      BRW_SWIZZLE4(3,3,3,3)
+#define BRW_SWIZZLE_XYXY      BRW_SWIZZLE4(0,1,0,1)
+
+static inline bool
+brw_is_single_value_swizzle(int swiz)
+{
+   return (swiz == BRW_SWIZZLE_XXXX ||
+           swiz == BRW_SWIZZLE_YYYY ||
+           swiz == BRW_SWIZZLE_ZZZZ ||
+           swiz == BRW_SWIZZLE_WWWW);
+}
+
+#define BRW_WRITEMASK_X 0x1
+#define BRW_WRITEMASK_Y 0x2
+#define BRW_WRITEMASK_Z 0x4
+#define BRW_WRITEMASK_W 0x8
+
+#define BRW_WRITEMASK_XY (BRW_WRITEMASK_X | BRW_WRITEMASK_Y)
+#define BRW_WRITEMASK_XZ (BRW_WRITEMASK_X | BRW_WRITEMASK_Z)
+#define BRW_WRITEMASK_XW (BRW_WRITEMASK_X | BRW_WRITEMASK_W)
+#define BRW_WRITEMASK_YW (BRW_WRITEMASK_Y | BRW_WRITEMASK_W)
+#define BRW_WRITEMASK_ZW (BRW_WRITEMASK_Z | BRW_WRITEMASK_W)
+#define BRW_WRITEMASK_XYZ (BRW_WRITEMASK_X | BRW_WRITEMASK_Y | BRW_WRITEMASK_Z)
+#define BRW_WRITEMASK_XYZW (BRW_WRITEMASK_X | BRW_WRITEMASK_Y | \
+                            BRW_WRITEMASK_Z | BRW_WRITEMASK_W)
+
+#define REG_SIZE (8*4)
+
+/* These aren't hardware structs, just something useful for us to pass around:
+ *
+ * Align1 operation has a lot of control over input ranges.  Used in
+ * WM programs to implement shaders decomposed into "channel serial"
+ * or "structure of array" form:
+ */
+struct brw_reg {
+   unsigned type:4;
+   unsigned file:2;
+   unsigned nr:8;
+   unsigned subnr:5;              /* :1 in align16 */
+   unsigned negate:1;             /* source only */
+   unsigned abs:1;                /* source only */
+   unsigned vstride:4;            /* source only */
+   unsigned width:3;              /* src only, align1 only */
+   unsigned hstride:2;            /* align1 only */
+   unsigned address_mode:1;       /* relative addressing, hopefully! */
+   unsigned pad0:1;
+
+   union {
+      struct {
+         unsigned swizzle:8;      /* src only, align16 only */
+         unsigned writemask:4;    /* dest only, align16 only */
+         int  indirect_offset:10; /* relative addressing offset */
+         unsigned pad1:10;        /* two dwords total */
+      } bits;
+
+      float f;
+      int   d;
+      unsigned ud;
+   } dw1;
+};
+
+
+struct brw_indirect {
+   unsigned addr_subnr:4;
+   int addr_offset:10;
+   unsigned pad:18;
+};
+
+
+static inline int
+type_sz(unsigned type)
+{
+   switch(type) {
+   case BRW_REGISTER_TYPE_UD:
+   case BRW_REGISTER_TYPE_D:
+   case BRW_REGISTER_TYPE_F:
+      return 4;
+   case BRW_REGISTER_TYPE_HF:
+   case BRW_REGISTER_TYPE_UW:
+   case BRW_REGISTER_TYPE_W:
+      return 2;
+   case BRW_REGISTER_TYPE_UB:
+   case BRW_REGISTER_TYPE_B:
+      return 1;
+   default:
+      return 0;
+   }
+}
+
+/**
+ * Construct a brw_reg.
+ * \param file      one of the BRW_x_REGISTER_FILE values
+ * \param nr        register number/index
+ * \param subnr     register sub number
+ * \param type      one of BRW_REGISTER_TYPE_x
+ * \param vstride   one of BRW_VERTICAL_STRIDE_x
+ * \param width     one of BRW_WIDTH_x
+ * \param hstride   one of BRW_HORIZONTAL_STRIDE_x
+ * \param swizzle   one of BRW_SWIZZLE_x
+ * \param writemask BRW_WRITEMASK_X/Y/Z/W bitfield
+ */
+static inline struct brw_reg
+brw_reg(unsigned file,
+        unsigned nr,
+        unsigned subnr,
+        unsigned type,
+        unsigned vstride,
+        unsigned width,
+        unsigned hstride,
+        unsigned swizzle,
+        unsigned writemask)
+{
+   struct brw_reg reg;
+   if (file == BRW_GENERAL_REGISTER_FILE)
+      assert(nr < BRW_MAX_GRF);
+   else if (file == BRW_MESSAGE_REGISTER_FILE)
+      assert((nr & ~(1 << 7)) < BRW_MAX_MRF);
+   else if (file == BRW_ARCHITECTURE_REGISTER_FILE)
+      assert(nr <= BRW_ARF_TIMESTAMP);
+
+   reg.type = type;
+   reg.file = file;
+   reg.nr = nr;
+   reg.subnr = subnr * type_sz(type);
+   reg.negate = 0;
+   reg.abs = 0;
+   reg.vstride = vstride;
+   reg.width = width;
+   reg.hstride = hstride;
+   reg.address_mode = BRW_ADDRESS_DIRECT;
+   reg.pad0 = 0;
+
+   /* Could do better: If the reg is r5.3<0;1,0>, we probably want to
+    * set swizzle and writemask to W, as the lower bits of subnr will
+    * be lost when converted to align16.  This is probably too much to
+    * keep track of as you'd want it adjusted by suboffset(), etc.
+    * Perhaps fix up when converting to align16?
+    */
+   reg.dw1.bits.swizzle = swizzle;
+   reg.dw1.bits.writemask = writemask;
+   reg.dw1.bits.indirect_offset = 0;
+   reg.dw1.bits.pad1 = 0;
+   return reg;
+}
+
+/** Construct float[16] register */
+static inline struct brw_reg
+brw_vec16_reg(unsigned file, unsigned nr, unsigned subnr)
+{
+   return brw_reg(file,
+                  nr,
+                  subnr,
+                  BRW_REGISTER_TYPE_F,
+                  BRW_VERTICAL_STRIDE_16,
+                  BRW_WIDTH_16,
+                  BRW_HORIZONTAL_STRIDE_1,
+                  BRW_SWIZZLE_XYZW,
+                  BRW_WRITEMASK_XYZW);
+}
+
+/** Construct float[8] register */
+static inline struct brw_reg
+brw_vec8_reg(unsigned file, unsigned nr, unsigned subnr)
+{
+   return brw_reg(file,
+                  nr,
+                  subnr,
+                  BRW_REGISTER_TYPE_F,
+                  BRW_VERTICAL_STRIDE_8,
+                  BRW_WIDTH_8,
+                  BRW_HORIZONTAL_STRIDE_1,
+                  BRW_SWIZZLE_XYZW,
+                  BRW_WRITEMASK_XYZW);
+}
+
+/** Construct float[4] register */
+static inline struct brw_reg
+brw_vec4_reg(unsigned file, unsigned nr, unsigned subnr)
+{
+   return brw_reg(file,
+                  nr,
+                  subnr,
+                  BRW_REGISTER_TYPE_F,
+                  BRW_VERTICAL_STRIDE_4,
+                  BRW_WIDTH_4,
+                  BRW_HORIZONTAL_STRIDE_1,
+                  BRW_SWIZZLE_XYZW,
+                  BRW_WRITEMASK_XYZW);
+}
+
+/** Construct float[2] register */
+static inline struct brw_reg
+brw_vec2_reg(unsigned file, unsigned nr, unsigned subnr)
+{
+   return brw_reg(file,
+                  nr,
+                  subnr,
+                  BRW_REGISTER_TYPE_F,
+                  BRW_VERTICAL_STRIDE_2,
+                  BRW_WIDTH_2,
+                  BRW_HORIZONTAL_STRIDE_1,
+                  BRW_SWIZZLE_XYXY,
+                  BRW_WRITEMASK_XY);
+}
+
+/** Construct float[1] register */
+static inline struct brw_reg
+brw_vec1_reg(unsigned file, unsigned nr, unsigned subnr)
+{
+   return brw_reg(file,
+                  nr,
+                  subnr,
+                  BRW_REGISTER_TYPE_F,
+                  BRW_VERTICAL_STRIDE_0,
+                  BRW_WIDTH_1,
+                  BRW_HORIZONTAL_STRIDE_0,
+                  BRW_SWIZZLE_XXXX,
+                  BRW_WRITEMASK_X);
+}
+
+
+static inline struct brw_reg
+retype(struct brw_reg reg, unsigned type)
+{
+   reg.type = type;
+   return reg;
+}
+
+static inline struct brw_reg
+sechalf(struct brw_reg reg)
+{
+   if (reg.vstride)
+      reg.nr++;
+   return reg;
+}
+
+static inline struct brw_reg
+suboffset(struct brw_reg reg, unsigned delta)
+{
+   reg.subnr += delta * type_sz(reg.type);
+   return reg;
+}
+
+
+static inline struct brw_reg
+offset(struct brw_reg reg, unsigned delta)
+{
+   reg.nr += delta;
+   return reg;
+}
+
+
+static inline struct brw_reg
+byte_offset(struct brw_reg reg, unsigned bytes)
+{
+   unsigned newoffset = reg.nr * REG_SIZE + reg.subnr + bytes;
+   reg.nr = newoffset / REG_SIZE;
+   reg.subnr = newoffset % REG_SIZE;
+   return reg;
+}
+
+
+/** Construct unsigned word[16] register */
+static inline struct brw_reg
+brw_uw16_reg(unsigned file, unsigned nr, unsigned subnr)
+{
+   return suboffset(retype(brw_vec16_reg(file, nr, 0), BRW_REGISTER_TYPE_UW), subnr);
+}
+
+/** Construct unsigned word[8] register */
+static inline struct brw_reg
+brw_uw8_reg(unsigned file, unsigned nr, unsigned subnr)
+{
+   return suboffset(retype(brw_vec8_reg(file, nr, 0), BRW_REGISTER_TYPE_UW), subnr);
+}
+
+/** Construct unsigned word[1] register */
+static inline struct brw_reg
+brw_uw1_reg(unsigned file, unsigned nr, unsigned subnr)
+{
+   return suboffset(retype(brw_vec1_reg(file, nr, 0), BRW_REGISTER_TYPE_UW), subnr);
+}
+
+static inline struct brw_reg
+brw_imm_reg(unsigned type)
+{
+   return brw_reg(BRW_IMMEDIATE_VALUE,
+                  0,
+                  0,
+                  type,
+                  BRW_VERTICAL_STRIDE_0,
+                  BRW_WIDTH_1,
+                  BRW_HORIZONTAL_STRIDE_0,
+                  0,
+                  0);
+}
+
+/** Construct float immediate register */
+static inline struct brw_reg
+brw_imm_f(float f)
+{
+   struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_F);
+   imm.dw1.f = f;
+   return imm;
+}
+
+/** Construct integer immediate register */
+static inline struct brw_reg
+brw_imm_d(int d)
+{
+   struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_D);
+   imm.dw1.d = d;
+   return imm;
+}
+
+/** Construct uint immediate register */
+static inline struct brw_reg
+brw_imm_ud(unsigned ud)
+{
+   struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_UD);
+   imm.dw1.ud = ud;
+   return imm;
+}
+
+/** Construct ushort immediate register */
+static inline struct brw_reg
+brw_imm_uw(uint16_t uw)
+{
+   struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_UW);
+   imm.dw1.ud = uw | (uw << 16);
+   return imm;
+}
+
+/** Construct short immediate register */
+static inline struct brw_reg
+brw_imm_w(int16_t w)
+{
+   struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_W);
+   imm.dw1.d = w | (w << 16);
+   return imm;
+}
+
+/* brw_imm_b and brw_imm_ub aren't supported by hardware - the type
+ * numbers alias with _V and _VF below:
+ */
+
+/** Construct vector of eight signed half-byte values */
+static inline struct brw_reg
+brw_imm_v(unsigned v)
+{
+   struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_V);
+   imm.vstride = BRW_VERTICAL_STRIDE_0;
+   imm.width = BRW_WIDTH_8;
+   imm.hstride = BRW_HORIZONTAL_STRIDE_1;
+   imm.dw1.ud = v;
+   return imm;
+}
+
+/** Construct vector of four 8-bit float values */
+static inline struct brw_reg
+brw_imm_vf(unsigned v)
+{
+   struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_VF);
+   imm.vstride = BRW_VERTICAL_STRIDE_0;
+   imm.width = BRW_WIDTH_4;
+   imm.hstride = BRW_HORIZONTAL_STRIDE_1;
+   imm.dw1.ud = v;
+   return imm;
+}
+
+#define VF_ZERO 0x0
+#define VF_ONE  0x30
+#define VF_NEG  (1<<7)
+
+static inline struct brw_reg
+brw_imm_vf4(unsigned v0, unsigned v1, unsigned v2, unsigned v3)
+{
+   struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_VF);
+   imm.vstride = BRW_VERTICAL_STRIDE_0;
+   imm.width = BRW_WIDTH_4;
+   imm.hstride = BRW_HORIZONTAL_STRIDE_1;
+   imm.dw1.ud = ((v0 << 0) | (v1 << 8) | (v2 << 16) | (v3 << 24));
+   return imm;
+}
+
+
+static inline struct brw_reg
+brw_address(struct brw_reg reg)
+{
+   return brw_imm_uw(reg.nr * REG_SIZE + reg.subnr);
+}
+
+/** Construct float[1] general-purpose register */
+static inline struct brw_reg
+brw_vec1_grf(unsigned nr, unsigned subnr)
+{
+   return brw_vec1_reg(BRW_GENERAL_REGISTER_FILE, nr, subnr);
+}
+
+/** Construct float[2] general-purpose register */
+static inline struct brw_reg
+brw_vec2_grf(unsigned nr, unsigned subnr)
+{
+   return brw_vec2_reg(BRW_GENERAL_REGISTER_FILE, nr, subnr);
+}
+
+/** Construct float[4] general-purpose register */
+static inline struct brw_reg
+brw_vec4_grf(unsigned nr, unsigned subnr)
+{
+   return brw_vec4_reg(BRW_GENERAL_REGISTER_FILE, nr, subnr);
+}
+
+/** Construct float[8] general-purpose register */
+static inline struct brw_reg
+brw_vec8_grf(unsigned nr, unsigned subnr)
+{
+   return brw_vec8_reg(BRW_GENERAL_REGISTER_FILE, nr, subnr);
+}
+
+
+static inline struct brw_reg
+brw_uw8_grf(unsigned nr, unsigned subnr)
+{
+   return brw_uw8_reg(BRW_GENERAL_REGISTER_FILE, nr, subnr);
+}
+
+static inline struct brw_reg
+brw_uw16_grf(unsigned nr, unsigned subnr)
+{
+   return brw_uw16_reg(BRW_GENERAL_REGISTER_FILE, nr, subnr);
+}
+
+
+/** Construct null register (usually used for setting condition codes) */
+static inline struct brw_reg
+brw_null_reg(void)
+{
+   return brw_vec8_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_NULL, 0);
+}
+
+static inline struct brw_reg
+brw_address_reg(unsigned subnr)
+{
+   return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ADDRESS, subnr);
+}
+
+/* If/else instructions break in align16 mode if writemask & swizzle
+ * aren't xyzw.  This goes against the convention for other scalar
+ * regs:
+ */
+static inline struct brw_reg
+brw_ip_reg(void)
+{
+   return brw_reg(BRW_ARCHITECTURE_REGISTER_FILE,
+                  BRW_ARF_IP,
+                  0,
+                  BRW_REGISTER_TYPE_UD,
+                  BRW_VERTICAL_STRIDE_4, /* ? */
+                  BRW_WIDTH_1,
+                  BRW_HORIZONTAL_STRIDE_0,
+                  BRW_SWIZZLE_XYZW, /* NOTE! */
+                  BRW_WRITEMASK_XYZW); /* NOTE! */
+}
+
+static inline struct brw_reg
+brw_acc_reg(void)
+{
+   return brw_vec8_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ACCUMULATOR, 0);
+}
+
+static inline struct brw_reg
+brw_notification_1_reg(void)
+{
+
+   return brw_reg(BRW_ARCHITECTURE_REGISTER_FILE,
+                  BRW_ARF_NOTIFICATION_COUNT,
+                  1,
+                  BRW_REGISTER_TYPE_UD,
+                  BRW_VERTICAL_STRIDE_0,
+                  BRW_WIDTH_1,
+                  BRW_HORIZONTAL_STRIDE_0,
+                  BRW_SWIZZLE_XXXX,
+                  BRW_WRITEMASK_X);
+}
+
+
+static inline struct brw_reg
+brw_flag_reg(int reg, int subreg)
+{
+   return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE,
+                      BRW_ARF_FLAG + reg, subreg);
+}
+
+
+static inline struct brw_reg
+brw_mask_reg(unsigned subnr)
+{
+   return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_MASK, subnr);
+}
+
+static inline struct brw_reg
+brw_message_reg(unsigned nr)
+{
+   assert((nr & ~(1 << 7)) < BRW_MAX_MRF);
+   return brw_vec8_reg(BRW_MESSAGE_REGISTER_FILE, nr, 0);
+}
+
+
+/* This is almost always called with a numeric constant argument, so
+ * make things easy to evaluate at compile time:
+ */
+static inline unsigned cvt(unsigned val)
+{
+   switch (val) {
+   case 0: return 0;
+   case 1: return 1;
+   case 2: return 2;
+   case 4: return 3;
+   case 8: return 4;
+   case 16: return 5;
+   case 32: return 6;
+   }
+   return 0;
+}
+
+static inline struct brw_reg
+stride(struct brw_reg reg, unsigned vstride, unsigned width, unsigned hstride)
+{
+   reg.vstride = cvt(vstride);
+   reg.width = cvt(width) - 1;
+   reg.hstride = cvt(hstride);
+   return reg;
+}
+
+
+static inline struct brw_reg
+vec16(struct brw_reg reg)
+{
+   return stride(reg, 16,16,1);
+}
+
+static inline struct brw_reg
+vec8(struct brw_reg reg)
+{
+   return stride(reg, 8,8,1);
+}
+
+static inline struct brw_reg
+vec4(struct brw_reg reg)
+{
+   return stride(reg, 4,4,1);
+}
+
+static inline struct brw_reg
+vec2(struct brw_reg reg)
+{
+   return stride(reg, 2,2,1);
+}
+
+static inline struct brw_reg
+vec1(struct brw_reg reg)
+{
+   return stride(reg, 0,1,0);
+}
+
+
+static inline struct brw_reg
+get_element(struct brw_reg reg, unsigned elt)
+{
+   return vec1(suboffset(reg, elt));
+}
+
+static inline struct brw_reg
+get_element_ud(struct brw_reg reg, unsigned elt)
+{
+   return vec1(suboffset(retype(reg, BRW_REGISTER_TYPE_UD), elt));
+}
+
+static inline struct brw_reg
+get_element_d(struct brw_reg reg, unsigned elt)
+{
+   return vec1(suboffset(retype(reg, BRW_REGISTER_TYPE_D), elt));
+}
+
+
+static inline struct brw_reg
+brw_swizzle(struct brw_reg reg, unsigned x, unsigned y, unsigned z, unsigned w)
+{
+   assert(reg.file != BRW_IMMEDIATE_VALUE);
+
+   reg.dw1.bits.swizzle = BRW_SWIZZLE4(BRW_GET_SWZ(reg.dw1.bits.swizzle, x),
+                                       BRW_GET_SWZ(reg.dw1.bits.swizzle, y),
+                                       BRW_GET_SWZ(reg.dw1.bits.swizzle, z),
+                                       BRW_GET_SWZ(reg.dw1.bits.swizzle, w));
+   return reg;
+}
+
+
+static inline struct brw_reg
+brw_swizzle1(struct brw_reg reg, unsigned x)
+{
+   return brw_swizzle(reg, x, x, x, x);
+}
+
+static inline struct brw_reg
+brw_writemask(struct brw_reg reg, unsigned mask)
+{
+   assert(reg.file != BRW_IMMEDIATE_VALUE);
+   reg.dw1.bits.writemask &= mask;
+   return reg;
+}
+
+static inline struct brw_reg
+brw_set_writemask(struct brw_reg reg, unsigned mask)
+{
+   assert(reg.file != BRW_IMMEDIATE_VALUE);
+   reg.dw1.bits.writemask = mask;
+   return reg;
+}
+
+static inline struct brw_reg
+negate(struct brw_reg reg)
+{
+   reg.negate ^= 1;
+   return reg;
+}
+
+static inline struct brw_reg
+brw_abs(struct brw_reg reg)
+{
+   reg.abs = 1;
+   reg.negate = 0;
+   return reg;
+}
+
+/************************************************************************/
+
+static inline struct brw_reg
+brw_vec4_indirect(unsigned subnr, int offset)
+{
+   struct brw_reg reg =  brw_vec4_grf(0, 0);
+   reg.subnr = subnr;
+   reg.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
+   reg.dw1.bits.indirect_offset = offset;
+   return reg;
+}
+
+static inline struct brw_reg
+brw_vec1_indirect(unsigned subnr, int offset)
+{
+   struct brw_reg reg =  brw_vec1_grf(0, 0);
+   reg.subnr = subnr;
+   reg.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
+   reg.dw1.bits.indirect_offset = offset;
+   return reg;
+}
+
+static inline struct brw_reg
+deref_4f(struct brw_indirect ptr, int offset)
+{
+   return brw_vec4_indirect(ptr.addr_subnr, ptr.addr_offset + offset);
+}
+
+static inline struct brw_reg
+deref_1f(struct brw_indirect ptr, int offset)
+{
+   return brw_vec1_indirect(ptr.addr_subnr, ptr.addr_offset + offset);
+}
+
+static inline struct brw_reg
+deref_4b(struct brw_indirect ptr, int offset)
+{
+   return retype(deref_4f(ptr, offset), BRW_REGISTER_TYPE_B);
+}
+
+static inline struct brw_reg
+deref_1uw(struct brw_indirect ptr, int offset)
+{
+   return retype(deref_1f(ptr, offset), BRW_REGISTER_TYPE_UW);
+}
+
+static inline struct brw_reg
+deref_1d(struct brw_indirect ptr, int offset)
+{
+   return retype(deref_1f(ptr, offset), BRW_REGISTER_TYPE_D);
+}
+
+static inline struct brw_reg
+deref_1ud(struct brw_indirect ptr, int offset)
+{
+   return retype(deref_1f(ptr, offset), BRW_REGISTER_TYPE_UD);
+}
+
+static inline struct brw_reg
+get_addr_reg(struct brw_indirect ptr)
+{
+   return brw_address_reg(ptr.addr_subnr);
+}
+
+static inline struct brw_indirect
+brw_indirect_offset(struct brw_indirect ptr, int offset)
+{
+   ptr.addr_offset += offset;
+   return ptr;
+}
+
+static inline struct brw_indirect
+brw_indirect(unsigned addr_subnr, int offset)
+{
+   struct brw_indirect ptr;
+   ptr.addr_subnr = addr_subnr;
+   ptr.addr_offset = offset;
+   ptr.pad = 0;
+   return ptr;
+}
+
+/** Do two brw_regs refer to the same register? */
+static inline bool
+brw_same_reg(struct brw_reg r1, struct brw_reg r2)
+{
+   return r1.file == r2.file && r1.nr == r2.nr;
+}
+
+void brw_print_reg(struct brw_reg reg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/assembler/disasm-main.c b/assembler/disasm-main.c
index 5cc1e7d..b900e91 100644
--- a/assembler/disasm-main.c
+++ b/assembler/disasm-main.c
@@ -27,6 +27,7 @@
 #include <unistd.h>
 
 #include "gen4asm.h"
+#include "brw_eu.h"
 
 static const struct option longopts[] = {
 	{ NULL, 0, NULL, 0 }
@@ -95,7 +96,10 @@ read_program_binary (FILE *input)
 
 static void usage(void)
 {
-    fprintf(stderr, "usage: intel-gen4disasm [-o outputfile] [-b] inputfile\n");
+    fprintf(stderr, "usage: intel-gen4disasm [options] inputfile\n");
+    fprintf(stderr, "\t-b, --binary                         C style binary output\n");
+    fprintf(stderr, "\t-o, --output {outputfile}            Specify output file\n");
+    fprintf(stderr, "\t-g, --gen <4|5|6|7>                  Specify GPU generation\n");
 }
 
 int main(int argc, char **argv)
@@ -107,9 +111,10 @@ int main(int argc, char **argv)
     char		*output_file = NULL;
     int			byte_array_input = 0;
     int			o;
+    int			gen = 4;
     struct brw_program_instruction  *inst;
 
-    while ((o = getopt_long(argc, argv, "o:b", longopts, NULL)) != -1) {
+    while ((o = getopt_long(argc, argv, "o:bg:", longopts, NULL)) != -1) {
 	switch (o) {
 	case 'o':
 	    if (strcmp(optarg, "-") != 0)
@@ -118,6 +123,15 @@ int main(int argc, char **argv)
 	case 'b':
 	    byte_array_input = 1;
 	    break;
+	case 'g':
+	    gen = strtol(optarg, NULL, 10);
+
+	    if (gen < 4 || gen > 7) {
+		    usage();
+		    exit(1);
+	    }
+
+	    break;
 	default:
 	    usage();
 	    exit(1);
@@ -153,6 +167,6 @@ int main(int argc, char **argv)
     }
 	    
     for (inst = program->first; inst; inst = inst->next)
-	disasm (output, &inst->instruction);
+	brw_disasm (output, &inst->instruction, gen);
     exit (0);
 }
diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index f9ed161..e47e9e6 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -197,6 +197,3 @@ int yylex_destroy(void);
 
 char *
 lex_text(void);
-
-int
-disasm (FILE *output, struct brw_instruction *inst);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 23/90] assembler: Import ralloc from Mesa
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (21 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 22/90] assembler: Update the disassembler code Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 24/90] assembler: Remove white space from brw_eu.h Damien Lespiau
                   ` (67 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

This also add a new brw_compat.h that should help maintaining the
diff between mesa's version and our as small as possible.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/Makefile.am  |    3 +
 assembler/brw_compat.h |   64 +++++++
 assembler/ralloc.c     |  482 ++++++++++++++++++++++++++++++++++++++++++++++++
 assembler/ralloc.h     |  407 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 956 insertions(+), 0 deletions(-)
 create mode 100644 assembler/brw_compat.h
 create mode 100644 assembler/ralloc.c
 create mode 100644 assembler/ralloc.h

diff --git a/assembler/Makefile.am b/assembler/Makefile.am
index 8843e1a..d4733d3 100644
--- a/assembler/Makefile.am
+++ b/assembler/Makefile.am
@@ -10,6 +10,9 @@ BUILT_SOURCES = gram.h gram.c lex.c
 gram.h: gram.c
 
 intel_gen4asm_SOURCES =	\
+	brw_compat.h	\
+	ralloc.c	\
+	ralloc.h	\
 	brw_defines.h	\
 	brw_eu.h	\
 	brw_reg.h	\
diff --git a/assembler/brw_compat.h b/assembler/brw_compat.h
new file mode 100644
index 0000000..9300190
--- /dev/null
+++ b/assembler/brw_compat.h
@@ -0,0 +1,64 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/*
+ * To share code with mesa without having to do big modifications and still be
+ * able to sync files together at a later point, this file holds macros and
+ * types defined in mesa's core headers.
+ */
+
+#ifndef __BRW_COMPAT_H__
+#define __BRW_COMPAT_H__
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ *  * __builtin_expect macros
+ *   */
+#if !defined(__GNUC__)
+#  define __builtin_expect(x, y) (x)
+#endif
+
+#ifndef likely
+#  ifdef __GNUC__
+#    define likely(x)   __builtin_expect(!!(x), 1)
+#    define unlikely(x) __builtin_expect(!!(x), 0)
+#  else
+#    define likely(x)   (x)
+#    define unlikely(x) (x)
+#  endif
+#endif
+
+#if (__GNUC__ >= 3)
+#define PRINTFLIKE(f, a) __attribute__ ((format(__printf__, f, a)))
+#else
+#define PRINTFLIKE(f, a)
+#endif
+
+#ifdef __cplusplus
+} /* end of extern "C" */
+#endif
+
+#endif /* __BRW_COMPAT_H__ */
diff --git a/assembler/ralloc.c b/assembler/ralloc.c
new file mode 100644
index 0000000..59e71c4
--- /dev/null
+++ b/assembler/ralloc.c
@@ -0,0 +1,482 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <assert.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+
+/* Android defines SIZE_MAX in limits.h, instead of the standard stdint.h */
+#ifdef ANDROID
+#include <limits.h>
+#endif
+
+/* Some versions of MinGW are missing _vscprintf's declaration, although they
+ * still provide the symbol in the import library. */
+#ifdef __MINGW32__
+_CRTIMP int _vscprintf(const char *format, va_list argptr);
+#endif
+
+#include "ralloc.h"
+
+#ifndef va_copy
+#ifdef __va_copy
+#define va_copy(dest, src) __va_copy((dest), (src))
+#else
+#define va_copy(dest, src) (dest) = (src)
+#endif
+#endif
+
+#define CANARY 0x5A1106
+
+struct ralloc_header
+{
+   /* A canary value used to determine whether a pointer is ralloc'd. */
+   unsigned canary;
+
+   struct ralloc_header *parent;
+
+   /* The first child (head of a linked list) */
+   struct ralloc_header *child;
+
+   /* Linked list of siblings */
+   struct ralloc_header *prev;
+   struct ralloc_header *next;
+
+   void (*destructor)(void *);
+};
+
+typedef struct ralloc_header ralloc_header;
+
+static void unlink_block(ralloc_header *info);
+static void unsafe_free(ralloc_header *info);
+
+static ralloc_header *
+get_header(const void *ptr)
+{
+   ralloc_header *info = (ralloc_header *) (((char *) ptr) -
+					    sizeof(ralloc_header));
+   assert(info->canary == CANARY);
+   return info;
+}
+
+#define PTR_FROM_HEADER(info) (((char *) info) + sizeof(ralloc_header))
+
+static void
+add_child(ralloc_header *parent, ralloc_header *info)
+{
+   if (parent != NULL) {
+      info->parent = parent;
+      info->next = parent->child;
+      parent->child = info;
+
+      if (info->next != NULL)
+	 info->next->prev = info;
+   }
+}
+
+void *
+ralloc_context(const void *ctx)
+{
+   return ralloc_size(ctx, 0);
+}
+
+void *
+ralloc_size(const void *ctx, size_t size)
+{
+   void *block = calloc(1, size + sizeof(ralloc_header));
+
+   ralloc_header *info = (ralloc_header *) block;
+   ralloc_header *parent = ctx != NULL ? get_header(ctx) : NULL;
+
+   add_child(parent, info);
+
+   info->canary = CANARY;
+
+   return PTR_FROM_HEADER(info);
+}
+
+void *
+rzalloc_size(const void *ctx, size_t size)
+{
+   void *ptr = ralloc_size(ctx, size);
+   if (likely(ptr != NULL))
+      memset(ptr, 0, size);
+   return ptr;
+}
+
+/* helper function - assumes ptr != NULL */
+static void *
+resize(void *ptr, size_t size)
+{
+   ralloc_header *child, *old, *info;
+
+   old = get_header(ptr);
+   info = realloc(old, size + sizeof(ralloc_header));
+
+   if (info == NULL)
+      return NULL;
+
+   /* Update parent and sibling's links to the reallocated node. */
+   if (info != old && info->parent != NULL) {
+      if (info->parent->child == old)
+	 info->parent->child = info;
+
+      if (info->prev != NULL)
+	 info->prev->next = info;
+
+      if (info->next != NULL)
+	 info->next->prev = info;
+   }
+
+   /* Update child->parent links for all children */
+   for (child = info->child; child != NULL; child = child->next)
+      child->parent = info;
+
+   return PTR_FROM_HEADER(info);
+}
+
+void *
+reralloc_size(const void *ctx, void *ptr, size_t size)
+{
+   if (unlikely(ptr == NULL))
+      return ralloc_size(ctx, size);
+
+   assert(ralloc_parent(ptr) == ctx);
+   return resize(ptr, size);
+}
+
+void *
+ralloc_array_size(const void *ctx, size_t size, unsigned count)
+{
+   if (count > SIZE_MAX/size)
+      return NULL;
+
+   return ralloc_size(ctx, size * count);
+}
+
+void *
+rzalloc_array_size(const void *ctx, size_t size, unsigned count)
+{
+   if (count > SIZE_MAX/size)
+      return NULL;
+
+   return rzalloc_size(ctx, size * count);
+}
+
+void *
+reralloc_array_size(const void *ctx, void *ptr, size_t size, unsigned count)
+{
+   if (count > SIZE_MAX/size)
+      return NULL;
+
+   return reralloc_size(ctx, ptr, size * count);
+}
+
+void
+ralloc_free(void *ptr)
+{
+   ralloc_header *info;
+
+   if (ptr == NULL)
+      return;
+
+   info = get_header(ptr);
+   unlink_block(info);
+   unsafe_free(info);
+}
+
+static void
+unlink_block(ralloc_header *info)
+{
+   /* Unlink from parent & siblings */
+   if (info->parent != NULL) {
+      if (info->parent->child == info)
+	 info->parent->child = info->next;
+
+      if (info->prev != NULL)
+	 info->prev->next = info->next;
+
+      if (info->next != NULL)
+	 info->next->prev = info->prev;
+   }
+   info->parent = NULL;
+   info->prev = NULL;
+   info->next = NULL;
+}
+
+static void
+unsafe_free(ralloc_header *info)
+{
+   /* Recursively free any children...don't waste time unlinking them. */
+   ralloc_header *temp;
+   while (info->child != NULL) {
+      temp = info->child;
+      info->child = temp->next;
+      unsafe_free(temp);
+   }
+
+   /* Free the block itself.  Call the destructor first, if any. */
+   if (info->destructor != NULL)
+      info->destructor(PTR_FROM_HEADER(info));
+
+   free(info);
+}
+
+void
+ralloc_steal(const void *new_ctx, void *ptr)
+{
+   ralloc_header *info, *parent;
+
+   if (unlikely(ptr == NULL))
+      return;
+
+   info = get_header(ptr);
+   parent = get_header(new_ctx);
+
+   unlink_block(info);
+
+   add_child(parent, info);
+}
+
+void *
+ralloc_parent(const void *ptr)
+{
+   ralloc_header *info;
+
+   if (unlikely(ptr == NULL))
+      return NULL;
+
+   info = get_header(ptr);
+   return info->parent ? PTR_FROM_HEADER(info->parent) : NULL;
+}
+
+static void *autofree_context = NULL;
+
+static void
+autofree(void)
+{
+   ralloc_free(autofree_context);
+}
+
+void *
+ralloc_autofree_context(void)
+{
+   if (unlikely(autofree_context == NULL)) {
+      autofree_context = ralloc_context(NULL);
+      atexit(autofree);
+   }
+   return autofree_context;
+}
+
+void
+ralloc_set_destructor(const void *ptr, void(*destructor)(void *))
+{
+   ralloc_header *info = get_header(ptr);
+   info->destructor = destructor;
+}
+
+char *
+ralloc_strdup(const void *ctx, const char *str)
+{
+   size_t n;
+   char *ptr;
+
+   if (unlikely(str == NULL))
+      return NULL;
+
+   n = strlen(str);
+   ptr = ralloc_array(ctx, char, n + 1);
+   memcpy(ptr, str, n);
+   ptr[n] = '\0';
+   return ptr;
+}
+
+char *
+ralloc_strndup(const void *ctx, const char *str, size_t max)
+{
+   size_t n;
+   char *ptr;
+
+   if (unlikely(str == NULL))
+      return NULL;
+
+   n = strlen(str);
+   if (n > max)
+      n = max;
+
+   ptr = ralloc_array(ctx, char, n + 1);
+   memcpy(ptr, str, n);
+   ptr[n] = '\0';
+   return ptr;
+}
+
+/* helper routine for strcat/strncat - n is the exact amount to copy */
+static bool
+cat(char **dest, const char *str, size_t n)
+{
+   char *both;
+   size_t existing_length;
+   assert(dest != NULL && *dest != NULL);
+
+   existing_length = strlen(*dest);
+   both = resize(*dest, existing_length + n + 1);
+   if (unlikely(both == NULL))
+      return false;
+
+   memcpy(both + existing_length, str, n);
+   both[existing_length + n] = '\0';
+
+   *dest = both;
+   return true;
+}
+
+
+bool
+ralloc_strcat(char **dest, const char *str)
+{
+   return cat(dest, str, strlen(str));
+}
+
+bool
+ralloc_strncat(char **dest, const char *str, size_t n)
+{
+   /* Clamp n to the string length */
+   size_t str_length = strlen(str);
+   if (str_length < n)
+      n = str_length;
+
+   return cat(dest, str, n);
+}
+
+char *
+ralloc_asprintf(const void *ctx, const char *fmt, ...)
+{
+   char *ptr;
+   va_list args;
+   va_start(args, fmt);
+   ptr = ralloc_vasprintf(ctx, fmt, args);
+   va_end(args);
+   return ptr;
+}
+
+/* Return the length of the string that would be generated by a printf-style
+ * format and argument list, not including the \0 byte.
+ */
+static size_t
+printf_length(const char *fmt, va_list untouched_args)
+{
+   int size;
+   char junk;
+
+   /* Make a copy of the va_list so the original caller can still use it */
+   va_list args;
+   va_copy(args, untouched_args);
+
+#ifdef _WIN32
+   /* We need to use _vcsprintf to calculate the size as vsnprintf returns -1
+    * if the number of characters to write is greater than count.
+    */
+   size = _vscprintf(fmt, args);
+   (void)junk;
+#else
+   size = vsnprintf(&junk, 1, fmt, args);
+#endif
+   assert(size >= 0);
+
+   va_end(args);
+
+   return size;
+}
+
+char *
+ralloc_vasprintf(const void *ctx, const char *fmt, va_list args)
+{
+   size_t size = printf_length(fmt, args) + 1;
+
+   char *ptr = ralloc_size(ctx, size);
+   if (ptr != NULL)
+      vsnprintf(ptr, size, fmt, args);
+
+   return ptr;
+}
+
+bool
+ralloc_asprintf_append(char **str, const char *fmt, ...)
+{
+   bool success;
+   va_list args;
+   va_start(args, fmt);
+   success = ralloc_vasprintf_append(str, fmt, args);
+   va_end(args);
+   return success;
+}
+
+bool
+ralloc_vasprintf_append(char **str, const char *fmt, va_list args)
+{
+   size_t existing_length;
+   assert(str != NULL);
+   existing_length = *str ? strlen(*str) : 0;
+   return ralloc_vasprintf_rewrite_tail(str, &existing_length, fmt, args);
+}
+
+bool
+ralloc_asprintf_rewrite_tail(char **str, size_t *start, const char *fmt, ...)
+{
+   bool success;
+   va_list args;
+   va_start(args, fmt);
+   success = ralloc_vasprintf_rewrite_tail(str, start, fmt, args);
+   va_end(args);
+   return success;
+}
+
+bool
+ralloc_vasprintf_rewrite_tail(char **str, size_t *start, const char *fmt,
+			      va_list args)
+{
+   size_t new_length;
+   char *ptr;
+
+   assert(str != NULL);
+
+   if (unlikely(*str == NULL)) {
+      // Assuming a NULL context is probably bad, but it's expected behavior.
+      *str = ralloc_vasprintf(NULL, fmt, args);
+      return true;
+   }
+
+   new_length = printf_length(fmt, args);
+
+   ptr = resize(*str, *start + new_length + 1);
+   if (unlikely(ptr == NULL))
+      return false;
+
+   vsnprintf(ptr + *start, new_length + 1, fmt, args);
+   *str = ptr;
+   *start += new_length;
+   return true;
+}
diff --git a/assembler/ralloc.h b/assembler/ralloc.h
new file mode 100644
index 0000000..6228d5b
--- /dev/null
+++ b/assembler/ralloc.h
@@ -0,0 +1,407 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * \file ralloc.h
+ *
+ * ralloc: a recursive memory allocator
+ *
+ * The ralloc memory allocator creates a hierarchy of allocated
+ * objects. Every allocation is in reference to some parent, and
+ * every allocated object can in turn be used as the parent of a
+ * subsequent allocation. This allows for extremely convenient
+ * discarding of an entire tree/sub-tree of allocations by calling
+ * ralloc_free on any particular object to free it and all of its
+ * children.
+ *
+ * The conceptual working of ralloc was directly inspired by Andrew
+ * Tridgell's talloc, but ralloc is an independent implementation
+ * released under the MIT license and tuned for Mesa.
+ *
+ * The talloc implementation is available under the GNU Lesser
+ * General Public License (GNU LGPL), version 3 or later. It is
+ * more sophisticated than ralloc in that it includes reference
+ * counting and debugging features. See: http://talloc.samba.org/
+ */
+
+#ifndef RALLOC_H
+#define RALLOC_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stddef.h>
+#include <stdarg.h>
+#include <stdbool.h>
+#include "brw_compat.h"
+
+/**
+ * \def ralloc(ctx, type)
+ * Allocate a new object chained off of the given context.
+ *
+ * This is equivalent to:
+ * \code
+ * ((type *) ralloc_size(ctx, sizeof(type))
+ * \endcode
+ */
+#define ralloc(ctx, type)  ((type *) ralloc_size(ctx, sizeof(type)))
+
+/**
+ * \def rzalloc(ctx, type)
+ * Allocate a new object out of the given context and initialize it to zero.
+ *
+ * This is equivalent to:
+ * \code
+ * ((type *) rzalloc_size(ctx, sizeof(type))
+ * \endcode
+ */
+#define rzalloc(ctx, type) ((type *) rzalloc_size(ctx, sizeof(type)))
+
+/**
+ * Allocate a new ralloc context.
+ *
+ * While any ralloc'd pointer can be used as a context, sometimes it is useful
+ * to simply allocate a context with no associated memory.
+ *
+ * It is equivalent to:
+ * \code
+ * ((type *) ralloc_size(ctx, 0)
+ * \endcode
+ */
+void *ralloc_context(const void *ctx);
+
+/**
+ * Allocate memory chained off of the given context.
+ *
+ * This is the core allocation routine which is used by all others.  It
+ * simply allocates storage for \p size bytes and returns the pointer,
+ * similar to \c malloc.
+ */
+void *ralloc_size(const void *ctx, size_t size);
+
+/**
+ * Allocate zero-initialized memory chained off of the given context.
+ *
+ * This is similar to \c calloc with a size of 1.
+ */
+void *rzalloc_size(const void *ctx, size_t size);
+
+/**
+ * Resize a piece of ralloc-managed memory, preserving data.
+ *
+ * Similar to \c realloc.  Unlike C89, passing 0 for \p size does not free the
+ * memory.  Instead, it resizes it to a 0-byte ralloc context, just like
+ * calling ralloc_size(ctx, 0).  This is different from talloc.
+ *
+ * \param ctx  The context to use for new allocation.  If \p ptr != NULL,
+ *             it must be the same as ralloc_parent(\p ptr).
+ * \param ptr  Pointer to the memory to be resized.  May be NULL.
+ * \param size The amount of memory to allocate, in bytes.
+ */
+void *reralloc_size(const void *ctx, void *ptr, size_t size);
+
+/// \defgroup array Array Allocators @{
+
+/**
+ * \def ralloc_array(ctx, type, count)
+ * Allocate an array of objects chained off the given context.
+ *
+ * Similar to \c calloc, but does not initialize the memory to zero.
+ *
+ * More than a convenience function, this also checks for integer overflow when
+ * multiplying \c sizeof(type) and \p count.  This is necessary for security.
+ *
+ * This is equivalent to:
+ * \code
+ * ((type *) ralloc_array_size(ctx, sizeof(type), count)
+ * \endcode
+ */
+#define ralloc_array(ctx, type, count) \
+   ((type *) ralloc_array_size(ctx, sizeof(type), count))
+
+/**
+ * \def rzalloc_array(ctx, type, count)
+ * Allocate a zero-initialized array chained off the given context.
+ *
+ * Similar to \c calloc.
+ *
+ * More than a convenience function, this also checks for integer overflow when
+ * multiplying \c sizeof(type) and \p count.  This is necessary for security.
+ *
+ * This is equivalent to:
+ * \code
+ * ((type *) rzalloc_array_size(ctx, sizeof(type), count)
+ * \endcode
+ */
+#define rzalloc_array(ctx, type, count) \
+   ((type *) rzalloc_array_size(ctx, sizeof(type), count))
+
+/**
+ * \def reralloc(ctx, ptr, type, count)
+ * Resize a ralloc-managed array, preserving data.
+ *
+ * Similar to \c realloc.  Unlike C89, passing 0 for \p size does not free the
+ * memory.  Instead, it resizes it to a 0-byte ralloc context, just like
+ * calling ralloc_size(ctx, 0).  This is different from talloc.
+ *
+ * More than a convenience function, this also checks for integer overflow when
+ * multiplying \c sizeof(type) and \p count.  This is necessary for security.
+ *
+ * \param ctx   The context to use for new allocation.  If \p ptr != NULL,
+ *              it must be the same as ralloc_parent(\p ptr).
+ * \param ptr   Pointer to the array to be resized.  May be NULL.
+ * \param type  The element type.
+ * \param count The number of elements to allocate.
+ */
+#define reralloc(ctx, ptr, type, count) \
+   ((type *) reralloc_array_size(ctx, ptr, sizeof(type), count))
+
+/**
+ * Allocate memory for an array chained off the given context.
+ *
+ * Similar to \c calloc, but does not initialize the memory to zero.
+ *
+ * More than a convenience function, this also checks for integer overflow when
+ * multiplying \p size and \p count.  This is necessary for security.
+ */
+void *ralloc_array_size(const void *ctx, size_t size, unsigned count);
+
+/**
+ * Allocate a zero-initialized array chained off the given context.
+ *
+ * Similar to \c calloc.
+ *
+ * More than a convenience function, this also checks for integer overflow when
+ * multiplying \p size and \p count.  This is necessary for security.
+ */
+void *rzalloc_array_size(const void *ctx, size_t size, unsigned count);
+
+/**
+ * Resize a ralloc-managed array, preserving data.
+ *
+ * Similar to \c realloc.  Unlike C89, passing 0 for \p size does not free the
+ * memory.  Instead, it resizes it to a 0-byte ralloc context, just like
+ * calling ralloc_size(ctx, 0).  This is different from talloc.
+ *
+ * More than a convenience function, this also checks for integer overflow when
+ * multiplying \c sizeof(type) and \p count.  This is necessary for security.
+ *
+ * \param ctx   The context to use for new allocation.  If \p ptr != NULL,
+ *              it must be the same as ralloc_parent(\p ptr).
+ * \param ptr   Pointer to the array to be resized.  May be NULL.
+ * \param size  The size of an individual element.
+ * \param count The number of elements to allocate.
+ *
+ * \return True unless allocation failed.
+ */
+void *reralloc_array_size(const void *ctx, void *ptr, size_t size,
+			  unsigned count);
+/// @}
+
+/**
+ * Free a piece of ralloc-managed memory.
+ *
+ * This will also free the memory of any children allocated this context.
+ */
+void ralloc_free(void *ptr);
+
+/**
+ * "Steal" memory from one context, changing it to another.
+ *
+ * This changes \p ptr's context to \p new_ctx.  This is quite useful if
+ * memory is allocated out of a temporary context.
+ */
+void ralloc_steal(const void *new_ctx, void *ptr);
+
+/**
+ * Return the given pointer's ralloc context.
+ */
+void *ralloc_parent(const void *ptr);
+
+/**
+ * Return a context whose memory will be automatically freed at program exit.
+ *
+ * The first call to this function creates a context and registers a handler
+ * to free it using \c atexit.  This may cause trouble if used in a library
+ * loaded with \c dlopen.
+ */
+void *ralloc_autofree_context(void);
+
+/**
+ * Set a callback to occur just before an object is freed.
+ */
+void ralloc_set_destructor(const void *ptr, void(*destructor)(void *));
+
+/// \defgroup array String Functions @{
+/**
+ * Duplicate a string, allocating the memory from the given context.
+ */
+char *ralloc_strdup(const void *ctx, const char *str);
+
+/**
+ * Duplicate a string, allocating the memory from the given context.
+ *
+ * Like \c strndup, at most \p n characters are copied.  If \p str is longer
+ * than \p n characters, \p n are copied, and a termining \c '\0' byte is added.
+ */
+char *ralloc_strndup(const void *ctx, const char *str, size_t n);
+
+/**
+ * Concatenate two strings, allocating the necessary space.
+ *
+ * This appends \p str to \p *dest, similar to \c strcat, using ralloc_resize
+ * to expand \p *dest to the appropriate size.  \p dest will be updated to the
+ * new pointer unless allocation fails.
+ *
+ * The result will always be null-terminated.
+ *
+ * \return True unless allocation failed.
+ */
+bool ralloc_strcat(char **dest, const char *str);
+
+/**
+ * Concatenate two strings, allocating the necessary space.
+ *
+ * This appends at most \p n bytes of \p str to \p *dest, using ralloc_resize
+ * to expand \p *dest to the appropriate size.  \p dest will be updated to the
+ * new pointer unless allocation fails.
+ *
+ * The result will always be null-terminated; \p str does not need to be null
+ * terminated if it is longer than \p n.
+ *
+ * \return True unless allocation failed.
+ */
+bool ralloc_strncat(char **dest, const char *str, size_t n);
+
+/**
+ * Print to a string.
+ *
+ * This is analogous to \c sprintf, but allocates enough space (using \p ctx
+ * as the context) for the resulting string.
+ *
+ * \return The newly allocated string.
+ */
+char *ralloc_asprintf (const void *ctx, const char *fmt, ...) PRINTFLIKE(2, 3);
+
+/**
+ * Print to a string, given a va_list.
+ *
+ * This is analogous to \c vsprintf, but allocates enough space (using \p ctx
+ * as the context) for the resulting string.
+ *
+ * \return The newly allocated string.
+ */
+char *ralloc_vasprintf(const void *ctx, const char *fmt, va_list args);
+
+/**
+ * Rewrite the tail of an existing string, starting at a given index.
+ *
+ * Overwrites the contents of *str starting at \p start with newly formatted
+ * text, including a new null-terminator.  Allocates more memory as necessary.
+ *
+ * This can be used to append formatted text when the length of the existing
+ * string is already known, saving a strlen() call.
+ *
+ * \sa ralloc_asprintf_append
+ *
+ * \param str   The string to be updated.
+ * \param start The index to start appending new data at.
+ * \param fmt   A printf-style formatting string
+ *
+ * \p str will be updated to the new pointer unless allocation fails.
+ * \p start will be increased by the length of the newly formatted text.
+ *
+ * \return True unless allocation failed.
+ */
+bool ralloc_asprintf_rewrite_tail(char **str, size_t *start,
+				  const char *fmt, ...)
+				  PRINTFLIKE(3, 4);
+
+/**
+ * Rewrite the tail of an existing string, starting at a given index.
+ *
+ * Overwrites the contents of *str starting at \p start with newly formatted
+ * text, including a new null-terminator.  Allocates more memory as necessary.
+ *
+ * This can be used to append formatted text when the length of the existing
+ * string is already known, saving a strlen() call.
+ *
+ * \sa ralloc_vasprintf_append
+ *
+ * \param str   The string to be updated.
+ * \param start The index to start appending new data at.
+ * \param fmt   A printf-style formatting string
+ * \param args  A va_list containing the data to be formatted
+ *
+ * \p str will be updated to the new pointer unless allocation fails.
+ * \p start will be increased by the length of the newly formatted text.
+ *
+ * \return True unless allocation failed.
+ */
+bool ralloc_vasprintf_rewrite_tail(char **str, size_t *start, const char *fmt,
+				   va_list args);
+
+/**
+ * Append formatted text to the supplied string.
+ *
+ * This is equivalent to
+ * \code
+ * ralloc_asprintf_rewrite_tail(str, strlen(*str), fmt, ...)
+ * \endcode
+ *
+ * \sa ralloc_asprintf
+ * \sa ralloc_asprintf_rewrite_tail
+ * \sa ralloc_strcat
+ *
+ * \p str will be updated to the new pointer unless allocation fails.
+ *
+ * \return True unless allocation failed.
+ */
+bool ralloc_asprintf_append (char **str, const char *fmt, ...)
+			     PRINTFLIKE(2, 3);
+
+/**
+ * Append formatted text to the supplied string, given a va_list.
+ *
+ * This is equivalent to
+ * \code
+ * ralloc_vasprintf_rewrite_tail(str, strlen(*str), fmt, args)
+ * \endcode
+ *
+ * \sa ralloc_vasprintf
+ * \sa ralloc_vasprintf_rewrite_tail
+ * \sa ralloc_strcat
+ *
+ * \p str will be updated to the new pointer unless allocation fails.
+ *
+ * \return True unless allocation failed.
+ */
+bool ralloc_vasprintf_append(char **str, const char *fmt, va_list args);
+/// @}
+
+#ifdef __cplusplus
+} /* end of extern "C" */
+#endif
+
+#endif
-- 
1.7.7.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 24/90] assembler: Remove white space from brw_eu.h
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (22 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 23/90] assembler: Import ralloc from Mesa Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 25/90] assembler: Introduce struct brw_context Damien Lespiau
                   ` (66 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_eu.h |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/assembler/brw_eu.h b/assembler/brw_eu.h
index b7009ff..262a40b 100644
--- a/assembler/brw_eu.h
+++ b/assembler/brw_eu.h
@@ -2,7 +2,7 @@
  Copyright (C) Intel Corp.  2006.  All Rights Reserved.
  Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
  develop this 3D driver.
- 
+
  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
@@ -10,11 +10,11 @@
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:
- 
+
  The above copyright notice and this permission notice (including the
  next paragraph) shall be included in all copies or substantial
  portions of the Software.
- 
+
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -22,13 +22,13 @@
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- 
+
  **********************************************************************/
  /*
   * Authors:
   *   Keith Whitwell <keith@tungstengraphics.com>
   */
-   
+
 
 #ifndef BRW_EU_H
 #define BRW_EU_H
@@ -308,7 +308,7 @@ void brw_shader_time_add(struct brw_compile *p,
 /* If/else/endif.  Works by manipulating the execution flags on each
  * channel.
  */
-struct brw_instruction *brw_IF(struct brw_compile *p, 
+struct brw_instruction *brw_IF(struct brw_compile *p,
 			       GLuint execute_size);
 struct brw_instruction *gen6_IF(struct brw_compile *p, uint32_t conditional,
 				struct brw_reg src0, struct brw_reg src1);
@@ -346,7 +346,7 @@ void brw_CMP(struct brw_compile *p,
 	     struct brw_reg src0,
 	     struct brw_reg src1);
 
-/*********************************************************************** 
+/***********************************************************************
  * brw_eu_util.c:
  */
 
@@ -370,7 +370,7 @@ void brw_copy8(struct brw_compile *p,
 	       struct brw_reg src,
 	       GLuint count);
 
-void brw_math_invert( struct brw_compile *p, 
+void brw_math_invert( struct brw_compile *p,
 		      struct brw_reg dst,
 		      struct brw_reg src);
 
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 25/90] assembler: Introduce struct brw_context
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (23 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 24/90] assembler: Remove white space from brw_eu.h Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 26/90] assembler: Make an libbrw library Damien Lespiau
                   ` (65 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

A lot of the mesa code use struct brw_context to get the GPU generation
and various information. Let's stub this structure and initialize it
ourselves to be able to resuse mesa's code untouched.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/Makefile.am   |    2 +
 assembler/brw_context.c |   44 ++++++++++++++++++++++++++++++++++
 assembler/brw_context.h |   60 +++++++++++++++++++++++++++++++++++++++++++++++
 assembler/brw_eu.h      |    1 +
 4 files changed, 107 insertions(+), 0 deletions(-)
 create mode 100644 assembler/brw_context.c
 create mode 100644 assembler/brw_context.h

diff --git a/assembler/Makefile.am b/assembler/Makefile.am
index d4733d3..9bd3289 100644
--- a/assembler/Makefile.am
+++ b/assembler/Makefile.am
@@ -11,6 +11,8 @@ gram.h: gram.c
 
 intel_gen4asm_SOURCES =	\
 	brw_compat.h	\
+	brw_context.c	\
+	brw_context.h	\
 	ralloc.c	\
 	ralloc.h	\
 	brw_defines.h	\
diff --git a/assembler/brw_context.c b/assembler/brw_context.c
new file mode 100644
index 0000000..6f2a964
--- /dev/null
+++ b/assembler/brw_context.c
@@ -0,0 +1,44 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <string.h>
+
+#include "brw_context.h"
+
+static bool
+intel_init_context(struct intel_context *intel, int gen)
+{
+   memset(intel, 0, sizeof(struct intel_context));
+   intel->gen = gen / 10;
+   intel->is_haswell = gen == 75;
+   if (intel->gen >= 5)
+      intel->needs_ff_sync = true;
+
+   return true;
+}
+
+bool
+brw_init_context(struct brw_context *brw, int gen)
+{
+   return intel_init_context(&brw->intel, gen);
+}
diff --git a/assembler/brw_context.h b/assembler/brw_context.h
new file mode 100644
index 0000000..f0e3a35
--- /dev/null
+++ b/assembler/brw_context.h
@@ -0,0 +1,60 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/*
+ * To share code with mesa without having to do big modifications and still be
+ * able to sync files together at a later point, this file stubs the fields
+ * of struct brw_context used by the code we import.
+ */
+
+#ifndef __BRW_CONTEXT_H__
+#define __BRW_CONTEXT_H__
+
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct intel_context
+{
+   int gen;
+   int gt;
+   bool is_haswell;
+   bool is_g4x;
+   bool needs_ff_sync;
+};
+
+struct brw_context
+{
+   struct intel_context intel;
+};
+
+bool
+brw_init_context(struct brw_context *brw, int gen);
+
+#ifdef __cplusplus
+} /* end of extern "C" */
+#endif
+
+#endif /* __BRW_CONTEXT_H__ */
diff --git a/assembler/brw_eu.h b/assembler/brw_eu.h
index 262a40b..f3e99fa 100644
--- a/assembler/brw_eu.h
+++ b/assembler/brw_eu.h
@@ -34,6 +34,7 @@
 #define BRW_EU_H
 
 #include <stdbool.h>
+#include "brw_context.h"
 #include "brw_structs.h"
 #include "brw_defines.h"
 #include "brw_reg.h"
-- 
1.7.7.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 26/90] assembler: Make an libbrw library
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (24 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 25/90] assembler: Introduce struct brw_context Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 27/90] assembler: Protect gen4asm.h from multiple inclusions Damien Lespiau
                   ` (64 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

With the brw_* files imported from mesa.

There are still a few things in that library that needs gen4asm.h, for
instance the GLuint and GLint types. The hope is that eventually libbrw
can be split out in its own directory and shared.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/Makefile.am |   30 +++++++++++++++++++-----------
 1 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/assembler/Makefile.am b/assembler/Makefile.am
index 9bd3289..d5affd0 100644
--- a/assembler/Makefile.am
+++ b/assembler/Makefile.am
@@ -1,7 +1,22 @@
 SUBDIRS = doc test
 
+noinst_LTLIBRARIES = libbrw.la
+
 bin_PROGRAMS = intel-gen4asm intel-gen4disasm
 
+libbrw_la_SOURCES =	\
+	brw_compat.h	\
+	brw_context.c	\
+	brw_context.h	\
+	brw_disasm.c	\
+	brw_defines.h	\
+	brw_eu.h	\
+	brw_reg.h	\
+	brw_structs.h	\
+	ralloc.c	\
+	ralloc.h	\
+	$(NULL)
+
 AM_YFLAGS = -d --warnings=all
 AM_CFLAGS= $(ASSEMBLER_WARN_CFLAGS)
 
@@ -10,23 +25,16 @@ BUILT_SOURCES = gram.h gram.c lex.c
 gram.h: gram.c
 
 intel_gen4asm_SOURCES =	\
-	brw_compat.h	\
-	brw_context.c	\
-	brw_context.h	\
-	ralloc.c	\
-	ralloc.h	\
-	brw_defines.h	\
-	brw_eu.h	\
-	brw_reg.h	\
-	brw_structs.h	\
 	gen4asm.h	\
 	gram.y		\
 	lex.l		\
 	main.c		\
 	$(NULL)
 
-intel_gen4disasm_SOURCES =  \
-	brw_disasm.c disasm-main.c
+intel_gen4asm_LDADD = libbrw.la
+
+intel_gen4disasm_SOURCES =  disasm-main.c
+intel_gen4disasm_LDADD = libbrw.la
 
 pkgconfigdir = $(libdir)/pkgconfig
 pkgconfig_DATA = intel-gen4asm.pc
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 27/90] assembler: Protect gen4asm.h from multiple inclusions
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (25 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 26/90] assembler: Make an libbrw library Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 28/90] assembler: Import brw_eu_compact.c Damien Lespiau
                   ` (63 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index e47e9e6..71b8a4d 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -26,6 +26,9 @@
  *
  */
 
+#ifndef __GEN4ASM_H__
+#define __GEN4ASM_H__
+
 #include <inttypes.h>
 
 typedef unsigned char GLubyte;
@@ -197,3 +200,5 @@ int yylex_destroy(void);
 
 char *
 lex_text(void);
+
+#endif /* __GEN4ASM_H__ */
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 28/90] assembler: Import brw_eu_compact.c
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (26 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 27/90] assembler: Protect gen4asm.h from multiple inclusions Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 29/90] assembler: Import brw_eu.c Damien Lespiau
                   ` (62 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

To be able to import brw_eu.c and brw_eu_emit.c later on. This could be
used to get the assembler generate compact instructions at some point.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/Makefile.am      |   23 +-
 assembler/brw_compat.h     |    2 +
 assembler/brw_context.h    |    4 +
 assembler/brw_eu.h         |   16 +
 assembler/brw_eu_compact.c |  810 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 844 insertions(+), 11 deletions(-)
 create mode 100644 assembler/brw_eu_compact.c

diff --git a/assembler/Makefile.am b/assembler/Makefile.am
index d5affd0..48e38d0 100644
--- a/assembler/Makefile.am
+++ b/assembler/Makefile.am
@@ -4,17 +4,18 @@ noinst_LTLIBRARIES = libbrw.la
 
 bin_PROGRAMS = intel-gen4asm intel-gen4disasm
 
-libbrw_la_SOURCES =	\
-	brw_compat.h	\
-	brw_context.c	\
-	brw_context.h	\
-	brw_disasm.c	\
-	brw_defines.h	\
-	brw_eu.h	\
-	brw_reg.h	\
-	brw_structs.h	\
-	ralloc.c	\
-	ralloc.h	\
+libbrw_la_SOURCES =		\
+	brw_compat.h		\
+	brw_context.c		\
+	brw_context.h		\
+	brw_disasm.c		\
+	brw_defines.h		\
+	brw_eu.h		\
+	brw_eu_compact.c	\
+	brw_reg.h		\
+	brw_structs.h		\
+	ralloc.c		\
+	ralloc.h		\
 	$(NULL)
 
 AM_YFLAGS = -d --warnings=all
diff --git a/assembler/brw_compat.h b/assembler/brw_compat.h
index 9300190..5102a02 100644
--- a/assembler/brw_compat.h
+++ b/assembler/brw_compat.h
@@ -57,6 +57,8 @@ extern "C" {
 #define PRINTFLIKE(f, a)
 #endif
 
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof(x[0]))
+
 #ifdef __cplusplus
 } /* end of extern "C" */
 #endif
diff --git a/assembler/brw_context.h b/assembler/brw_context.h
index f0e3a35..16a9f70 100644
--- a/assembler/brw_context.h
+++ b/assembler/brw_context.h
@@ -36,6 +36,10 @@
 extern "C" {
 #endif
 
+#ifndef INTEL_DEBUG
+#define INTEL_DEBUG (0)
+#endif
+
 struct intel_context
 {
    int gen;
diff --git a/assembler/brw_eu.h b/assembler/brw_eu.h
index f3e99fa..6d656a4 100644
--- a/assembler/brw_eu.h
+++ b/assembler/brw_eu.h
@@ -34,6 +34,8 @@
 #define BRW_EU_H
 
 #include <stdbool.h>
+#include <stdio.h>
+#include "gen4asm.h"
 #include "brw_context.h"
 #include "brw_structs.h"
 #include "brw_defines.h"
@@ -383,6 +385,20 @@ void brw_set_uip_jip(struct brw_compile *p);
 
 uint32_t brw_swap_cmod(uint32_t cmod);
 
+/* brw_eu_compact.c */
+void brw_init_compaction_tables(struct intel_context *intel);
+void brw_compact_instructions(struct brw_compile *p);
+void brw_uncompact_instruction(struct intel_context *intel,
+			       struct brw_instruction *dst,
+			       struct brw_compact_instruction *src);
+bool brw_try_compact_instruction(struct brw_compile *p,
+                                 struct brw_compact_instruction *dst,
+                                 struct brw_instruction *src);
+
+void brw_debug_compact_uncompact(struct intel_context *intel,
+				 struct brw_instruction *orig,
+				 struct brw_instruction *uncompacted);
+
 /* brw_optimize.c */
 void brw_optimize(struct brw_compile *p);
 void brw_remove_duplicate_mrf_moves(struct brw_compile *p);
diff --git a/assembler/brw_eu_compact.c b/assembler/brw_eu_compact.c
new file mode 100644
index 0000000..d362ed3
--- /dev/null
+++ b/assembler/brw_eu_compact.c
@@ -0,0 +1,810 @@
+/*
+ * Copyright © 2012 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/** @file brw_eu_compact.c
+ *
+ * Instruction compaction is a feature of gm45 and newer hardware that allows
+ * for a smaller instruction encoding.
+ *
+ * The instruction cache is on the order of 32KB, and many programs generate
+ * far more instructions than that.  The instruction cache is built to barely
+ * keep up with instruction dispatch abaility in cache hit cases -- L1
+ * instruction cache misses that still hit in the next level could limit
+ * throughput by around 50%.
+ *
+ * The idea of instruction compaction is that most instructions use a tiny
+ * subset of the GPU functionality, so we can encode what would be a 16 byte
+ * instruction in 8 bytes using some lookup tables for various fields.
+ */
+
+#include <string.h>
+
+#include "brw_compat.h"
+#include "brw_context.h"
+#include "brw_eu.h"
+
+static const uint32_t gen6_control_index_table[32] = {
+   0b00000000000000000,
+   0b01000000000000000,
+   0b00110000000000000,
+   0b00000000100000000,
+   0b00010000000000000,
+   0b00001000100000000,
+   0b00000000100000010,
+   0b00000000000000010,
+   0b01000000100000000,
+   0b01010000000000000,
+   0b10110000000000000,
+   0b00100000000000000,
+   0b11010000000000000,
+   0b11000000000000000,
+   0b01001000100000000,
+   0b01000000000001000,
+   0b01000000000000100,
+   0b00000000000001000,
+   0b00000000000000100,
+   0b00111000100000000,
+   0b00001000100000010,
+   0b00110000100000000,
+   0b00110000000000001,
+   0b00100000000000001,
+   0b00110000000000010,
+   0b00110000000000101,
+   0b00110000000001001,
+   0b00110000000010000,
+   0b00110000000000011,
+   0b00110000000000100,
+   0b00110000100001000,
+   0b00100000000001001
+};
+
+static const uint32_t gen6_datatype_table[32] = {
+   0b001001110000000000,
+   0b001000110000100000,
+   0b001001110000000001,
+   0b001000000001100000,
+   0b001010110100101001,
+   0b001000000110101101,
+   0b001100011000101100,
+   0b001011110110101101,
+   0b001000000111101100,
+   0b001000000001100001,
+   0b001000110010100101,
+   0b001000000001000001,
+   0b001000001000110001,
+   0b001000001000101001,
+   0b001000000000100000,
+   0b001000001000110010,
+   0b001010010100101001,
+   0b001011010010100101,
+   0b001000000110100101,
+   0b001100011000101001,
+   0b001011011000101100,
+   0b001011010110100101,
+   0b001011110110100101,
+   0b001111011110111101,
+   0b001111011110111100,
+   0b001111011110111101,
+   0b001111011110011101,
+   0b001111011110111110,
+   0b001000000000100001,
+   0b001000000000100010,
+   0b001001111111011101,
+   0b001000001110111110,
+};
+
+static const uint32_t gen6_subreg_table[32] = {
+   0b000000000000000,
+   0b000000000000100,
+   0b000000110000000,
+   0b111000000000000,
+   0b011110000001000,
+   0b000010000000000,
+   0b000000000010000,
+   0b000110000001100,
+   0b001000000000000,
+   0b000001000000000,
+   0b000001010010100,
+   0b000000001010110,
+   0b010000000000000,
+   0b110000000000000,
+   0b000100000000000,
+   0b000000010000000,
+   0b000000000001000,
+   0b100000000000000,
+   0b000001010000000,
+   0b001010000000000,
+   0b001100000000000,
+   0b000000001010100,
+   0b101101010010100,
+   0b010100000000000,
+   0b000000010001111,
+   0b011000000000000,
+   0b111110000000000,
+   0b101000000000000,
+   0b000000000001111,
+   0b000100010001111,
+   0b001000010001111,
+   0b000110000000000,
+};
+
+static const uint32_t gen6_src_index_table[32] = {
+   0b000000000000,
+   0b010110001000,
+   0b010001101000,
+   0b001000101000,
+   0b011010010000,
+   0b000100100000,
+   0b010001101100,
+   0b010101110000,
+   0b011001111000,
+   0b001100101000,
+   0b010110001100,
+   0b001000100000,
+   0b010110001010,
+   0b000000000010,
+   0b010101010000,
+   0b010101101000,
+   0b111101001100,
+   0b111100101100,
+   0b011001110000,
+   0b010110001001,
+   0b010101011000,
+   0b001101001000,
+   0b010000101100,
+   0b010000000000,
+   0b001101110000,
+   0b001100010000,
+   0b001100000000,
+   0b010001101010,
+   0b001101111000,
+   0b000001110000,
+   0b001100100000,
+   0b001101010000,
+};
+
+static const uint32_t gen7_control_index_table[32] = {
+   0b0000000000000000010,
+   0b0000100000000000000,
+   0b0000100000000000001,
+   0b0000100000000000010,
+   0b0000100000000000011,
+   0b0000100000000000100,
+   0b0000100000000000101,
+   0b0000100000000000111,
+   0b0000100000000001000,
+   0b0000100000000001001,
+   0b0000100000000001101,
+   0b0000110000000000000,
+   0b0000110000000000001,
+   0b0000110000000000010,
+   0b0000110000000000011,
+   0b0000110000000000100,
+   0b0000110000000000101,
+   0b0000110000000000111,
+   0b0000110000000001001,
+   0b0000110000000001101,
+   0b0000110000000010000,
+   0b0000110000100000000,
+   0b0001000000000000000,
+   0b0001000000000000010,
+   0b0001000000000000100,
+   0b0001000000100000000,
+   0b0010110000000000000,
+   0b0010110000000010000,
+   0b0011000000000000000,
+   0b0011000000100000000,
+   0b0101000000000000000,
+   0b0101000000100000000
+};
+
+static const uint32_t gen7_datatype_table[32] = {
+   0b001000000000000001,
+   0b001000000000100000,
+   0b001000000000100001,
+   0b001000000001100001,
+   0b001000000010111101,
+   0b001000001011111101,
+   0b001000001110100001,
+   0b001000001110100101,
+   0b001000001110111101,
+   0b001000010000100001,
+   0b001000110000100000,
+   0b001000110000100001,
+   0b001001010010100101,
+   0b001001110010100100,
+   0b001001110010100101,
+   0b001111001110111101,
+   0b001111011110011101,
+   0b001111011110111100,
+   0b001111011110111101,
+   0b001111111110111100,
+   0b000000001000001100,
+   0b001000000000111101,
+   0b001000000010100101,
+   0b001000010000100000,
+   0b001001010010100100,
+   0b001001110010000100,
+   0b001010010100001001,
+   0b001101111110111101,
+   0b001111111110111101,
+   0b001011110110101100,
+   0b001010010100101000,
+   0b001010110100101000
+};
+
+static const uint32_t gen7_subreg_table[32] = {
+   0b000000000000000,
+   0b000000000000001,
+   0b000000000001000,
+   0b000000000001111,
+   0b000000000010000,
+   0b000000010000000,
+   0b000000100000000,
+   0b000000110000000,
+   0b000001000000000,
+   0b000001000010000,
+   0b000010100000000,
+   0b001000000000000,
+   0b001000000000001,
+   0b001000010000001,
+   0b001000010000010,
+   0b001000010000011,
+   0b001000010000100,
+   0b001000010000111,
+   0b001000010001000,
+   0b001000010001110,
+   0b001000010001111,
+   0b001000110000000,
+   0b001000111101000,
+   0b010000000000000,
+   0b010000110000000,
+   0b011000000000000,
+   0b011110010000111,
+   0b100000000000000,
+   0b101000000000000,
+   0b110000000000000,
+   0b111000000000000,
+   0b111000000011100
+};
+
+static const uint32_t gen7_src_index_table[32] = {
+   0b000000000000,
+   0b000000000010,
+   0b000000010000,
+   0b000000010010,
+   0b000000011000,
+   0b000000100000,
+   0b000000101000,
+   0b000001001000,
+   0b000001010000,
+   0b000001110000,
+   0b000001111000,
+   0b001100000000,
+   0b001100000010,
+   0b001100001000,
+   0b001100010000,
+   0b001100010010,
+   0b001100100000,
+   0b001100101000,
+   0b001100111000,
+   0b001101000000,
+   0b001101000010,
+   0b001101001000,
+   0b001101010000,
+   0b001101100000,
+   0b001101101000,
+   0b001101110000,
+   0b001101110001,
+   0b001101111000,
+   0b010001101000,
+   0b010001101001,
+   0b010001101010,
+   0b010110001000
+};
+
+static const uint32_t *control_index_table;
+static const uint32_t *datatype_table;
+static const uint32_t *subreg_table;
+static const uint32_t *src_index_table;
+
+static bool
+set_control_index(struct intel_context *intel,
+                  struct brw_compact_instruction *dst,
+                  struct brw_instruction *src)
+{
+   uint32_t *src_u32 = (uint32_t *)src;
+   uint32_t uncompacted = 0;
+
+   uncompacted |= ((src_u32[0] >> 8) & 0xffff) << 0;
+   uncompacted |= ((src_u32[0] >> 31) & 0x1) << 16;
+   /* On gen7, the flag register number gets integrated into the control
+    * index.
+    */
+   if (intel->gen >= 7)
+      uncompacted |= ((src_u32[2] >> 25) & 0x3) << 17;
+
+   for (int i = 0; i < 32; i++) {
+      if (control_index_table[i] == uncompacted) {
+	 dst->dw0.control_index = i;
+	 return true;
+      }
+   }
+
+   return false;
+}
+
+static bool
+set_datatype_index(struct brw_compact_instruction *dst,
+                   struct brw_instruction *src)
+{
+   uint32_t uncompacted = 0;
+
+   uncompacted |= src->bits1.ud & 0x7fff;
+   uncompacted |= (src->bits1.ud >> 29) << 15;
+
+   for (int i = 0; i < 32; i++) {
+      if (datatype_table[i] == uncompacted) {
+	 dst->dw0.data_type_index = i;
+	 return true;
+      }
+   }
+
+   return false;
+}
+
+static bool
+set_subreg_index(struct brw_compact_instruction *dst,
+                 struct brw_instruction *src)
+{
+   uint32_t uncompacted = 0;
+
+   uncompacted |= src->bits1.da1.dest_subreg_nr << 0;
+   uncompacted |= src->bits2.da1.src0_subreg_nr << 5;
+   uncompacted |= src->bits3.da1.src1_subreg_nr << 10;
+
+   for (int i = 0; i < 32; i++) {
+      if (subreg_table[i] == uncompacted) {
+	 dst->dw0.sub_reg_index = i;
+	 return true;
+      }
+   }
+
+   return false;
+}
+
+static bool
+get_src_index(uint32_t uncompacted,
+              uint32_t *compacted)
+{
+   for (int i = 0; i < 32; i++) {
+      if (src_index_table[i] == uncompacted) {
+	 *compacted = i;
+	 return true;
+      }
+   }
+
+   return false;
+}
+
+static bool
+set_src0_index(struct brw_compact_instruction *dst,
+               struct brw_instruction *src)
+{
+   uint32_t compacted, uncompacted = 0;
+
+   uncompacted |= (src->bits2.ud >> 13) & 0xfff;
+
+   if (!get_src_index(uncompacted, &compacted))
+      return false;
+
+   dst->dw0.src0_index = compacted & 0x3;
+   dst->dw1.src0_index = compacted >> 2;
+
+   return true;
+}
+
+static bool
+set_src1_index(struct brw_compact_instruction *dst,
+               struct brw_instruction *src)
+{
+   uint32_t compacted, uncompacted = 0;
+
+   uncompacted |= (src->bits3.ud >> 13) & 0xfff;
+
+   if (!get_src_index(uncompacted, &compacted))
+      return false;
+
+   dst->dw1.src1_index = compacted;
+
+   return true;
+}
+
+/**
+ * Tries to compact instruction src into dst.
+ *
+ * It doesn't modify dst unless src is compactable, which is relied on by
+ * brw_compact_instructions().
+ */
+bool
+brw_try_compact_instruction(struct brw_compile *p,
+                            struct brw_compact_instruction *dst,
+                            struct brw_instruction *src)
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+   struct brw_compact_instruction temp;
+
+   if (src->header.opcode == BRW_OPCODE_IF ||
+       src->header.opcode == BRW_OPCODE_ELSE ||
+       src->header.opcode == BRW_OPCODE_ENDIF ||
+       src->header.opcode == BRW_OPCODE_HALT ||
+       src->header.opcode == BRW_OPCODE_DO ||
+       src->header.opcode == BRW_OPCODE_WHILE) {
+      /* FINISHME: The fixup code below, and brw_set_uip_jip and friends, needs
+       * to be able to handle compacted flow control instructions..
+       */
+      return false;
+   }
+
+   /* FINISHME: immediates */
+   if (src->bits1.da1.src0_reg_file == BRW_IMMEDIATE_VALUE ||
+       src->bits1.da1.src1_reg_file == BRW_IMMEDIATE_VALUE)
+      return false;
+
+   memset(&temp, 0, sizeof(temp));
+
+   temp.dw0.opcode = src->header.opcode;
+   temp.dw0.debug_control = src->header.debug_control;
+   if (!set_control_index(intel, &temp, src))
+      return false;
+   if (!set_datatype_index(&temp, src))
+      return false;
+   if (!set_subreg_index(&temp, src))
+      return false;
+   temp.dw0.acc_wr_control = src->header.acc_wr_control;
+   temp.dw0.conditionalmod = src->header.destreg__conditionalmod;
+   if (intel->gen <= 6)
+      temp.dw0.flag_subreg_nr = src->bits2.da1.flag_subreg_nr;
+   temp.dw0.cmpt_ctrl = 1;
+   if (!set_src0_index(&temp, src))
+      return false;
+   if (!set_src1_index(&temp, src))
+      return false;
+   temp.dw1.dst_reg_nr = src->bits1.da1.dest_reg_nr;
+   temp.dw1.src0_reg_nr = src->bits2.da1.src0_reg_nr;
+   temp.dw1.src1_reg_nr = src->bits3.da1.src1_reg_nr;
+
+   *dst = temp;
+
+   return true;
+}
+
+static void
+set_uncompacted_control(struct intel_context *intel,
+                        struct brw_instruction *dst,
+                        struct brw_compact_instruction *src)
+{
+   uint32_t *dst_u32 = (uint32_t *)dst;
+   uint32_t uncompacted = control_index_table[src->dw0.control_index];
+
+   dst_u32[0] |= ((uncompacted >> 0) & 0xffff) << 8;
+   dst_u32[0] |= ((uncompacted >> 16) & 0x1) << 31;
+
+   if (intel->gen >= 7)
+      dst_u32[2] |= ((uncompacted >> 17) & 0x3) << 25;
+}
+
+static void
+set_uncompacted_datatype(struct brw_instruction *dst,
+                         struct brw_compact_instruction *src)
+{
+   uint32_t uncompacted = datatype_table[src->dw0.data_type_index];
+
+   dst->bits1.ud &= ~(0x7 << 29);
+   dst->bits1.ud |= ((uncompacted >> 15) & 0x7) << 29;
+   dst->bits1.ud &= ~0x7fff;
+   dst->bits1.ud |= uncompacted & 0x7fff;
+}
+
+static void
+set_uncompacted_subreg(struct brw_instruction *dst,
+                       struct brw_compact_instruction *src)
+{
+   uint32_t uncompacted = subreg_table[src->dw0.sub_reg_index];
+
+   dst->bits1.da1.dest_subreg_nr = (uncompacted >> 0)  & 0x1f;
+   dst->bits2.da1.src0_subreg_nr = (uncompacted >> 5)  & 0x1f;
+   dst->bits3.da1.src1_subreg_nr = (uncompacted >> 10) & 0x1f;
+}
+
+static void
+set_uncompacted_src0(struct brw_instruction *dst,
+                     struct brw_compact_instruction *src)
+{
+   uint32_t compacted = src->dw0.src0_index | src->dw1.src0_index << 2;
+   uint32_t uncompacted = src_index_table[compacted];
+
+   dst->bits2.ud |= uncompacted << 13;
+}
+
+static void
+set_uncompacted_src1(struct brw_instruction *dst,
+                     struct brw_compact_instruction *src)
+{
+   uint32_t uncompacted = src_index_table[src->dw1.src1_index];
+
+   dst->bits3.ud |= uncompacted << 13;
+}
+
+void
+brw_uncompact_instruction(struct intel_context *intel,
+                          struct brw_instruction *dst,
+                          struct brw_compact_instruction *src)
+{
+   memset(dst, 0, sizeof(*dst));
+
+   dst->header.opcode = src->dw0.opcode;
+   dst->header.debug_control = src->dw0.debug_control;
+
+   set_uncompacted_control(intel, dst, src);
+   set_uncompacted_datatype(dst, src);
+   set_uncompacted_subreg(dst, src);
+   dst->header.acc_wr_control = src->dw0.acc_wr_control;
+   dst->header.destreg__conditionalmod = src->dw0.conditionalmod;
+   if (intel->gen <= 6)
+      dst->bits2.da1.flag_subreg_nr = src->dw0.flag_subreg_nr;
+   set_uncompacted_src0(dst, src);
+   set_uncompacted_src1(dst, src);
+   dst->bits1.da1.dest_reg_nr = src->dw1.dst_reg_nr;
+   dst->bits2.da1.src0_reg_nr = src->dw1.src0_reg_nr;
+   dst->bits3.da1.src1_reg_nr = src->dw1.src1_reg_nr;
+}
+
+void brw_debug_compact_uncompact(struct intel_context *intel,
+                                 struct brw_instruction *orig,
+                                 struct brw_instruction *uncompacted)
+{
+   fprintf(stderr, "Instruction compact/uncompact changed (gen%d):\n",
+           intel->gen);
+
+   fprintf(stderr, "  before: ");
+   brw_disasm(stderr, orig, intel->gen);
+
+   fprintf(stderr, "  after:  ");
+   brw_disasm(stderr, uncompacted, intel->gen);
+
+   uint32_t *before_bits = (uint32_t *)orig;
+   uint32_t *after_bits = (uint32_t *)uncompacted;
+   printf("  changed bits:\n");
+   for (int i = 0; i < 128; i++) {
+      uint32_t before = before_bits[i / 32] & (1 << (i & 31));
+      uint32_t after = after_bits[i / 32] & (1 << (i & 31));
+
+      if (before != after) {
+         printf("  bit %d, %s to %s\n", i,
+                before ? "set" : "unset",
+                after ? "set" : "unset");
+      }
+   }
+}
+
+static int
+compacted_between(int old_ip, int old_target_ip, int *compacted_counts)
+{
+   int this_compacted_count = compacted_counts[old_ip];
+   int target_compacted_count = compacted_counts[old_target_ip];
+   return target_compacted_count - this_compacted_count;
+}
+
+static void
+update_uip_jip(struct brw_instruction *insn, int this_old_ip,
+               int *compacted_counts)
+{
+   int target_old_ip;
+
+   target_old_ip = this_old_ip + insn->bits3.break_cont.jip;
+   insn->bits3.break_cont.jip -= compacted_between(this_old_ip,
+                                                   target_old_ip,
+                                                   compacted_counts);
+
+   target_old_ip = this_old_ip + insn->bits3.break_cont.uip;
+   insn->bits3.break_cont.uip -= compacted_between(this_old_ip,
+                                                   target_old_ip,
+                                                   compacted_counts);
+}
+
+void
+brw_init_compaction_tables(struct intel_context *intel)
+{
+   assert(gen6_control_index_table[ARRAY_SIZE(gen6_control_index_table) - 1] != 0);
+   assert(gen6_datatype_table[ARRAY_SIZE(gen6_datatype_table) - 1] != 0);
+   assert(gen6_subreg_table[ARRAY_SIZE(gen6_subreg_table) - 1] != 0);
+   assert(gen6_src_index_table[ARRAY_SIZE(gen6_src_index_table) - 1] != 0);
+   assert(gen7_control_index_table[ARRAY_SIZE(gen6_control_index_table) - 1] != 0);
+   assert(gen7_datatype_table[ARRAY_SIZE(gen6_datatype_table) - 1] != 0);
+   assert(gen7_subreg_table[ARRAY_SIZE(gen6_subreg_table) - 1] != 0);
+   assert(gen7_src_index_table[ARRAY_SIZE(gen6_src_index_table) - 1] != 0);
+
+   switch (intel->gen) {
+   case 7:
+      control_index_table = gen7_control_index_table;
+      datatype_table = gen7_datatype_table;
+      subreg_table = gen7_subreg_table;
+      src_index_table = gen7_src_index_table;
+      break;
+   case 6:
+      control_index_table = gen6_control_index_table;
+      datatype_table = gen6_datatype_table;
+      subreg_table = gen6_subreg_table;
+      src_index_table = gen6_src_index_table;
+      break;
+   default:
+      return;
+   }
+}
+
+void
+brw_compact_instructions(struct brw_compile *p)
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+   void *store = p->store;
+   /* For an instruction at byte offset 8*i before compaction, this is the number
+    * of compacted instructions that preceded it.
+    */
+   int compacted_counts[p->next_insn_offset / 8];
+   /* For an instruction at byte offset 8*i after compaction, this is the
+    * 8-byte offset it was at before compaction.
+    */
+   int old_ip[p->next_insn_offset / 8];
+
+   if (intel->gen < 6)
+      return;
+
+   int src_offset;
+   int offset = 0;
+   int compacted_count = 0;
+   for (src_offset = 0; src_offset < p->nr_insn * 16;) {
+      struct brw_instruction *src = store + src_offset;
+      void *dst = store + offset;
+
+      old_ip[offset / 8] = src_offset / 8;
+      compacted_counts[src_offset / 8] = compacted_count;
+
+      struct brw_instruction saved = *src;
+
+      if (!src->header.cmpt_control &&
+          brw_try_compact_instruction(p, dst, src)) {
+         compacted_count++;
+
+         if (INTEL_DEBUG) {
+            struct brw_instruction uncompacted;
+            brw_uncompact_instruction(intel, &uncompacted, dst);
+            if (memcmp(&saved, &uncompacted, sizeof(uncompacted))) {
+               brw_debug_compact_uncompact(intel, &saved, &uncompacted);
+            }
+         }
+
+         offset += 8;
+         src_offset += 16;
+      } else {
+         int size = src->header.cmpt_control ? 8 : 16;
+
+         /* It appears that the end of thread SEND instruction needs to be
+          * aligned, or the GPU hangs.
+          */
+         if ((src->header.opcode == BRW_OPCODE_SEND ||
+              src->header.opcode == BRW_OPCODE_SENDC) &&
+             src->bits3.generic.end_of_thread &&
+             (offset & 8) != 0) {
+            struct brw_compact_instruction *align = store + offset;
+            memset(align, 0, sizeof(*align));
+            align->dw0.opcode = BRW_OPCODE_NOP;
+            align->dw0.cmpt_ctrl = 1;
+            offset += 8;
+            old_ip[offset / 8] = src_offset / 8;
+            dst = store + offset;
+         }
+
+         /* If we didn't compact this intruction, we need to move it down into
+          * place.
+          */
+         if (offset != src_offset) {
+            memmove(dst, src, size);
+         }
+         offset += size;
+         src_offset += size;
+      }
+   }
+
+   /* Fix up control flow offsets. */
+   p->next_insn_offset = offset;
+   for (offset = 0; offset < p->next_insn_offset;) {
+      struct brw_instruction *insn = store + offset;
+      int this_old_ip = old_ip[offset / 8];
+      int this_compacted_count = compacted_counts[this_old_ip];
+      int target_old_ip, target_compacted_count;
+
+      switch (insn->header.opcode) {
+      case BRW_OPCODE_BREAK:
+      case BRW_OPCODE_CONTINUE:
+      case BRW_OPCODE_HALT:
+         update_uip_jip(insn, this_old_ip, compacted_counts);
+         break;
+
+      case BRW_OPCODE_IF:
+      case BRW_OPCODE_ELSE:
+      case BRW_OPCODE_ENDIF:
+      case BRW_OPCODE_WHILE:
+         if (intel->gen == 6) {
+            target_old_ip = this_old_ip + insn->bits1.branch_gen6.jump_count;
+            target_compacted_count = compacted_counts[target_old_ip];
+            insn->bits1.branch_gen6.jump_count -= (target_compacted_count -
+                                                   this_compacted_count);
+         } else {
+            update_uip_jip(insn, this_old_ip, compacted_counts);
+         }
+         break;
+      }
+
+      if (insn->header.cmpt_control) {
+         offset += 8;
+      } else {
+         offset += 16;
+      }
+   }
+
+   /* p->nr_insn is counting the number of uncompacted instructions still, so
+    * divide.  We do want to be sure there's a valid instruction in any
+    * alignment padding, so that the next compression pass (for the FS 8/16
+    * compile passes) parses correctly.
+    */
+   if (p->next_insn_offset & 8) {
+      struct brw_compact_instruction *align = store + offset;
+      memset(align, 0, sizeof(*align));
+      align->dw0.opcode = BRW_OPCODE_NOP;
+      align->dw0.cmpt_ctrl = 1;
+      p->next_insn_offset += 8;
+   }
+   p->nr_insn = p->next_insn_offset / 16;
+
+   if (0) {
+      fprintf(stdout, "dumping compacted program\n");
+      brw_dump_compile(p, stdout, 0, p->next_insn_offset);
+
+      int cmp = 0;
+      for (offset = 0; offset < p->next_insn_offset;) {
+         struct brw_instruction *insn = store + offset;
+
+         if (insn->header.cmpt_control) {
+            offset += 8;
+            cmp++;
+         } else {
+            offset += 16;
+         }
+      }
+      fprintf(stderr, "%db/%db saved (%d%%)\n", cmp * 8, offset + cmp * 8,
+              cmp * 8 * 100 / (offset + cmp * 8));
+   }
+}
-- 
1.7.7.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 29/90] assembler: Import brw_eu.c
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (27 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 28/90] assembler: Import brw_eu_compact.c Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 30/90] assembler: Don't use -Wpointer-arith Damien Lespiau
                   ` (61 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Another step the road of importing Mesa's emission code.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/Makefile.am |    1 +
 assembler/brw_eu.c    |  269 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 270 insertions(+), 0 deletions(-)
 create mode 100644 assembler/brw_eu.c

diff --git a/assembler/Makefile.am b/assembler/Makefile.am
index 48e38d0..e4468a4 100644
--- a/assembler/Makefile.am
+++ b/assembler/Makefile.am
@@ -11,6 +11,7 @@ libbrw_la_SOURCES =		\
 	brw_disasm.c		\
 	brw_defines.h		\
 	brw_eu.h		\
+	brw_eu.c		\
 	brw_eu_compact.c	\
 	brw_reg.h		\
 	brw_structs.h		\
diff --git a/assembler/brw_eu.c b/assembler/brw_eu.c
new file mode 100644
index 0000000..1641c95
--- /dev/null
+++ b/assembler/brw_eu.c
@@ -0,0 +1,269 @@
+/*
+ Copyright (C) Intel Corp.  2006.  All Rights Reserved.
+ Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
+ develop this 3D driver.
+ 
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+ 
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.
+ 
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ 
+ **********************************************************************/
+ /*
+  * Authors:
+  *   Keith Whitwell <keith@tungstengraphics.com>
+  */
+  
+
+#include <string.h>
+
+#include "gen4asm.h"
+#include "brw_context.h"
+#include "brw_defines.h"
+#include "brw_eu.h"
+
+#include "ralloc.h"
+
+/* Returns the corresponding conditional mod for swapping src0 and
+ * src1 in e.g. CMP.
+ */
+uint32_t
+brw_swap_cmod(uint32_t cmod)
+{
+   switch (cmod) {
+   case BRW_CONDITIONAL_Z:
+   case BRW_CONDITIONAL_NZ:
+      return cmod;
+   case BRW_CONDITIONAL_G:
+      return BRW_CONDITIONAL_L;
+   case BRW_CONDITIONAL_GE:
+      return BRW_CONDITIONAL_LE;
+   case BRW_CONDITIONAL_L:
+      return BRW_CONDITIONAL_G;
+   case BRW_CONDITIONAL_LE:
+      return BRW_CONDITIONAL_GE;
+   default:
+      return ~0;
+   }
+}
+
+
+/* How does predicate control work when execution_size != 8?  Do I
+ * need to test/set for 0xffff when execution_size is 16?
+ */
+void brw_set_predicate_control_flag_value( struct brw_compile *p, GLuint value )
+{
+   p->current->header.predicate_control = BRW_PREDICATE_NONE;
+
+   if (value != 0xff) {
+      if (value != p->flag_value) {
+	 brw_push_insn_state(p);
+	 brw_MOV(p, brw_flag_reg(0, 0), brw_imm_uw(value));
+	 p->flag_value = value;
+	 brw_pop_insn_state(p);
+      }
+
+      p->current->header.predicate_control = BRW_PREDICATE_NORMAL;
+   }   
+}
+
+void brw_set_predicate_control( struct brw_compile *p, GLuint pc )
+{
+   p->current->header.predicate_control = pc;
+}
+
+void brw_set_predicate_inverse(struct brw_compile *p, bool predicate_inverse)
+{
+   p->current->header.predicate_inverse = predicate_inverse;
+}
+
+void brw_set_conditionalmod( struct brw_compile *p, GLuint conditional )
+{
+   p->current->header.destreg__conditionalmod = conditional;
+}
+
+void brw_set_flag_reg(struct brw_compile *p, int reg, int subreg)
+{
+   p->current->bits2.da1.flag_reg_nr = reg;
+   p->current->bits2.da1.flag_subreg_nr = subreg;
+}
+
+void brw_set_access_mode( struct brw_compile *p, GLuint access_mode )
+{
+   p->current->header.access_mode = access_mode;
+}
+
+void
+brw_set_compression_control(struct brw_compile *p,
+			    enum brw_compression compression_control)
+{
+   p->compressed = (compression_control == BRW_COMPRESSION_COMPRESSED);
+
+   if (p->brw->intel.gen >= 6) {
+      /* Since we don't use the 32-wide support in gen6, we translate
+       * the pre-gen6 compression control here.
+       */
+      switch (compression_control) {
+      case BRW_COMPRESSION_NONE:
+	 /* This is the "use the first set of bits of dmask/vmask/arf
+	  * according to execsize" option.
+	  */
+	 p->current->header.compression_control = GEN6_COMPRESSION_1Q;
+	 break;
+      case BRW_COMPRESSION_2NDHALF:
+	 /* For 8-wide, this is "use the second set of 8 bits." */
+	 p->current->header.compression_control = GEN6_COMPRESSION_2Q;
+	 break;
+      case BRW_COMPRESSION_COMPRESSED:
+	 /* For 16-wide instruction compression, use the first set of 16 bits
+	  * since we don't do 32-wide dispatch.
+	  */
+	 p->current->header.compression_control = GEN6_COMPRESSION_1H;
+	 break;
+      default:
+	 assert(!"not reached");
+	 p->current->header.compression_control = GEN6_COMPRESSION_1H;
+	 break;
+      }
+   } else {
+      p->current->header.compression_control = compression_control;
+   }
+}
+
+void brw_set_mask_control( struct brw_compile *p, GLuint value )
+{
+   p->current->header.mask_control = value;
+}
+
+void brw_set_saturate( struct brw_compile *p, bool enable )
+{
+   p->current->header.saturate = enable;
+}
+
+void brw_set_acc_write_control(struct brw_compile *p, GLuint value)
+{
+   if (p->brw->intel.gen >= 6)
+      p->current->header.acc_wr_control = value;
+}
+
+void brw_push_insn_state( struct brw_compile *p )
+{
+   assert(p->current != &p->stack[BRW_EU_MAX_INSN_STACK-1]);
+   memcpy(p->current+1, p->current, sizeof(struct brw_instruction));
+   p->compressed_stack[p->current - p->stack] = p->compressed;
+   p->current++;   
+}
+
+void brw_pop_insn_state( struct brw_compile *p )
+{
+   assert(p->current != p->stack);
+   p->current--;
+   p->compressed = p->compressed_stack[p->current - p->stack];
+}
+
+
+/***********************************************************************
+ */
+void
+brw_init_compile(struct brw_context *brw, struct brw_compile *p, void *mem_ctx)
+{
+   memset(p, 0, sizeof(*p));
+
+   p->brw = brw;
+   /*
+    * Set the initial instruction store array size to 1024, if found that
+    * isn't enough, then it will double the store size at brw_next_insn()
+    * until out of memory.
+    */
+   p->store_size = 1024;
+   p->store = rzalloc_array(mem_ctx, struct brw_instruction, p->store_size);
+   p->nr_insn = 0;
+   p->current = p->stack;
+   p->compressed = false;
+   memset(p->current, 0, sizeof(p->current[0]));
+
+   p->mem_ctx = mem_ctx;
+
+   /* Some defaults?
+    */
+   brw_set_mask_control(p, BRW_MASK_ENABLE); /* what does this do? */
+   brw_set_saturate(p, 0);
+   brw_set_compression_control(p, BRW_COMPRESSION_NONE);
+   brw_set_predicate_control_flag_value(p, 0xff); 
+
+   /* Set up control flow stack */
+   p->if_stack_depth = 0;
+   p->if_stack_array_size = 16;
+   p->if_stack = rzalloc_array(mem_ctx, int, p->if_stack_array_size);
+
+   p->loop_stack_depth = 0;
+   p->loop_stack_array_size = 16;
+   p->loop_stack = rzalloc_array(mem_ctx, int, p->loop_stack_array_size);
+   p->if_depth_in_loop = rzalloc_array(mem_ctx, int, p->loop_stack_array_size);
+
+   brw_init_compaction_tables(&brw->intel);
+}
+
+
+const GLuint *brw_get_program( struct brw_compile *p,
+			       GLuint *sz )
+{
+   brw_compact_instructions(p);
+
+   *sz = p->next_insn_offset;
+   return (const GLuint *)p->store;
+}
+
+void
+brw_dump_compile(struct brw_compile *p, FILE *out, int start, int end)
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+   void *store = p->store;
+   bool dump_hex = false;
+
+   for (int offset = start; offset < end;) {
+      struct brw_instruction *insn = store + offset;
+      struct brw_instruction uncompacted;
+      printf("0x%08x: ", offset);
+
+      if (insn->header.cmpt_control) {
+	 struct brw_compact_instruction *compacted = (void *)insn;
+	 if (dump_hex) {
+	    printf("0x%08x 0x%08x                       ",
+		   ((uint32_t *)insn)[1],
+		   ((uint32_t *)insn)[0]);
+	 }
+
+	 brw_uncompact_instruction(intel, &uncompacted, compacted);
+	 insn = &uncompacted;
+	 offset += 8;
+      } else {
+	 if (dump_hex) {
+	    printf("0x%08x 0x%08x 0x%08x 0x%08x ",
+		   ((uint32_t *)insn)[3],
+		   ((uint32_t *)insn)[2],
+		   ((uint32_t *)insn)[1],
+		   ((uint32_t *)insn)[0]);
+	 }
+	 offset += 16;
+      }
+
+      brw_disasm(stdout, insn, p->brw->intel.gen);
+   }
+}
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 30/90] assembler: Don't use -Wpointer-arith
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (28 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 29/90] assembler: Import brw_eu.c Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 31/90] assembler: Import brw_eu_emit.c Damien Lespiau
                   ` (60 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Mesa's code uses the GNU C extension that allows additions and
soustractions on void* (+/- 1).

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 configure.ac |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure.ac b/configure.ac
index 832c6e4..cd1c201 100644
--- a/configure.ac
+++ b/configure.ac
@@ -64,7 +64,7 @@ XORG_DEFAULT_OPTIONS
 # it generates waaaay to many warnings.
 ASSEMBLER_WARN_CFLAGS=""
 if test "x$GCC" = "xyes"; then
-	ASSEMBLER_WARN_CFLAGS="-Wall -Wpointer-arith -Wstrict-prototypes \
+	ASSEMBLER_WARN_CFLAGS="-Wall -Wstrict-prototypes \
 	-Wmissing-prototypes -Wmissing-declarations \
 	-Wnested-externs -fno-strict-aliasing"
 fi
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 31/90] assembler: Import brw_eu_emit.c
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (29 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 30/90] assembler: Don't use -Wpointer-arith Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 32/90] assembler: Use BRW_WRITEMASK_XYZW instead of the 0xf constant Damien Lespiau
                   ` (59 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Finally importing the meaty brw_eu_emit.c code that emit instructions.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/Makefile.am   |    1 +
 assembler/brw_compat.h  |    1 +
 assembler/brw_eu_emit.c | 2549 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 2551 insertions(+), 0 deletions(-)
 create mode 100644 assembler/brw_eu_emit.c

diff --git a/assembler/Makefile.am b/assembler/Makefile.am
index e4468a4..113e550 100644
--- a/assembler/Makefile.am
+++ b/assembler/Makefile.am
@@ -13,6 +13,7 @@ libbrw_la_SOURCES =		\
 	brw_eu.h		\
 	brw_eu.c		\
 	brw_eu_compact.c	\
+	brw_eu_emit.c		\
 	brw_reg.h		\
 	brw_structs.h		\
 	ralloc.c		\
diff --git a/assembler/brw_compat.h b/assembler/brw_compat.h
index 5102a02..4bf7f31 100644
--- a/assembler/brw_compat.h
+++ b/assembler/brw_compat.h
@@ -58,6 +58,7 @@ extern "C" {
 #endif
 
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof(x[0]))
+#define Elements(x) ARRAY_SIZE(x)
 
 #ifdef __cplusplus
 } /* end of extern "C" */
diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
new file mode 100644
index 0000000..ea4baeb
--- /dev/null
+++ b/assembler/brw_eu_emit.c
@@ -0,0 +1,2549 @@
+/*
+ Copyright (C) Intel Corp.  2006.  All Rights Reserved.
+ Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
+ develop this 3D driver.
+ 
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+ 
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.
+ 
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ 
+ **********************************************************************/
+ /*
+  * Authors:
+  *   Keith Whitwell <keith@tungstengraphics.com>
+  */
+     
+#include <string.h>
+
+#include "brw_context.h"
+#include "brw_defines.h"
+#include "brw_eu.h"
+
+#include "ralloc.h"
+
+/***********************************************************************
+ * Internal helper for constructing instructions
+ */
+
+static void guess_execution_size(struct brw_compile *p,
+				 struct brw_instruction *insn,
+				 struct brw_reg reg)
+{
+   if (reg.width == BRW_WIDTH_8 && p->compressed)
+      insn->header.execution_size = BRW_EXECUTE_16;
+   else
+      insn->header.execution_size = reg.width;	/* note - definitions are compatible */
+}
+
+
+/**
+ * Prior to Sandybridge, the SEND instruction accepted non-MRF source
+ * registers, implicitly moving the operand to a message register.
+ *
+ * On Sandybridge, this is no longer the case.  This function performs the
+ * explicit move; it should be called before emitting a SEND instruction.
+ */
+void
+gen6_resolve_implied_move(struct brw_compile *p,
+			  struct brw_reg *src,
+			  GLuint msg_reg_nr)
+{
+   struct intel_context *intel = &p->brw->intel;
+   if (intel->gen < 6)
+      return;
+
+   if (src->file == BRW_MESSAGE_REGISTER_FILE)
+      return;
+
+   if (src->file != BRW_ARCHITECTURE_REGISTER_FILE || src->nr != BRW_ARF_NULL) {
+      brw_push_insn_state(p);
+      brw_set_mask_control(p, BRW_MASK_DISABLE);
+      brw_set_compression_control(p, BRW_COMPRESSION_NONE);
+      brw_MOV(p, retype(brw_message_reg(msg_reg_nr), BRW_REGISTER_TYPE_UD),
+	      retype(*src, BRW_REGISTER_TYPE_UD));
+      brw_pop_insn_state(p);
+   }
+   *src = brw_message_reg(msg_reg_nr);
+}
+
+static void
+gen7_convert_mrf_to_grf(struct brw_compile *p, struct brw_reg *reg)
+{
+   /* From the BSpec / ISA Reference / send - [DevIVB+]:
+    * "The send with EOT should use register space R112-R127 for <src>. This is
+    *  to enable loading of a new thread into the same slot while the message
+    *  with EOT for current thread is pending dispatch."
+    *
+    * Since we're pretending to have 16 MRFs anyway, we may as well use the
+    * registers required for messages with EOT.
+    */
+   struct intel_context *intel = &p->brw->intel;
+   if (intel->gen == 7 && reg->file == BRW_MESSAGE_REGISTER_FILE) {
+      reg->file = BRW_GENERAL_REGISTER_FILE;
+      reg->nr += GEN7_MRF_HACK_START;
+   }
+}
+
+
+void
+brw_set_dest(struct brw_compile *p, struct brw_instruction *insn,
+	     struct brw_reg dest)
+{
+   if (dest.file != BRW_ARCHITECTURE_REGISTER_FILE &&
+       dest.file != BRW_MESSAGE_REGISTER_FILE)
+      assert(dest.nr < 128);
+
+   gen7_convert_mrf_to_grf(p, &dest);
+
+   insn->bits1.da1.dest_reg_file = dest.file;
+   insn->bits1.da1.dest_reg_type = dest.type;
+   insn->bits1.da1.dest_address_mode = dest.address_mode;
+
+   if (dest.address_mode == BRW_ADDRESS_DIRECT) {   
+      insn->bits1.da1.dest_reg_nr = dest.nr;
+
+      if (insn->header.access_mode == BRW_ALIGN_1) {
+	 insn->bits1.da1.dest_subreg_nr = dest.subnr;
+	 if (dest.hstride == BRW_HORIZONTAL_STRIDE_0)
+	    dest.hstride = BRW_HORIZONTAL_STRIDE_1;
+	 insn->bits1.da1.dest_horiz_stride = dest.hstride;
+      }
+      else {
+	 insn->bits1.da16.dest_subreg_nr = dest.subnr / 16;
+	 insn->bits1.da16.dest_writemask = dest.dw1.bits.writemask;
+	 /* even ignored in da16, still need to set as '01' */
+	 insn->bits1.da16.dest_horiz_stride = 1;
+      }
+   }
+   else {
+      insn->bits1.ia1.dest_subreg_nr = dest.subnr;
+
+      /* These are different sizes in align1 vs align16:
+       */
+      if (insn->header.access_mode == BRW_ALIGN_1) {
+	 insn->bits1.ia1.dest_indirect_offset = dest.dw1.bits.indirect_offset;
+	 if (dest.hstride == BRW_HORIZONTAL_STRIDE_0)
+	    dest.hstride = BRW_HORIZONTAL_STRIDE_1;
+	 insn->bits1.ia1.dest_horiz_stride = dest.hstride;
+      }
+      else {
+	 insn->bits1.ia16.dest_indirect_offset = dest.dw1.bits.indirect_offset;
+	 /* even ignored in da16, still need to set as '01' */
+	 insn->bits1.ia16.dest_horiz_stride = 1;
+      }
+   }
+
+   /* NEW: Set the execution size based on dest.width and
+    * insn->compression_control:
+    */
+   guess_execution_size(p, insn, dest);
+}
+
+extern int reg_type_size[];
+
+static void
+validate_reg(struct brw_instruction *insn, struct brw_reg reg)
+{
+   int hstride_for_reg[] = {0, 1, 2, 4};
+   int vstride_for_reg[] = {0, 1, 2, 4, 8, 16, 32, 64, 128, 256};
+   int width_for_reg[] = {1, 2, 4, 8, 16};
+   int execsize_for_reg[] = {1, 2, 4, 8, 16};
+   int width, hstride, vstride, execsize;
+
+   if (reg.file == BRW_IMMEDIATE_VALUE) {
+      /* 3.3.6: Region Parameters.  Restriction: Immediate vectors
+       * mean the destination has to be 128-bit aligned and the
+       * destination horiz stride has to be a word.
+       */
+      if (reg.type == BRW_REGISTER_TYPE_V) {
+	 assert(hstride_for_reg[insn->bits1.da1.dest_horiz_stride] *
+		reg_type_size[insn->bits1.da1.dest_reg_type] == 2);
+      }
+
+      return;
+   }
+
+   if (reg.file == BRW_ARCHITECTURE_REGISTER_FILE &&
+       reg.file == BRW_ARF_NULL)
+      return;
+
+   assert(reg.hstride >= 0 && reg.hstride < Elements(hstride_for_reg));
+   hstride = hstride_for_reg[reg.hstride];
+
+   if (reg.vstride == 0xf) {
+      vstride = -1;
+   } else {
+      assert(reg.vstride >= 0 && reg.vstride < Elements(vstride_for_reg));
+      vstride = vstride_for_reg[reg.vstride];
+   }
+
+   assert(reg.width >= 0 && reg.width < Elements(width_for_reg));
+   width = width_for_reg[reg.width];
+
+   assert(insn->header.execution_size >= 0 &&
+	  insn->header.execution_size < Elements(execsize_for_reg));
+   execsize = execsize_for_reg[insn->header.execution_size];
+
+   /* Restrictions from 3.3.10: Register Region Restrictions. */
+   /* 3. */
+   assert(execsize >= width);
+
+   /* 4. */
+   if (execsize == width && hstride != 0) {
+      assert(vstride == -1 || vstride == width * hstride);
+   }
+
+   /* 5. */
+   if (execsize == width && hstride == 0) {
+      /* no restriction on vstride. */
+   }
+
+   /* 6. */
+   if (width == 1) {
+      assert(hstride == 0);
+   }
+
+   /* 7. */
+   if (execsize == 1 && width == 1) {
+      assert(hstride == 0);
+      assert(vstride == 0);
+   }
+
+   /* 8. */
+   if (vstride == 0 && hstride == 0) {
+      assert(width == 1);
+   }
+
+   /* 10. Check destination issues. */
+}
+
+void
+brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
+	     struct brw_reg reg)
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+
+   if (reg.type != BRW_ARCHITECTURE_REGISTER_FILE)
+      assert(reg.nr < 128);
+
+   gen7_convert_mrf_to_grf(p, &reg);
+
+   if (intel->gen >= 6 && (insn->header.opcode == BRW_OPCODE_SEND ||
+                           insn->header.opcode == BRW_OPCODE_SENDC)) {
+      /* Any source modifiers or regions will be ignored, since this just
+       * identifies the MRF/GRF to start reading the message contents from.
+       * Check for some likely failures.
+       */
+      assert(!reg.negate);
+      assert(!reg.abs);
+      assert(reg.address_mode == BRW_ADDRESS_DIRECT);
+   }
+
+   validate_reg(insn, reg);
+
+   insn->bits1.da1.src0_reg_file = reg.file;
+   insn->bits1.da1.src0_reg_type = reg.type;
+   insn->bits2.da1.src0_abs = reg.abs;
+   insn->bits2.da1.src0_negate = reg.negate;
+   insn->bits2.da1.src0_address_mode = reg.address_mode;
+
+   if (reg.file == BRW_IMMEDIATE_VALUE) {
+      insn->bits3.ud = reg.dw1.ud;
+   
+      /* Required to set some fields in src1 as well:
+       */
+      insn->bits1.da1.src1_reg_file = 0; /* arf */
+      insn->bits1.da1.src1_reg_type = reg.type;
+   }
+   else 
+   {
+      if (reg.address_mode == BRW_ADDRESS_DIRECT) {
+	 if (insn->header.access_mode == BRW_ALIGN_1) {
+	    insn->bits2.da1.src0_subreg_nr = reg.subnr;
+	    insn->bits2.da1.src0_reg_nr = reg.nr;
+	 }
+	 else {
+	    insn->bits2.da16.src0_subreg_nr = reg.subnr / 16;
+	    insn->bits2.da16.src0_reg_nr = reg.nr;
+	 }
+      }
+      else {
+	 insn->bits2.ia1.src0_subreg_nr = reg.subnr;
+
+	 if (insn->header.access_mode == BRW_ALIGN_1) {
+	    insn->bits2.ia1.src0_indirect_offset = reg.dw1.bits.indirect_offset; 
+	 }
+	 else {
+	    insn->bits2.ia16.src0_subreg_nr = reg.dw1.bits.indirect_offset;
+	 }
+      }
+
+      if (insn->header.access_mode == BRW_ALIGN_1) {
+	 if (reg.width == BRW_WIDTH_1 && 
+	     insn->header.execution_size == BRW_EXECUTE_1) {
+	    insn->bits2.da1.src0_horiz_stride = BRW_HORIZONTAL_STRIDE_0;
+	    insn->bits2.da1.src0_width = BRW_WIDTH_1;
+	    insn->bits2.da1.src0_vert_stride = BRW_VERTICAL_STRIDE_0;
+	 }
+	 else {
+	    insn->bits2.da1.src0_horiz_stride = reg.hstride;
+	    insn->bits2.da1.src0_width = reg.width;
+	    insn->bits2.da1.src0_vert_stride = reg.vstride;
+	 }
+      }
+      else {
+	 insn->bits2.da16.src0_swz_x = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_X);
+	 insn->bits2.da16.src0_swz_y = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Y);
+	 insn->bits2.da16.src0_swz_z = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Z);
+	 insn->bits2.da16.src0_swz_w = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_W);
+
+	 /* This is an oddity of the fact we're using the same
+	  * descriptions for registers in align_16 as align_1:
+	  */
+	 if (reg.vstride == BRW_VERTICAL_STRIDE_8)
+	    insn->bits2.da16.src0_vert_stride = BRW_VERTICAL_STRIDE_4;
+	 else
+	    insn->bits2.da16.src0_vert_stride = reg.vstride;
+      }
+   }
+}
+
+
+void brw_set_src1(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg reg)
+{
+   assert(reg.file != BRW_MESSAGE_REGISTER_FILE);
+
+   if (reg.type != BRW_ARCHITECTURE_REGISTER_FILE)
+      assert(reg.nr < 128);
+
+   gen7_convert_mrf_to_grf(p, &reg);
+
+   validate_reg(insn, reg);
+
+   insn->bits1.da1.src1_reg_file = reg.file;
+   insn->bits1.da1.src1_reg_type = reg.type;
+   insn->bits3.da1.src1_abs = reg.abs;
+   insn->bits3.da1.src1_negate = reg.negate;
+
+   /* Only src1 can be immediate in two-argument instructions.
+    */
+   assert(insn->bits1.da1.src0_reg_file != BRW_IMMEDIATE_VALUE);
+
+   if (reg.file == BRW_IMMEDIATE_VALUE) {
+      insn->bits3.ud = reg.dw1.ud;
+   }
+   else {
+      /* This is a hardware restriction, which may or may not be lifted
+       * in the future:
+       */
+      assert (reg.address_mode == BRW_ADDRESS_DIRECT);
+      /* assert (reg.file == BRW_GENERAL_REGISTER_FILE); */
+
+      if (insn->header.access_mode == BRW_ALIGN_1) {
+	 insn->bits3.da1.src1_subreg_nr = reg.subnr;
+	 insn->bits3.da1.src1_reg_nr = reg.nr;
+      }
+      else {
+	 insn->bits3.da16.src1_subreg_nr = reg.subnr / 16;
+	 insn->bits3.da16.src1_reg_nr = reg.nr;
+      }
+
+      if (insn->header.access_mode == BRW_ALIGN_1) {
+	 if (reg.width == BRW_WIDTH_1 && 
+	     insn->header.execution_size == BRW_EXECUTE_1) {
+	    insn->bits3.da1.src1_horiz_stride = BRW_HORIZONTAL_STRIDE_0;
+	    insn->bits3.da1.src1_width = BRW_WIDTH_1;
+	    insn->bits3.da1.src1_vert_stride = BRW_VERTICAL_STRIDE_0;
+	 }
+	 else {
+	    insn->bits3.da1.src1_horiz_stride = reg.hstride;
+	    insn->bits3.da1.src1_width = reg.width;
+	    insn->bits3.da1.src1_vert_stride = reg.vstride;
+	 }
+      }
+      else {
+	 insn->bits3.da16.src1_swz_x = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_X);
+	 insn->bits3.da16.src1_swz_y = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Y);
+	 insn->bits3.da16.src1_swz_z = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Z);
+	 insn->bits3.da16.src1_swz_w = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_W);
+
+	 /* This is an oddity of the fact we're using the same
+	  * descriptions for registers in align_16 as align_1:
+	  */
+	 if (reg.vstride == BRW_VERTICAL_STRIDE_8)
+	    insn->bits3.da16.src1_vert_stride = BRW_VERTICAL_STRIDE_4;
+	 else
+	    insn->bits3.da16.src1_vert_stride = reg.vstride;
+      }
+   }
+}
+
+/**
+ * Set the Message Descriptor and Extended Message Descriptor fields
+ * for SEND messages.
+ *
+ * \note This zeroes out the Function Control bits, so it must be called
+ *       \b before filling out any message-specific data.  Callers can
+ *       choose not to fill in irrelevant bits; they will be zero.
+ */
+static void
+brw_set_message_descriptor(struct brw_compile *p,
+			   struct brw_instruction *inst,
+			   enum brw_message_target sfid,
+			   unsigned msg_length,
+			   unsigned response_length,
+			   bool header_present,
+			   bool end_of_thread)
+{
+   struct intel_context *intel = &p->brw->intel;
+
+   brw_set_src1(p, inst, brw_imm_d(0));
+
+   if (intel->gen >= 5) {
+      inst->bits3.generic_gen5.header_present = header_present;
+      inst->bits3.generic_gen5.response_length = response_length;
+      inst->bits3.generic_gen5.msg_length = msg_length;
+      inst->bits3.generic_gen5.end_of_thread = end_of_thread;
+
+      if (intel->gen >= 6) {
+	 /* On Gen6+ Message target/SFID goes in bits 27:24 of the header */
+	 inst->header.destreg__conditionalmod = sfid;
+      } else {
+	 /* Set Extended Message Descriptor (ex_desc) */
+	 inst->bits2.send_gen5.sfid = sfid;
+	 inst->bits2.send_gen5.end_of_thread = end_of_thread;
+      }
+   } else {
+      inst->bits3.generic.response_length = response_length;
+      inst->bits3.generic.msg_length = msg_length;
+      inst->bits3.generic.msg_target = sfid;
+      inst->bits3.generic.end_of_thread = end_of_thread;
+   }
+}
+
+static void brw_set_math_message( struct brw_compile *p,
+				  struct brw_instruction *insn,
+				  GLuint function,
+				  GLuint integer_type,
+				  bool low_precision,
+				  GLuint dataType )
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+   unsigned msg_length;
+   unsigned response_length;
+
+   /* Infer message length from the function */
+   switch (function) {
+   case BRW_MATH_FUNCTION_POW:
+   case BRW_MATH_FUNCTION_INT_DIV_QUOTIENT:
+   case BRW_MATH_FUNCTION_INT_DIV_REMAINDER:
+   case BRW_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER:
+      msg_length = 2;
+      break;
+   default:
+      msg_length = 1;
+      break;
+   }
+
+   /* Infer response length from the function */
+   switch (function) {
+   case BRW_MATH_FUNCTION_SINCOS:
+   case BRW_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER:
+      response_length = 2;
+      break;
+   default:
+      response_length = 1;
+      break;
+   }
+
+
+   brw_set_message_descriptor(p, insn, BRW_SFID_MATH,
+			      msg_length, response_length, false, false);
+   if (intel->gen == 5) {
+      insn->bits3.math_gen5.function = function;
+      insn->bits3.math_gen5.int_type = integer_type;
+      insn->bits3.math_gen5.precision = low_precision;
+      insn->bits3.math_gen5.saturate = insn->header.saturate;
+      insn->bits3.math_gen5.data_type = dataType;
+      insn->bits3.math_gen5.snapshot = 0;
+   } else {
+      insn->bits3.math.function = function;
+      insn->bits3.math.int_type = integer_type;
+      insn->bits3.math.precision = low_precision;
+      insn->bits3.math.saturate = insn->header.saturate;
+      insn->bits3.math.data_type = dataType;
+   }
+   insn->header.saturate = 0;
+}
+
+
+static void brw_set_ff_sync_message(struct brw_compile *p,
+				    struct brw_instruction *insn,
+				    bool allocate,
+				    GLuint response_length,
+				    bool end_of_thread)
+{
+   brw_set_message_descriptor(p, insn, BRW_SFID_URB,
+			      1, response_length, true, end_of_thread);
+   insn->bits3.urb_gen5.opcode = 1; /* FF_SYNC */
+   insn->bits3.urb_gen5.offset = 0; /* Not used by FF_SYNC */
+   insn->bits3.urb_gen5.swizzle_control = 0; /* Not used by FF_SYNC */
+   insn->bits3.urb_gen5.allocate = allocate;
+   insn->bits3.urb_gen5.used = 0; /* Not used by FF_SYNC */
+   insn->bits3.urb_gen5.complete = 0; /* Not used by FF_SYNC */
+}
+
+static void brw_set_urb_message( struct brw_compile *p,
+				 struct brw_instruction *insn,
+				 bool allocate,
+				 bool used,
+				 GLuint msg_length,
+				 GLuint response_length,
+				 bool end_of_thread,
+				 bool complete,
+				 GLuint offset,
+				 GLuint swizzle_control )
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+
+   brw_set_message_descriptor(p, insn, BRW_SFID_URB,
+			      msg_length, response_length, true, end_of_thread);
+   if (intel->gen == 7) {
+      insn->bits3.urb_gen7.opcode = 0;	/* URB_WRITE_HWORD */
+      insn->bits3.urb_gen7.offset = offset;
+      assert(swizzle_control != BRW_URB_SWIZZLE_TRANSPOSE);
+      insn->bits3.urb_gen7.swizzle_control = swizzle_control;
+      /* per_slot_offset = 0 makes it ignore offsets in message header */
+      insn->bits3.urb_gen7.per_slot_offset = 0;
+      insn->bits3.urb_gen7.complete = complete;
+   } else if (intel->gen >= 5) {
+      insn->bits3.urb_gen5.opcode = 0;	/* URB_WRITE */
+      insn->bits3.urb_gen5.offset = offset;
+      insn->bits3.urb_gen5.swizzle_control = swizzle_control;
+      insn->bits3.urb_gen5.allocate = allocate;
+      insn->bits3.urb_gen5.used = used;	/* ? */
+      insn->bits3.urb_gen5.complete = complete;
+   } else {
+      insn->bits3.urb.opcode = 0;	/* ? */
+      insn->bits3.urb.offset = offset;
+      insn->bits3.urb.swizzle_control = swizzle_control;
+      insn->bits3.urb.allocate = allocate;
+      insn->bits3.urb.used = used;	/* ? */
+      insn->bits3.urb.complete = complete;
+   }
+}
+
+void
+brw_set_dp_write_message(struct brw_compile *p,
+			 struct brw_instruction *insn,
+			 GLuint binding_table_index,
+			 GLuint msg_control,
+			 GLuint msg_type,
+			 GLuint msg_length,
+			 bool header_present,
+			 GLuint last_render_target,
+			 GLuint response_length,
+			 GLuint end_of_thread,
+			 GLuint send_commit_msg)
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+   unsigned sfid;
+
+   if (intel->gen >= 7) {
+      /* Use the Render Cache for RT writes; otherwise use the Data Cache */
+      if (msg_type == GEN6_DATAPORT_WRITE_MESSAGE_RENDER_TARGET_WRITE)
+	 sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
+      else
+	 sfid = GEN7_SFID_DATAPORT_DATA_CACHE;
+   } else if (intel->gen == 6) {
+      /* Use the render cache for all write messages. */
+      sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
+   } else {
+      sfid = BRW_SFID_DATAPORT_WRITE;
+   }
+
+   brw_set_message_descriptor(p, insn, sfid, msg_length, response_length,
+			      header_present, end_of_thread);
+
+   if (intel->gen >= 7) {
+      insn->bits3.gen7_dp.binding_table_index = binding_table_index;
+      insn->bits3.gen7_dp.msg_control = msg_control |
+                                        last_render_target << 6;
+      insn->bits3.gen7_dp.msg_type = msg_type;
+   } else if (intel->gen == 6) {
+      insn->bits3.gen6_dp.binding_table_index = binding_table_index;
+      insn->bits3.gen6_dp.msg_control = msg_control |
+                                        last_render_target << 5;
+      insn->bits3.gen6_dp.msg_type = msg_type;
+      insn->bits3.gen6_dp.send_commit_msg = send_commit_msg;
+   } else if (intel->gen == 5) {
+      insn->bits3.dp_write_gen5.binding_table_index = binding_table_index;
+      insn->bits3.dp_write_gen5.msg_control = msg_control;
+      insn->bits3.dp_write_gen5.last_render_target = last_render_target;
+      insn->bits3.dp_write_gen5.msg_type = msg_type;
+      insn->bits3.dp_write_gen5.send_commit_msg = send_commit_msg;
+   } else {
+      insn->bits3.dp_write.binding_table_index = binding_table_index;
+      insn->bits3.dp_write.msg_control = msg_control;
+      insn->bits3.dp_write.last_render_target = last_render_target;
+      insn->bits3.dp_write.msg_type = msg_type;
+      insn->bits3.dp_write.send_commit_msg = send_commit_msg;
+   }
+}
+
+void
+brw_set_dp_read_message(struct brw_compile *p,
+			struct brw_instruction *insn,
+			GLuint binding_table_index,
+			GLuint msg_control,
+			GLuint msg_type,
+			GLuint target_cache,
+			GLuint msg_length,
+                        bool header_present,
+			GLuint response_length)
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+   unsigned sfid;
+
+   if (intel->gen >= 7) {
+      sfid = GEN7_SFID_DATAPORT_DATA_CACHE;
+   } else if (intel->gen == 6) {
+      if (target_cache == BRW_DATAPORT_READ_TARGET_RENDER_CACHE)
+	 sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
+      else
+	 sfid = GEN6_SFID_DATAPORT_SAMPLER_CACHE;
+   } else {
+      sfid = BRW_SFID_DATAPORT_READ;
+   }
+
+   brw_set_message_descriptor(p, insn, sfid, msg_length, response_length,
+			      header_present, false);
+
+   if (intel->gen >= 7) {
+      insn->bits3.gen7_dp.binding_table_index = binding_table_index;
+      insn->bits3.gen7_dp.msg_control = msg_control;
+      insn->bits3.gen7_dp.msg_type = msg_type;
+   } else if (intel->gen == 6) {
+      insn->bits3.gen6_dp.binding_table_index = binding_table_index;
+      insn->bits3.gen6_dp.msg_control = msg_control;
+      insn->bits3.gen6_dp.msg_type = msg_type;
+      insn->bits3.gen6_dp.send_commit_msg = 0;
+   } else if (intel->gen == 5) {
+      insn->bits3.dp_read_gen5.binding_table_index = binding_table_index;
+      insn->bits3.dp_read_gen5.msg_control = msg_control;
+      insn->bits3.dp_read_gen5.msg_type = msg_type;
+      insn->bits3.dp_read_gen5.target_cache = target_cache;
+   } else if (intel->is_g4x) {
+      insn->bits3.dp_read_g4x.binding_table_index = binding_table_index; /*0:7*/
+      insn->bits3.dp_read_g4x.msg_control = msg_control;  /*8:10*/
+      insn->bits3.dp_read_g4x.msg_type = msg_type;  /*11:13*/
+      insn->bits3.dp_read_g4x.target_cache = target_cache;  /*14:15*/
+   } else {
+      insn->bits3.dp_read.binding_table_index = binding_table_index; /*0:7*/
+      insn->bits3.dp_read.msg_control = msg_control;  /*8:11*/
+      insn->bits3.dp_read.msg_type = msg_type;  /*12:13*/
+      insn->bits3.dp_read.target_cache = target_cache;  /*14:15*/
+   }
+}
+
+void
+brw_set_sampler_message(struct brw_compile *p,
+                        struct brw_instruction *insn,
+                        GLuint binding_table_index,
+                        GLuint sampler,
+                        GLuint msg_type,
+                        GLuint response_length,
+                        GLuint msg_length,
+                        GLuint header_present,
+                        GLuint simd_mode,
+                        GLuint return_format)
+{
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+
+   brw_set_message_descriptor(p, insn, BRW_SFID_SAMPLER, msg_length,
+			      response_length, header_present, false);
+
+   if (intel->gen >= 7) {
+      insn->bits3.sampler_gen7.binding_table_index = binding_table_index;
+      insn->bits3.sampler_gen7.sampler = sampler;
+      insn->bits3.sampler_gen7.msg_type = msg_type;
+      insn->bits3.sampler_gen7.simd_mode = simd_mode;
+   } else if (intel->gen >= 5) {
+      insn->bits3.sampler_gen5.binding_table_index = binding_table_index;
+      insn->bits3.sampler_gen5.sampler = sampler;
+      insn->bits3.sampler_gen5.msg_type = msg_type;
+      insn->bits3.sampler_gen5.simd_mode = simd_mode;
+   } else if (intel->is_g4x) {
+      insn->bits3.sampler_g4x.binding_table_index = binding_table_index;
+      insn->bits3.sampler_g4x.sampler = sampler;
+      insn->bits3.sampler_g4x.msg_type = msg_type;
+   } else {
+      insn->bits3.sampler.binding_table_index = binding_table_index;
+      insn->bits3.sampler.sampler = sampler;
+      insn->bits3.sampler.msg_type = msg_type;
+      insn->bits3.sampler.return_format = return_format;
+   }
+}
+
+
+#define next_insn brw_next_insn
+struct brw_instruction *
+brw_next_insn(struct brw_compile *p, GLuint opcode)
+{
+   struct brw_instruction *insn;
+
+   if (p->nr_insn + 1 > p->store_size) {
+      if (0)
+         printf("incresing the store size to %d\n", p->store_size << 1);
+      p->store_size <<= 1;
+      p->store = reralloc(p->mem_ctx, p->store,
+                          struct brw_instruction, p->store_size);
+      if (!p->store)
+         assert(!"realloc eu store memeory failed");
+   }
+
+   p->next_insn_offset += 16;
+   insn = &p->store[p->nr_insn++];
+   memcpy(insn, p->current, sizeof(*insn));
+
+   /* Reset this one-shot flag: 
+    */
+
+   if (p->current->header.destreg__conditionalmod) {
+      p->current->header.destreg__conditionalmod = 0;
+      p->current->header.predicate_control = BRW_PREDICATE_NORMAL;
+   }
+
+   insn->header.opcode = opcode;
+   return insn;
+}
+
+static struct brw_instruction *brw_alu1( struct brw_compile *p,
+					 GLuint opcode,
+					 struct brw_reg dest,
+					 struct brw_reg src )
+{
+   struct brw_instruction *insn = next_insn(p, opcode);
+   brw_set_dest(p, insn, dest);
+   brw_set_src0(p, insn, src);
+   return insn;
+}
+
+static struct brw_instruction *brw_alu2(struct brw_compile *p,
+					GLuint opcode,
+					struct brw_reg dest,
+					struct brw_reg src0,
+					struct brw_reg src1 )
+{
+   struct brw_instruction *insn = next_insn(p, opcode);   
+   brw_set_dest(p, insn, dest);
+   brw_set_src0(p, insn, src0);
+   brw_set_src1(p, insn, src1);
+   return insn;
+}
+
+static int
+get_3src_subreg_nr(struct brw_reg reg)
+{
+   if (reg.vstride == BRW_VERTICAL_STRIDE_0) {
+      assert(brw_is_single_value_swizzle(reg.dw1.bits.swizzle));
+      return reg.subnr / 4 + BRW_GET_SWZ(reg.dw1.bits.swizzle, 0);
+   } else {
+      return reg.subnr / 4;
+   }
+}
+
+static struct brw_instruction *brw_alu3(struct brw_compile *p,
+					GLuint opcode,
+					struct brw_reg dest,
+					struct brw_reg src0,
+					struct brw_reg src1,
+					struct brw_reg src2)
+{
+   struct brw_instruction *insn = next_insn(p, opcode);
+
+   gen7_convert_mrf_to_grf(p, &dest);
+
+   assert(insn->header.access_mode == BRW_ALIGN_16);
+
+   assert(dest.file == BRW_GENERAL_REGISTER_FILE ||
+	  dest.file == BRW_MESSAGE_REGISTER_FILE);
+   assert(dest.nr < 128);
+   assert(dest.address_mode == BRW_ADDRESS_DIRECT);
+   assert(dest.type == BRW_REGISTER_TYPE_F);
+   insn->bits1.da3src.dest_reg_file = (dest.file == BRW_MESSAGE_REGISTER_FILE);
+   insn->bits1.da3src.dest_reg_nr = dest.nr;
+   insn->bits1.da3src.dest_subreg_nr = dest.subnr / 16;
+   insn->bits1.da3src.dest_writemask = dest.dw1.bits.writemask;
+   guess_execution_size(p, insn, dest);
+
+   assert(src0.file == BRW_GENERAL_REGISTER_FILE);
+   assert(src0.address_mode == BRW_ADDRESS_DIRECT);
+   assert(src0.nr < 128);
+   assert(src0.type == BRW_REGISTER_TYPE_F);
+   insn->bits2.da3src.src0_swizzle = src0.dw1.bits.swizzle;
+   insn->bits2.da3src.src0_subreg_nr = get_3src_subreg_nr(src0);
+   insn->bits2.da3src.src0_reg_nr = src0.nr;
+   insn->bits1.da3src.src0_abs = src0.abs;
+   insn->bits1.da3src.src0_negate = src0.negate;
+   insn->bits2.da3src.src0_rep_ctrl = src0.vstride == BRW_VERTICAL_STRIDE_0;
+
+   assert(src1.file == BRW_GENERAL_REGISTER_FILE);
+   assert(src1.address_mode == BRW_ADDRESS_DIRECT);
+   assert(src1.nr < 128);
+   assert(src1.type == BRW_REGISTER_TYPE_F);
+   insn->bits2.da3src.src1_swizzle = src1.dw1.bits.swizzle;
+   insn->bits2.da3src.src1_subreg_nr_low = get_3src_subreg_nr(src1) & 0x3;
+   insn->bits3.da3src.src1_subreg_nr_high = get_3src_subreg_nr(src1) >> 2;
+   insn->bits2.da3src.src1_rep_ctrl = src1.vstride == BRW_VERTICAL_STRIDE_0;
+   insn->bits3.da3src.src1_reg_nr = src1.nr;
+   insn->bits1.da3src.src1_abs = src1.abs;
+   insn->bits1.da3src.src1_negate = src1.negate;
+
+   assert(src2.file == BRW_GENERAL_REGISTER_FILE);
+   assert(src2.address_mode == BRW_ADDRESS_DIRECT);
+   assert(src2.nr < 128);
+   assert(src2.type == BRW_REGISTER_TYPE_F);
+   insn->bits3.da3src.src2_swizzle = src2.dw1.bits.swizzle;
+   insn->bits3.da3src.src2_subreg_nr = get_3src_subreg_nr(src2);
+   insn->bits3.da3src.src2_rep_ctrl = src2.vstride == BRW_VERTICAL_STRIDE_0;
+   insn->bits3.da3src.src2_reg_nr = src2.nr;
+   insn->bits1.da3src.src2_abs = src2.abs;
+   insn->bits1.da3src.src2_negate = src2.negate;
+
+   return insn;
+}
+
+
+/***********************************************************************
+ * Convenience routines.
+ */
+#define ALU1(OP)					\
+struct brw_instruction *brw_##OP(struct brw_compile *p,	\
+	      struct brw_reg dest,			\
+	      struct brw_reg src0)   			\
+{							\
+   return brw_alu1(p, BRW_OPCODE_##OP, dest, src0);    	\
+}
+
+#define ALU2(OP)					\
+struct brw_instruction *brw_##OP(struct brw_compile *p,	\
+	      struct brw_reg dest,			\
+	      struct brw_reg src0,			\
+	      struct brw_reg src1)   			\
+{							\
+   return brw_alu2(p, BRW_OPCODE_##OP, dest, src0, src1);	\
+}
+
+#define ALU3(OP)					\
+struct brw_instruction *brw_##OP(struct brw_compile *p,	\
+	      struct brw_reg dest,			\
+	      struct brw_reg src0,			\
+	      struct brw_reg src1,			\
+	      struct brw_reg src2)   			\
+{							\
+   return brw_alu3(p, BRW_OPCODE_##OP, dest, src0, src1, src2);	\
+}
+
+/* Rounding operations (other than RNDD) require two instructions - the first
+ * stores a rounded value (possibly the wrong way) in the dest register, but
+ * also sets a per-channel "increment bit" in the flag register.  A predicated
+ * add of 1.0 fixes dest to contain the desired result.
+ *
+ * Sandybridge and later appear to round correctly without an ADD.
+ */
+#define ROUND(OP)							      \
+void brw_##OP(struct brw_compile *p,					      \
+	      struct brw_reg dest,					      \
+	      struct brw_reg src)					      \
+{									      \
+   struct brw_instruction *rnd, *add;					      \
+   rnd = next_insn(p, BRW_OPCODE_##OP);					      \
+   brw_set_dest(p, rnd, dest);						      \
+   brw_set_src0(p, rnd, src);						      \
+									      \
+   if (p->brw->intel.gen < 6) {						      \
+      /* turn on round-increments */					      \
+      rnd->header.destreg__conditionalmod = BRW_CONDITIONAL_R;		      \
+      add = brw_ADD(p, dest, dest, brw_imm_f(1.0f));			      \
+      add->header.predicate_control = BRW_PREDICATE_NORMAL;		      \
+   }									      \
+}
+
+
+ALU1(MOV)
+ALU2(SEL)
+ALU1(NOT)
+ALU2(AND)
+ALU2(OR)
+ALU2(XOR)
+ALU2(SHR)
+ALU2(SHL)
+ALU2(RSR)
+ALU2(RSL)
+ALU2(ASR)
+ALU1(FRC)
+ALU1(RNDD)
+ALU2(MAC)
+ALU2(MACH)
+ALU1(LZD)
+ALU2(DP4)
+ALU2(DPH)
+ALU2(DP3)
+ALU2(DP2)
+ALU2(LINE)
+ALU2(PLN)
+ALU3(MAD)
+
+ROUND(RNDZ)
+ROUND(RNDE)
+
+
+struct brw_instruction *brw_ADD(struct brw_compile *p,
+				struct brw_reg dest,
+				struct brw_reg src0,
+				struct brw_reg src1)
+{
+   /* 6.2.2: add */
+   if (src0.type == BRW_REGISTER_TYPE_F ||
+       (src0.file == BRW_IMMEDIATE_VALUE &&
+	src0.type == BRW_REGISTER_TYPE_VF)) {
+      assert(src1.type != BRW_REGISTER_TYPE_UD);
+      assert(src1.type != BRW_REGISTER_TYPE_D);
+   }
+
+   if (src1.type == BRW_REGISTER_TYPE_F ||
+       (src1.file == BRW_IMMEDIATE_VALUE &&
+	src1.type == BRW_REGISTER_TYPE_VF)) {
+      assert(src0.type != BRW_REGISTER_TYPE_UD);
+      assert(src0.type != BRW_REGISTER_TYPE_D);
+   }
+
+   return brw_alu2(p, BRW_OPCODE_ADD, dest, src0, src1);
+}
+
+struct brw_instruction *brw_AVG(struct brw_compile *p,
+                                struct brw_reg dest,
+                                struct brw_reg src0,
+                                struct brw_reg src1)
+{
+   assert(dest.type == src0.type);
+   assert(src0.type == src1.type);
+   switch (src0.type) {
+   case BRW_REGISTER_TYPE_B:
+   case BRW_REGISTER_TYPE_UB:
+   case BRW_REGISTER_TYPE_W:
+   case BRW_REGISTER_TYPE_UW:
+   case BRW_REGISTER_TYPE_D:
+   case BRW_REGISTER_TYPE_UD:
+      break;
+   default:
+      assert(!"Bad type for brw_AVG");
+   }
+
+   return brw_alu2(p, BRW_OPCODE_AVG, dest, src0, src1);
+}
+
+struct brw_instruction *brw_MUL(struct brw_compile *p,
+				struct brw_reg dest,
+				struct brw_reg src0,
+				struct brw_reg src1)
+{
+   /* 6.32.38: mul */
+   if (src0.type == BRW_REGISTER_TYPE_D ||
+       src0.type == BRW_REGISTER_TYPE_UD ||
+       src1.type == BRW_REGISTER_TYPE_D ||
+       src1.type == BRW_REGISTER_TYPE_UD) {
+      assert(dest.type != BRW_REGISTER_TYPE_F);
+   }
+
+   if (src0.type == BRW_REGISTER_TYPE_F ||
+       (src0.file == BRW_IMMEDIATE_VALUE &&
+	src0.type == BRW_REGISTER_TYPE_VF)) {
+      assert(src1.type != BRW_REGISTER_TYPE_UD);
+      assert(src1.type != BRW_REGISTER_TYPE_D);
+   }
+
+   if (src1.type == BRW_REGISTER_TYPE_F ||
+       (src1.file == BRW_IMMEDIATE_VALUE &&
+	src1.type == BRW_REGISTER_TYPE_VF)) {
+      assert(src0.type != BRW_REGISTER_TYPE_UD);
+      assert(src0.type != BRW_REGISTER_TYPE_D);
+   }
+
+   assert(src0.file != BRW_ARCHITECTURE_REGISTER_FILE ||
+	  src0.nr != BRW_ARF_ACCUMULATOR);
+   assert(src1.file != BRW_ARCHITECTURE_REGISTER_FILE ||
+	  src1.nr != BRW_ARF_ACCUMULATOR);
+
+   return brw_alu2(p, BRW_OPCODE_MUL, dest, src0, src1);
+}
+
+
+void brw_NOP(struct brw_compile *p)
+{
+   struct brw_instruction *insn = next_insn(p, BRW_OPCODE_NOP);   
+   brw_set_dest(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
+   brw_set_src0(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
+   brw_set_src1(p, insn, brw_imm_ud(0x0));
+}
+
+
+
+
+
+/***********************************************************************
+ * Comparisons, if/else/endif
+ */
+
+struct brw_instruction *brw_JMPI(struct brw_compile *p, 
+                                 struct brw_reg dest,
+                                 struct brw_reg src0,
+                                 struct brw_reg src1)
+{
+   struct brw_instruction *insn = brw_alu2(p, BRW_OPCODE_JMPI, dest, src0, src1);
+
+   insn->header.execution_size = 1;
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+   insn->header.mask_control = BRW_MASK_DISABLE;
+
+   p->current->header.predicate_control = BRW_PREDICATE_NONE;
+
+   return insn;
+}
+
+static void
+push_if_stack(struct brw_compile *p, struct brw_instruction *inst)
+{
+   p->if_stack[p->if_stack_depth] = inst - p->store;
+
+   p->if_stack_depth++;
+   if (p->if_stack_array_size <= p->if_stack_depth) {
+      p->if_stack_array_size *= 2;
+      p->if_stack = reralloc(p->mem_ctx, p->if_stack, int,
+			     p->if_stack_array_size);
+   }
+}
+
+static struct brw_instruction *
+pop_if_stack(struct brw_compile *p)
+{
+   p->if_stack_depth--;
+   return &p->store[p->if_stack[p->if_stack_depth]];
+}
+
+static void
+push_loop_stack(struct brw_compile *p, struct brw_instruction *inst)
+{
+   if (p->loop_stack_array_size < p->loop_stack_depth) {
+      p->loop_stack_array_size *= 2;
+      p->loop_stack = reralloc(p->mem_ctx, p->loop_stack, int,
+			       p->loop_stack_array_size);
+      p->if_depth_in_loop = reralloc(p->mem_ctx, p->if_depth_in_loop, int,
+				     p->loop_stack_array_size);
+   }
+
+   p->loop_stack[p->loop_stack_depth] = inst - p->store;
+   p->loop_stack_depth++;
+   p->if_depth_in_loop[p->loop_stack_depth] = 0;
+}
+
+static struct brw_instruction *
+get_inner_do_insn(struct brw_compile *p)
+{
+   return &p->store[p->loop_stack[p->loop_stack_depth - 1]];
+}
+
+/* EU takes the value from the flag register and pushes it onto some
+ * sort of a stack (presumably merging with any flag value already on
+ * the stack).  Within an if block, the flags at the top of the stack
+ * control execution on each channel of the unit, eg. on each of the
+ * 16 pixel values in our wm programs.
+ *
+ * When the matching 'else' instruction is reached (presumably by
+ * countdown of the instruction count patched in by our ELSE/ENDIF
+ * functions), the relevent flags are inverted.
+ *
+ * When the matching 'endif' instruction is reached, the flags are
+ * popped off.  If the stack is now empty, normal execution resumes.
+ */
+struct brw_instruction *
+brw_IF(struct brw_compile *p, GLuint execute_size)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn;
+
+   insn = next_insn(p, BRW_OPCODE_IF);
+
+   /* Override the defaults for this instruction:
+    */
+   if (intel->gen < 6) {
+      brw_set_dest(p, insn, brw_ip_reg());
+      brw_set_src0(p, insn, brw_ip_reg());
+      brw_set_src1(p, insn, brw_imm_d(0x0));
+   } else if (intel->gen == 6) {
+      brw_set_dest(p, insn, brw_imm_w(0));
+      insn->bits1.branch_gen6.jump_count = 0;
+      brw_set_src0(p, insn, vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)));
+      brw_set_src1(p, insn, vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)));
+   } else {
+      brw_set_dest(p, insn, vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)));
+      brw_set_src0(p, insn, vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)));
+      brw_set_src1(p, insn, brw_imm_ud(0));
+      insn->bits3.break_cont.jip = 0;
+      insn->bits3.break_cont.uip = 0;
+   }
+
+   insn->header.execution_size = execute_size;
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+   insn->header.predicate_control = BRW_PREDICATE_NORMAL;
+   insn->header.mask_control = BRW_MASK_ENABLE;
+   if (!p->single_program_flow)
+      insn->header.thread_control = BRW_THREAD_SWITCH;
+
+   p->current->header.predicate_control = BRW_PREDICATE_NONE;
+
+   push_if_stack(p, insn);
+   p->if_depth_in_loop[p->loop_stack_depth]++;
+   return insn;
+}
+
+/* This function is only used for gen6-style IF instructions with an
+ * embedded comparison (conditional modifier).  It is not used on gen7.
+ */
+struct brw_instruction *
+gen6_IF(struct brw_compile *p, uint32_t conditional,
+	struct brw_reg src0, struct brw_reg src1)
+{
+   struct brw_instruction *insn;
+
+   insn = next_insn(p, BRW_OPCODE_IF);
+
+   brw_set_dest(p, insn, brw_imm_w(0));
+   if (p->compressed) {
+      insn->header.execution_size = BRW_EXECUTE_16;
+   } else {
+      insn->header.execution_size = BRW_EXECUTE_8;
+   }
+   insn->bits1.branch_gen6.jump_count = 0;
+   brw_set_src0(p, insn, src0);
+   brw_set_src1(p, insn, src1);
+
+   assert(insn->header.compression_control == BRW_COMPRESSION_NONE);
+   assert(insn->header.predicate_control == BRW_PREDICATE_NONE);
+   insn->header.destreg__conditionalmod = conditional;
+
+   if (!p->single_program_flow)
+      insn->header.thread_control = BRW_THREAD_SWITCH;
+
+   push_if_stack(p, insn);
+   return insn;
+}
+
+/**
+ * In single-program-flow (SPF) mode, convert IF and ELSE into ADDs.
+ */
+static void
+convert_IF_ELSE_to_ADD(struct brw_compile *p,
+		       struct brw_instruction *if_inst,
+		       struct brw_instruction *else_inst)
+{
+   /* The next instruction (where the ENDIF would be, if it existed) */
+   struct brw_instruction *next_inst = &p->store[p->nr_insn];
+
+   assert(p->single_program_flow);
+   assert(if_inst != NULL && if_inst->header.opcode == BRW_OPCODE_IF);
+   assert(else_inst == NULL || else_inst->header.opcode == BRW_OPCODE_ELSE);
+   assert(if_inst->header.execution_size == BRW_EXECUTE_1);
+
+   /* Convert IF to an ADD instruction that moves the instruction pointer
+    * to the first instruction of the ELSE block.  If there is no ELSE
+    * block, point to where ENDIF would be.  Reverse the predicate.
+    *
+    * There's no need to execute an ENDIF since we don't need to do any
+    * stack operations, and if we're currently executing, we just want to
+    * continue normally.
+    */
+   if_inst->header.opcode = BRW_OPCODE_ADD;
+   if_inst->header.predicate_inverse = 1;
+
+   if (else_inst != NULL) {
+      /* Convert ELSE to an ADD instruction that points where the ENDIF
+       * would be.
+       */
+      else_inst->header.opcode = BRW_OPCODE_ADD;
+
+      if_inst->bits3.ud = (else_inst - if_inst + 1) * 16;
+      else_inst->bits3.ud = (next_inst - else_inst) * 16;
+   } else {
+      if_inst->bits3.ud = (next_inst - if_inst) * 16;
+   }
+}
+
+/**
+ * Patch IF and ELSE instructions with appropriate jump targets.
+ */
+static void
+patch_IF_ELSE(struct brw_compile *p,
+	      struct brw_instruction *if_inst,
+	      struct brw_instruction *else_inst,
+	      struct brw_instruction *endif_inst)
+{
+   struct intel_context *intel = &p->brw->intel;
+
+   /* We shouldn't be patching IF and ELSE instructions in single program flow
+    * mode when gen < 6, because in single program flow mode on those
+    * platforms, we convert flow control instructions to conditional ADDs that
+    * operate on IP (see brw_ENDIF).
+    *
+    * However, on Gen6, writing to IP doesn't work in single program flow mode
+    * (see the SandyBridge PRM, Volume 4 part 2, p79: "When SPF is ON, IP may
+    * not be updated by non-flow control instructions.").  And on later
+    * platforms, there is no significant benefit to converting control flow
+    * instructions to conditional ADDs.  So we do patch IF and ELSE
+    * instructions in single program flow mode on those platforms.
+    */
+   if (intel->gen < 6)
+      assert(!p->single_program_flow);
+
+   assert(if_inst != NULL && if_inst->header.opcode == BRW_OPCODE_IF);
+   assert(endif_inst != NULL);
+   assert(else_inst == NULL || else_inst->header.opcode == BRW_OPCODE_ELSE);
+
+   unsigned br = 1;
+   /* Jump count is for 64bit data chunk each, so one 128bit instruction
+    * requires 2 chunks.
+    */
+   if (intel->gen >= 5)
+      br = 2;
+
+   assert(endif_inst->header.opcode == BRW_OPCODE_ENDIF);
+   endif_inst->header.execution_size = if_inst->header.execution_size;
+
+   if (else_inst == NULL) {
+      /* Patch IF -> ENDIF */
+      if (intel->gen < 6) {
+	 /* Turn it into an IFF, which means no mask stack operations for
+	  * all-false and jumping past the ENDIF.
+	  */
+	 if_inst->header.opcode = BRW_OPCODE_IFF;
+	 if_inst->bits3.if_else.jump_count = br * (endif_inst - if_inst + 1);
+	 if_inst->bits3.if_else.pop_count = 0;
+	 if_inst->bits3.if_else.pad0 = 0;
+      } else if (intel->gen == 6) {
+	 /* As of gen6, there is no IFF and IF must point to the ENDIF. */
+	 if_inst->bits1.branch_gen6.jump_count = br * (endif_inst - if_inst);
+      } else {
+	 if_inst->bits3.break_cont.uip = br * (endif_inst - if_inst);
+	 if_inst->bits3.break_cont.jip = br * (endif_inst - if_inst);
+      }
+   } else {
+      else_inst->header.execution_size = if_inst->header.execution_size;
+
+      /* Patch IF -> ELSE */
+      if (intel->gen < 6) {
+	 if_inst->bits3.if_else.jump_count = br * (else_inst - if_inst);
+	 if_inst->bits3.if_else.pop_count = 0;
+	 if_inst->bits3.if_else.pad0 = 0;
+      } else if (intel->gen == 6) {
+	 if_inst->bits1.branch_gen6.jump_count = br * (else_inst - if_inst + 1);
+      }
+
+      /* Patch ELSE -> ENDIF */
+      if (intel->gen < 6) {
+	 /* BRW_OPCODE_ELSE pre-gen6 should point just past the
+	  * matching ENDIF.
+	  */
+	 else_inst->bits3.if_else.jump_count = br*(endif_inst - else_inst + 1);
+	 else_inst->bits3.if_else.pop_count = 1;
+	 else_inst->bits3.if_else.pad0 = 0;
+      } else if (intel->gen == 6) {
+	 /* BRW_OPCODE_ELSE on gen6 should point to the matching ENDIF. */
+	 else_inst->bits1.branch_gen6.jump_count = br*(endif_inst - else_inst);
+      } else {
+	 /* The IF instruction's JIP should point just past the ELSE */
+	 if_inst->bits3.break_cont.jip = br * (else_inst - if_inst + 1);
+	 /* The IF instruction's UIP and ELSE's JIP should point to ENDIF */
+	 if_inst->bits3.break_cont.uip = br * (endif_inst - if_inst);
+	 else_inst->bits3.break_cont.jip = br * (endif_inst - else_inst);
+      }
+   }
+}
+
+void
+brw_ELSE(struct brw_compile *p)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn;
+
+   insn = next_insn(p, BRW_OPCODE_ELSE);
+
+   if (intel->gen < 6) {
+      brw_set_dest(p, insn, brw_ip_reg());
+      brw_set_src0(p, insn, brw_ip_reg());
+      brw_set_src1(p, insn, brw_imm_d(0x0));
+   } else if (intel->gen == 6) {
+      brw_set_dest(p, insn, brw_imm_w(0));
+      insn->bits1.branch_gen6.jump_count = 0;
+      brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src1(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+   } else {
+      brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src1(p, insn, brw_imm_ud(0));
+      insn->bits3.break_cont.jip = 0;
+      insn->bits3.break_cont.uip = 0;
+   }
+
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+   insn->header.mask_control = BRW_MASK_ENABLE;
+   if (!p->single_program_flow)
+      insn->header.thread_control = BRW_THREAD_SWITCH;
+
+   push_if_stack(p, insn);
+}
+
+void
+brw_ENDIF(struct brw_compile *p)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn = NULL;
+   struct brw_instruction *else_inst = NULL;
+   struct brw_instruction *if_inst = NULL;
+   struct brw_instruction *tmp;
+   bool emit_endif = true;
+
+   /* In single program flow mode, we can express IF and ELSE instructions
+    * equivalently as ADD instructions that operate on IP.  On platforms prior
+    * to Gen6, flow control instructions cause an implied thread switch, so
+    * this is a significant savings.
+    *
+    * However, on Gen6, writing to IP doesn't work in single program flow mode
+    * (see the SandyBridge PRM, Volume 4 part 2, p79: "When SPF is ON, IP may
+    * not be updated by non-flow control instructions.").  And on later
+    * platforms, there is no significant benefit to converting control flow
+    * instructions to conditional ADDs.  So we only do this trick on Gen4 and
+    * Gen5.
+    */
+   if (intel->gen < 6 && p->single_program_flow)
+      emit_endif = false;
+
+   /*
+    * A single next_insn() may change the base adress of instruction store
+    * memory(p->store), so call it first before referencing the instruction
+    * store pointer from an index
+    */
+   if (emit_endif)
+      insn = next_insn(p, BRW_OPCODE_ENDIF);
+
+   /* Pop the IF and (optional) ELSE instructions from the stack */
+   p->if_depth_in_loop[p->loop_stack_depth]--;
+   tmp = pop_if_stack(p);
+   if (tmp->header.opcode == BRW_OPCODE_ELSE) {
+      else_inst = tmp;
+      tmp = pop_if_stack(p);
+   }
+   if_inst = tmp;
+
+   if (!emit_endif) {
+      /* ENDIF is useless; don't bother emitting it. */
+      convert_IF_ELSE_to_ADD(p, if_inst, else_inst);
+      return;
+   }
+
+   if (intel->gen < 6) {
+      brw_set_dest(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
+      brw_set_src0(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
+      brw_set_src1(p, insn, brw_imm_d(0x0));
+   } else if (intel->gen == 6) {
+      brw_set_dest(p, insn, brw_imm_w(0));
+      brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src1(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+   } else {
+      brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src1(p, insn, brw_imm_ud(0));
+   }
+
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+   insn->header.mask_control = BRW_MASK_ENABLE;
+   insn->header.thread_control = BRW_THREAD_SWITCH;
+
+   /* Also pop item off the stack in the endif instruction: */
+   if (intel->gen < 6) {
+      insn->bits3.if_else.jump_count = 0;
+      insn->bits3.if_else.pop_count = 1;
+      insn->bits3.if_else.pad0 = 0;
+   } else if (intel->gen == 6) {
+      insn->bits1.branch_gen6.jump_count = 2;
+   } else {
+      insn->bits3.break_cont.jip = 2;
+   }
+   patch_IF_ELSE(p, if_inst, else_inst, insn);
+}
+
+struct brw_instruction *brw_BREAK(struct brw_compile *p)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn;
+
+   insn = next_insn(p, BRW_OPCODE_BREAK);
+   if (intel->gen >= 6) {
+      brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src1(p, insn, brw_imm_d(0x0));
+   } else {
+      brw_set_dest(p, insn, brw_ip_reg());
+      brw_set_src0(p, insn, brw_ip_reg());
+      brw_set_src1(p, insn, brw_imm_d(0x0));
+      insn->bits3.if_else.pad0 = 0;
+      insn->bits3.if_else.pop_count = p->if_depth_in_loop[p->loop_stack_depth];
+   }
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+   insn->header.execution_size = BRW_EXECUTE_8;
+
+   return insn;
+}
+
+struct brw_instruction *gen6_CONT(struct brw_compile *p)
+{
+   struct brw_instruction *insn;
+
+   insn = next_insn(p, BRW_OPCODE_CONTINUE);
+   brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+   brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+   brw_set_dest(p, insn, brw_ip_reg());
+   brw_set_src0(p, insn, brw_ip_reg());
+   brw_set_src1(p, insn, brw_imm_d(0x0));
+
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+   insn->header.execution_size = BRW_EXECUTE_8;
+   return insn;
+}
+
+struct brw_instruction *brw_CONT(struct brw_compile *p)
+{
+   struct brw_instruction *insn;
+   insn = next_insn(p, BRW_OPCODE_CONTINUE);
+   brw_set_dest(p, insn, brw_ip_reg());
+   brw_set_src0(p, insn, brw_ip_reg());
+   brw_set_src1(p, insn, brw_imm_d(0x0));
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+   insn->header.execution_size = BRW_EXECUTE_8;
+   /* insn->header.mask_control = BRW_MASK_DISABLE; */
+   insn->bits3.if_else.pad0 = 0;
+   insn->bits3.if_else.pop_count = p->if_depth_in_loop[p->loop_stack_depth];
+   return insn;
+}
+
+struct brw_instruction *gen6_HALT(struct brw_compile *p)
+{
+   struct brw_instruction *insn;
+
+   insn = next_insn(p, BRW_OPCODE_HALT);
+   brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+   brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+   brw_set_src1(p, insn, brw_imm_d(0x0)); /* UIP and JIP, updated later. */
+
+   if (p->compressed) {
+      insn->header.execution_size = BRW_EXECUTE_16;
+   } else {
+      insn->header.compression_control = BRW_COMPRESSION_NONE;
+      insn->header.execution_size = BRW_EXECUTE_8;
+   }
+   return insn;
+}
+
+/* DO/WHILE loop:
+ *
+ * The DO/WHILE is just an unterminated loop -- break or continue are
+ * used for control within the loop.  We have a few ways they can be
+ * done.
+ *
+ * For uniform control flow, the WHILE is just a jump, so ADD ip, ip,
+ * jip and no DO instruction.
+ *
+ * For non-uniform control flow pre-gen6, there's a DO instruction to
+ * push the mask, and a WHILE to jump back, and BREAK to get out and
+ * pop the mask.
+ *
+ * For gen6, there's no more mask stack, so no need for DO.  WHILE
+ * just points back to the first instruction of the loop.
+ */
+struct brw_instruction *brw_DO(struct brw_compile *p, GLuint execute_size)
+{
+   struct intel_context *intel = &p->brw->intel;
+
+   if (intel->gen >= 6 || p->single_program_flow) {
+      push_loop_stack(p, &p->store[p->nr_insn]);
+      return &p->store[p->nr_insn];
+   } else {
+      struct brw_instruction *insn = next_insn(p, BRW_OPCODE_DO);
+
+      push_loop_stack(p, insn);
+
+      /* Override the defaults for this instruction:
+       */
+      brw_set_dest(p, insn, brw_null_reg());
+      brw_set_src0(p, insn, brw_null_reg());
+      brw_set_src1(p, insn, brw_null_reg());
+
+      insn->header.compression_control = BRW_COMPRESSION_NONE;
+      insn->header.execution_size = execute_size;
+      insn->header.predicate_control = BRW_PREDICATE_NONE;
+      /* insn->header.mask_control = BRW_MASK_ENABLE; */
+      /* insn->header.mask_control = BRW_MASK_DISABLE; */
+
+      return insn;
+   }
+}
+
+/**
+ * For pre-gen6, we patch BREAK/CONT instructions to point at the WHILE
+ * instruction here.
+ *
+ * For gen6+, see brw_set_uip_jip(), which doesn't care so much about the loop
+ * nesting, since it can always just point to the end of the block/current loop.
+ */
+static void
+brw_patch_break_cont(struct brw_compile *p, struct brw_instruction *while_inst)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *do_inst = get_inner_do_insn(p);
+   struct brw_instruction *inst;
+   int br = (intel->gen == 5) ? 2 : 1;
+
+   for (inst = while_inst - 1; inst != do_inst; inst--) {
+      /* If the jump count is != 0, that means that this instruction has already
+       * been patched because it's part of a loop inside of the one we're
+       * patching.
+       */
+      if (inst->header.opcode == BRW_OPCODE_BREAK &&
+	  inst->bits3.if_else.jump_count == 0) {
+	 inst->bits3.if_else.jump_count = br * ((while_inst - inst) + 1);
+      } else if (inst->header.opcode == BRW_OPCODE_CONTINUE &&
+		 inst->bits3.if_else.jump_count == 0) {
+	 inst->bits3.if_else.jump_count = br * (while_inst - inst);
+      }
+   }
+}
+
+struct brw_instruction *brw_WHILE(struct brw_compile *p)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn, *do_insn;
+   GLuint br = 1;
+
+   if (intel->gen >= 5)
+      br = 2;
+
+   if (intel->gen >= 7) {
+      insn = next_insn(p, BRW_OPCODE_WHILE);
+      do_insn = get_inner_do_insn(p);
+
+      brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src1(p, insn, brw_imm_ud(0));
+      insn->bits3.break_cont.jip = br * (do_insn - insn);
+
+      insn->header.execution_size = BRW_EXECUTE_8;
+   } else if (intel->gen == 6) {
+      insn = next_insn(p, BRW_OPCODE_WHILE);
+      do_insn = get_inner_do_insn(p);
+
+      brw_set_dest(p, insn, brw_imm_w(0));
+      insn->bits1.branch_gen6.jump_count = br * (do_insn - insn);
+      brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+      brw_set_src1(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
+
+      insn->header.execution_size = BRW_EXECUTE_8;
+   } else {
+      if (p->single_program_flow) {
+	 insn = next_insn(p, BRW_OPCODE_ADD);
+         do_insn = get_inner_do_insn(p);
+
+	 brw_set_dest(p, insn, brw_ip_reg());
+	 brw_set_src0(p, insn, brw_ip_reg());
+	 brw_set_src1(p, insn, brw_imm_d((do_insn - insn) * 16));
+	 insn->header.execution_size = BRW_EXECUTE_1;
+      } else {
+	 insn = next_insn(p, BRW_OPCODE_WHILE);
+         do_insn = get_inner_do_insn(p);
+
+	 assert(do_insn->header.opcode == BRW_OPCODE_DO);
+
+	 brw_set_dest(p, insn, brw_ip_reg());
+	 brw_set_src0(p, insn, brw_ip_reg());
+	 brw_set_src1(p, insn, brw_imm_d(0));
+
+	 insn->header.execution_size = do_insn->header.execution_size;
+	 insn->bits3.if_else.jump_count = br * (do_insn - insn + 1);
+	 insn->bits3.if_else.pop_count = 0;
+	 insn->bits3.if_else.pad0 = 0;
+
+	 brw_patch_break_cont(p, insn);
+      }
+   }
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+   p->current->header.predicate_control = BRW_PREDICATE_NONE;
+
+   p->loop_stack_depth--;
+
+   return insn;
+}
+
+
+/* FORWARD JUMPS:
+ */
+void brw_land_fwd_jump(struct brw_compile *p, int jmp_insn_idx)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *jmp_insn = &p->store[jmp_insn_idx];
+   GLuint jmpi = 1;
+
+   if (intel->gen >= 5)
+      jmpi = 2;
+
+   assert(jmp_insn->header.opcode == BRW_OPCODE_JMPI);
+   assert(jmp_insn->bits1.da1.src1_reg_file == BRW_IMMEDIATE_VALUE);
+
+   jmp_insn->bits3.ud = jmpi * (p->nr_insn - jmp_insn_idx - 1);
+}
+
+
+
+/* To integrate with the above, it makes sense that the comparison
+ * instruction should populate the flag register.  It might be simpler
+ * just to use the flag reg for most WM tasks?
+ */
+void brw_CMP(struct brw_compile *p,
+	     struct brw_reg dest,
+	     GLuint conditional,
+	     struct brw_reg src0,
+	     struct brw_reg src1)
+{
+   struct brw_instruction *insn = next_insn(p, BRW_OPCODE_CMP);
+
+   insn->header.destreg__conditionalmod = conditional;
+   brw_set_dest(p, insn, dest);
+   brw_set_src0(p, insn, src0);
+   brw_set_src1(p, insn, src1);
+
+/*    guess_execution_size(insn, src0); */
+
+
+   /* Make it so that future instructions will use the computed flag
+    * value until brw_set_predicate_control_flag_value() is called
+    * again.  
+    */
+   if (dest.file == BRW_ARCHITECTURE_REGISTER_FILE &&
+       dest.nr == 0) {
+      p->current->header.predicate_control = BRW_PREDICATE_NORMAL;
+      p->flag_value = 0xff;
+   }
+}
+
+/* Issue 'wait' instruction for n1, host could program MMIO
+   to wake up thread. */
+void brw_WAIT (struct brw_compile *p)
+{
+   struct brw_instruction *insn = next_insn(p, BRW_OPCODE_WAIT);
+   struct brw_reg src = brw_notification_1_reg();
+
+   brw_set_dest(p, insn, src);
+   brw_set_src0(p, insn, src);
+   brw_set_src1(p, insn, brw_null_reg());
+   insn->header.execution_size = 0; /* must */
+   insn->header.predicate_control = 0;
+   insn->header.compression_control = 0;
+}
+
+
+/***********************************************************************
+ * Helpers for the various SEND message types:
+ */
+
+/** Extended math function, float[8].
+ */
+void brw_math( struct brw_compile *p,
+	       struct brw_reg dest,
+	       GLuint function,
+	       GLuint msg_reg_nr,
+	       struct brw_reg src,
+	       GLuint data_type,
+	       GLuint precision )
+{
+   struct intel_context *intel = &p->brw->intel;
+
+   if (intel->gen >= 6) {
+      struct brw_instruction *insn = next_insn(p, BRW_OPCODE_MATH);
+
+      assert(dest.file == BRW_GENERAL_REGISTER_FILE);
+      assert(src.file == BRW_GENERAL_REGISTER_FILE);
+
+      assert(dest.hstride == BRW_HORIZONTAL_STRIDE_1);
+      if (intel->gen == 6)
+	 assert(src.hstride == BRW_HORIZONTAL_STRIDE_1);
+
+      /* Source modifiers are ignored for extended math instructions on Gen6. */
+      if (intel->gen == 6) {
+	 assert(!src.negate);
+	 assert(!src.abs);
+      }
+
+      if (function == BRW_MATH_FUNCTION_INT_DIV_QUOTIENT ||
+	  function == BRW_MATH_FUNCTION_INT_DIV_REMAINDER ||
+	  function == BRW_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER) {
+	 assert(src.type != BRW_REGISTER_TYPE_F);
+      } else {
+	 assert(src.type == BRW_REGISTER_TYPE_F);
+      }
+
+      /* Math is the same ISA format as other opcodes, except that CondModifier
+       * becomes FC[3:0] and ThreadCtrl becomes FC[5:4].
+       */
+      insn->header.destreg__conditionalmod = function;
+
+      brw_set_dest(p, insn, dest);
+      brw_set_src0(p, insn, src);
+      brw_set_src1(p, insn, brw_null_reg());
+   } else {
+      struct brw_instruction *insn = next_insn(p, BRW_OPCODE_SEND);
+
+      /* Example code doesn't set predicate_control for send
+       * instructions.
+       */
+      insn->header.predicate_control = 0;
+      insn->header.destreg__conditionalmod = msg_reg_nr;
+
+      brw_set_dest(p, insn, dest);
+      brw_set_src0(p, insn, src);
+      brw_set_math_message(p,
+			   insn,
+			   function,
+			   src.type == BRW_REGISTER_TYPE_D,
+			   precision,
+			   data_type);
+   }
+}
+
+/** Extended math function, float[8].
+ */
+void brw_math2(struct brw_compile *p,
+	       struct brw_reg dest,
+	       GLuint function,
+	       struct brw_reg src0,
+	       struct brw_reg src1)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn = next_insn(p, BRW_OPCODE_MATH);
+
+   assert(intel->gen >= 6);
+   (void) intel;
+
+
+   assert(dest.file == BRW_GENERAL_REGISTER_FILE);
+   assert(src0.file == BRW_GENERAL_REGISTER_FILE);
+   assert(src1.file == BRW_GENERAL_REGISTER_FILE);
+
+   assert(dest.hstride == BRW_HORIZONTAL_STRIDE_1);
+   if (intel->gen == 6) {
+      assert(src0.hstride == BRW_HORIZONTAL_STRIDE_1);
+      assert(src1.hstride == BRW_HORIZONTAL_STRIDE_1);
+   }
+
+   if (function == BRW_MATH_FUNCTION_INT_DIV_QUOTIENT ||
+       function == BRW_MATH_FUNCTION_INT_DIV_REMAINDER ||
+       function == BRW_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER) {
+      assert(src0.type != BRW_REGISTER_TYPE_F);
+      assert(src1.type != BRW_REGISTER_TYPE_F);
+   } else {
+      assert(src0.type == BRW_REGISTER_TYPE_F);
+      assert(src1.type == BRW_REGISTER_TYPE_F);
+   }
+
+   /* Source modifiers are ignored for extended math instructions on Gen6. */
+   if (intel->gen == 6) {
+      assert(!src0.negate);
+      assert(!src0.abs);
+      assert(!src1.negate);
+      assert(!src1.abs);
+   }
+
+   /* Math is the same ISA format as other opcodes, except that CondModifier
+    * becomes FC[3:0] and ThreadCtrl becomes FC[5:4].
+    */
+   insn->header.destreg__conditionalmod = function;
+
+   brw_set_dest(p, insn, dest);
+   brw_set_src0(p, insn, src0);
+   brw_set_src1(p, insn, src1);
+}
+
+
+/**
+ * Write a block of OWORDs (half a GRF each) from the scratch buffer,
+ * using a constant offset per channel.
+ *
+ * The offset must be aligned to oword size (16 bytes).  Used for
+ * register spilling.
+ */
+void brw_oword_block_write_scratch(struct brw_compile *p,
+				   struct brw_reg mrf,
+				   int num_regs,
+				   GLuint offset)
+{
+   struct intel_context *intel = &p->brw->intel;
+   uint32_t msg_control, msg_type;
+   int mlen;
+
+   if (intel->gen >= 6)
+      offset /= 16;
+
+   mrf = retype(mrf, BRW_REGISTER_TYPE_UD);
+
+   if (num_regs == 1) {
+      msg_control = BRW_DATAPORT_OWORD_BLOCK_2_OWORDS;
+      mlen = 2;
+   } else {
+      msg_control = BRW_DATAPORT_OWORD_BLOCK_4_OWORDS;
+      mlen = 3;
+   }
+
+   /* Set up the message header.  This is g0, with g0.2 filled with
+    * the offset.  We don't want to leave our offset around in g0 or
+    * it'll screw up texture samples, so set it up inside the message
+    * reg.
+    */
+   {
+      brw_push_insn_state(p);
+      brw_set_mask_control(p, BRW_MASK_DISABLE);
+      brw_set_compression_control(p, BRW_COMPRESSION_NONE);
+
+      brw_MOV(p, mrf, retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD));
+
+      /* set message header global offset field (reg 0, element 2) */
+      brw_MOV(p,
+	      retype(brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE,
+				  mrf.nr,
+				  2), BRW_REGISTER_TYPE_UD),
+	      brw_imm_ud(offset));
+
+      brw_pop_insn_state(p);
+   }
+
+   {
+      struct brw_reg dest;
+      struct brw_instruction *insn = next_insn(p, BRW_OPCODE_SEND);
+      int send_commit_msg;
+      struct brw_reg src_header = retype(brw_vec8_grf(0, 0),
+					 BRW_REGISTER_TYPE_UW);
+
+      if (insn->header.compression_control != BRW_COMPRESSION_NONE) {
+	 insn->header.compression_control = BRW_COMPRESSION_NONE;
+	 src_header = vec16(src_header);
+      }
+      assert(insn->header.predicate_control == BRW_PREDICATE_NONE);
+      insn->header.destreg__conditionalmod = mrf.nr;
+
+      /* Until gen6, writes followed by reads from the same location
+       * are not guaranteed to be ordered unless write_commit is set.
+       * If set, then a no-op write is issued to the destination
+       * register to set a dependency, and a read from the destination
+       * can be used to ensure the ordering.
+       *
+       * For gen6, only writes between different threads need ordering
+       * protection.  Our use of DP writes is all about register
+       * spilling within a thread.
+       */
+      if (intel->gen >= 6) {
+	 dest = retype(vec16(brw_null_reg()), BRW_REGISTER_TYPE_UW);
+	 send_commit_msg = 0;
+      } else {
+	 dest = src_header;
+	 send_commit_msg = 1;
+      }
+
+      brw_set_dest(p, insn, dest);
+      if (intel->gen >= 6) {
+	 brw_set_src0(p, insn, mrf);
+      } else {
+	 brw_set_src0(p, insn, brw_null_reg());
+      }
+
+      if (intel->gen >= 6)
+	 msg_type = GEN6_DATAPORT_WRITE_MESSAGE_OWORD_BLOCK_WRITE;
+      else
+	 msg_type = BRW_DATAPORT_WRITE_MESSAGE_OWORD_BLOCK_WRITE;
+
+      brw_set_dp_write_message(p,
+			       insn,
+			       255, /* binding table index (255=stateless) */
+			       msg_control,
+			       msg_type,
+			       mlen,
+			       true, /* header_present */
+			       0, /* not a render target */
+			       send_commit_msg, /* response_length */
+			       0, /* eot */
+			       send_commit_msg);
+   }
+}
+
+
+/**
+ * Read a block of owords (half a GRF each) from the scratch buffer
+ * using a constant index per channel.
+ *
+ * Offset must be aligned to oword size (16 bytes).  Used for register
+ * spilling.
+ */
+void
+brw_oword_block_read_scratch(struct brw_compile *p,
+			     struct brw_reg dest,
+			     struct brw_reg mrf,
+			     int num_regs,
+			     GLuint offset)
+{
+   struct intel_context *intel = &p->brw->intel;
+   uint32_t msg_control;
+   int rlen;
+
+   if (intel->gen >= 6)
+      offset /= 16;
+
+   mrf = retype(mrf, BRW_REGISTER_TYPE_UD);
+   dest = retype(dest, BRW_REGISTER_TYPE_UW);
+
+   if (num_regs == 1) {
+      msg_control = BRW_DATAPORT_OWORD_BLOCK_2_OWORDS;
+      rlen = 1;
+   } else {
+      msg_control = BRW_DATAPORT_OWORD_BLOCK_4_OWORDS;
+      rlen = 2;
+   }
+
+   {
+      brw_push_insn_state(p);
+      brw_set_compression_control(p, BRW_COMPRESSION_NONE);
+      brw_set_mask_control(p, BRW_MASK_DISABLE);
+
+      brw_MOV(p, mrf, retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD));
+
+      /* set message header global offset field (reg 0, element 2) */
+      brw_MOV(p,
+	      retype(brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE,
+				  mrf.nr,
+				  2), BRW_REGISTER_TYPE_UD),
+	      brw_imm_ud(offset));
+
+      brw_pop_insn_state(p);
+   }
+
+   {
+      struct brw_instruction *insn = next_insn(p, BRW_OPCODE_SEND);
+
+      assert(insn->header.predicate_control == 0);
+      insn->header.compression_control = BRW_COMPRESSION_NONE;
+      insn->header.destreg__conditionalmod = mrf.nr;
+
+      brw_set_dest(p, insn, dest);	/* UW? */
+      if (intel->gen >= 6) {
+	 brw_set_src0(p, insn, mrf);
+      } else {
+	 brw_set_src0(p, insn, brw_null_reg());
+      }
+
+      brw_set_dp_read_message(p,
+			      insn,
+			      255, /* binding table index (255=stateless) */
+			      msg_control,
+			      BRW_DATAPORT_READ_MESSAGE_OWORD_BLOCK_READ, /* msg_type */
+			      BRW_DATAPORT_READ_TARGET_RENDER_CACHE,
+			      1, /* msg_length */
+                              true, /* header_present */
+			      rlen);
+   }
+}
+
+/**
+ * Read a float[4] vector from the data port Data Cache (const buffer).
+ * Location (in buffer) should be a multiple of 16.
+ * Used for fetching shader constants.
+ */
+void brw_oword_block_read(struct brw_compile *p,
+			  struct brw_reg dest,
+			  struct brw_reg mrf,
+			  uint32_t offset,
+			  uint32_t bind_table_index)
+{
+   struct intel_context *intel = &p->brw->intel;
+
+   /* On newer hardware, offset is in units of owords. */
+   if (intel->gen >= 6)
+      offset /= 16;
+
+   mrf = retype(mrf, BRW_REGISTER_TYPE_UD);
+
+   brw_push_insn_state(p);
+   brw_set_predicate_control(p, BRW_PREDICATE_NONE);
+   brw_set_compression_control(p, BRW_COMPRESSION_NONE);
+   brw_set_mask_control(p, BRW_MASK_DISABLE);
+
+   brw_MOV(p, mrf, retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD));
+
+   /* set message header global offset field (reg 0, element 2) */
+   brw_MOV(p,
+	   retype(brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE,
+			       mrf.nr,
+			       2), BRW_REGISTER_TYPE_UD),
+	   brw_imm_ud(offset));
+
+   struct brw_instruction *insn = next_insn(p, BRW_OPCODE_SEND);
+   insn->header.destreg__conditionalmod = mrf.nr;
+
+   /* cast dest to a uword[8] vector */
+   dest = retype(vec8(dest), BRW_REGISTER_TYPE_UW);
+
+   brw_set_dest(p, insn, dest);
+   if (intel->gen >= 6) {
+      brw_set_src0(p, insn, mrf);
+   } else {
+      brw_set_src0(p, insn, brw_null_reg());
+   }
+
+   brw_set_dp_read_message(p,
+			   insn,
+			   bind_table_index,
+			   BRW_DATAPORT_OWORD_BLOCK_1_OWORDLOW,
+			   BRW_DATAPORT_READ_MESSAGE_OWORD_BLOCK_READ,
+			   BRW_DATAPORT_READ_TARGET_DATA_CACHE,
+			   1, /* msg_length */
+                           true, /* header_present */
+			   1); /* response_length (1 reg, 2 owords!) */
+
+   brw_pop_insn_state(p);
+}
+
+
+void brw_fb_WRITE(struct brw_compile *p,
+		  int dispatch_width,
+                  GLuint msg_reg_nr,
+                  struct brw_reg src0,
+                  GLuint msg_control,
+                  GLuint binding_table_index,
+                  GLuint msg_length,
+                  GLuint response_length,
+                  bool eot,
+                  bool header_present)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn;
+   GLuint msg_type;
+   struct brw_reg dest;
+
+   if (dispatch_width == 16)
+      dest = retype(vec16(brw_null_reg()), BRW_REGISTER_TYPE_UW);
+   else
+      dest = retype(vec8(brw_null_reg()), BRW_REGISTER_TYPE_UW);
+
+   if (intel->gen >= 6) {
+      insn = next_insn(p, BRW_OPCODE_SENDC);
+   } else {
+      insn = next_insn(p, BRW_OPCODE_SEND);
+   }
+   /* The execution mask is ignored for render target writes. */
+   insn->header.predicate_control = 0;
+   insn->header.compression_control = BRW_COMPRESSION_NONE;
+
+   if (intel->gen >= 6) {
+      /* headerless version, just submit color payload */
+      src0 = brw_message_reg(msg_reg_nr);
+
+      msg_type = GEN6_DATAPORT_WRITE_MESSAGE_RENDER_TARGET_WRITE;
+   } else {
+      insn->header.destreg__conditionalmod = msg_reg_nr;
+
+      msg_type = BRW_DATAPORT_WRITE_MESSAGE_RENDER_TARGET_WRITE;
+   }
+
+   brw_set_dest(p, insn, dest);
+   brw_set_src0(p, insn, src0);
+   brw_set_dp_write_message(p,
+			    insn,
+			    binding_table_index,
+			    msg_control,
+			    msg_type,
+			    msg_length,
+			    header_present,
+			    eot, /* last render target write */
+			    response_length,
+			    eot,
+			    0 /* send_commit_msg */);
+}
+
+
+/**
+ * Texture sample instruction.
+ * Note: the msg_type plus msg_length values determine exactly what kind
+ * of sampling operation is performed.  See volume 4, page 161 of docs.
+ */
+void brw_SAMPLE(struct brw_compile *p,
+		struct brw_reg dest,
+		GLuint msg_reg_nr,
+		struct brw_reg src0,
+		GLuint binding_table_index,
+		GLuint sampler,
+		GLuint writemask,
+		GLuint msg_type,
+		GLuint response_length,
+		GLuint msg_length,
+		GLuint header_present,
+		GLuint simd_mode,
+		GLuint return_format)
+{
+   struct intel_context *intel = &p->brw->intel;
+   bool need_stall = 0;
+
+   if (writemask == 0) {
+      /*printf("%s: zero writemask??\n", __FUNCTION__); */
+      return;
+   }
+   
+   /* Hardware doesn't do destination dependency checking on send
+    * instructions properly.  Add a workaround which generates the
+    * dependency by other means.  In practice it seems like this bug
+    * only crops up for texture samples, and only where registers are
+    * written by the send and then written again later without being
+    * read in between.  Luckily for us, we already track that
+    * information and use it to modify the writemask for the
+    * instruction, so that is a guide for whether a workaround is
+    * needed.
+    */
+   if (writemask != BRW_WRITEMASK_XYZW) {
+      GLuint dst_offset = 0;
+      GLuint i, newmask = 0, len = 0;
+
+      for (i = 0; i < 4; i++) {
+	 if (writemask & (1<<i))
+	    break;
+	 dst_offset += 2;
+      }
+      for (; i < 4; i++) {
+	 if (!(writemask & (1<<i)))
+	    break;
+	 newmask |= 1<<i;
+	 len++;
+      }
+
+      if (newmask != writemask) {
+	 need_stall = 1;
+         /* printf("need stall %x %x\n", newmask , writemask); */
+      }
+      else {
+	 bool dispatch_16 = false;
+
+	 struct brw_reg m1 = brw_message_reg(msg_reg_nr);
+
+	 guess_execution_size(p, p->current, dest);
+	 if (p->current->header.execution_size == BRW_EXECUTE_16)
+	    dispatch_16 = true;
+
+	 newmask = ~newmask & BRW_WRITEMASK_XYZW;
+
+	 brw_push_insn_state(p);
+
+	 brw_set_compression_control(p, BRW_COMPRESSION_NONE);
+	 brw_set_mask_control(p, BRW_MASK_DISABLE);
+
+	 brw_MOV(p, retype(m1, BRW_REGISTER_TYPE_UD),
+		 retype(brw_vec8_grf(0,0), BRW_REGISTER_TYPE_UD));
+  	 brw_MOV(p, get_element_ud(m1, 2), brw_imm_ud(newmask << 12)); 
+
+	 brw_pop_insn_state(p);
+
+  	 src0 = retype(brw_null_reg(), BRW_REGISTER_TYPE_UW); 
+	 dest = offset(dest, dst_offset);
+
+	 /* For 16-wide dispatch, masked channels are skipped in the
+	  * response.  For 8-wide, masked channels still take up slots,
+	  * and are just not written to.
+	  */
+	 if (dispatch_16)
+	    response_length = len * 2;
+      }
+   }
+
+   {
+      struct brw_instruction *insn;
+   
+      gen6_resolve_implied_move(p, &src0, msg_reg_nr);
+
+      insn = next_insn(p, BRW_OPCODE_SEND);
+      insn->header.predicate_control = 0; /* XXX */
+      insn->header.compression_control = BRW_COMPRESSION_NONE;
+      if (intel->gen < 6)
+	  insn->header.destreg__conditionalmod = msg_reg_nr;
+
+      brw_set_dest(p, insn, dest);
+      brw_set_src0(p, insn, src0);
+      brw_set_sampler_message(p, insn,
+			      binding_table_index,
+			      sampler,
+			      msg_type,
+			      response_length, 
+			      msg_length,
+			      header_present,
+			      simd_mode,
+			      return_format);
+   }
+
+   if (need_stall) {
+      struct brw_reg reg = vec8(offset(dest, response_length-1));
+
+      /*  mov (8) r9.0<1>:f    r9.0<8;8,1>:f    { Align1 }
+       */
+      brw_push_insn_state(p);
+      brw_set_compression_control(p, BRW_COMPRESSION_NONE);
+      brw_MOV(p, retype(reg, BRW_REGISTER_TYPE_UD),
+	      retype(reg, BRW_REGISTER_TYPE_UD));
+      brw_pop_insn_state(p);
+   }
+
+}
+
+/* All these variables are pretty confusing - we might be better off
+ * using bitmasks and macros for this, in the old style.  Or perhaps
+ * just having the caller instantiate the fields in dword3 itself.
+ */
+void brw_urb_WRITE(struct brw_compile *p,
+		   struct brw_reg dest,
+		   GLuint msg_reg_nr,
+		   struct brw_reg src0,
+		   bool allocate,
+		   bool used,
+		   GLuint msg_length,
+		   GLuint response_length,
+		   bool eot,
+		   bool writes_complete,
+		   GLuint offset,
+		   GLuint swizzle)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn;
+
+   gen6_resolve_implied_move(p, &src0, msg_reg_nr);
+
+   if (intel->gen == 7) {
+      /* Enable Channel Masks in the URB_WRITE_HWORD message header */
+      brw_push_insn_state(p);
+      brw_set_access_mode(p, BRW_ALIGN_1);
+      brw_OR(p, retype(brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE, msg_reg_nr, 5),
+		       BRW_REGISTER_TYPE_UD),
+	        retype(brw_vec1_grf(0, 5), BRW_REGISTER_TYPE_UD),
+		brw_imm_ud(0xff00));
+      brw_pop_insn_state(p);
+   }
+
+   insn = next_insn(p, BRW_OPCODE_SEND);
+
+   assert(msg_length < BRW_MAX_MRF);
+
+   brw_set_dest(p, insn, dest);
+   brw_set_src0(p, insn, src0);
+   brw_set_src1(p, insn, brw_imm_d(0));
+
+   if (intel->gen < 6)
+      insn->header.destreg__conditionalmod = msg_reg_nr;
+
+   brw_set_urb_message(p,
+		       insn,
+		       allocate,
+		       used,
+		       msg_length,
+		       response_length, 
+		       eot, 
+		       writes_complete, 
+		       offset,
+		       swizzle);
+}
+
+static int
+next_ip(struct brw_compile *p, int ip)
+{
+   struct brw_instruction *insn = (void *)p->store + ip;
+
+   if (insn->header.cmpt_control)
+      return ip + 8;
+   else
+      return ip + 16;
+}
+
+static int
+brw_find_next_block_end(struct brw_compile *p, int start)
+{
+   int ip;
+   void *store = p->store;
+
+   for (ip = next_ip(p, start); ip < p->next_insn_offset; ip = next_ip(p, ip)) {
+      struct brw_instruction *insn = store + ip;
+
+      switch (insn->header.opcode) {
+      case BRW_OPCODE_ENDIF:
+      case BRW_OPCODE_ELSE:
+      case BRW_OPCODE_WHILE:
+      case BRW_OPCODE_HALT:
+	 return ip;
+      }
+   }
+
+   return 0;
+}
+
+/* There is no DO instruction on gen6, so to find the end of the loop
+ * we have to see if the loop is jumping back before our start
+ * instruction.
+ */
+static int
+brw_find_loop_end(struct brw_compile *p, int start)
+{
+   struct intel_context *intel = &p->brw->intel;
+   int ip;
+   int scale = 8;
+   void *store = p->store;
+
+   /* Always start after the instruction (such as a WHILE) we're trying to fix
+    * up.
+    */
+   for (ip = next_ip(p, start); ip < p->next_insn_offset; ip = next_ip(p, ip)) {
+      struct brw_instruction *insn = store + ip;
+
+      if (insn->header.opcode == BRW_OPCODE_WHILE) {
+	 int jip = intel->gen == 6 ? insn->bits1.branch_gen6.jump_count
+				   : insn->bits3.break_cont.jip;
+	 if (ip + jip * scale <= start)
+	    return ip;
+      }
+   }
+   assert(!"not reached");
+   return start;
+}
+
+/* After program generation, go back and update the UIP and JIP of
+ * BREAK, CONT, and HALT instructions to their correct locations.
+ */
+void
+brw_set_uip_jip(struct brw_compile *p)
+{
+   struct intel_context *intel = &p->brw->intel;
+   int ip;
+   int scale = 8;
+   void *store = p->store;
+
+   if (intel->gen < 6)
+      return;
+
+   for (ip = 0; ip < p->next_insn_offset; ip = next_ip(p, ip)) {
+      struct brw_instruction *insn = store + ip;
+
+      if (insn->header.cmpt_control) {
+	 /* Fixups for compacted BREAK/CONTINUE not supported yet. */
+	 assert(insn->header.opcode != BRW_OPCODE_BREAK &&
+		insn->header.opcode != BRW_OPCODE_CONTINUE &&
+		insn->header.opcode != BRW_OPCODE_HALT);
+	 continue;
+      }
+
+      int block_end_ip = brw_find_next_block_end(p, ip);
+      switch (insn->header.opcode) {
+      case BRW_OPCODE_BREAK:
+         assert(block_end_ip != 0);
+	 insn->bits3.break_cont.jip = (block_end_ip - ip) / scale;
+	 /* Gen7 UIP points to WHILE; Gen6 points just after it */
+	 insn->bits3.break_cont.uip =
+	    (brw_find_loop_end(p, ip) - ip +
+             (intel->gen == 6 ? 16 : 0)) / scale;
+	 break;
+      case BRW_OPCODE_CONTINUE:
+         assert(block_end_ip != 0);
+	 insn->bits3.break_cont.jip = (block_end_ip - ip) / scale;
+	 insn->bits3.break_cont.uip =
+            (brw_find_loop_end(p, ip) - ip) / scale;
+
+	 assert(insn->bits3.break_cont.uip != 0);
+	 assert(insn->bits3.break_cont.jip != 0);
+	 break;
+
+      case BRW_OPCODE_ENDIF:
+         if (block_end_ip == 0)
+            insn->bits3.break_cont.jip = 2;
+         else
+            insn->bits3.break_cont.jip = (block_end_ip - ip) / scale;
+	 break;
+
+      case BRW_OPCODE_HALT:
+	 /* From the Sandy Bridge PRM (volume 4, part 2, section 8.3.19):
+	  *
+	  *    "In case of the halt instruction not inside any conditional
+	  *     code block, the value of <JIP> and <UIP> should be the
+	  *     same. In case of the halt instruction inside conditional code
+	  *     block, the <UIP> should be the end of the program, and the
+	  *     <JIP> should be end of the most inner conditional code block."
+	  *
+	  * The uip will have already been set by whoever set up the
+	  * instruction.
+	  */
+	 if (block_end_ip == 0) {
+	    insn->bits3.break_cont.jip = insn->bits3.break_cont.uip;
+	 } else {
+	    insn->bits3.break_cont.jip = (block_end_ip - ip) / scale;
+	 }
+	 assert(insn->bits3.break_cont.uip != 0);
+	 assert(insn->bits3.break_cont.jip != 0);
+	 break;
+      }
+   }
+}
+
+void brw_ff_sync(struct brw_compile *p,
+		   struct brw_reg dest,
+		   GLuint msg_reg_nr,
+		   struct brw_reg src0,
+		   bool allocate,
+		   GLuint response_length,
+		   bool eot)
+{
+   struct intel_context *intel = &p->brw->intel;
+   struct brw_instruction *insn;
+
+   gen6_resolve_implied_move(p, &src0, msg_reg_nr);
+
+   insn = next_insn(p, BRW_OPCODE_SEND);
+   brw_set_dest(p, insn, dest);
+   brw_set_src0(p, insn, src0);
+   brw_set_src1(p, insn, brw_imm_d(0));
+
+   if (intel->gen < 6)
+      insn->header.destreg__conditionalmod = msg_reg_nr;
+
+   brw_set_ff_sync_message(p,
+			   insn,
+			   allocate,
+			   response_length,
+			   eot);
+}
+
+/**
+ * Emit the SEND instruction necessary to generate stream output data on Gen6
+ * (for transform feedback).
+ *
+ * If send_commit_msg is true, this is the last piece of stream output data
+ * from this thread, so send the data as a committed write.  According to the
+ * Sandy Bridge PRM (volume 2 part 1, section 4.5.1):
+ *
+ *   "Prior to End of Thread with a URB_WRITE, the kernel must ensure all
+ *   writes are complete by sending the final write as a committed write."
+ */
+void
+brw_svb_write(struct brw_compile *p,
+              struct brw_reg dest,
+              GLuint msg_reg_nr,
+              struct brw_reg src0,
+              GLuint binding_table_index,
+              bool   send_commit_msg)
+{
+   struct brw_instruction *insn;
+
+   gen6_resolve_implied_move(p, &src0, msg_reg_nr);
+
+   insn = next_insn(p, BRW_OPCODE_SEND);
+   brw_set_dest(p, insn, dest);
+   brw_set_src0(p, insn, src0);
+   brw_set_src1(p, insn, brw_imm_d(0));
+   brw_set_dp_write_message(p, insn,
+                            binding_table_index,
+                            0, /* msg_control: ignored */
+                            GEN6_DATAPORT_WRITE_MESSAGE_STREAMED_VB_WRITE,
+                            1, /* msg_length */
+                            true, /* header_present */
+                            0, /* last_render_target: ignored */
+                            send_commit_msg, /* response_length */
+                            0, /* end_of_thread */
+                            send_commit_msg); /* send_commit_msg */
+}
+
+/**
+ * This instruction is generated as a single-channel align1 instruction by
+ * both the VS and FS stages when using INTEL_DEBUG=shader_time.
+ *
+ * We can't use the typed atomic op in the FS because that has the execution
+ * mask ANDed with the pixel mask, but we just want to write the one dword for
+ * all the pixels.
+ *
+ * We don't use the SIMD4x2 atomic ops in the VS because want to just write
+ * one u32.  So we use the same untyped atomic write message as the pixel
+ * shader.
+ *
+ * The untyped atomic operation requires a BUFFER surface type with RAW
+ * format, and is only accessible through the legacy DATA_CACHE dataport
+ * messages.
+ */
+void brw_shader_time_add(struct brw_compile *p,
+                         int base_mrf,
+                         uint32_t surf_index)
+{
+   struct intel_context *intel = &p->brw->intel;
+   assert(intel->gen >= 7);
+
+   brw_push_insn_state(p);
+   brw_set_access_mode(p, BRW_ALIGN_1);
+   brw_set_mask_control(p, BRW_MASK_DISABLE);
+   struct brw_instruction *send = brw_next_insn(p, BRW_OPCODE_SEND);
+   brw_pop_insn_state(p);
+
+   /* We use brw_vec1_reg and unmasked because we want to increment the given
+    * offset only once.
+    */
+   brw_set_dest(p, send, brw_vec1_reg(BRW_ARCHITECTURE_REGISTER_FILE,
+                                      BRW_ARF_NULL, 0));
+   brw_set_src0(p, send, brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE,
+                                      base_mrf, 0));
+
+   bool header_present = false;
+   bool eot = false;
+   uint32_t mlen = 2; /* offset, value */
+   uint32_t rlen = 0;
+   brw_set_message_descriptor(p, send,
+                              GEN7_SFID_DATAPORT_DATA_CACHE,
+                              mlen, rlen, header_present, eot);
+
+   send->bits3.ud |= 6 << 14; /* untyped atomic op */
+   send->bits3.ud |= 0 << 13; /* no return data */
+   send->bits3.ud |= 1 << 12; /* SIMD8 mode */
+   send->bits3.ud |= BRW_AOP_ADD << 8;
+   send->bits3.ud |= surf_index << 0;
+}
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 32/90] assembler: Use BRW_WRITEMASK_XYZW instead of the 0xf constant
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (30 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 31/90] assembler: Import brw_eu_emit.c Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 33/90] assembler: Remove the writemask_set field of struct dest_operand Damien Lespiau
                   ` (58 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 55708ca..a27375b 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -32,6 +32,7 @@
 #include <assert.h>
 #include "gen4asm.h"
 #include "brw_defines.h"
+#include "brw_reg.h"
 
 #define DEFAULT_EXECSIZE (ffs(program_defaults.execute_size) - 1)
 #define DEFAULT_DSTREGION -1
@@ -58,7 +59,7 @@ static struct dst_operand ip_dst =
     .reg_type = BRW_REGISTER_TYPE_UD,
     .address_mode = BRW_ADDRESS_DIRECT,
     .horiz_stride = 1,
-    .writemask = 0xF,
+    .writemask = BRW_WRITEMASK_XYZW,
 };
 static struct src_operand ip_src =
 {
@@ -2431,7 +2432,7 @@ chansel:	X | Y | Z | W
 writemask:	/* empty */
 		{
 		  $$.writemask_set = 0;
-		  $$.writemask = 0xf;
+		  $$.writemask = BRW_WRITEMASK_XYZW;
 		}
 		| DOT writemask_x writemask_y writemask_z writemask_w
 		{
@@ -3134,7 +3135,7 @@ void set_direct_dst_operand(struct dst_operand *dst, struct direct_reg *reg,
 	dst->reg_type = type;
 	dst->horiz_stride = 1;
 	dst->writemask_set = 0;
-	dst->writemask = 0xf;
+	dst->writemask = BRW_WRITEMASK_XYZW;
 }
 
 void set_direct_src_operand(struct src_operand *src, struct direct_reg *reg,
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 33/90] assembler: Remove the writemask_set field of struct dest_operand
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (31 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 32/90] assembler: Use BRW_WRITEMASK_XYZW instead of the 0xf constant Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 34/90] assembler: Use subreg_nr to store the address register subreg Damien Lespiau
                   ` (57 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

writemask_set gets in the way of switching to using struct brw_reg and
it's possible to derive it from the writemask value.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    1 -
 assembler/gram.y    |   10 ++++------
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 71b8a4d..e57a699 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -91,7 +91,6 @@ struct indirect_reg {
 struct dst_operand {
 	int reg_file, reg_nr, subreg_nr, reg_type;
 
-	int writemask_set;
 	int writemask;
 
 	int horiz_stride;
diff --git a/assembler/gram.y b/assembler/gram.y
index a27375b..62dad6d 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1541,7 +1541,6 @@ dstoperand:	symbol_reg dstregion
 		  $$.address_subreg_nr = $1.address_subreg_nr;
 		  $$.indirect_offset = $1.indirect_offset;
 		  $$.horiz_stride = $2;
-		  $$.writemask_set = $3.writemask_set;
 		  $$.writemask = $3.writemask;
 		  $$.reg_type = $4.type;
 		}
@@ -2431,12 +2430,10 @@ chansel:	X | Y | Z | W
  */
 writemask:	/* empty */
 		{
-		  $$.writemask_set = 0;
 		  $$.writemask = BRW_WRITEMASK_XYZW;
 		}
 		| DOT writemask_x writemask_y writemask_z writemask_w
 		{
-		  $$.writemask_set = 1;
 		  $$.writemask = $2 | $3 | $4 | $5;
 		}
 ;
@@ -2843,7 +2840,8 @@ int set_instruction_dest(struct brw_instruction *instr,
 		instr->bits1.da1.dest_reg_nr = dest->reg_nr;
 		instr->bits1.da1.dest_horiz_stride = dest->horiz_stride;
 		instr->bits1.da1.dest_address_mode = dest->address_mode;
-		if (dest->writemask_set) {
+		if (dest->writemask != 0 &&
+		    dest->writemask != BRW_WRITEMASK_XYZW) {
 			fprintf(stderr, "error: write mask set in align1 "
 				"instruction\n");
 			return 1;
@@ -2863,7 +2861,8 @@ int set_instruction_dest(struct brw_instruction *instr,
 		instr->bits1.ia1.dest_horiz_stride = dest->horiz_stride;
 		instr->bits1.ia1.dest_indirect_offset = dest->indirect_offset;
 		instr->bits1.ia1.dest_address_mode = dest->address_mode;
-		if (dest->writemask_set) {
+		if (dest->writemask != 0 &&
+		    dest->writemask != BRW_WRITEMASK_XYZW) {
 			fprintf(stderr, "error: write mask set in align1 "
 				"instruction\n");
 			return 1;
@@ -3134,7 +3133,6 @@ void set_direct_dst_operand(struct dst_operand *dst, struct direct_reg *reg,
 	dst->subreg_nr = reg->subreg_nr;
 	dst->reg_type = type;
 	dst->horiz_stride = 1;
-	dst->writemask_set = 0;
 	dst->writemask = BRW_WRITEMASK_XYZW;
 }
 
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 34/90] assembler: Use subreg_nr to store the address register subreg
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (32 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 33/90] assembler: Remove the writemask_set field of struct dest_operand Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 35/90] assembler: Simplify get_subreg_address() Damien Lespiau
                   ` (56 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Another step towards using struct brw_reg for source and destination
operands.

Instead of having a separate field to store the sub register number of
the address register in indirect access mode, we can reuse the subreg_nr
field that was only used for direct access so far.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    2 --
 assembler/gram.y    |   22 +++++++++++-----------
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index e57a699..a2ab5e8 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -97,7 +97,6 @@ struct dst_operand {
 	int address_mode; /* 0 if direct, 1 if register-indirect */
 
 	/* Indirect addressing */
-	int address_subreg_nr;
 	int indirect_offset;
 };
 
@@ -114,7 +113,6 @@ struct src_operand {
 	int default_region;
 
 	int address_mode; /* 0 if direct, 1 if register-indirect */
-	int address_subreg_nr;
 	int indirect_offset; /* XXX */
 
 	int swizzle_set;
diff --git a/assembler/gram.y b/assembler/gram.y
index 62dad6d..844904d 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1538,7 +1538,7 @@ dstoperand:	symbol_reg dstregion
 		  $$.reg_nr = $1.reg_nr;
 		  $$.subreg_nr = $1.subreg_nr;
 		  $$.address_mode = $1.address_mode;
-		  $$.address_subreg_nr = $1.address_subreg_nr;
+		  $$.subreg_nr = $1.subreg_nr;
 		  $$.indirect_offset = $1.indirect_offset;
 		  $$.horiz_stride = $2;
 		  $$.writemask = $3.writemask;
@@ -1676,7 +1676,7 @@ dstreg:		directgenreg
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
 		  $$.reg_file = $1.reg_file;
-		  $$.address_subreg_nr = $1.address_subreg_nr;
+		  $$.subreg_nr = $1.address_subreg_nr;
 		  $$.indirect_offset = $1.indirect_offset;
 		}
 		| indirectmsgreg
@@ -1684,7 +1684,7 @@ dstreg:		directgenreg
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
 		  $$.reg_file = $1.reg_file;
-		  $$.address_subreg_nr = $1.address_subreg_nr;
+		  $$.subreg_nr = $1.address_subreg_nr;
 		  $$.indirect_offset = $1.indirect_offset;
 		}
 ;
@@ -1908,7 +1908,7 @@ indirectsrcoperand:
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
 		  $$.reg_file = $3.reg_file;
-		  $$.address_subreg_nr = $3.address_subreg_nr;
+		  $$.subreg_nr = $3.address_subreg_nr;
 		  $$.indirect_offset = $3.indirect_offset;
 		  $$.reg_type = $5.type;
 		  $$.vert_stride = $4.vert_stride;
@@ -2286,7 +2286,7 @@ relativelocation2:
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
 		  $$.reg_file = $1.reg_file;
-		  $$.address_subreg_nr = $1.address_subreg_nr;
+		  $$.subreg_nr = $1.address_subreg_nr;
 		  $$.indirect_offset = $1.indirect_offset;
 		  $$.reg_type = $3.type;
 		  $$.vert_stride = $2.vert_stride;
@@ -2857,7 +2857,7 @@ int set_instruction_dest(struct brw_instruction *instr,
 	} else if (instr->header.access_mode == BRW_ALIGN_1) {
 		instr->bits1.ia1.dest_reg_file = dest->reg_file;
 		instr->bits1.ia1.dest_reg_type = dest->reg_type;
-		instr->bits1.ia1.dest_subreg_nr = get_indirect_subreg_address(dest->address_subreg_nr);
+		instr->bits1.ia1.dest_subreg_nr = dest->subreg_nr;
 		instr->bits1.ia1.dest_horiz_stride = dest->horiz_stride;
 		instr->bits1.ia1.dest_indirect_offset = dest->indirect_offset;
 		instr->bits1.ia1.dest_address_mode = dest->address_mode;
@@ -2870,7 +2870,7 @@ int set_instruction_dest(struct brw_instruction *instr,
 	} else {
 		instr->bits1.ia16.dest_reg_file = dest->reg_file;
 		instr->bits1.ia16.dest_reg_type = dest->reg_type;
-		instr->bits1.ia16.dest_subreg_nr = get_indirect_subreg_address(dest->address_subreg_nr);
+		instr->bits1.ia16.dest_subreg_nr = get_indirect_subreg_address(dest->subreg_nr);
 		instr->bits1.ia16.dest_writemask = dest->writemask;
 		instr->bits1.ia16.dest_horiz_stride = ffs(1);
 		instr->bits1.ia16.dest_indirect_offset = (dest->indirect_offset >> 4); /* half register aligned */
@@ -2921,7 +2921,7 @@ int set_instruction_src0(struct brw_instruction *instr,
         } else {
             if (instr->header.access_mode == BRW_ALIGN_1) {
 		instr->bits2.ia1.src0_indirect_offset = src->indirect_offset;
-		instr->bits2.ia1.src0_subreg_nr = get_indirect_subreg_address(src->address_subreg_nr);
+		instr->bits2.ia1.src0_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
 		instr->bits2.ia1.src0_abs = src->abs;
 		instr->bits2.ia1.src0_negate = src->negate;
 		instr->bits2.ia1.src0_address_mode = src->address_mode;
@@ -2937,7 +2937,7 @@ int set_instruction_src0(struct brw_instruction *instr,
 		instr->bits2.ia16.src0_swz_x = src->swizzle_x;
 		instr->bits2.ia16.src0_swz_y = src->swizzle_y;
 		instr->bits2.ia16.src0_indirect_offset = (src->indirect_offset >> 4); /* half register aligned */
-		instr->bits2.ia16.src0_subreg_nr = get_indirect_subreg_address(src->address_subreg_nr);
+		instr->bits2.ia16.src0_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
 		instr->bits2.ia16.src0_abs = src->abs;
 		instr->bits2.ia16.src0_negate = src->negate;
 		instr->bits2.ia16.src0_address_mode = src->address_mode;
@@ -3004,7 +3004,7 @@ int set_instruction_src1(struct brw_instruction *instr,
 	} else {
             if (instr->header.access_mode == BRW_ALIGN_1) {
 		instr->bits3.ia1.src1_indirect_offset = src->indirect_offset;
-		instr->bits3.ia1.src1_subreg_nr = get_indirect_subreg_address(src->address_subreg_nr);
+		instr->bits3.ia1.src1_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
 		instr->bits3.ia1.src1_abs = src->abs;
 		instr->bits3.ia1.src1_negate = src->negate;
 		instr->bits3.ia1.src1_address_mode = src->address_mode;
@@ -3020,7 +3020,7 @@ int set_instruction_src1(struct brw_instruction *instr,
 		instr->bits3.ia16.src1_swz_x = src->swizzle_x;
 		instr->bits3.ia16.src1_swz_y = src->swizzle_y;
 		instr->bits3.ia16.src1_indirect_offset = (src->indirect_offset >> 4); /* half register aligned */
-		instr->bits3.ia16.src1_subreg_nr = get_indirect_subreg_address(src->address_subreg_nr);
+		instr->bits3.ia16.src1_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
 		instr->bits3.ia16.src1_abs = src->abs;
 		instr->bits3.ia16.src1_negate = src->negate;
 		instr->bits3.ia16.src1_address_mode = src->address_mode;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 35/90] assembler: Simplify get_subreg_address()
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (33 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 34/90] assembler: Use subreg_nr to store the address register subreg Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 36/90] assembler: Make print_instruction() take an instruction Damien Lespiau
                   ` (55 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

This function can only be called to resolve subreg_nr in direct mode
(there is an other function for the indirect case) and it makes no sense
to call it with an immediate operand.

Express those facts with asserts and simplify the logic.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   17 +++++------------
 1 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 844904d..f608d82 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -2716,18 +2716,11 @@ static int get_subreg_address(GLuint regfile, GLuint type, GLuint subreg, GLuint
 {
     int unit_size = 1;
 
-    if (address_mode == BRW_ADDRESS_DIRECT) {
-        if (advanced_flag == 1) {
-            if ((regfile == BRW_GENERAL_REGISTER_FILE ||
-                 regfile == BRW_MESSAGE_REGISTER_FILE || 
-                 regfile == BRW_ARCHITECTURE_REGISTER_FILE)) {
-                
-                unit_size = get_type_size(type);
-            } 
-        }
-    } else {
-        unit_size = 1;
-    }
+    assert(address_mode == BRW_ADDRESS_DIRECT);
+    assert(regfile != BRW_IMMEDIATE_VALUE);
+
+    if (advanced_flag)
+	unit_size = get_type_size(type);
 
     return subreg * unit_size;
 }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 36/90] assembler: Make print_instruction() take an instruction
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (34 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 35/90] assembler: Simplify get_subreg_address() Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 37/90] assembler: Refactor the code adding instructions and labels Damien Lespiau
                   ` (54 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

No need to use a brw_program_instruction there as a brw_instruction is
what you really dump anyway, espcially when the plan is to use
brw_compile from Mesa sooner rather than later.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/main.c |   44 ++++++++++++++++++++++----------------------
 1 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/assembler/main.c b/assembler/main.c
index 1b411c7..28daf3e 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -239,35 +239,35 @@ static void free_entry_point_table(struct entry_point_item *p) {
 }
 
 static void
-print_instruction(FILE *output, struct brw_program_instruction *entry)
+print_instruction(FILE *output, struct brw_instruction *instruction)
 {
 	if (binary_like_output) {
 		fprintf(output, "\t0x%02x, 0x%02x, 0x%02x, 0x%02x, "
 				"0x%02x, 0x%02x, 0x%02x, 0x%02x,\n"
 				"\t0x%02x, 0x%02x, 0x%02x, 0x%02x, "
 				"0x%02x, 0x%02x, 0x%02x, 0x%02x,\n",
-			((unsigned char *)(&entry->instruction))[0],
-			((unsigned char *)(&entry->instruction))[1],
-			((unsigned char *)(&entry->instruction))[2],
-			((unsigned char *)(&entry->instruction))[3],
-			((unsigned char *)(&entry->instruction))[4],
-			((unsigned char *)(&entry->instruction))[5],
-			((unsigned char *)(&entry->instruction))[6],
-			((unsigned char *)(&entry->instruction))[7],
-			((unsigned char *)(&entry->instruction))[8],
-			((unsigned char *)(&entry->instruction))[9],
-			((unsigned char *)(&entry->instruction))[10],
-			((unsigned char *)(&entry->instruction))[11],
-			((unsigned char *)(&entry->instruction))[12],
-			((unsigned char *)(&entry->instruction))[13],
-			((unsigned char *)(&entry->instruction))[14],
-			((unsigned char *)(&entry->instruction))[15]);
+			((unsigned char *)instruction)[0],
+			((unsigned char *)instruction)[1],
+			((unsigned char *)instruction)[2],
+			((unsigned char *)instruction)[3],
+			((unsigned char *)instruction)[4],
+			((unsigned char *)instruction)[5],
+			((unsigned char *)instruction)[6],
+			((unsigned char *)instruction)[7],
+			((unsigned char *)instruction)[8],
+			((unsigned char *)instruction)[9],
+			((unsigned char *)instruction)[10],
+			((unsigned char *)instruction)[11],
+			((unsigned char *)instruction)[12],
+			((unsigned char *)instruction)[13],
+			((unsigned char *)instruction)[14],
+			((unsigned char *)instruction)[15]);
 	} else {
 		fprintf(output, "   { 0x%08x, 0x%08x, 0x%08x, 0x%08x },\n",
-			((int *)(&entry->instruction))[0],
-			((int *)(&entry->instruction))[1],
-			((int *)(&entry->instruction))[2],
-			((int *)(&entry->instruction))[3]);
+			((int *)instruction)[0],
+			((int *)instruction)[1],
+			((int *)instruction)[2],
+			((int *)instruction)[3]);
 	}
 }
 int main(int argc, char **argv)
@@ -470,7 +470,7 @@ int main(int argc, char **argv)
 		entry = entry1) {
 	    entry1 = entry->next;
 	    if (!entry->islabel)
-		print_instruction(output, entry);
+		print_instruction(output, &entry->instruction);
 	    else
 		free(entry->string);
 	    free(entry);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 37/90] assembler: Refactor the code adding instructions and labels
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (35 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 36/90] assembler: Make print_instruction() take an instruction Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 38/90] assembler: Make explicit that labels are part of the instructions list Damien Lespiau
                   ` (53 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Factoring out the code from the grammar will allow us to switch to
using brw_compile in a cleaner way.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   84 ++++++++++++++++++++++++++++-------------------------
 1 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index f608d82..a1c09f7 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -97,6 +97,42 @@ void set_direct_dst_operand(struct dst_operand *dst, struct direct_reg *reg,
 void set_direct_src_operand(struct src_operand *src, struct direct_reg *reg,
 			    int type);
 
+static void brw_program_init(struct brw_program *p)
+{
+   memset(p, 0, sizeof(struct brw_program));
+}
+
+static void brw_program_append_entry(struct brw_program *p,
+				     struct brw_program_instruction *entry)
+{
+    entry->next = NULL;
+    if (p->last)
+	p->last->next = entry;
+    else
+	p->first = entry;
+    p->last = entry;
+}
+
+static void brw_program_add_instruction(struct brw_program *p,
+					struct brw_instruction *instruction)
+{
+    struct brw_program_instruction *list_entry;
+
+    list_entry = calloc(sizeof(struct brw_program_instruction), 1);
+    list_entry->instruction = *instruction;
+    brw_program_append_entry(p, list_entry);
+}
+
+static void brw_program_add_label(struct brw_program *p, const char *label)
+{
+    struct brw_program_instruction *list_entry;
+
+    list_entry = calloc(sizeof(struct brw_program_instruction), 1);
+    list_entry->string = strdup(label);
+    list_entry->islabel = 1;
+    brw_program_append_entry(p, list_entry);
+}
+
 %}
 
 %start ROOT
@@ -345,59 +381,27 @@ instrseq:	instrseq pragma
 		}
 		| instrseq instruction SEMICOLON
 		{
-		  struct brw_program_instruction *list_entry =
-		    calloc(sizeof(struct brw_program_instruction), 1);
-		  list_entry->instruction = $2;
-		  list_entry->next = NULL;
-		  if ($1.last) {
-			$1.last->next = list_entry;
-		  } else {
-			$1.first = list_entry;
-		  }
-		  $1.last = list_entry;
+		  brw_program_add_instruction(&$1, &$2);
 		  $$ = $1;
 		}
 		| instruction SEMICOLON
 		{
-		  struct brw_program_instruction *list_entry =
-		    calloc(sizeof(struct brw_program_instruction), 1);
-		  list_entry->instruction = $1;
-
-		  list_entry->next = NULL;
-
-		  $$.first = list_entry;
-		  $$.last = list_entry;
+		  brw_program_init(&$$);
+		  brw_program_add_instruction(&$$, &$1);
 		}
-        | instrseq SEMICOLON
+		| instrseq SEMICOLON
 		{
 		    $$ = $1;
 		}
-        | instrseq label
+		| instrseq label
         	{
-          struct brw_program_instruction *list_entry =
-            calloc(sizeof(struct brw_program_instruction), 1);
-          list_entry->string = strdup($2);
-          list_entry->islabel = 1;
-		  list_entry->next = NULL;
-		  if ($1.last) {
-			$1.last->next = list_entry;
-		  } else {
-			$1.first = list_entry;
-		  }
-		  $1.last = list_entry;
+		  brw_program_add_label(&$1, $2);
 		  $$ = $1;
                 }
 		| label
 		{
-		  struct brw_program_instruction *list_entry =
-		    calloc(sizeof(struct brw_program_instruction), 1);
-                  list_entry->string = strdup($1);
-                  list_entry->islabel = 1;
-
-		  list_entry->next = NULL;
-
-		  $$.first = list_entry;
-		  $$.last = list_entry;
+		  brw_program_init(&$$);
+		  brw_program_add_label(&$$, $1);
 		}
 		| pragma
 		{
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 38/90] assembler: Make explicit that labels are part of the instructions list
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (36 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 37/90] assembler: Refactor the code adding instructions and labels Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 39/90] assembler: Don't change the size of opcodes! Damien Lespiau
                   ` (52 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The output of the parsing is a list of struct brw_program_instruction.
These instructions can be either GEN instructions aka struct
brw_instruction or labels. To make this more explicit we now have a type
to test to determine which instruction we are dealing with.

This will also allow to to pull the relocation bits into struct
brw_program_instruction instead of having them in the structure
representing the opcodes.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/disasm-main.c |    2 +-
 assembler/gen4asm.h     |   36 +++++++++++++++++++++----
 assembler/gram.y        |    7 +++--
 assembler/main.c        |   66 +++++++++++++++++++++++++++--------------------
 4 files changed, 73 insertions(+), 38 deletions(-)

diff --git a/assembler/disasm-main.c b/assembler/disasm-main.c
index b900e91..fbb6ae3 100644
--- a/assembler/disasm-main.c
+++ b/assembler/disasm-main.c
@@ -167,6 +167,6 @@ int main(int argc, char **argv)
     }
 	    
     for (inst = program->first; inst; inst = inst->next)
-	brw_disasm (output, &inst->instruction, gen);
+	brw_disasm (output, &inst->instruction.gen, gen);
     exit (0);
 }
diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index a2ab5e8..aa380e1 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -30,6 +30,8 @@
 #define __GEN4ASM_H__
 
 #include <inttypes.h>
+#include <stdbool.h>
+#include <assert.h>
 
 typedef unsigned char GLubyte;
 typedef short GLshort;
@@ -133,18 +135,40 @@ typedef struct {
     } u;
 } imm32_t;
 
+enum assembler_instruction_type {
+    GEN4ASM_INSTRUCTION_GEN,
+    GEN4ASM_INSTRUCTION_LABEL,
+};
+
+struct label_instruction {
+    char   *name;
+};
+
 /**
  * This structure is just the list container for instructions accumulated by
  * the parser and labels.
  */
 struct brw_program_instruction {
-	struct brw_instruction instruction;
-	struct brw_program_instruction *next;
-	GLuint islabel;
-	GLuint inst_offset;
-	char   *string;
+    enum assembler_instruction_type type;
+    unsigned inst_offset;
+    union {
+	struct brw_instruction gen;
+	struct label_instruction label;
+    } instruction;
+    struct brw_program_instruction *next;
 };
 
+static inline bool is_label(struct brw_program_instruction *instruction)
+{
+    return instruction->type == GEN4ASM_INSTRUCTION_LABEL;
+}
+
+static inline char *label_name(struct brw_program_instruction *i)
+{
+    assert(is_label(i));
+    return i->instruction.label.name;
+}
+
 /**
  * This structure is a list of instructions.  It is the final output of the
  * parser.
@@ -188,7 +212,7 @@ struct declared_register {
 };
 struct declared_register *find_register(char *name);
 void insert_register(struct declared_register *reg);
-void add_label(char *name, int addr);
+void add_label(struct brw_program_instruction *instruction);
 int label_to_addr(char *name, int start_addr);
 
 int yyparse(void);
diff --git a/assembler/gram.y b/assembler/gram.y
index a1c09f7..cf65f9f 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -119,7 +119,8 @@ static void brw_program_add_instruction(struct brw_program *p,
     struct brw_program_instruction *list_entry;
 
     list_entry = calloc(sizeof(struct brw_program_instruction), 1);
-    list_entry->instruction = *instruction;
+    list_entry->type = GEN4ASM_INSTRUCTION_GEN;
+    list_entry->instruction.gen = *instruction;
     brw_program_append_entry(p, list_entry);
 }
 
@@ -128,8 +129,8 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
     struct brw_program_instruction *list_entry;
 
     list_entry = calloc(sizeof(struct brw_program_instruction), 1);
-    list_entry->string = strdup(label);
-    list_entry->islabel = 1;
+    list_entry->type = GEN4ASM_INSTRUCTION_LABEL;
+    list_entry->instruction.label.name = strdup(label);
     brw_program_append_entry(p, list_entry);
 }
 
diff --git a/assembler/main.c b/assembler/main.c
index 28daf3e..eb75230 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -31,6 +31,7 @@
 #include <string.h>
 #include <getopt.h>
 #include <unistd.h>
+#include <assert.h>
 
 #include "gen4asm.h"
 
@@ -154,14 +155,17 @@ void insert_register(struct declared_register *reg)
     insert_hash_item(declared_register_table, reg->name, reg);
 }
 
-void add_label(char *name, int addr)
+void add_label(struct brw_program_instruction *i)
 {
     struct label_item **p = &label_table;
+
+    assert(is_label(i));
+
     while(*p)
         p = &((*p)->next);
     *p = calloc(1, sizeof(**p));
-    (*p)->name = name;
-    (*p)->addr = addr;
+    (*p)->name = label_name(i);
+    (*p)->addr = i->inst_offset;
 }
 
 /* Some assembly code have duplicated labels.
@@ -220,11 +224,14 @@ static int read_entry_file(char *fn)
 	return 0;
 }
 
-static int is_entry_point(char *s)
+static int is_entry_point(struct brw_program_instruction *i)
 {
 	struct entry_point_item *p;
+
+	assert(i->type == GEN4ASM_INSTRUCTION_LABEL);
+
 	for (p = entry_point_table; p; p = p->next) {
-	    if (strcmp(p->str, s) == 0)
+	    if (strcmp(p->str, i->instruction.label.name) == 0)
 		return 1;
 	}
 	return 0;
@@ -379,24 +386,24 @@ int main(int argc, char **argv)
 		entry != NULL; entry = entry->next) {
 	    entry->inst_offset = inst_offset;
 	    entry1 = entry->next;
-	    if (entry1 && entry1->islabel && is_entry_point(entry1->string)) {
+	    if (entry1 && is_label(entry1) && is_entry_point(entry1)) {
 		// insert NOP instructions until (inst_offset+1) % 4 == 0
 		while (((inst_offset+1) % 4) != 0) {
 		    tmp_entry = calloc(sizeof(*tmp_entry), 1);
-		    tmp_entry->instruction.header.opcode = BRW_OPCODE_NOP;
+		    tmp_entry->instruction.gen.header.opcode = BRW_OPCODE_NOP;
 		    entry->next = tmp_entry;
 		    tmp_entry->next = entry1;
 		    entry = tmp_entry;
 		    tmp_entry->inst_offset = ++inst_offset;
 		}
 	    }
-	    if (!entry->islabel)
+	    if (!is_label(entry))
               inst_offset++;
 	}
 
 	for (entry = compiled_program.first; entry; entry = entry->next)
-	    if (entry->islabel)
-		add_label(entry->string, entry->inst_offset);
+	    if (is_label(entry))
+		add_label(entry);
 
 	if (need_export) {
 		if (export_filename) {
@@ -406,15 +413,18 @@ int main(int argc, char **argv)
 		}
 		for (entry = compiled_program.first;
 			entry != NULL; entry = entry->next) {
-		    if (entry->islabel) 
+		    if (is_label(entry))
 			fprintf(export_file, "#define %s_IP %d\n",
-				entry->string, (IS_GENx(5) ? 2 : 1)*(entry->inst_offset));
+				label_name(entry), (IS_GENx(5) ? 2 : 1)*(entry->inst_offset));
 		}
 		fclose(export_file);
 	}
 
 	for (entry = compiled_program.first; entry; entry = entry->next) {
-	    struct brw_instruction *inst = & entry->instruction;
+	    struct brw_instruction *inst = & entry->instruction.gen;
+
+	    if (is_label(entry))
+		continue;
 
 	    if (inst->first_reloc_target)
 		inst->first_reloc_offset = label_to_addr(inst->first_reloc_target, entry->inst_offset) - entry->inst_offset;
@@ -424,14 +434,14 @@ int main(int argc, char **argv)
 
 	    if (inst->second_reloc_offset) {
 		// this is a branch instruction with two offset arguments
-		entry->instruction.bits3.break_cont.jip = jump_distance(inst->first_reloc_offset);
-		entry->instruction.bits3.break_cont.uip = jump_distance(inst->second_reloc_offset);
+		inst->bits3.break_cont.jip = jump_distance(inst->first_reloc_offset);
+		inst->bits3.break_cont.uip = jump_distance(inst->second_reloc_offset);
 	    } else if (inst->first_reloc_offset) {
 		// this is a branch instruction with one offset argument
 		int offset = inst->first_reloc_offset;
 		/* bspec: Unlike other flow control instructions, the offset used by JMPI is relative to the incremented instruction pointer rather than the IP value for the instruction itself. */
 		
-		int is_jmpi = entry->instruction.header.opcode == BRW_OPCODE_JMPI; // target relative to the post-incremented IP, so delta == 1 if JMPI
+		int is_jmpi = inst->header.opcode == BRW_OPCODE_JMPI; // target relative to the post-incremented IP, so delta == 1 if JMPI
 		if(is_jmpi)
 		    offset --;
 		offset = jump_distance(offset);
@@ -439,25 +449,25 @@ int main(int argc, char **argv)
 			offset = offset * 8;
 
 		if(!IS_GENp(6)) {
-		    entry->instruction.bits3.JIP = offset;
-		    if(entry->instruction.header.opcode == BRW_OPCODE_ELSE)
-			entry->instruction.bits3.break_cont.uip = 1; /* Set the istack pop count, which must always be 1. */
+		    inst->bits3.JIP = offset;
+		    if(inst->header.opcode == BRW_OPCODE_ELSE)
+			inst->bits3.break_cont.uip = 1; /* Set the istack pop count, which must always be 1. */
 		} else if(IS_GENx(6)) {
 		    /* TODO: endif JIP pos is not in Gen6 spec. may be bits1 */
-		    int opcode = entry->instruction.header.opcode;
+		    int opcode = inst->header.opcode;
 		    if(opcode == BRW_OPCODE_CALL || opcode == BRW_OPCODE_JMPI)
-			entry->instruction.bits3.JIP = offset; // for CALL, JMPI
+			inst->bits3.JIP = offset; // for CALL, JMPI
 		    else
-			entry->instruction.bits1.branch_gen6.jump_count = offset; // for CASE,ELSE,FORK,IF,WHILE
+			inst->bits1.branch_gen6.jump_count = offset; // for CASE,ELSE,FORK,IF,WHILE
 		} else if(IS_GENp(7)) {
-		    int opcode = entry->instruction.header.opcode;
+		    int opcode = inst->header.opcode;
 		    /* Gen7 JMPI Restrictions in bspec:
 		     * The JIP data type must be Signed DWord
 		     */
 		    if(opcode == BRW_OPCODE_JMPI)
-			entry->instruction.bits3.JIP = offset;
+			inst->bits3.JIP = offset;
 		    else
-			entry->instruction.bits3.break_cont.jip = offset;
+			inst->bits3.break_cont.jip = offset;
 		}
 	    }
 	}
@@ -469,10 +479,10 @@ int main(int argc, char **argv)
 		entry != NULL;
 		entry = entry1) {
 	    entry1 = entry->next;
-	    if (!entry->islabel)
-		print_instruction(output, &entry->instruction);
+	    if (!is_label(entry))
+		print_instruction(output, &entry->instruction.gen);
 	    else
-		free(entry->string);
+		free(entry->instruction.label.name);
 	    free(entry);
 	}
 	if (binary_like_output)
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 39/90] assembler: Don't change the size of opcodes!
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (37 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 38/90] assembler: Make explicit that labels are part of the instructions list Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 40/90] assembler: Make sure nobody adds a field back to struct brw_instruction Damien Lespiau
                   ` (51 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Until now, the assembler had relocation-related fields added to struct
brw_instruction. This changes the size of the structure and break code
assuming the opcode structure is really 16 bytes, for instance the
emission code in brw_eu_emit.c.

With this commit, we build on the infrastructure that slowly emerged in
the few previous commits to add a relocatable instruction with the
needed fields.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_structs.h |    3 -
 assembler/gen4asm.h     |   13 +++
 assembler/gram.y        |  193 ++++++++++++++++++++++++++--------------------
 assembler/main.c        |   23 +++---
 4 files changed, 134 insertions(+), 98 deletions(-)

diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index db7a9be..e650bf5 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -1463,9 +1463,6 @@ struct brw_instruction
       GLuint ud;
       float f;
    } bits3;
-
-   char *first_reloc_target, *second_reloc_target; // first for JIP, second for UIP
-   GLint first_reloc_offset, second_reloc_offset; // in number of instructions
 };
 
 struct brw_compact_instruction {
diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index aa380e1..aeb2b9c 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -137,6 +137,7 @@ typedef struct {
 
 enum assembler_instruction_type {
     GEN4ASM_INSTRUCTION_GEN,
+    GEN4ASM_INSTRUCTION_GEN_RELOCATABLE,
     GEN4ASM_INSTRUCTION_LABEL,
 };
 
@@ -144,6 +145,12 @@ struct label_instruction {
     char   *name;
 };
 
+struct relocatable_instruction {
+    struct brw_instruction gen;
+    char *first_reloc_target, *second_reloc_target; // JIP and UIP respectively
+    GLint first_reloc_offset, second_reloc_offset; // in number of instructions
+};
+
 /**
  * This structure is just the list container for instructions accumulated by
  * the parser and labels.
@@ -153,6 +160,7 @@ struct brw_program_instruction {
     unsigned inst_offset;
     union {
 	struct brw_instruction gen;
+	struct relocatable_instruction reloc;
 	struct label_instruction label;
     } instruction;
     struct brw_program_instruction *next;
@@ -169,6 +177,11 @@ static inline char *label_name(struct brw_program_instruction *i)
     return i->instruction.label.name;
 }
 
+static inline bool is_relocatable(struct brw_program_instruction *intruction)
+{
+    return intruction->type == GEN4ASM_INSTRUCTION_GEN_RELOCATABLE;
+}
+
 /**
  * This structure is a list of instructions.  It is the final output of the
  * parser.
diff --git a/assembler/gram.y b/assembler/gram.y
index cf65f9f..342c66d 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -124,6 +124,17 @@ static void brw_program_add_instruction(struct brw_program *p,
     brw_program_append_entry(p, list_entry);
 }
 
+static void brw_program_add_relocatable(struct brw_program *p,
+					struct relocatable_instruction *reloc)
+{
+    struct brw_program_instruction *list_entry;
+
+    list_entry = calloc(sizeof(struct brw_program_instruction), 1);
+    list_entry->type = GEN4ASM_INSTRUCTION_GEN_RELOCATABLE;
+    list_entry->instruction.reloc = *reloc;
+    brw_program_append_entry(p, list_entry);
+}
+
 static void brw_program_add_label(struct brw_program *p, const char *label)
 {
     struct brw_program_instruction *list_entry;
@@ -143,6 +154,7 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 	int integer;
 	double number;
 	struct brw_instruction instruction;
+	struct relocatable_instruction relocatable;
 	struct brw_program program;
 	struct region region;
 	struct regtype regtype;
@@ -227,14 +239,14 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 %type <integer> simple_int
 %type <instruction> instruction unaryinstruction binaryinstruction
 %type <instruction> binaryaccinstruction trinaryinstruction sendinstruction
-%type <instruction> jumpinstruction
-%type <instruction> breakinstruction syncinstruction
+%type <instruction> syncinstruction
 %type <instruction> msgtarget
 %type <instruction> instoptions instoption_list predicate
 %type <instruction> mathinstruction
-%type <instruction> subroutineinstruction
-%type <instruction> multibranchinstruction
-%type <instruction> nopinstruction loopinstruction ifelseinstruction haltinstruction
+%type <instruction> nopinstruction
+%type <relocatable> relocatableinstruction breakinstruction
+%type <relocatable> ifelseinstruction loopinstruction haltinstruction
+%type <relocatable> multibranchinstruction subroutineinstruction jumpinstruction
 %type <string> label
 %type <program> instrseq
 %type <integer> instoption
@@ -390,6 +402,16 @@ instrseq:	instrseq pragma
 		  brw_program_init(&$$);
 		  brw_program_add_instruction(&$$, &$1);
 		}
+		| instrseq relocatableinstruction SEMICOLON
+		{
+		  brw_program_add_relocatable(&$1, &$2);
+		  $$ = $1;
+		}
+		| relocatableinstruction SEMICOLON
+		{
+		  brw_program_init(&$$);
+		  brw_program_add_relocatable(&$$, &$1);
+		}
 		| instrseq SEMICOLON
 		{
 		    $$ = $1;
@@ -422,16 +444,19 @@ instruction:	unaryinstruction
 		| binaryaccinstruction
 		| trinaryinstruction
 		| sendinstruction
-		| jumpinstruction
-		| ifelseinstruction
-		| breakinstruction
 		| syncinstruction
 		| mathinstruction
-		| subroutineinstruction
-		| multibranchinstruction
 		| nopinstruction
-		| haltinstruction
-		| loopinstruction
+;
+
+/* relocatableinstruction are instructions that needs a relocation pass */
+relocatableinstruction:	ifelseinstruction
+			| loopinstruction
+			| haltinstruction
+			| multibranchinstruction
+			| subroutineinstruction
+			| jumpinstruction
+			| breakinstruction
 ;
 
 ifelseinstruction: ENDIF
@@ -442,11 +467,11 @@ ifelseinstruction: ENDIF
 		    YYERROR;
 		  }
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $1;
-		  $$.header.thread_control |= BRW_THREAD_SWITCH;
-		  $$.bits1.da1.dest_horiz_stride = 1;
-		  $$.bits1.da1.src1_reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.bits1.da1.src1_reg_type = BRW_REGISTER_TYPE_UD;
+		  $$.gen.header.opcode = $1;
+		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
+		  $$.gen.bits1.da1.dest_horiz_stride = 1;
+		  $$.gen.bits1.da1.src1_reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.gen.bits1.da1.src1_reg_type = BRW_REGISTER_TYPE_UD;
 		}
 		| ENDIF execsize relativelocation instoptions
 		{
@@ -457,8 +482,8 @@ ifelseinstruction: ENDIF
 		    YYERROR;
 		  }
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $1;
-		  $$.header.execution_size = $2;
+		  $$.gen.header.opcode = $1;
+		  $$.gen.header.execution_size = $2;
 		  $$.first_reloc_target = $3.reloc_target;
 		  $$.first_reloc_offset = $3.imm32;
 		}
@@ -470,18 +495,18 @@ ifelseinstruction: ENDIF
 		    $3.imm32 |= (1 << 16);
 
 		    memset(&$$, 0, sizeof($$));
-		    $$.header.opcode = $1;
-		    $$.header.execution_size = $2;
-		    $$.header.thread_control |= BRW_THREAD_SWITCH;
-		    set_instruction_dest(&$$, &ip_dst);
-		    set_instruction_src0(&$$, &ip_src);
-		    set_instruction_src1(&$$, &$3);
+		    $$.gen.header.opcode = $1;
+		    $$.gen.header.execution_size = $2;
+		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
+		    set_instruction_dest(&$$.gen, &ip_dst);
+		    set_instruction_src0(&$$.gen, &ip_src);
+		    set_instruction_src1(&$$.gen, &$3);
 		    $$.first_reloc_target = $3.reloc_target;
 		    $$.first_reloc_offset = $3.imm32;
 		  } else if(IS_GENp(6)) {
 		    memset(&$$, 0, sizeof($$));
-		    $$.header.opcode = $1;
-		    $$.header.execution_size = $2;
+		    $$.gen.header.opcode = $1;
+		    $$.gen.header.execution_size = $2;
 		    $$.first_reloc_target = $3.reloc_target;
 		    $$.first_reloc_offset = $3.imm32;
 		  } else {
@@ -504,14 +529,14 @@ ifelseinstruction: ENDIF
 		    YYERROR;
 		  }
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$, &$1);
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
+		  set_instruction_predicate(&$$.gen, &$1);
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = $3;
 		  if(!IS_GENp(6)) {
-		    $$.header.thread_control |= BRW_THREAD_SWITCH;
-		    set_instruction_dest(&$$, &ip_dst);
-		    set_instruction_src0(&$$, &ip_src);
-		    set_instruction_src1(&$$, &$4);
+		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
+		    set_instruction_dest(&$$.gen, &ip_dst);
+		    set_instruction_src0(&$$.gen, &ip_src);
+		    set_instruction_src1(&$$.gen, &$4);
 		  }
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
@@ -524,9 +549,9 @@ ifelseinstruction: ENDIF
 		    YYERROR;
 		  }
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$, &$1);
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
+		  set_instruction_predicate(&$$.gen, &$1);
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = $3;
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
 		  $$.second_reloc_target = $5.reloc_target;
@@ -542,14 +567,14 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		     * offset is the second source operand.  The offset is added
 		     * to the pre-incremented IP.
 		     */
-		    set_instruction_dest(&$$, &ip_dst);
+		    set_instruction_dest(&$$.gen, &ip_dst);
 		    memset(&$$, 0, sizeof($$));
-		    set_instruction_predicate(&$$, &$1);
-		    $$.header.opcode = $2;
-		    $$.header.execution_size = $3;
-		    $$.header.thread_control |= BRW_THREAD_SWITCH;
-		    set_instruction_src0(&$$, &ip_src);
-		    set_instruction_src1(&$$, &$4);
+		    set_instruction_predicate(&$$.gen, &$1);
+		    $$.gen.header.opcode = $2;
+		    $$.gen.header.execution_size = $3;
+		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
+		    set_instruction_src0(&$$.gen, &ip_src);
+		    set_instruction_src1(&$$.gen, &$4);
 		    $$.first_reloc_target = $4.reloc_target;
 		    $$.first_reloc_offset = $4.imm32;
 		  } else if (IS_GENp(6)) {
@@ -557,9 +582,9 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		         dest must have the same element size as src0.
 		         dest horizontal stride must be 1. */
 		    memset(&$$, 0, sizeof($$));
-		    set_instruction_predicate(&$$, &$1);
-		    $$.header.opcode = $2;
-		    $$.header.execution_size = $3;
+		    set_instruction_predicate(&$$.gen, &$1);
+		    $$.gen.header.opcode = $2;
+		    $$.gen.header.execution_size = $3;
 		    $$.first_reloc_target = $4.reloc_target;
 		    $$.first_reloc_offset = $4.imm32;
 		  } else {
@@ -571,7 +596,7 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		{
 		  // deprecated
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $1;
+		  $$.gen.header.opcode = $1;
 		};
 
 haltinstruction: predicate HALT execsize relativelocation relativelocation instoptions
@@ -579,15 +604,15 @@ haltinstruction: predicate HALT execsize relativelocation relativelocation insto
 		  // for Gen6, Gen7
 		  /* Gen6, Gen7 bspec: dst and src0 must be the null reg. */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$, &$1);
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
+		  set_instruction_predicate(&$$.gen, &$1);
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = $3;
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
 		  $$.second_reloc_target = $5.reloc_target;
 		  $$.second_reloc_offset = $5.imm32;
-		  set_instruction_dest(&$$, &dst_null_reg);
-		  set_instruction_src0(&$$, &src_null_reg);
+		  set_instruction_dest(&$$.gen, &dst_null_reg);
+		  set_instruction_src0(&$$.gen, &src_null_reg);
 		};
 
 multibranchinstruction:
@@ -595,28 +620,28 @@ multibranchinstruction:
 		{
 		  /* Gen7 bspec: dest must be null. use Switch option */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$, &$1);
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
-		  $$.header.thread_control |= BRW_THREAD_SWITCH;
+		  set_instruction_predicate(&$$.gen, &$1);
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = $3;
+		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
-		  set_instruction_dest(&$$, &dst_null_reg);
+		  set_instruction_dest(&$$.gen, &dst_null_reg);
 		}
 		| predicate BRC execsize relativelocation relativelocation instoptions
 		{
 		  /* Gen7 bspec: dest must be null. src0 must be null. use Switch option */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$, &$1);
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
-		  $$.header.thread_control |= BRW_THREAD_SWITCH;
+		  set_instruction_predicate(&$$.gen, &$1);
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = $3;
+		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
 		  $$.second_reloc_target = $5.reloc_target;
 		  $$.second_reloc_offset = $5.imm32;
-		  set_instruction_dest(&$$, &dst_null_reg);
-		  set_instruction_src0(&$$, &src_null_reg);
+		  set_instruction_dest(&$$.gen, &dst_null_reg);
+		  set_instruction_src0(&$$.gen, &src_null_reg);
 		}
 ;
 
@@ -638,12 +663,12 @@ subroutineinstruction:
 		       execution size must be 2.
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$, &$1);
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = 1; /* execution size must be 2. Here 1 is encoded 2. */
+		  set_instruction_predicate(&$$.gen, &$1);
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = 1; /* execution size must be 2. Here 1 is encoded 2. */
 
 		  $4.reg_type = BRW_REGISTER_TYPE_D; /* dest type should be DWORD */
-		  set_instruction_dest(&$$, &$4);
+		  set_instruction_dest(&$$.gen, &$4);
 
 		  struct src_operand src0;
 		  memset(&src0, 0, sizeof(src0));
@@ -652,7 +677,7 @@ subroutineinstruction:
 		  src0.horiz_stride = 1; /*encoded 1*/
 		  src0.width = 1; /*encoded 2*/
 		  src0.vert_stride = 2; /*encoded 2*/
-		  set_instruction_src0(&$$, &src0);
+		  set_instruction_src0(&$$.gen, &src0);
 
 		  $$.first_reloc_target = $5.reloc_target;
 		  $$.first_reloc_offset = $5.imm32;
@@ -666,15 +691,15 @@ subroutineinstruction:
 		       src0 region control must be <2,2,1> (not specified clearly. should be same as CALL)
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$, &$1);
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = 1; /* execution size of RET should be 2 */
-		  set_instruction_dest(&$$, &dst_null_reg);
+		  set_instruction_predicate(&$$.gen, &$1);
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = 1; /* execution size of RET should be 2 */
+		  set_instruction_dest(&$$.gen, &dst_null_reg);
 		  $5.reg_type = BRW_REGISTER_TYPE_D;
 		  $5.horiz_stride = 1; /*encoded 1*/
 		  $5.width = 1; /*encoded 2*/
 		  $5.vert_stride = 2; /*encoded 2*/
-		  set_instruction_src0(&$$, &$5);
+		  set_instruction_src0(&$$.gen, &$5);
 		}
 ;
 
@@ -1089,14 +1114,14 @@ jumpinstruction: predicate JMPI execsize relativelocation2
 		   * is the post-incremented IP plus the offset.
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = ffs(1) - 1;
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = ffs(1) - 1;
 		  if(advanced_flag)
-		  	$$.header.mask_control = BRW_MASK_DISABLE;
-		  set_instruction_predicate(&$$, &$1);
-		  set_instruction_dest(&$$, &ip_dst);
-		  set_instruction_src0(&$$, &ip_src);
-		  set_instruction_src1(&$$, &$4);
+			$$.gen.header.mask_control = BRW_MASK_DISABLE;
+		  set_instruction_predicate(&$$.gen, &$1);
+		  set_instruction_dest(&$$.gen, &ip_dst);
+		  set_instruction_src0(&$$.gen, &ip_src);
+		  set_instruction_src1(&$$.gen, &$4);
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
 		}
@@ -1123,9 +1148,9 @@ breakinstruction: predicate breakop execsize relativelocation relativelocation i
 		{
 		  // for Gen6, Gen7
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$, &$1);
-		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
+		  set_instruction_predicate(&$$.gen, &$1);
+		  $$.gen.header.opcode = $2;
+		  $$.gen.header.execution_size = $3;
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
 		  $$.second_reloc_target = $5.reloc_target;
diff --git a/assembler/main.c b/assembler/main.c
index eb75230..85f0790 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -421,24 +421,25 @@ int main(int argc, char **argv)
 	}
 
 	for (entry = compiled_program.first; entry; entry = entry->next) {
-	    struct brw_instruction *inst = & entry->instruction.gen;
+	    struct relocatable_instruction *reloc = &entry->instruction.reloc;
+	    struct brw_instruction *inst = &reloc->gen;
 
-	    if (is_label(entry))
+	    if (!is_relocatable(entry))
 		continue;
 
-	    if (inst->first_reloc_target)
-		inst->first_reloc_offset = label_to_addr(inst->first_reloc_target, entry->inst_offset) - entry->inst_offset;
+	    if (reloc->first_reloc_target)
+		reloc->first_reloc_offset = label_to_addr(reloc->first_reloc_target, entry->inst_offset) - entry->inst_offset;
 
-	    if (inst->second_reloc_target)
-		inst->second_reloc_offset = label_to_addr(inst->second_reloc_target, entry->inst_offset) - entry->inst_offset;
+	    if (reloc->second_reloc_target)
+		reloc->second_reloc_offset = label_to_addr(reloc->second_reloc_target, entry->inst_offset) - entry->inst_offset;
 
-	    if (inst->second_reloc_offset) {
+	    if (reloc->second_reloc_offset) {
 		// this is a branch instruction with two offset arguments
-		inst->bits3.break_cont.jip = jump_distance(inst->first_reloc_offset);
-		inst->bits3.break_cont.uip = jump_distance(inst->second_reloc_offset);
-	    } else if (inst->first_reloc_offset) {
+		inst->bits3.break_cont.jip = jump_distance(reloc->first_reloc_offset);
+		inst->bits3.break_cont.uip = jump_distance(reloc->second_reloc_offset);
+	    } else if (reloc->first_reloc_offset) {
 		// this is a branch instruction with one offset argument
-		int offset = inst->first_reloc_offset;
+		int offset = reloc->first_reloc_offset;
 		/* bspec: Unlike other flow control instructions, the offset used by JMPI is relative to the incremented instruction pointer rather than the IP value for the instruction itself. */
 		
 		int is_jmpi = inst->header.opcode == BRW_OPCODE_JMPI; // target relative to the post-incremented IP, so delta == 1 if JMPI
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 40/90] assembler: Make sure nobody adds a field back to struct brw_instruction
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (38 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 39/90] assembler: Don't change the size of opcodes! Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 41/90] assembler: Don't expose functions only used in main.c Damien Lespiau
                   ` (50 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Adding something there will break the library, so we might as check for
it.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index aeb2b9c..388cc75 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -55,6 +55,15 @@ extern long int gen_level;
 
 void yyerror (char *msg);
 
+#define STRUCT_SIZE_ASSERT(TYPE, SIZE) \
+typedef struct { \
+          char compile_time_assert_ ## TYPE ## _size[ \
+              (sizeof (struct TYPE) == (SIZE)) ? 1 : -1]; \
+        } _ ## TYPE ## SizeCheck
+
+/* ensure nobody changes the size of struct brw_instruction */
+STRUCT_SIZE_ASSERT(brw_instruction, 16);
+
 /**
  * This structure is the internal representation of directly-addressed
  * registers in the parser.
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 41/90] assembler: Don't expose functions only used in main.c
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (39 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 40/90] assembler: Make sure nobody adds a field back to struct brw_instruction Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 42/90] assembler: Make struct declared_register use struct brw_reg Damien Lespiau
                   ` (49 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

and make then static.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    2 --
 assembler/main.c    |    4 ++--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 388cc75..8dd08b7 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -234,8 +234,6 @@ struct declared_register {
 };
 struct declared_register *find_register(char *name);
 void insert_register(struct declared_register *reg);
-void add_label(struct brw_program_instruction *instruction);
-int label_to_addr(char *name, int start_addr);
 
 int yyparse(void);
 int yylex(void);
diff --git a/assembler/main.c b/assembler/main.c
index 85f0790..176835b 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -155,7 +155,7 @@ void insert_register(struct declared_register *reg)
     insert_hash_item(declared_register_table, reg->name, reg);
 }
 
-void add_label(struct brw_program_instruction *i)
+static void add_label(struct brw_program_instruction *i)
 {
     struct label_item **p = &label_table;
 
@@ -170,7 +170,7 @@ void add_label(struct brw_program_instruction *i)
 
 /* Some assembly code have duplicated labels.
    Start from start_addr. Search as a loop. Return the first label found. */
-int label_to_addr(char *name, int start_addr)
+static int label_to_addr(char *name, int start_addr)
 {
     /* return the first label just after start_addr, or the first label from the head */
     struct label_item *p;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 42/90] assembler: Make struct declared_register use struct brw_reg
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (40 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 41/90] assembler: Don't expose functions only used in main.c Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 43/90] assembler: Replace struct direct_reg by " Damien Lespiau
                   ` (48 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

It's time to start converting the emission code in gram.y to use libbrw
infrastructure. Let's start with using brw_reg for declared register.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    4 +++-
 assembler/gram.y    |   46 +++++++++++++++++++++++-----------------------
 2 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 8dd08b7..d81f597 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -33,6 +33,8 @@
 #include <stdbool.h>
 #include <assert.h>
 
+#include "brw_reg.h"
+
 typedef unsigned char GLubyte;
 typedef short GLshort;
 typedef unsigned int GLuint;
@@ -226,7 +228,7 @@ extern struct program_defaults program_defaults;
 
 struct declared_register {
     char *name;
-    struct direct_reg base;
+    struct brw_reg reg;
     int element_size;
     struct region src_region;
     int dst_region;
diff --git a/assembler/gram.y b/assembler/gram.y
index 342c66d..7b4cdee 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -353,9 +353,9 @@ declare_pragma:	DECLARE_PRAGMA STRING declare_base declare_elementsize declare_s
 			reg = calloc(sizeof(struct declared_register), 1);
 			reg->name = $2;
 		    }
-		    reg->base.reg_file = $3.reg_file;
-		    reg->base.reg_nr = $3.reg_nr;
-		    reg->base.subreg_nr = $3.subreg_nr;
+		    reg->reg.file = $3.reg_file;
+		    reg->reg.nr = $3.reg_nr;
+		    reg->reg.subnr = $3.subreg_nr;
 		    reg->element_size = $4;
 		    reg->src_region = $5;
 		    reg->dst_region = $6;
@@ -1548,9 +1548,9 @@ dst:		dstoperand | dstoperandex
 dstoperand:	symbol_reg dstregion
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.base.reg_file;
-		  $$.reg_nr = $1.base.reg_nr;
-		  $$.subreg_nr = $1.base.subreg_nr;
+		  $$.reg_file = $1.reg.file;
+		  $$.reg_nr = $1.reg.nr;
+		  $$.subreg_nr = $1.reg.subnr;
 		  if ($2 == DEFAULT_DSTREGION) {
 		      $$.horiz_stride = $1.dst_region;
 		  } else {
@@ -1657,7 +1657,7 @@ symbol_reg_p: STRING LPAREN exp RPAREN
 		    }
 
 		    memcpy(&$$, dcl_reg, sizeof(*dcl_reg));
-		    $$.base.reg_nr += $3;
+		    $$.reg.nr += $3;
 		    free($1);
 		}
 		| STRING LPAREN exp COMMA exp RPAREN
@@ -1670,15 +1670,15 @@ symbol_reg_p: STRING LPAREN exp RPAREN
 		    }
 
 		    memcpy(&$$, dcl_reg, sizeof(*dcl_reg));
-		    $$.base.reg_nr += $3;
-		    $$.base.subreg_nr += $5;
+		    $$.reg.nr += $3;
 		    if(advanced_flag) {
-		        $$.base.reg_nr += $$.base.subreg_nr / (32 / get_type_size(dcl_reg->type));
-		        $$.base.subreg_nr = $$.base.subreg_nr % (32 / get_type_size(dcl_reg->type));
+			int size = get_type_size(dcl_reg->type);
+		        $$.reg.nr += ($$.reg.subnr + $5) / (32 / size);
+		        $$.reg.subnr = ($$.reg.subnr + $5) % (32 / size);
 		    } else {
-		        $$.base.reg_nr += $$.base.subreg_nr / 32;
-		        $$.base.subreg_nr = $$.base.subreg_nr % 32;
-			}
+		        $$.reg.nr += ($$.reg.subnr + $5) / 32;
+		        $$.reg.subnr = ($$.reg.subnr + $5) % 32;
+		    }
 		    free($1);
 		}
 ;
@@ -1857,9 +1857,9 @@ srcarchoperandex_typed: flagreg | addrreg | maskreg
 sendleadreg: symbol_reg
              {
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.base.reg_file;
-		  $$.reg_nr = $1.base.reg_nr;
-		  $$.subreg_nr = $1.base.subreg_nr;
+		  $$.reg_file = $1.reg.file;
+		  $$.reg_nr = $1.reg.nr;
+		  $$.subreg_nr = $1.reg.subnr;
              }
              | directgenreg | directmsgreg
 ;
@@ -1871,9 +1871,9 @@ directsrcoperand:	negate abs symbol_reg region regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $3.base.reg_file;
-		  $$.reg_nr = $3.base.reg_nr;
-		  $$.subreg_nr = $3.base.subreg_nr;
+		  $$.reg_file = $3.reg.file;
+		  $$.reg_nr = $3.reg.nr;
+		  $$.subreg_nr = $3.reg.subnr;
 		  if ($5.is_default) {
 		    $$.reg_type = $3.type;
 		  } else {
@@ -2303,9 +2303,9 @@ relativelocation2:
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $1.base.reg_file;
-		  $$.reg_nr = $1.base.reg_nr;
-		  $$.subreg_nr = $1.base.subreg_nr;
+		  $$.reg_file = $1.reg.file;
+		  $$.reg_nr = $1.reg.nr;
+		  $$.subreg_nr = $1.reg.subnr;
 		  $$.reg_type = $1.type;
 		  $$.vert_stride = $1.src_region.vert_stride;
 		  $$.width = $1.src_region.width;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 43/90] assembler: Replace struct direct_reg by struct brw_reg
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (41 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 42/90] assembler: Make struct declared_register use struct brw_reg Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 44/90] assembler: Replace struct indirect_reg " Damien Lespiau
                   ` (47 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

More code simplification can be layered on top of that (by using some
brw_* helpers to create registers), that'd be for another commit.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    8 --
 assembler/gram.y    |  202 +++++++++++++++++++++++++-------------------------
 2 files changed, 101 insertions(+), 109 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index d81f597..122baf0 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -66,14 +66,6 @@ typedef struct { \
 /* ensure nobody changes the size of struct brw_instruction */
 STRUCT_SIZE_ASSERT(brw_instruction, 16);
 
-/**
- * This structure is the internal representation of directly-addressed
- * registers in the parser.
- */
-struct direct_reg {
-	int reg_file, reg_nr, subreg_nr;
-};
-
 struct condition {
     	int cond;
 	int flag_reg_nr;
diff --git a/assembler/gram.y b/assembler/gram.y
index 7b4cdee..71dbea9 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -92,9 +92,9 @@ void set_instruction_options(struct brw_instruction *instr,
 			     struct brw_instruction *options);
 void set_instruction_predicate(struct brw_instruction *instr,
 			       struct brw_instruction *predicate);
-void set_direct_dst_operand(struct dst_operand *dst, struct direct_reg *reg,
+void set_direct_dst_operand(struct dst_operand *dst, struct brw_reg *reg,
 			    int type);
-void set_direct_src_operand(struct src_operand *src, struct direct_reg *reg,
+void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
 			    int type);
 
 static void brw_program_init(struct brw_program *p)
@@ -158,7 +158,7 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 	struct brw_program program;
 	struct region region;
 	struct regtype regtype;
-	struct direct_reg direct_reg;
+	struct brw_reg direct_reg;
 	struct indirect_reg indirect_reg;
 	struct condition condition;
 	struct declared_register symbol_reg;
@@ -263,7 +263,7 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 %type <region> region region_wh indirectregion declare_srcregion;
 %type <regtype> regtype
 %type <direct_reg> directgenreg directmsgreg addrreg accreg flagreg maskreg
-%type <direct_reg> maskstackreg notifyreg 
+%type <direct_reg> maskstackreg notifyreg
 /* %type <direct_reg>  maskstackdepthreg */
 %type <direct_reg> statereg controlreg ipreg nullreg
 %type <direct_reg> dstoperandex_typed srcarchoperandex_typed
@@ -926,7 +926,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.destreg__conditionalmod = $5.reg_nr; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 
@@ -949,7 +949,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.destreg__conditionalmod = $5.reg_nr; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$4) != 0)
@@ -996,7 +996,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       src0.reg_type = BRW_REGISTER_TYPE_D;
                   }
 
-                  src0.reg_nr = $5.reg_nr;
+                  src0.reg_nr = $5.nr;
                   src0.subreg_nr = 0;
                   set_instruction_src0(&$$, &src0);
 
@@ -1042,7 +1042,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       src0.reg_type = BRW_REGISTER_TYPE_D;
                   }
 
-                  src0.reg_nr = $5.reg_nr;
+                  src0.reg_nr = $5.nr;
                   src0.subreg_nr = 0;
                   set_instruction_src0(&$$, &src0);
 
@@ -1060,7 +1060,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.destreg__conditionalmod = $5.reg_nr; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$4) != 0)
@@ -1082,7 +1082,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.execution_size = $3;
-		  $$.header.destreg__conditionalmod = $5.reg_nr; /* msg reg index */
+		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 
@@ -1582,45 +1582,45 @@ dstoperand:	symbol_reg dstregion
 dstoperandex:	dstoperandex_typed dstregion regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg_file;
-		  $$.reg_nr = $1.reg_nr;
-		  $$.subreg_nr = $1.subreg_nr;
+		  $$.reg_file = $1.file;
+		  $$.reg_nr = $1.nr;
+		  $$.subreg_nr = $1.subnr;
 		  $$.horiz_stride = $2;
 		  $$.reg_type = $3.type;
 		}
 		| maskstackreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg_file;
-		  $$.reg_nr = $1.reg_nr;
-		  $$.subreg_nr = $1.subreg_nr;
+		  $$.reg_file = $1.file;
+		  $$.reg_nr = $1.nr;
+		  $$.subreg_nr = $1.subnr;
 		  $$.horiz_stride = 1;
 		  $$.reg_type = BRW_REGISTER_TYPE_UW;
 		}
 		| controlreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg_file;
-		  $$.reg_nr = $1.reg_nr;
-		  $$.subreg_nr = $1.subreg_nr;
+		  $$.reg_file = $1.file;
+		  $$.reg_nr = $1.nr;
+		  $$.subreg_nr = $1.subnr;
 		  $$.horiz_stride = 1;
 		  $$.reg_type = BRW_REGISTER_TYPE_UD;
 		}
 		| ipreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg_file;
-		  $$.reg_nr = $1.reg_nr;
-		  $$.subreg_nr = $1.subreg_nr;
+		  $$.reg_file = $1.file;
+		  $$.reg_nr = $1.nr;
+		  $$.subreg_nr = $1.subnr;
 		  $$.horiz_stride = 1;
 		  $$.reg_type = BRW_REGISTER_TYPE_UD;
 		}
 		| nullreg dstregion regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg_file;
-		  $$.reg_nr = $1.reg_nr;
-		  $$.subreg_nr = $1.subreg_nr;
+		  $$.reg_file = $1.file;
+		  $$.reg_nr = $1.nr;
+		  $$.subreg_nr = $1.subnr;
 		  $$.horiz_stride = $2;
 		  $$.reg_type = $3.type;
 		}
@@ -1689,17 +1689,17 @@ dstreg:		directgenreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $1.reg_file;
-		  $$.reg_nr = $1.reg_nr;
-		  $$.subreg_nr = $1.subreg_nr;
+		  $$.reg_file = $1.file;
+		  $$.reg_nr = $1.nr;
+		  $$.subreg_nr = $1.subnr;
 		}
 		| directmsgreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $1.reg_file;
-		  $$.reg_nr = $1.reg_nr;
-		  $$.subreg_nr = $1.subreg_nr;
+		  $$.reg_file = $1.file;
+		  $$.reg_nr = $1.nr;
+		  $$.subreg_nr = $1.subnr;
 		}
 		| indirectgenreg
 		{
@@ -1809,10 +1809,10 @@ directsrcaccoperand:	directsrcoperand
 srcarchoperandex: srcarchoperandex_typed region regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg_file;
+		  $$.reg_file = $1.file;
 		  $$.reg_type = $3.type;
-		  $$.subreg_nr = $1.subreg_nr;
-		  $$.reg_nr = $1.reg_nr;
+		  $$.subreg_nr = $1.subnr;
+		  $$.reg_nr = $1.nr;
 		  $$.vert_stride = $2.vert_stride;
 		  $$.width = $2.width;
 		  $$.horiz_stride = $2.horiz_stride;
@@ -1857,9 +1857,9 @@ srcarchoperandex_typed: flagreg | addrreg | maskreg
 sendleadreg: symbol_reg
              {
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg.file;
-		  $$.reg_nr = $1.reg.nr;
-		  $$.subreg_nr = $1.reg.subnr;
+		  $$.file = $1.reg.file;
+		  $$.nr = $1.reg.nr;
+		  $$.subnr = $1.reg.subnr;
              }
              | directgenreg | directmsgreg
 ;
@@ -1900,9 +1900,9 @@ directsrcoperand:	negate abs symbol_reg region regtype
 		  else{
 		    memset (&$$, '\0', sizeof ($$));
 		    $$.address_mode = BRW_ADDRESS_DIRECT;
-		    $$.reg_file = $1.reg_file;
-		    $$.reg_nr = $1.reg_nr;
-		    $$.subreg_nr = $1.subreg_nr;
+		    $$.reg_file = $1.file;
+		    $$.reg_nr = $1.nr;
+		    $$.subreg_nr = $1.subnr;
 		    $$.vert_stride = $2.vert_stride;
 		    $$.width = $2.width;
 		    $$.horiz_stride = $2.horiz_stride;
@@ -1913,9 +1913,9 @@ directsrcoperand:	negate abs symbol_reg region regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $3.reg_file;
-		  $$.reg_nr = $3.reg_nr;
-		  $$.subreg_nr = $3.subreg_nr;
+		  $$.reg_file = $3.file;
+		  $$.reg_nr = $3.nr;
+		  $$.subreg_nr = $3.subnr;
 		  $$.reg_type = $5.type;
 		  $$.vert_stride = $4.vert_stride;
 		  $$.width = $4.width;
@@ -1966,13 +1966,13 @@ addrparam:	addrreg COMMA immaddroffset
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_subreg_nr = $1.subreg_nr;
+		  $$.address_subreg_nr = $1.subnr;
 		  $$.indirect_offset = $3;
 		}
 		| addrreg 
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_subreg_nr = $1.subreg_nr;
+		  $$.address_subreg_nr = $1.subnr;
 		  $$.indirect_offset = 0;
 		}
 ;
@@ -2000,9 +2000,9 @@ subregnum:	DOT exp
 directgenreg:	GENREG subregnum
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_GENERAL_REGISTER_FILE;
-		  $$.reg_nr = $1;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_GENERAL_REGISTER_FILE;
+		  $$.nr = $1;
+		  $$.subnr = $2;
 		}
 ;
 
@@ -2018,9 +2018,9 @@ indirectgenreg: GENREGFILE LSQUARE addrparam RSQUARE
 directmsgreg:	MSGREG subregnum
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_MESSAGE_REGISTER_FILE;
-		  $$.reg_nr = $1;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_MESSAGE_REGISTER_FILE;
+		  $$.nr = $1;
+		  $$.subnr = $2;
 		}
 ;
 
@@ -2041,9 +2041,9 @@ addrreg:	ADDRESSREG subregnum
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_ADDRESS | $1;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_ADDRESS | $1;
+		  $$.subnr = $2;
 		}
 ;
 
@@ -2055,9 +2055,9 @@ accreg:		ACCREG subregnum
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_ACCUMULATOR | $1;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_ACCUMULATOR | $1;
+		  $$.subnr = $2;
 		}
 ;
 
@@ -2077,9 +2077,9 @@ flagreg:	FLAGREG subregnum
 		  }
 
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_FLAG | $1;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_FLAG | $1;
+		  $$.subnr = $2;
 		}
 ;
 
@@ -2091,16 +2091,16 @@ maskreg:	MASKREG subregnum
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_MASK;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_MASK;
+		  $$.subnr = $2;
 		}
 		| mask_subreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_MASK;
-		  $$.subreg_nr = $1;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_MASK;
+		  $$.subnr = $1;
 		}
 ;
 
@@ -2115,16 +2115,16 @@ maskstackreg:	MASKSTACKREG subregnum
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_MASK_STACK;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_MASK_STACK;
+		  $$.subnr = $2;
 		}
 		| maskstack_subreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_MASK_STACK;
-		  $$.subreg_nr = $1;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_MASK_STACK;
+		  $$.subnr = $1;
 		}
 ;
 
@@ -2168,14 +2168,14 @@ notifyreg:	NOTIFYREG regtype
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
 
                   if (IS_GENp(6)) {
-		    $$.reg_nr = BRW_ARF_NOTIFICATION_COUNT;
-                    $$.subreg_nr = $1;
+		    $$.nr = BRW_ARF_NOTIFICATION_COUNT;
+                    $$.subnr = $1;
                   } else {
-		    $$.reg_nr = BRW_ARF_NOTIFICATION_COUNT | $1;
-                    $$.subreg_nr = 0;
+		    $$.nr = BRW_ARF_NOTIFICATION_COUNT | $1;
+                    $$.subnr = 0;
                   }
 		}
 /*
@@ -2208,9 +2208,9 @@ statereg:	STATEREG subregnum
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_STATE | $1;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_STATE | $1;
+		  $$.subnr = $2;
 		}
 ;
 
@@ -2227,27 +2227,27 @@ controlreg:	CONTROLREG subregnum
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_CONTROL | $1;
-		  $$.subreg_nr = $2;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_CONTROL | $1;
+		  $$.subnr = $2;
 		}
 ;
 
 ipreg:		IPREG regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_IP;
-		  $$.subreg_nr = 0;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_IP;
+		  $$.subnr = 0;
 		}
 ;
 
 nullreg:	NULL_TOKEN
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.reg_nr = BRW_ARF_NULL;
-		  $$.subreg_nr = 0;
+		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  $$.nr = BRW_ARF_NULL;
+		  $$.subnr = 0;
 		}
 ;
 
@@ -2504,8 +2504,8 @@ predicate:	/* empty */
 		   * set a predicate for one flag register and conditional
 		   * modification on the other flag register.
 		   */
-		  $$.bits2.da1.flag_reg_nr = ($3.reg_nr & 0xF);
-		  $$.bits2.da1.flag_subreg_nr = $3.subreg_nr;
+		  $$.bits2.da1.flag_reg_nr = ($3.nr & 0xF);
+		  $$.bits2.da1.flag_subreg_nr = $3.subnr;
 		  $$.header.predicate_inverse = $2;
 		}
 ;
@@ -2570,8 +2570,8 @@ conditionalmodifier: condition
 		| condition DOT flagreg
 		{
 		    $$.cond = $1;
-		    $$.flag_reg_nr = ($3.reg_nr & 0xF);
-		    $$.flag_subreg_nr = $3.subreg_nr;
+		    $$.flag_reg_nr = ($3.nr & 0xF);
+		    $$.flag_subreg_nr = $3.subnr;
 		}
 
 condition: /* empty */    { $$ = BRW_CONDITIONAL_NONE; }
@@ -3146,28 +3146,28 @@ void set_instruction_predicate(struct brw_instruction *instr,
 	instr->bits2.da1.flag_subreg_nr = predicate->bits2.da1.flag_subreg_nr;
 }
 
-void set_direct_dst_operand(struct dst_operand *dst, struct direct_reg *reg,
+void set_direct_dst_operand(struct dst_operand *dst, struct brw_reg *reg,
 			    int type)
 {
 	memset(dst, 0, sizeof(*dst));
 	dst->address_mode = BRW_ADDRESS_DIRECT;
-	dst->reg_file = reg->reg_file;
-	dst->reg_nr = reg->reg_nr;
-	dst->subreg_nr = reg->subreg_nr;
+	dst->reg_file = reg->file;
+	dst->reg_nr = reg->nr;
+	dst->subreg_nr = reg->subnr;
 	dst->reg_type = type;
 	dst->horiz_stride = 1;
 	dst->writemask = BRW_WRITEMASK_XYZW;
 }
 
-void set_direct_src_operand(struct src_operand *src, struct direct_reg *reg,
+void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
 			    int type)
 {
 	memset(src, 0, sizeof(*src));
 	src->address_mode = BRW_ADDRESS_DIRECT;
-	src->reg_file = reg->reg_file;
+	src->reg_file = reg->file;
 	src->reg_type = type;
-	src->subreg_nr = reg->subreg_nr;
-	src->reg_nr = reg->reg_nr;
+	src->subreg_nr = reg->subnr;
+	src->reg_nr = reg->nr;
 	src->vert_stride = 0;
 	src->width = 0;
 	src->horiz_stride = 0;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 44/90] assembler: Replace struct indirect_reg by struct brw_reg
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (42 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 43/90] assembler: Replace struct direct_reg by " Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 45/90] assembler: Unify the direct and indirect register type Damien Lespiau
                   ` (46 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

More code simplification can be layered on top of that (by using some
brw_* helpers to create registers), that'd be for another commit.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    8 --------
 assembler/gram.y    |   46 +++++++++++++++++++++++-----------------------
 2 files changed, 23 insertions(+), 31 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 122baf0..8a3e95b 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -80,14 +80,6 @@ struct regtype {
     int type;
     int is_default;
 };
-/**
- * This structure is the internal representation of register-indirect addressed
- * registers in the parser.
- */
-
-struct indirect_reg {
-	int reg_file, address_subreg_nr, indirect_offset;
-};
 
 /**
  * This structure is the internal representation of destination operands in the
diff --git a/assembler/gram.y b/assembler/gram.y
index 71dbea9..169026c 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -159,7 +159,7 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 	struct region region;
 	struct regtype regtype;
 	struct brw_reg direct_reg;
-	struct indirect_reg indirect_reg;
+	struct brw_reg indirect_reg;
 	struct condition condition;
 	struct declared_register symbol_reg;
 	imm32_t imm32;
@@ -1705,17 +1705,17 @@ dstreg:		directgenreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
-		  $$.reg_file = $1.reg_file;
-		  $$.subreg_nr = $1.address_subreg_nr;
-		  $$.indirect_offset = $1.indirect_offset;
+		  $$.reg_file = $1.file;
+		  $$.subreg_nr = $1.subnr;
+		  $$.indirect_offset = $1.dw1.bits.indirect_offset;
 		}
 		| indirectmsgreg
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
-		  $$.reg_file = $1.reg_file;
-		  $$.subreg_nr = $1.address_subreg_nr;
-		  $$.indirect_offset = $1.indirect_offset;
+		  $$.reg_file = $1.file;
+		  $$.subreg_nr = $1.subnr;
+		  $$.indirect_offset = $1.dw1.bits.indirect_offset;
 		}
 ;
 
@@ -1937,9 +1937,9 @@ indirectsrcoperand:
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
-		  $$.reg_file = $3.reg_file;
-		  $$.subreg_nr = $3.address_subreg_nr;
-		  $$.indirect_offset = $3.indirect_offset;
+		  $$.reg_file = $3.file;
+		  $$.subreg_nr = $3.subnr;
+		  $$.indirect_offset = $3.dw1.bits.indirect_offset;
 		  $$.reg_type = $5.type;
 		  $$.vert_stride = $4.vert_stride;
 		  $$.width = $4.width;
@@ -1966,14 +1966,14 @@ addrparam:	addrreg COMMA immaddroffset
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_subreg_nr = $1.subnr;
-		  $$.indirect_offset = $3;
+		  $$.subnr = $1.subnr;
+		  $$.dw1.bits.indirect_offset = $3;
 		}
 		| addrreg 
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_subreg_nr = $1.subnr;
-		  $$.indirect_offset = 0;
+		  $$.subnr = $1.subnr;
+		  $$.dw1.bits.indirect_offset = 0;
 		}
 ;
 
@@ -2009,9 +2009,9 @@ directgenreg:	GENREG subregnum
 indirectgenreg: GENREGFILE LSQUARE addrparam RSQUARE
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_GENERAL_REGISTER_FILE;
-		  $$.address_subreg_nr = $3.address_subreg_nr;
-		  $$.indirect_offset = $3.indirect_offset;
+		  $$.file = BRW_GENERAL_REGISTER_FILE;
+		  $$.subnr = $3.subnr;
+		  $$.dw1.bits.indirect_offset = $3.dw1.bits.indirect_offset;
 		}
 ;
 
@@ -2027,9 +2027,9 @@ directmsgreg:	MSGREG subregnum
 indirectmsgreg: MSGREGFILE LSQUARE addrparam RSQUARE
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_MESSAGE_REGISTER_FILE;
-		  $$.address_subreg_nr = $3.address_subreg_nr;
-		  $$.indirect_offset = $3.indirect_offset;
+		  $$.file = BRW_MESSAGE_REGISTER_FILE;
+		  $$.subnr = $3.subnr;
+		  $$.dw1.bits.indirect_offset = $3.dw1.bits.indirect_offset;
 		}
 ;
 
@@ -2315,9 +2315,9 @@ relativelocation2:
 		{
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
-		  $$.reg_file = $1.reg_file;
-		  $$.subreg_nr = $1.address_subreg_nr;
-		  $$.indirect_offset = $1.indirect_offset;
+		  $$.reg_file = $1.file;
+		  $$.subreg_nr = $1.subnr;
+		  $$.indirect_offset = $1.dw1.bits.indirect_offset;
 		  $$.reg_type = $3.type;
 		  $$.vert_stride = $2.vert_stride;
 		  $$.width = $2.width;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 45/90] assembler: Unify the direct and indirect register type
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (43 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 44/90] assembler: Replace struct indirect_reg " Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 46/90] assembler: Replace struct dst_operand by struct brw_reg Damien Lespiau
                   ` (45 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

They are all struct brw_reg registers now.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   19 +++++++++----------
 1 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 169026c..e015e0a 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -158,8 +158,7 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 	struct brw_program program;
 	struct region region;
 	struct regtype regtype;
-	struct brw_reg direct_reg;
-	struct brw_reg indirect_reg;
+	struct brw_reg reg;
 	struct condition condition;
 	struct declared_register symbol_reg;
 	imm32_t imm32;
@@ -262,13 +261,13 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 %type <integer> predctrl predstate
 %type <region> region region_wh indirectregion declare_srcregion;
 %type <regtype> regtype
-%type <direct_reg> directgenreg directmsgreg addrreg accreg flagreg maskreg
-%type <direct_reg> maskstackreg notifyreg
-/* %type <direct_reg>  maskstackdepthreg */
-%type <direct_reg> statereg controlreg ipreg nullreg
-%type <direct_reg> dstoperandex_typed srcarchoperandex_typed
-%type <direct_reg> sendleadreg
-%type <indirect_reg> indirectgenreg indirectmsgreg addrparam
+%type <reg> directgenreg directmsgreg addrreg accreg flagreg maskreg
+%type <reg> maskstackreg notifyreg
+/* %type <reg>  maskstackdepthreg */
+%type <reg> statereg controlreg ipreg nullreg
+%type <reg> dstoperandex_typed srcarchoperandex_typed
+%type <reg> sendleadreg
+%type <reg> indirectgenreg indirectmsgreg addrparam
 %type <integer> mask_subreg maskstack_subreg 
 %type <integer> declare_elementsize declare_dstregion declare_type
 /* %type <intger> maskstackdepth_subreg */
@@ -1955,7 +1954,7 @@ indirectsrcoperand:
 ;
 
 /* 1.4.4: Address Registers */
-/* Returns a partially-completed indirect_reg consisting of the address
+/* Returns a partially-completed struct brw_reg consisting of the address
  * register fields for register-indirect access.
  */
 addrparam:	addrreg COMMA immaddroffset
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 46/90] assembler: Replace struct dst_operand by struct brw_reg
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (44 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 45/90] assembler: Unify the direct and indirect register type Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 47/90] assembler: Consolidate the swizzling configuration on 8 bits Damien Lespiau
                   ` (44 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

One more step on the road to replacing all register-like structures by
struct brw_reg.

Two things in this commit are worth noting:

* As we are using more and more brw_reg, a lot of the field-by-field
  assignments can be replaced by 1 assignment which results is a
  reduction of code

* As the destination horizontal stride is now stored on 2 bits in
  brw_reg, it's not possible to defer the handling of DEFAULT_DSTREGION
  (aka (int)-1) when setting the destination operand. It has to be done
  when parsing the region and resolve_dst_region() is a helper for that
  task.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |   16 ----
 assembler/gram.y    |  227 ++++++++++++++++++++++-----------------------------
 2 files changed, 97 insertions(+), 146 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 8a3e95b..fe09d52 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -82,22 +82,6 @@ struct regtype {
 };
 
 /**
- * This structure is the internal representation of destination operands in the
- * parser.
- */
-struct dst_operand {
-	int reg_file, reg_nr, subreg_nr, reg_type;
-
-	int writemask;
-
-	int horiz_stride;
-	int address_mode; /* 0 if direct, 1 if register-indirect */
-
-	/* Indirect addressing */
-	int indirect_offset;
-};
-
-/**
  * This structure is the internal representation of source operands in the 
  * parser.
  */
diff --git a/assembler/gram.y b/assembler/gram.y
index e015e0a..8f2a1f9 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -47,19 +47,19 @@ static struct src_operand src_null_reg =
     .reg_nr = BRW_ARF_NULL,
     .reg_type = BRW_REGISTER_TYPE_UD,
 };
-static struct dst_operand dst_null_reg =
+static struct brw_reg dst_null_reg =
 {
-    .reg_file = BRW_ARCHITECTURE_REGISTER_FILE,
-    .reg_nr = BRW_ARF_NULL,
+    .file = BRW_ARCHITECTURE_REGISTER_FILE,
+    .nr = BRW_ARF_NULL,
 };
-static struct dst_operand ip_dst =
+static struct brw_reg ip_dst =
 {
-    .reg_file = BRW_ARCHITECTURE_REGISTER_FILE,
-    .reg_nr = BRW_ARF_IP,
-    .reg_type = BRW_REGISTER_TYPE_UD,
+    .file = BRW_ARCHITECTURE_REGISTER_FILE,
+    .nr = BRW_ARF_IP,
+    .type = BRW_REGISTER_TYPE_UD,
     .address_mode = BRW_ADDRESS_DIRECT,
-    .horiz_stride = 1,
-    .writemask = BRW_WRITEMASK_XYZW,
+    .hstride = 1,
+    .dw1.bits.writemask = BRW_WRITEMASK_XYZW,
 };
 static struct src_operand ip_src =
 {
@@ -75,13 +75,13 @@ static struct src_operand ip_src =
 
 static int get_type_size(GLuint type);
 int set_instruction_dest(struct brw_instruction *instr,
-			 struct dst_operand *dest);
+			 struct brw_reg *dest);
 int set_instruction_src0(struct brw_instruction *instr,
 			 struct src_operand *src);
 int set_instruction_src1(struct brw_instruction *instr,
 			 struct src_operand *src);
 int set_instruction_dest_three_src(struct brw_instruction *instr,
-                                   struct dst_operand *dest);
+                                   struct brw_reg *dest);
 int set_instruction_src0_three_src(struct brw_instruction *instr,
                                    struct src_operand *src);
 int set_instruction_src1_three_src(struct brw_instruction *instr,
@@ -92,7 +92,7 @@ void set_instruction_options(struct brw_instruction *instr,
 			     struct brw_instruction *options);
 void set_instruction_predicate(struct brw_instruction *instr,
 			       struct brw_instruction *predicate);
-void set_direct_dst_operand(struct dst_operand *dst, struct brw_reg *reg,
+void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 			    int type);
 void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
 			    int type);
@@ -145,6 +145,21 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
     brw_program_append_entry(p, list_entry);
 }
 
+static int resolve_dst_region(struct declared_register *reference, int region)
+{
+    int resolved = region;
+
+    if (resolved == DEFAULT_DSTREGION) {
+	if (reference)
+	    resolved = reference->dst_region;
+        else
+            resolved = 1;
+    }
+
+    assert(resolved == 1 || resolved == 2 || resolved == 3);
+    return resolved;
+}
+
 %}
 
 %start ROOT
@@ -163,7 +178,6 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 	struct declared_register symbol_reg;
 	imm32_t imm32;
 
-	struct dst_operand dst_operand;
 	struct src_operand src_operand;
 }
 
@@ -273,8 +287,8 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 /* %type <intger> maskstackdepth_subreg */
 %type <symbol_reg> symbol_reg symbol_reg_p;
 %type <imm32> imm32
-%type <dst_operand> dst dstoperand dstoperandex dstreg post_dst writemask
-%type <dst_operand> declare_base
+%type <reg> dst dstoperand dstoperandex dstreg post_dst writemask
+%type <reg> declare_base
 %type <src_operand> directsrcoperand srcarchoperandex directsrcaccoperand
 %type <src_operand> indirectsrcoperand
 %type <src_operand> src srcimm imm32reg payload srcacc srcaccimm swizzle
@@ -352,9 +366,7 @@ declare_pragma:	DECLARE_PRAGMA STRING declare_base declare_elementsize declare_s
 			reg = calloc(sizeof(struct declared_register), 1);
 			reg->name = $2;
 		    }
-		    reg->reg.file = $3.reg_file;
-		    reg->reg.nr = $3.reg_nr;
-		    reg->reg.subnr = $3.subreg_nr;
+		    reg->reg = $3;
 		    reg->element_size = $4;
 		    reg->src_region = $5;
 		    reg->dst_region = $6;
@@ -666,7 +678,7 @@ subroutineinstruction:
 		  $$.gen.header.opcode = $2;
 		  $$.gen.header.execution_size = 1; /* execution size must be 2. Here 1 is encoded 2. */
 
-		  $4.reg_type = BRW_REGISTER_TYPE_D; /* dest type should be DWORD */
+		  $4.type = BRW_REGISTER_TYPE_D; /* dest type should be DWORD */
 		  set_instruction_dest(&$$.gen, &$4);
 
 		  struct src_operand src0;
@@ -1167,7 +1179,7 @@ maskpushop:	MSAVE | PUSH
 
 syncinstruction: predicate WAIT notifyreg
 		{
-		  struct dst_operand notify_dst;
+		  struct brw_reg notify_dst;
 		  struct src_operand notify_src;
 
 		  memset(&$$, 0, sizeof($$));
@@ -1546,32 +1558,19 @@ dst:		dstoperand | dstoperandex
 
 dstoperand:	symbol_reg dstregion
 		{
-		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg.file;
-		  $$.reg_nr = $1.reg.nr;
-		  $$.subreg_nr = $1.reg.subnr;
-		  if ($2 == DEFAULT_DSTREGION) {
-		      $$.horiz_stride = $1.dst_region;
-		  } else {
-		      $$.horiz_stride = $2;
-		  }
-		  $$.reg_type = $1.type;
+		  $$ = $1.reg;
+	          $$.hstride = resolve_dst_region(&$1, $2);
+		  $$.type = $1.type;
 		}
 		| dstreg dstregion writemask regtype
 		{
 		  /* Returns an instruction with just the destination register
 		   * filled in.
 		   */
-		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.reg_file;
-		  $$.reg_nr = $1.reg_nr;
-		  $$.subreg_nr = $1.subreg_nr;
-		  $$.address_mode = $1.address_mode;
-		  $$.subreg_nr = $1.subreg_nr;
-		  $$.indirect_offset = $1.indirect_offset;
-		  $$.horiz_stride = $2;
-		  $$.writemask = $3.writemask;
-		  $$.reg_type = $4.type;
+		  $$ = $1;
+	          $$.hstride = resolve_dst_region(NULL, $2);
+		  $$.dw1.bits.writemask = $3.dw1.bits.writemask;
+		  $$.type = $4.type;
 		}
 ;
 
@@ -1580,48 +1579,33 @@ dstoperand:	symbol_reg dstregion
  */
 dstoperandex:	dstoperandex_typed dstregion regtype
 		{
-		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.file;
-		  $$.reg_nr = $1.nr;
-		  $$.subreg_nr = $1.subnr;
-		  $$.horiz_stride = $2;
-		  $$.reg_type = $3.type;
+		  $$ = $1;
+	          $$.hstride = resolve_dst_region(NULL, $2);
+		  $$.type = $3.type;
 		}
 		| maskstackreg
 		{
-		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.file;
-		  $$.reg_nr = $1.nr;
-		  $$.subreg_nr = $1.subnr;
-		  $$.horiz_stride = 1;
-		  $$.reg_type = BRW_REGISTER_TYPE_UW;
+		  $$ = $1;
+		  $$.hstride = 1;
+		  $$.type = BRW_REGISTER_TYPE_UW;
 		}
 		| controlreg
 		{
-		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.file;
-		  $$.reg_nr = $1.nr;
-		  $$.subreg_nr = $1.subnr;
-		  $$.horiz_stride = 1;
-		  $$.reg_type = BRW_REGISTER_TYPE_UD;
+		  $$ = $1;
+		  $$.hstride = 1;
+		  $$.type = BRW_REGISTER_TYPE_UD;
 		}
 		| ipreg
 		{
-		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.file;
-		  $$.reg_nr = $1.nr;
-		  $$.subreg_nr = $1.subnr;
-		  $$.horiz_stride = 1;
-		  $$.reg_type = BRW_REGISTER_TYPE_UD;
+		  $$ = $1;
+		  $$.hstride = 1;
+		  $$.type = BRW_REGISTER_TYPE_UD;
 		}
 		| nullreg dstregion regtype
 		{
-		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.file;
-		  $$.reg_nr = $1.nr;
-		  $$.subreg_nr = $1.subnr;
-		  $$.horiz_stride = $2;
-		  $$.reg_type = $3.type;
+		  $$ = $1;
+	          $$.hstride = resolve_dst_region(NULL, $2);
+		  $$.type = $3.type;
 		}
 ;
 
@@ -1686,35 +1670,23 @@ symbol_reg_p: STRING LPAREN exp RPAREN
  */
 dstreg:		directgenreg
 		{
-		  memset (&$$, '\0', sizeof ($$));
+		  $$ = $1;
 		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $1.file;
-		  $$.reg_nr = $1.nr;
-		  $$.subreg_nr = $1.subnr;
 		}
 		| directmsgreg
 		{
-		  memset (&$$, '\0', sizeof ($$));
+		  $$ = $1;
 		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $1.file;
-		  $$.reg_nr = $1.nr;
-		  $$.subreg_nr = $1.subnr;
 		}
 		| indirectgenreg
 		{
-		  memset (&$$, '\0', sizeof ($$));
+		  $$ = $1;
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
-		  $$.reg_file = $1.file;
-		  $$.subreg_nr = $1.subnr;
-		  $$.indirect_offset = $1.dw1.bits.indirect_offset;
 		}
 		| indirectmsgreg
 		{
-		  memset (&$$, '\0', sizeof ($$));
+		  $$ = $1;
 		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
-		  $$.reg_file = $1.file;
-		  $$.subreg_nr = $1.subnr;
-		  $$.indirect_offset = $1.dw1.bits.indirect_offset;
 		}
 ;
 
@@ -2454,16 +2426,16 @@ chansel:	X | Y | Z | W
 ;
 
 /* 1.4.9: Write mask */
-/* Returns a partially completed dst_operand, with just the writemask bits
+/* Returns a partially completed struct brw_reg, with just the writemask bits
  * filled out.
  */
 writemask:	/* empty */
 		{
-		  $$.writemask = BRW_WRITEMASK_XYZW;
+		  $$.dw1.bits.writemask = BRW_WRITEMASK_XYZW;
 		}
 		| DOT writemask_x writemask_y writemask_z writemask_w
 		{
-		  $$.writemask = $2 | $3 | $4 | $5;
+		  $$.dw1.bits.writemask = $2 | $3 | $4 | $5;
 		}
 ;
 
@@ -2850,52 +2822,50 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
  * Fills in the destination register information in instr from the bits in dst.
  */
 int set_instruction_dest(struct brw_instruction *instr,
-			 struct dst_operand *dest)
+			 struct brw_reg *dest)
 {
-	if (dest->horiz_stride == DEFAULT_DSTREGION)
-		dest->horiz_stride = ffs(1);
 	if (dest->address_mode == BRW_ADDRESS_DIRECT &&
 	    instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits1.da1.dest_reg_file = dest->reg_file;
-		instr->bits1.da1.dest_reg_type = dest->reg_type;
-		instr->bits1.da1.dest_subreg_nr = get_subreg_address(dest->reg_file, dest->reg_type, dest->subreg_nr, dest->address_mode);
-		instr->bits1.da1.dest_reg_nr = dest->reg_nr;
-		instr->bits1.da1.dest_horiz_stride = dest->horiz_stride;
+		instr->bits1.da1.dest_reg_file = dest->file;
+		instr->bits1.da1.dest_reg_type = dest->type;
+		instr->bits1.da1.dest_subreg_nr = get_subreg_address(dest->file, dest->type, dest->subnr, dest->address_mode);
+		instr->bits1.da1.dest_reg_nr = dest->nr;
+		instr->bits1.da1.dest_horiz_stride = dest->hstride;
 		instr->bits1.da1.dest_address_mode = dest->address_mode;
-		if (dest->writemask != 0 &&
-		    dest->writemask != BRW_WRITEMASK_XYZW) {
+		if (dest->dw1.bits.writemask != 0 &&
+		    dest->dw1.bits.writemask != BRW_WRITEMASK_XYZW) {
 			fprintf(stderr, "error: write mask set in align1 "
 				"instruction\n");
 			return 1;
 		}
 	} else if (dest->address_mode == BRW_ADDRESS_DIRECT) {
-		instr->bits1.da16.dest_reg_file = dest->reg_file;
-		instr->bits1.da16.dest_reg_type = dest->reg_type;
-		instr->bits1.da16.dest_subreg_nr = get_subreg_address(dest->reg_file, dest->reg_type, dest->subreg_nr, dest->address_mode);
-		instr->bits1.da16.dest_reg_nr = dest->reg_nr;
+		instr->bits1.da16.dest_reg_file = dest->file;
+		instr->bits1.da16.dest_reg_type = dest->type;
+		instr->bits1.da16.dest_subreg_nr = get_subreg_address(dest->file, dest->type, dest->subnr, dest->address_mode);
+		instr->bits1.da16.dest_reg_nr = dest->nr;
 		instr->bits1.da16.dest_address_mode = dest->address_mode;
 		instr->bits1.da16.dest_horiz_stride = ffs(1);
-		instr->bits1.da16.dest_writemask = dest->writemask;
+		instr->bits1.da16.dest_writemask = dest->dw1.bits.writemask;
 	} else if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits1.ia1.dest_reg_file = dest->reg_file;
-		instr->bits1.ia1.dest_reg_type = dest->reg_type;
-		instr->bits1.ia1.dest_subreg_nr = dest->subreg_nr;
-		instr->bits1.ia1.dest_horiz_stride = dest->horiz_stride;
-		instr->bits1.ia1.dest_indirect_offset = dest->indirect_offset;
+		instr->bits1.ia1.dest_reg_file = dest->file;
+		instr->bits1.ia1.dest_reg_type = dest->type;
+		instr->bits1.ia1.dest_subreg_nr = dest->subnr;
+		instr->bits1.ia1.dest_horiz_stride = dest->hstride;
+		instr->bits1.ia1.dest_indirect_offset = dest->dw1.bits.indirect_offset;
 		instr->bits1.ia1.dest_address_mode = dest->address_mode;
-		if (dest->writemask != 0 &&
-		    dest->writemask != BRW_WRITEMASK_XYZW) {
+		if (dest->dw1.bits.writemask != 0 &&
+		    dest->dw1.bits.writemask != BRW_WRITEMASK_XYZW) {
 			fprintf(stderr, "error: write mask set in align1 "
 				"instruction\n");
 			return 1;
 		}
 	} else {
-		instr->bits1.ia16.dest_reg_file = dest->reg_file;
-		instr->bits1.ia16.dest_reg_type = dest->reg_type;
-		instr->bits1.ia16.dest_subreg_nr = get_indirect_subreg_address(dest->subreg_nr);
-		instr->bits1.ia16.dest_writemask = dest->writemask;
+		instr->bits1.ia16.dest_reg_file = dest->file;
+		instr->bits1.ia16.dest_reg_type = dest->type;
+		instr->bits1.ia16.dest_subreg_nr = get_indirect_subreg_address(dest->subnr);
+		instr->bits1.ia16.dest_writemask = dest->dw1.bits.writemask;
 		instr->bits1.ia16.dest_horiz_stride = ffs(1);
-		instr->bits1.ia16.dest_indirect_offset = (dest->indirect_offset >> 4); /* half register aligned */
+		instr->bits1.ia16.dest_indirect_offset = (dest->dw1.bits.indirect_offset >> 4); /* half register aligned */
 		instr->bits1.ia16.dest_address_mode = dest->address_mode;
 	}
 
@@ -3076,13 +3046,13 @@ static int reg_type_2_to_3(int reg_type)
 }
 
 int set_instruction_dest_three_src(struct brw_instruction *instr,
-                                   struct dst_operand *dest)
+                                   struct brw_reg *dest)
 {
-	instr->bits1.da3src.dest_reg_file = dest->reg_file;
-	instr->bits1.da3src.dest_reg_nr = dest->reg_nr;
-	instr->bits1.da3src.dest_subreg_nr = get_subreg_address(dest->reg_file, dest->reg_type, dest->subreg_nr, dest->address_mode) / 4; // in DWORD
-	instr->bits1.da3src.dest_writemask = dest->writemask;
-	instr->bits1.da3src.dest_reg_type = reg_type_2_to_3(dest->reg_type);
+	instr->bits1.da3src.dest_reg_file = dest->file;
+	instr->bits1.da3src.dest_reg_nr = dest->nr;
+	instr->bits1.da3src.dest_subreg_nr = get_subreg_address(dest->file, dest->type, dest->subnr, dest->address_mode) / 4; // in DWORD
+	instr->bits1.da3src.dest_writemask = dest->dw1.bits.writemask;
+	instr->bits1.da3src.dest_reg_type = reg_type_2_to_3(dest->type);
 	return 0;
 }
 
@@ -3145,17 +3115,14 @@ void set_instruction_predicate(struct brw_instruction *instr,
 	instr->bits2.da1.flag_subreg_nr = predicate->bits2.da1.flag_subreg_nr;
 }
 
-void set_direct_dst_operand(struct dst_operand *dst, struct brw_reg *reg,
+void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 			    int type)
 {
-	memset(dst, 0, sizeof(*dst));
+	*dst = *reg;
 	dst->address_mode = BRW_ADDRESS_DIRECT;
-	dst->reg_file = reg->file;
-	dst->reg_nr = reg->nr;
-	dst->subreg_nr = reg->subnr;
-	dst->reg_type = type;
-	dst->horiz_stride = 1;
-	dst->writemask = BRW_WRITEMASK_XYZW;
+	dst->type = type;
+	dst->hstride = 1;
+	dst->dw1.bits.writemask = BRW_WRITEMASK_XYZW;
 }
 
 void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 47/90] assembler: Consolidate the swizzling configuration on 8 bits
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (45 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 46/90] assembler: Replace struct dst_operand by struct brw_reg Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 48/90] assembler: Get rid of src operand's swizzle_set Damien Lespiau
                   ` (43 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    2 +-
 assembler/gram.y    |   67 +++++++++++++++++---------------------------------
 2 files changed, 24 insertions(+), 45 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index fe09d52..0048b4a 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -97,7 +97,7 @@ struct src_operand {
 	int indirect_offset; /* XXX */
 
 	int swizzle_set;
-	int swizzle_x, swizzle_y, swizzle_z, swizzle_w;
+	unsigned swizzle: 8;
 
 	uint32_t imm32; /* set if reg_file == BRW_IMMEDIATE_VALUE or it is expressing a branch offset */
 	char *reloc_target; /* bspec: branching instructions JIP and UIP are source operands */
diff --git a/assembler/gram.y b/assembler/gram.y
index 8f2a1f9..a10198b 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -67,10 +67,7 @@ static struct src_operand ip_src =
     .reg_nr = BRW_ARF_IP,
     .reg_type = BRW_REGISTER_TYPE_UD,
     .address_mode = BRW_ADDRESS_DIRECT,
-    .swizzle_x = BRW_CHANNEL_X,
-    .swizzle_y = BRW_CHANNEL_Y,
-    .swizzle_z = BRW_CHANNEL_Z,
-    .swizzle_w = BRW_CHANNEL_W,
+    .swizzle = BRW_SWIZZLE_NOOP,
 };
 
 static int get_type_size(GLuint type);
@@ -1895,10 +1892,7 @@ directsrcoperand:	negate abs symbol_reg region regtype
 		  $$.negate = $1;
 		  $$.abs = $2;
 		  $$.swizzle_set = $6.swizzle_set;
-		  $$.swizzle_x = $6.swizzle_x;
-		  $$.swizzle_y = $6.swizzle_y;
-		  $$.swizzle_z = $6.swizzle_z;
-		  $$.swizzle_w = $6.swizzle_w;
+		  $$.swizzle = $6.swizzle;
 		}
 		| srcarchoperandex
 ;
@@ -1918,10 +1912,7 @@ indirectsrcoperand:
 		  $$.negate = $1;
 		  $$.abs = $2;
 		  $$.swizzle_set = $6.swizzle_set;
-		  $$.swizzle_x = $6.swizzle_x;
-		  $$.swizzle_y = $6.swizzle_y;
-		  $$.swizzle_z = $6.swizzle_z;
-		  $$.swizzle_w = $6.swizzle_w;
+		  $$.swizzle = $6.swizzle;
 		}
 ;
 
@@ -2399,26 +2390,17 @@ srcimmtype:	/* empty */
 swizzle:	/* empty */
 		{
 		  $$.swizzle_set = 0;
-		  $$.swizzle_x = BRW_CHANNEL_X;
-		  $$.swizzle_y = BRW_CHANNEL_Y;
-		  $$.swizzle_z = BRW_CHANNEL_Z;
-		  $$.swizzle_w = BRW_CHANNEL_W;
+		  $$.swizzle = BRW_SWIZZLE_NOOP;
 		}
 		| DOT chansel
 		{
 		  $$.swizzle_set = 1;
-		  $$.swizzle_x = $2;
-		  $$.swizzle_y = $2;
-		  $$.swizzle_z = $2;
-		  $$.swizzle_w = $2;
+		  $$.swizzle = BRW_SWIZZLE4($2, $2, $2, $2);
 		}
 		| DOT chansel chansel chansel chansel
 		{
 		  $$.swizzle_set = 1;
-		  $$.swizzle_x = $2;
-		  $$.swizzle_y = $3;
-		  $$.swizzle_z = $4;
-		  $$.swizzle_w = $5;
+		  $$.swizzle = BRW_SWIZZLE4($2, $3, $4, $5);
 		}
 ;
 
@@ -2904,10 +2886,10 @@ int set_instruction_src0(struct brw_instruction *instr,
 		instr->bits2.da16.src0_vert_stride = src->vert_stride;
 		instr->bits2.da16.src0_negate = src->negate;
 		instr->bits2.da16.src0_abs = src->abs;
-		instr->bits2.da16.src0_swz_x = src->swizzle_x;
-		instr->bits2.da16.src0_swz_y = src->swizzle_y;
-		instr->bits2.da16.src0_swz_z = src->swizzle_z;
-		instr->bits2.da16.src0_swz_w = src->swizzle_w;
+		instr->bits2.da16.src0_swz_x = BRW_GET_SWZ(src->swizzle, 0);
+		instr->bits2.da16.src0_swz_y = BRW_GET_SWZ(src->swizzle, 1);
+		instr->bits2.da16.src0_swz_z = BRW_GET_SWZ(src->swizzle, 2);
+		instr->bits2.da16.src0_swz_w = BRW_GET_SWZ(src->swizzle, 3);
 		instr->bits2.da16.src0_address_mode = src->address_mode;
             }
         } else {
@@ -2926,15 +2908,15 @@ int set_instruction_src0(struct brw_instruction *instr,
 			return 1;
 		}
             } else {
-		instr->bits2.ia16.src0_swz_x = src->swizzle_x;
-		instr->bits2.ia16.src0_swz_y = src->swizzle_y;
+		instr->bits2.ia16.src0_swz_x = BRW_GET_SWZ(src->swizzle, 0);
+		instr->bits2.ia16.src0_swz_y = BRW_GET_SWZ(src->swizzle, 1);
+		instr->bits2.ia16.src0_swz_z = BRW_GET_SWZ(src->swizzle, 2);
+		instr->bits2.ia16.src0_swz_w = BRW_GET_SWZ(src->swizzle, 3);
 		instr->bits2.ia16.src0_indirect_offset = (src->indirect_offset >> 4); /* half register aligned */
 		instr->bits2.ia16.src0_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
 		instr->bits2.ia16.src0_abs = src->abs;
 		instr->bits2.ia16.src0_negate = src->negate;
 		instr->bits2.ia16.src0_address_mode = src->address_mode;
-		instr->bits2.ia16.src0_swz_z = src->swizzle_z;
-		instr->bits2.ia16.src0_swz_w = src->swizzle_w;
 		instr->bits2.ia16.src0_vert_stride = src->vert_stride;
             }
         }
@@ -2982,10 +2964,10 @@ int set_instruction_src1(struct brw_instruction *instr,
 		instr->bits3.da16.src1_vert_stride = src->vert_stride;
 		instr->bits3.da16.src1_negate = src->negate;
 		instr->bits3.da16.src1_abs = src->abs;
-		instr->bits3.da16.src1_swz_x = src->swizzle_x;
-		instr->bits3.da16.src1_swz_y = src->swizzle_y;
-		instr->bits3.da16.src1_swz_z = src->swizzle_z;
-		instr->bits3.da16.src1_swz_w = src->swizzle_w;
+		instr->bits3.da16.src1_swz_x = BRW_GET_SWZ(src->swizzle, 0);
+		instr->bits3.da16.src1_swz_y = BRW_GET_SWZ(src->swizzle, 1);
+		instr->bits3.da16.src1_swz_z = BRW_GET_SWZ(src->swizzle, 2);
+		instr->bits3.da16.src1_swz_w = BRW_GET_SWZ(src->swizzle, 3);
                 instr->bits3.da16.src1_address_mode = src->address_mode;
 		if (src->address_mode != BRW_ADDRESS_DIRECT) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
@@ -3009,15 +2991,15 @@ int set_instruction_src1(struct brw_instruction *instr,
 			return 1;
 		}
             } else {
-		instr->bits3.ia16.src1_swz_x = src->swizzle_x;
-		instr->bits3.ia16.src1_swz_y = src->swizzle_y;
+		instr->bits3.ia16.src1_swz_x = BRW_GET_SWZ(src->swizzle, 0);
+		instr->bits3.ia16.src1_swz_y = BRW_GET_SWZ(src->swizzle, 1);
+		instr->bits3.ia16.src1_swz_z = BRW_GET_SWZ(src->swizzle, 2);
+		instr->bits3.ia16.src1_swz_w = BRW_GET_SWZ(src->swizzle, 3);
 		instr->bits3.ia16.src1_indirect_offset = (src->indirect_offset >> 4); /* half register aligned */
 		instr->bits3.ia16.src1_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
 		instr->bits3.ia16.src1_abs = src->abs;
 		instr->bits3.ia16.src1_negate = src->negate;
 		instr->bits3.ia16.src1_address_mode = src->address_mode;
-		instr->bits3.ia16.src1_swz_z = src->swizzle_z;
-		instr->bits3.ia16.src1_swz_w = src->swizzle_w;
 		instr->bits3.ia16.src1_vert_stride = src->vert_stride;
             }
         }
@@ -3140,8 +3122,5 @@ void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
 	src->negate = 0;
 	src->abs = 0;
 	src->swizzle_set = 0;
-	src->swizzle_x = BRW_CHANNEL_X;
-	src->swizzle_y = BRW_CHANNEL_Y;
-	src->swizzle_z = BRW_CHANNEL_Z;
-	src->swizzle_w = BRW_CHANNEL_W;
+	src->swizzle = BRW_SWIZZLE_NOOP;
 }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 48/90] assembler: Get rid of src operand's swizzle_set
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (46 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 47/90] assembler: Consolidate the swizzling configuration on 8 bits Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 49/90] assembler: Use brw_reg in the source operand Damien Lespiau
                   ` (42 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

swizzle_set can be derived from the value of swizzle itself, no need for
that field.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    1 -
 assembler/gram.y    |   14 ++++----------
 2 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 0048b4a..b4ea647 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -96,7 +96,6 @@ struct src_operand {
 	int address_mode; /* 0 if direct, 1 if register-indirect */
 	int indirect_offset; /* XXX */
 
-	int swizzle_set;
 	unsigned swizzle: 8;
 
 	uint32_t imm32; /* set if reg_file == BRW_IMMEDIATE_VALUE or it is expressing a branch offset */
diff --git a/assembler/gram.y b/assembler/gram.y
index a10198b..c1029fa 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1891,7 +1891,6 @@ directsrcoperand:	negate abs symbol_reg region regtype
 		  $$.default_region = $4.is_default;
 		  $$.negate = $1;
 		  $$.abs = $2;
-		  $$.swizzle_set = $6.swizzle_set;
 		  $$.swizzle = $6.swizzle;
 		}
 		| srcarchoperandex
@@ -1911,7 +1910,6 @@ indirectsrcoperand:
 		  $$.horiz_stride = $4.horiz_stride;
 		  $$.negate = $1;
 		  $$.abs = $2;
-		  $$.swizzle_set = $6.swizzle_set;
 		  $$.swizzle = $6.swizzle;
 		}
 ;
@@ -2389,17 +2387,14 @@ srcimmtype:	/* empty */
  */
 swizzle:	/* empty */
 		{
-		  $$.swizzle_set = 0;
 		  $$.swizzle = BRW_SWIZZLE_NOOP;
 		}
 		| DOT chansel
 		{
-		  $$.swizzle_set = 1;
 		  $$.swizzle = BRW_SWIZZLE4($2, $2, $2, $2);
 		}
 		| DOT chansel chansel chansel chansel
 		{
-		  $$.swizzle_set = 1;
 		  $$.swizzle = BRW_SWIZZLE4($2, $3, $4, $5);
 		}
 ;
@@ -2875,7 +2870,7 @@ int set_instruction_src0(struct brw_instruction *instr,
 		instr->bits2.da1.src0_negate = src->negate;
 		instr->bits2.da1.src0_abs = src->abs;
 		instr->bits2.da1.src0_address_mode = src->address_mode;
-		if (src->swizzle_set) {
+		if (src->swizzle && src->swizzle != BRW_SWIZZLE_NOOP) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
@@ -2902,7 +2897,7 @@ int set_instruction_src0(struct brw_instruction *instr,
 		instr->bits2.ia1.src0_horiz_stride = src->horiz_stride;
 		instr->bits2.ia1.src0_width = src->width;
 		instr->bits2.ia1.src0_vert_stride = src->vert_stride;
-		if (src->swizzle_set) {
+		if (src->swizzle && src->swizzle != BRW_SWIZZLE_NOOP) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
@@ -2953,7 +2948,7 @@ int set_instruction_src1(struct brw_instruction *instr,
 			return 1;
 		}
 		*/
-		if (src->swizzle_set) {
+		if (src->swizzle && src->swizzle != BRW_SWIZZLE_NOOP) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
@@ -2985,7 +2980,7 @@ int set_instruction_src1(struct brw_instruction *instr,
 		instr->bits3.ia1.src1_horiz_stride = src->horiz_stride;
 		instr->bits3.ia1.src1_width = src->width;
 		instr->bits3.ia1.src1_vert_stride = src->vert_stride;
-		if (src->swizzle_set) {
+		if (src->swizzle && src->swizzle != BRW_SWIZZLE_NOOP) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
@@ -3121,6 +3116,5 @@ void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
 	src->horiz_stride = 0;
 	src->negate = 0;
 	src->abs = 0;
-	src->swizzle_set = 0;
 	src->swizzle = BRW_SWIZZLE_NOOP;
 }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 49/90] assembler: Use brw_reg in the source operand
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (47 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 48/90] assembler: Get rid of src operand's swizzle_set Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 50/90] assembler: Factor out the destination register validation Damien Lespiau
                   ` (41 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Last refactoring step in transition to struct brw_reg.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |   14 +--
 assembler/gram.y    |  532 ++++++++++++++++++++++++++-------------------------
 2 files changed, 269 insertions(+), 277 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index b4ea647..49c6ea0 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -86,19 +86,9 @@ struct regtype {
  * parser.
  */
 struct src_operand {
-	int reg_file, reg_nr, subreg_nr, reg_type;
-
-	int abs, negate;
-
-	int horiz_stride, width, vert_stride;
+	struct brw_reg reg;
 	int default_region;
-
-	int address_mode; /* 0 if direct, 1 if register-indirect */
-	int indirect_offset; /* XXX */
-
-	unsigned swizzle: 8;
-
-	uint32_t imm32; /* set if reg_file == BRW_IMMEDIATE_VALUE or it is expressing a branch offset */
+	uint32_t imm32; /* set if reg.file == BRW_IMMEDIATE_VALUE or it is expressing a branch offset */
 	char *reloc_target; /* bspec: branching instructions JIP and UIP are source operands */
 } src_operand;
 
diff --git a/assembler/gram.y b/assembler/gram.y
index c1029fa..28b722e 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -37,15 +37,17 @@
 #define DEFAULT_EXECSIZE (ffs(program_defaults.execute_size) - 1)
 #define DEFAULT_DSTREGION -1
 
+#define SWIZZLE(reg) (reg.dw1.bits.swizzle)
+
 extern long int gen_level;
 extern int advanced_flag;
 extern int yylineno;
 extern int need_export;
 static struct src_operand src_null_reg =
 {
-    .reg_file = BRW_ARCHITECTURE_REGISTER_FILE,
-    .reg_nr = BRW_ARF_NULL,
-    .reg_type = BRW_REGISTER_TYPE_UD,
+    .reg.file = BRW_ARCHITECTURE_REGISTER_FILE,
+    .reg.nr = BRW_ARF_NULL,
+    .reg.type = BRW_REGISTER_TYPE_UD,
 };
 static struct brw_reg dst_null_reg =
 {
@@ -63,11 +65,11 @@ static struct brw_reg ip_dst =
 };
 static struct src_operand ip_src =
 {
-    .reg_file = BRW_ARCHITECTURE_REGISTER_FILE,
-    .reg_nr = BRW_ARF_IP,
-    .reg_type = BRW_REGISTER_TYPE_UD,
-    .address_mode = BRW_ADDRESS_DIRECT,
-    .swizzle = BRW_SWIZZLE_NOOP,
+    .reg.file = BRW_ARCHITECTURE_REGISTER_FILE,
+    .reg.nr = BRW_ARF_IP,
+    .reg.type = BRW_REGISTER_TYPE_UD,
+    .reg.address_mode = BRW_ADDRESS_DIRECT,
+    .reg.dw1.bits.swizzle = BRW_SWIZZLE_NOOP,
 };
 
 static int get_type_size(GLuint type);
@@ -680,11 +682,11 @@ subroutineinstruction:
 
 		  struct src_operand src0;
 		  memset(&src0, 0, sizeof(src0));
-		  src0.reg_type = BRW_REGISTER_TYPE_D; /* source type should be DWORD */
+		  src0.reg.type = BRW_REGISTER_TYPE_D; /* source type should be DWORD */
 		  /* source0 region control must be <2,2,1>. */
-		  src0.horiz_stride = 1; /*encoded 1*/
-		  src0.width = 1; /*encoded 2*/
-		  src0.vert_stride = 2; /*encoded 2*/
+		  src0.reg.hstride = 1; /*encoded 1*/
+		  src0.reg.width = 1; /*encoded 2*/
+		  src0.reg.vstride = 2; /*encoded 2*/
 		  set_instruction_src0(&$$.gen, &src0);
 
 		  $$.first_reloc_target = $5.reloc_target;
@@ -703,10 +705,10 @@ subroutineinstruction:
 		  $$.gen.header.opcode = $2;
 		  $$.gen.header.execution_size = 1; /* execution size of RET should be 2 */
 		  set_instruction_dest(&$$.gen, &dst_null_reg);
-		  $5.reg_type = BRW_REGISTER_TYPE_D;
-		  $5.horiz_stride = 1; /*encoded 1*/
-		  $5.width = 1; /*encoded 2*/
-		  $5.vert_stride = 2; /*encoded 2*/
+		  $5.reg.type = BRW_REGISTER_TYPE_D;
+		  $5.reg.hstride = 1; /*encoded 1*/
+		  $5.reg.width = 1; /*encoded 2*/
+		  $5.reg.vstride = 2; /*encoded 2*/
 		  set_instruction_src0(&$$.gen, &$5);
 		}
 ;
@@ -887,16 +889,16 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       struct src_operand src0;
 
                       memset(&src0, 0, sizeof(src0));
-                      src0.address_mode = BRW_ADDRESS_DIRECT;
+                      src0.reg.address_mode = BRW_ADDRESS_DIRECT;
 
                       if (IS_GENp(7))
-                          src0.reg_file = BRW_GENERAL_REGISTER_FILE;
+                          src0.reg.file = BRW_GENERAL_REGISTER_FILE;
                       else
-                          src0.reg_file = BRW_MESSAGE_REGISTER_FILE;
+                          src0.reg.file = BRW_MESSAGE_REGISTER_FILE;
 
-                      src0.reg_type = BRW_REGISTER_TYPE_D;
-                      src0.reg_nr = $4;
-                      src0.subreg_nr = 0;
+                      src0.reg.type = BRW_REGISTER_TYPE_D;
+                      src0.reg.nr = $4;
+                      src0.reg.subnr = 0;
                       set_instruction_src0(&$$, &src0);
 		  } else {
                       if (set_instruction_src0(&$$, &$6) != 0)
@@ -948,10 +950,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  }
 		| predicate SEND execsize dst sendleadreg payload imm32reg instoptions
                 {
-		  if ($7.reg_type != BRW_REGISTER_TYPE_UD &&
-		  	  $7.reg_type != BRW_REGISTER_TYPE_D &&
-		  	  $7.reg_type != BRW_REGISTER_TYPE_V) {
-		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.imm32, $7.reg_type);
+		  if ($7.reg.type != BRW_REGISTER_TYPE_UD &&
+		      $7.reg.type != BRW_REGISTER_TYPE_D &&
+		      $7.reg.type != BRW_REGISTER_TYPE_V) {
+		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.imm32, $7.reg.type);
 			YYERROR;
 		  }
 		  memset(&$$, 0, sizeof($$));
@@ -965,7 +967,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  if (set_instruction_src0(&$$, &$6) != 0)
 		    YYERROR;
 		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.bits1.da1.src1_reg_type = $7.reg_type;
+		  $$.bits1.da1.src1_reg_type = $7.reg.type;
 		  $$.bits3.ud = $7.imm32;
                 }
 		| predicate SEND execsize dst sendleadreg sndopr imm32reg instoptions
@@ -977,10 +979,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       YYERROR;
 		  }
 
-		  if ($7.reg_type != BRW_REGISTER_TYPE_UD &&
-                      $7.reg_type != BRW_REGISTER_TYPE_D &&
-                      $7.reg_type != BRW_REGISTER_TYPE_V) {
-                      fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.imm32, $7.reg_type);
+		  if ($7.reg.type != BRW_REGISTER_TYPE_UD &&
+                      $7.reg.type != BRW_REGISTER_TYPE_D &&
+                      $7.reg.type != BRW_REGISTER_TYPE_V) {
+                      fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.imm32, $7.reg.type);
                       YYERROR;
 		  }
 
@@ -994,22 +996,22 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       YYERROR;
 
                   memset(&src0, 0, sizeof(src0));
-                  src0.address_mode = BRW_ADDRESS_DIRECT;
+                  src0.reg.address_mode = BRW_ADDRESS_DIRECT;
 
                   if (IS_GENp(7)) {
-                      src0.reg_file = BRW_GENERAL_REGISTER_FILE;
-                      src0.reg_type = BRW_REGISTER_TYPE_UB;
+                      src0.reg.file = BRW_GENERAL_REGISTER_FILE;
+                      src0.reg.type = BRW_REGISTER_TYPE_UB;
                   } else {
-                      src0.reg_file = BRW_MESSAGE_REGISTER_FILE;
-                      src0.reg_type = BRW_REGISTER_TYPE_D;
+                      src0.reg.file = BRW_MESSAGE_REGISTER_FILE;
+                      src0.reg.type = BRW_REGISTER_TYPE_D;
                   }
 
-                  src0.reg_nr = $5.nr;
-                  src0.subreg_nr = 0;
+                  src0.reg.nr = $5.nr;
+                  src0.reg.subnr = 0;
                   set_instruction_src0(&$$, &src0);
 
 		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.bits1.da1.src1_reg_type = $7.reg_type;
+		  $$.bits1.da1.src1_reg_type = $7.reg.type;
                   $$.bits3.ud = $7.imm32;
                   $$.bits3.generic_gen5.end_of_thread = !!($6 & EX_DESC_EOT_MASK);
 		}
@@ -1022,10 +1024,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       YYERROR;
 		  }
 
-                  if ($7.reg_file != BRW_ARCHITECTURE_REGISTER_FILE ||
-                      ($7.reg_nr & 0xF0) != BRW_ARF_ADDRESS ||
-                      ($7.reg_nr & 0x0F) != 0 ||
-                      $7.subreg_nr != 0) {
+                  if ($7.reg.file != BRW_ARCHITECTURE_REGISTER_FILE ||
+                      ($7.reg.nr & 0xF0) != BRW_ARF_ADDRESS ||
+                      ($7.reg.nr & 0x0F) != 0 ||
+                      $7.reg.subnr != 0) {
                       fprintf (stderr, "%d: scalar register must be a0.0<0;1,0>:ud\n", yylineno);
                       YYERROR;
 		  }
@@ -1040,18 +1042,18 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       YYERROR;
 
                   memset(&src0, 0, sizeof(src0));
-                  src0.address_mode = BRW_ADDRESS_DIRECT;
+                  src0.reg.address_mode = BRW_ADDRESS_DIRECT;
 
                   if (IS_GENp(7)) {
-                      src0.reg_file = BRW_GENERAL_REGISTER_FILE;
-                      src0.reg_type = BRW_REGISTER_TYPE_UB;
+                      src0.reg.file = BRW_GENERAL_REGISTER_FILE;
+                      src0.reg.type = BRW_REGISTER_TYPE_UB;
                   } else {
-                      src0.reg_file = BRW_MESSAGE_REGISTER_FILE;
-                      src0.reg_type = BRW_REGISTER_TYPE_D;
+                      src0.reg.file = BRW_MESSAGE_REGISTER_FILE;
+                      src0.reg.type = BRW_REGISTER_TYPE_D;
                   }
 
-                  src0.reg_nr = $5.nr;
-                  src0.subreg_nr = 0;
+                  src0.reg.nr = $5.nr;
+                  src0.reg.subnr = 0;
                   set_instruction_src0(&$$, &src0);
 
                   set_instruction_src1(&$$, &$7);
@@ -1059,10 +1061,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		}
 		| predicate SEND execsize dst sendleadreg payload sndopr imm32reg instoptions
 		{
-		  if ($8.reg_type != BRW_REGISTER_TYPE_UD &&
-		  	  $8.reg_type != BRW_REGISTER_TYPE_D &&
-		  	  $8.reg_type != BRW_REGISTER_TYPE_V) {
-		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $8.imm32, $8.reg_type);
+		  if ($8.reg.type != BRW_REGISTER_TYPE_UD &&
+		      $8.reg.type != BRW_REGISTER_TYPE_D &&
+		      $8.reg.type != BRW_REGISTER_TYPE_V) {
+		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $8.imm32, $8.reg.type);
 			YYERROR;
 		  }
 		  memset(&$$, 0, sizeof($$));
@@ -1076,7 +1078,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  if (set_instruction_src0(&$$, &$6) != 0)
 		    YYERROR;
 		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.bits1.da1.src1_reg_type = $8.reg_type;
+		  $$.bits1.da1.src1_reg_type = $8.reg.type;
 		  if (IS_GENx(5)) {
 		      $$.bits2.send_gen5.sfid = ($7 & EX_DESC_SFID_MASK);
 		      $$.bits3.ud = $8.imm32;
@@ -1756,8 +1758,8 @@ imm32reg:	imm32 srcimmtype
 		    YYERROR;
 		  }
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.reg_type = $2;
+		  $$.reg.file = BRW_IMMEDIATE_VALUE;
+		  $$.reg.type = $2;
 		  $$.imm32 = d;
 		}
 ;
@@ -1766,9 +1768,9 @@ directsrcaccoperand:	directsrcoperand
 		| accreg region regtype
 		{
 		  set_direct_src_operand(&$$, &$1, $3.type);
-		  $$.vert_stride = $2.vert_stride;
-		  $$.width = $2.width;
-		  $$.horiz_stride = $2.horiz_stride;
+		  $$.reg.vstride = $2.vert_stride;
+		  $$.reg.width = $2.width;
+		  $$.reg.hstride = $2.horiz_stride;
 		  $$.default_region = $2.is_default;
 		}
 ;
@@ -1777,16 +1779,16 @@ directsrcaccoperand:	directsrcoperand
 srcarchoperandex: srcarchoperandex_typed region regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = $1.file;
-		  $$.reg_type = $3.type;
-		  $$.subreg_nr = $1.subnr;
-		  $$.reg_nr = $1.nr;
-		  $$.vert_stride = $2.vert_stride;
-		  $$.width = $2.width;
-		  $$.horiz_stride = $2.horiz_stride;
+		  $$.reg.file = $1.file;
+		  $$.reg.type = $3.type;
+		  $$.reg.subnr = $1.subnr;
+		  $$.reg.nr = $1.nr;
+		  $$.reg.vstride = $2.vert_stride;
+		  $$.reg.width = $2.width;
+		  $$.reg.hstride = $2.horiz_stride;
 		  $$.default_region = $2.is_default;
-		  $$.negate = 0;
-		  $$.abs = 0;
+		  $$.reg.negate = 0;
+		  $$.reg.abs = 0;
 		}
 		| maskstackreg
 		{
@@ -1838,26 +1840,26 @@ src:		directsrcoperand | indirectsrcoperand
 directsrcoperand:	negate abs symbol_reg region regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $3.reg.file;
-		  $$.reg_nr = $3.reg.nr;
-		  $$.subreg_nr = $3.reg.subnr;
+		  $$.reg.address_mode = BRW_ADDRESS_DIRECT;
+		  $$.reg.file = $3.reg.file;
+		  $$.reg.nr = $3.reg.nr;
+		  $$.reg.subnr = $3.reg.subnr;
 		  if ($5.is_default) {
-		    $$.reg_type = $3.type;
+		    $$.reg.type = $3.type;
 		  } else {
-		    $$.reg_type = $5.type;
+		    $$.reg.type = $5.type;
 		  }
 		  if ($4.is_default) {
-		    $$.vert_stride = $3.src_region.vert_stride;
-		    $$.width = $3.src_region.width;
-		    $$.horiz_stride = $3.src_region.horiz_stride;
+		    $$.reg.vstride = $3.src_region.vert_stride;
+		    $$.reg.width = $3.src_region.width;
+		    $$.reg.hstride = $3.src_region.horiz_stride;
 		  } else {
-		    $$.vert_stride = $4.vert_stride;
-		    $$.width = $4.width;
-		    $$.horiz_stride = $4.horiz_stride;
+		    $$.reg.vstride = $4.vert_stride;
+		    $$.reg.width = $4.width;
+		    $$.reg.hstride = $4.horiz_stride;
 		  }
-		  $$.negate = $1;
-		  $$.abs = $2;
+		  $$.reg.negate = $1;
+		  $$.reg.abs = $2;
 		} 
 		| statereg region regtype 
 		{
@@ -1867,31 +1869,31 @@ directsrcoperand:	negate abs symbol_reg region regtype
 		  }
 		  else{
 		    memset (&$$, '\0', sizeof ($$));
-		    $$.address_mode = BRW_ADDRESS_DIRECT;
-		    $$.reg_file = $1.file;
-		    $$.reg_nr = $1.nr;
-		    $$.subreg_nr = $1.subnr;
-		    $$.vert_stride = $2.vert_stride;
-		    $$.width = $2.width;
-		    $$.horiz_stride = $2.horiz_stride;
-		    $$.reg_type = $3.type;
+		    $$.reg.address_mode = BRW_ADDRESS_DIRECT;
+		    $$.reg.file = $1.file;
+		    $$.reg.nr = $1.nr;
+		    $$.reg.subnr = $1.subnr;
+		    $$.reg.vstride = $2.vert_stride;
+		    $$.reg.width = $2.width;
+		    $$.reg.hstride = $2.horiz_stride;
+		    $$.reg.type = $3.type;
 		  }
 		}
 		| negate abs directgenreg region regtype swizzle
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $3.file;
-		  $$.reg_nr = $3.nr;
-		  $$.subreg_nr = $3.subnr;
-		  $$.reg_type = $5.type;
-		  $$.vert_stride = $4.vert_stride;
-		  $$.width = $4.width;
-		  $$.horiz_stride = $4.horiz_stride;
+		  $$.reg.address_mode = BRW_ADDRESS_DIRECT;
+		  $$.reg.file = $3.file;
+		  $$.reg.nr = $3.nr;
+		  $$.reg.subnr = $3.subnr;
+		  $$.reg.type = $5.type;
+		  $$.reg.vstride = $4.vert_stride;
+		  $$.reg.width = $4.width;
+		  $$.reg.hstride = $4.horiz_stride;
 		  $$.default_region = $4.is_default;
-		  $$.negate = $1;
-		  $$.abs = $2;
-		  $$.swizzle = $6.swizzle;
+		  $$.reg.negate = $1;
+		  $$.reg.abs = $2;
+		  $$.reg.dw1.bits.swizzle = $6.reg.dw1.bits.swizzle;
 		}
 		| srcarchoperandex
 ;
@@ -1900,17 +1902,17 @@ indirectsrcoperand:
 		negate abs indirectgenreg indirectregion regtype swizzle
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
-		  $$.reg_file = $3.file;
-		  $$.subreg_nr = $3.subnr;
-		  $$.indirect_offset = $3.dw1.bits.indirect_offset;
-		  $$.reg_type = $5.type;
-		  $$.vert_stride = $4.vert_stride;
-		  $$.width = $4.width;
-		  $$.horiz_stride = $4.horiz_stride;
-		  $$.negate = $1;
-		  $$.abs = $2;
-		  $$.swizzle = $6.swizzle;
+		  $$.reg.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
+		  $$.reg.file = $3.file;
+		  $$.reg.subnr = $3.subnr;
+		  $$.reg.dw1.bits.indirect_offset = $3.dw1.bits.indirect_offset;
+		  $$.reg.type = $5.type;
+		  $$.reg.vstride = $4.vert_stride;
+		  $$.reg.width = $4.width;
+		  $$.reg.hstride = $4.horiz_stride;
+		  $$.reg.negate = $1;
+		  $$.reg.abs = $2;
+		  $$.reg.dw1.bits.swizzle = $6.reg.dw1.bits.swizzle;
 		}
 ;
 
@@ -2223,15 +2225,15 @@ relativelocation:
 		  }
 
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.reg_type = BRW_REGISTER_TYPE_D;
+		  $$.reg.file = BRW_IMMEDIATE_VALUE;
+		  $$.reg.type = BRW_REGISTER_TYPE_D;
 		  $$.imm32 = $1 & 0x0000ffff;
 		}
 		| STRING
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.reg_type = BRW_REGISTER_TYPE_D;
+		  $$.reg.file = BRW_IMMEDIATE_VALUE;
+		  $$.reg.type = BRW_REGISTER_TYPE_D;
 		  $$.reloc_target = $1;
 		}
 ;
@@ -2240,48 +2242,48 @@ relativelocation2:
 		  STRING
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.reg_type = BRW_REGISTER_TYPE_D;
+		  $$.reg.file = BRW_IMMEDIATE_VALUE;
+		  $$.reg.type = BRW_REGISTER_TYPE_D;
 		  $$.reloc_target = $1;
 		}
 		| exp
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.reg_type = BRW_REGISTER_TYPE_D;
+		  $$.reg.file = BRW_IMMEDIATE_VALUE;
+		  $$.reg.type = BRW_REGISTER_TYPE_D;
 		  $$.imm32 = $1;
 		}
 		| directgenreg region regtype
 		{
 		  set_direct_src_operand(&$$, &$1, $3.type);
-		  $$.vert_stride = $2.vert_stride;
-		  $$.width = $2.width;
-		  $$.horiz_stride = $2.horiz_stride;
+		  $$.reg.vstride = $2.vert_stride;
+		  $$.reg.width = $2.width;
+		  $$.reg.hstride = $2.horiz_stride;
 		  $$.default_region = $2.is_default;
 		}
 		| symbol_reg_p
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_mode = BRW_ADDRESS_DIRECT;
-		  $$.reg_file = $1.reg.file;
-		  $$.reg_nr = $1.reg.nr;
-		  $$.subreg_nr = $1.reg.subnr;
-		  $$.reg_type = $1.type;
-		  $$.vert_stride = $1.src_region.vert_stride;
-		  $$.width = $1.src_region.width;
-		  $$.horiz_stride = $1.src_region.horiz_stride;
+		  $$.reg.address_mode = BRW_ADDRESS_DIRECT;
+		  $$.reg.file = $1.reg.file;
+		  $$.reg.nr = $1.reg.nr;
+		  $$.reg.subnr = $1.reg.subnr;
+		  $$.reg.type = $1.type;
+		  $$.reg.vstride = $1.src_region.vert_stride;
+		  $$.reg.width = $1.src_region.width;
+		  $$.reg.hstride = $1.src_region.horiz_stride;
 		}
 		| indirectgenreg indirectregion regtype
 		{
 		  memset (&$$, '\0', sizeof ($$));
-		  $$.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
-		  $$.reg_file = $1.file;
-		  $$.subreg_nr = $1.subnr;
-		  $$.indirect_offset = $1.dw1.bits.indirect_offset;
-		  $$.reg_type = $3.type;
-		  $$.vert_stride = $2.vert_stride;
-		  $$.width = $2.width;
-		  $$.horiz_stride = $2.horiz_stride;
+		  $$.reg.address_mode = BRW_ADDRESS_REGISTER_INDIRECT_REGISTER;
+		  $$.reg.file = $1.file;
+		  $$.reg.subnr = $1.subnr;
+		  $$.reg.dw1.bits.indirect_offset = $1.dw1.bits.indirect_offset;
+		  $$.reg.type = $3.type;
+		  $$.reg.vstride = $2.vert_stride;
+		  $$.reg.width = $2.width;
+		  $$.reg.hstride = $2.horiz_stride;
 		}
 ;
 
@@ -2387,15 +2389,15 @@ srcimmtype:	/* empty */
  */
 swizzle:	/* empty */
 		{
-		  $$.swizzle = BRW_SWIZZLE_NOOP;
+		  $$.reg.dw1.bits.swizzle = BRW_SWIZZLE_NOOP;
 		}
 		| DOT chansel
 		{
-		  $$.swizzle = BRW_SWIZZLE4($2, $2, $2, $2);
+		  $$.reg.dw1.bits.swizzle = BRW_SWIZZLE4($2, $2, $2, $2);
 		}
 		| DOT chansel chansel chansel chansel
 		{
-		  $$.swizzle = BRW_SWIZZLE4($2, $3, $4, $5);
+		  $$.reg.dw1.bits.swizzle = BRW_SWIZZLE4($2, $3, $4, $5);
 		}
 ;
 
@@ -2732,13 +2734,13 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
     if (!src->default_region)
         return;
 
-    if (src->reg_file == BRW_ARCHITECTURE_REGISTER_FILE && 
-        ((src->reg_nr & 0xF0) == BRW_ARF_ADDRESS)) {
-        src->vert_stride = ffs(0);
-        src->width = ffs(1) - 1;
-        src->horiz_stride = ffs(0);
-    } else if (src->reg_file == BRW_ARCHITECTURE_REGISTER_FILE &&
-               ((src->reg_nr & 0xF0) == BRW_ARF_ACCUMULATOR)) {
+    if (src->reg.file == BRW_ARCHITECTURE_REGISTER_FILE && 
+        ((src->reg.nr & 0xF0) == BRW_ARF_ADDRESS)) {
+        src->reg.vstride = ffs(0);
+        src->reg.width = ffs(1) - 1;
+        src->reg.hstride = ffs(0);
+    } else if (src->reg.file == BRW_ARCHITECTURE_REGISTER_FILE &&
+               ((src->reg.nr & 0xF0) == BRW_ARF_ACCUMULATOR)) {
         int horiz_stride = 1, width, vert_stride;
         if (instr->header.compression_control == BRW_COMPRESSION_COMPRESSED) {
             width = 16;
@@ -2750,15 +2752,15 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
             width = (1 << instr->header.execution_size);
 
         vert_stride = horiz_stride * width;
-        src->vert_stride = ffs(vert_stride);
-        src->width = ffs(width) - 1;
-        src->horiz_stride = ffs(horiz_stride);
-    } else if ((src->reg_file == BRW_ARCHITECTURE_REGISTER_FILE) &&
-               (src->reg_nr == BRW_ARF_NULL) &&
+        src->reg.vstride = ffs(vert_stride);
+        src->reg.width = ffs(width) - 1;
+        src->reg.hstride = ffs(horiz_stride);
+    } else if ((src->reg.file == BRW_ARCHITECTURE_REGISTER_FILE) &&
+               (src->reg.nr == BRW_ARF_NULL) &&
                (instr->header.opcode == BRW_OPCODE_SEND)) {
-        src->vert_stride = ffs(8);
-        src->width = ffs(8) - 1;
-        src->horiz_stride = ffs(1);
+        src->reg.vstride = ffs(8);
+        src->reg.width = ffs(8) - 1;
+        src->reg.hstride = ffs(1);
     } else {
 
         int horiz_stride = 1, width, vert_stride;
@@ -2781,7 +2783,7 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
                 width = (1 << instr->header.execution_size) / horiz_stride;
                 vert_stride = horiz_stride * width;
 
-                if (get_type_size(src->reg_type) * (width + src->subreg_nr) > 32) {
+                if (get_type_size(src->reg.type) * (width + src->reg.subnr) > 32) {
                     horiz_stride = 0;
                     width = 1;
                     vert_stride = 0;
@@ -2789,9 +2791,9 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
             }
         }
 
-        src->vert_stride = ffs(vert_stride);
-        src->width = ffs(width) - 1;
-        src->horiz_stride = ffs(horiz_stride);
+        src->reg.vstride = ffs(vert_stride);
+        src->reg.width = ffs(width) - 1;
+        src->reg.hstride = ffs(horiz_stride);
     }
 }
 
@@ -2856,63 +2858,63 @@ int set_instruction_src0(struct brw_instruction *instr,
 	if (advanced_flag) {
 		reset_instruction_src_region(instr, src);
 	}
-	instr->bits1.da1.src0_reg_file = src->reg_file;
-	instr->bits1.da1.src0_reg_type = src->reg_type;
-	if (src->reg_file == BRW_IMMEDIATE_VALUE) {
+	instr->bits1.da1.src0_reg_file = src->reg.file;
+	instr->bits1.da1.src0_reg_type = src->reg.type;
+	if (src->reg.file == BRW_IMMEDIATE_VALUE) {
 		instr->bits3.ud = src->imm32;
-	} else if (src->address_mode == BRW_ADDRESS_DIRECT) {
+	} else if (src->reg.address_mode == BRW_ADDRESS_DIRECT) {
             if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits2.da1.src0_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode);
-		instr->bits2.da1.src0_reg_nr = src->reg_nr;
-		instr->bits2.da1.src0_vert_stride = src->vert_stride;
-		instr->bits2.da1.src0_width = src->width;
-		instr->bits2.da1.src0_horiz_stride = src->horiz_stride;
-		instr->bits2.da1.src0_negate = src->negate;
-		instr->bits2.da1.src0_abs = src->abs;
-		instr->bits2.da1.src0_address_mode = src->address_mode;
-		if (src->swizzle && src->swizzle != BRW_SWIZZLE_NOOP) {
+		instr->bits2.da1.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
+		instr->bits2.da1.src0_reg_nr = src->reg.nr;
+		instr->bits2.da1.src0_vert_stride = src->reg.vstride;
+		instr->bits2.da1.src0_width = src->reg.width;
+		instr->bits2.da1.src0_horiz_stride = src->reg.hstride;
+		instr->bits2.da1.src0_negate = src->reg.negate;
+		instr->bits2.da1.src0_abs = src->reg.abs;
+		instr->bits2.da1.src0_address_mode = src->reg.address_mode;
+		if (SWIZZLE(src->reg) && SWIZZLE(src->reg) != BRW_SWIZZLE_NOOP) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
 		}
             } else {
-		instr->bits2.da16.src0_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode);
-		instr->bits2.da16.src0_reg_nr = src->reg_nr;
-		instr->bits2.da16.src0_vert_stride = src->vert_stride;
-		instr->bits2.da16.src0_negate = src->negate;
-		instr->bits2.da16.src0_abs = src->abs;
-		instr->bits2.da16.src0_swz_x = BRW_GET_SWZ(src->swizzle, 0);
-		instr->bits2.da16.src0_swz_y = BRW_GET_SWZ(src->swizzle, 1);
-		instr->bits2.da16.src0_swz_z = BRW_GET_SWZ(src->swizzle, 2);
-		instr->bits2.da16.src0_swz_w = BRW_GET_SWZ(src->swizzle, 3);
-		instr->bits2.da16.src0_address_mode = src->address_mode;
+		instr->bits2.da16.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
+		instr->bits2.da16.src0_reg_nr = src->reg.nr;
+		instr->bits2.da16.src0_vert_stride = src->reg.vstride;
+		instr->bits2.da16.src0_negate = src->reg.negate;
+		instr->bits2.da16.src0_abs = src->reg.abs;
+		instr->bits2.da16.src0_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
+		instr->bits2.da16.src0_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
+		instr->bits2.da16.src0_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
+		instr->bits2.da16.src0_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
+		instr->bits2.da16.src0_address_mode = src->reg.address_mode;
             }
         } else {
             if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits2.ia1.src0_indirect_offset = src->indirect_offset;
-		instr->bits2.ia1.src0_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
-		instr->bits2.ia1.src0_abs = src->abs;
-		instr->bits2.ia1.src0_negate = src->negate;
-		instr->bits2.ia1.src0_address_mode = src->address_mode;
-		instr->bits2.ia1.src0_horiz_stride = src->horiz_stride;
-		instr->bits2.ia1.src0_width = src->width;
-		instr->bits2.ia1.src0_vert_stride = src->vert_stride;
-		if (src->swizzle && src->swizzle != BRW_SWIZZLE_NOOP) {
+		instr->bits2.ia1.src0_indirect_offset = src->reg.dw1.bits.indirect_offset;
+		instr->bits2.ia1.src0_subreg_nr = get_indirect_subreg_address(src->reg.subnr);
+		instr->bits2.ia1.src0_abs = src->reg.abs;
+		instr->bits2.ia1.src0_negate = src->reg.negate;
+		instr->bits2.ia1.src0_address_mode = src->reg.address_mode;
+		instr->bits2.ia1.src0_horiz_stride = src->reg.hstride;
+		instr->bits2.ia1.src0_width = src->reg.width;
+		instr->bits2.ia1.src0_vert_stride = src->reg.vstride;
+		if (SWIZZLE(src->reg) && SWIZZLE(src->reg) != BRW_SWIZZLE_NOOP) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
 		}
             } else {
-		instr->bits2.ia16.src0_swz_x = BRW_GET_SWZ(src->swizzle, 0);
-		instr->bits2.ia16.src0_swz_y = BRW_GET_SWZ(src->swizzle, 1);
-		instr->bits2.ia16.src0_swz_z = BRW_GET_SWZ(src->swizzle, 2);
-		instr->bits2.ia16.src0_swz_w = BRW_GET_SWZ(src->swizzle, 3);
-		instr->bits2.ia16.src0_indirect_offset = (src->indirect_offset >> 4); /* half register aligned */
-		instr->bits2.ia16.src0_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
-		instr->bits2.ia16.src0_abs = src->abs;
-		instr->bits2.ia16.src0_negate = src->negate;
-		instr->bits2.ia16.src0_address_mode = src->address_mode;
-		instr->bits2.ia16.src0_vert_stride = src->vert_stride;
+		instr->bits2.ia16.src0_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
+		instr->bits2.ia16.src0_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
+		instr->bits2.ia16.src0_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
+		instr->bits2.ia16.src0_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
+		instr->bits2.ia16.src0_indirect_offset = (src->reg.dw1.bits.indirect_offset >> 4); /* half register aligned */
+		instr->bits2.ia16.src0_subreg_nr = get_indirect_subreg_address(src->reg.subnr);
+		instr->bits2.ia16.src0_abs = src->reg.abs;
+		instr->bits2.ia16.src0_negate = src->reg.negate;
+		instr->bits2.ia16.src0_address_mode = src->reg.address_mode;
+		instr->bits2.ia16.src0_vert_stride = src->reg.vstride;
             }
         }
 
@@ -2927,20 +2929,20 @@ int set_instruction_src1(struct brw_instruction *instr,
 	if (advanced_flag) {
 		reset_instruction_src_region(instr, src);
 	}
-	instr->bits1.da1.src1_reg_file = src->reg_file;
-	instr->bits1.da1.src1_reg_type = src->reg_type;
-	if (src->reg_file == BRW_IMMEDIATE_VALUE) {
+	instr->bits1.da1.src1_reg_file = src->reg.file;
+	instr->bits1.da1.src1_reg_type = src->reg.type;
+	if (src->reg.file == BRW_IMMEDIATE_VALUE) {
 		instr->bits3.ud = src->imm32;
-	} else if (src->address_mode == BRW_ADDRESS_DIRECT) {
+	} else if (src->reg.address_mode == BRW_ADDRESS_DIRECT) {
             if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits3.da1.src1_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode);
-		instr->bits3.da1.src1_reg_nr = src->reg_nr;
-		instr->bits3.da1.src1_vert_stride = src->vert_stride;
-		instr->bits3.da1.src1_width = src->width;
-		instr->bits3.da1.src1_horiz_stride = src->horiz_stride;
-		instr->bits3.da1.src1_negate = src->negate;
-		instr->bits3.da1.src1_abs = src->abs;
-                instr->bits3.da1.src1_address_mode = src->address_mode;
+		instr->bits3.da1.src1_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
+		instr->bits3.da1.src1_reg_nr = src->reg.nr;
+		instr->bits3.da1.src1_vert_stride = src->reg.vstride;
+		instr->bits3.da1.src1_width = src->reg.width;
+		instr->bits3.da1.src1_horiz_stride = src->reg.hstride;
+		instr->bits3.da1.src1_negate = src->reg.negate;
+		instr->bits3.da1.src1_abs = src->reg.abs;
+                instr->bits3.da1.src1_address_mode = src->reg.address_mode;
 		/* XXX why?
 		if (src->address_mode != BRW_ADDRESS_DIRECT) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
@@ -2948,23 +2950,23 @@ int set_instruction_src1(struct brw_instruction *instr,
 			return 1;
 		}
 		*/
-		if (src->swizzle && src->swizzle != BRW_SWIZZLE_NOOP) {
+		if (SWIZZLE(src->reg) && SWIZZLE(src->reg) != BRW_SWIZZLE_NOOP) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
 		}
             } else {
-		instr->bits3.da16.src1_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode);
-		instr->bits3.da16.src1_reg_nr = src->reg_nr;
-		instr->bits3.da16.src1_vert_stride = src->vert_stride;
-		instr->bits3.da16.src1_negate = src->negate;
-		instr->bits3.da16.src1_abs = src->abs;
-		instr->bits3.da16.src1_swz_x = BRW_GET_SWZ(src->swizzle, 0);
-		instr->bits3.da16.src1_swz_y = BRW_GET_SWZ(src->swizzle, 1);
-		instr->bits3.da16.src1_swz_z = BRW_GET_SWZ(src->swizzle, 2);
-		instr->bits3.da16.src1_swz_w = BRW_GET_SWZ(src->swizzle, 3);
-                instr->bits3.da16.src1_address_mode = src->address_mode;
-		if (src->address_mode != BRW_ADDRESS_DIRECT) {
+		instr->bits3.da16.src1_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
+		instr->bits3.da16.src1_reg_nr = src->reg.nr;
+		instr->bits3.da16.src1_vert_stride = src->reg.vstride;
+		instr->bits3.da16.src1_negate = src->reg.negate;
+		instr->bits3.da16.src1_abs = src->reg.abs;
+		instr->bits3.da16.src1_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
+		instr->bits3.da16.src1_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
+		instr->bits3.da16.src1_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
+		instr->bits3.da16.src1_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
+                instr->bits3.da16.src1_address_mode = src->reg.address_mode;
+		if (src->reg.address_mode != BRW_ADDRESS_DIRECT) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
@@ -2972,30 +2974,30 @@ int set_instruction_src1(struct brw_instruction *instr,
             }
 	} else {
             if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits3.ia1.src1_indirect_offset = src->indirect_offset;
-		instr->bits3.ia1.src1_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
-		instr->bits3.ia1.src1_abs = src->abs;
-		instr->bits3.ia1.src1_negate = src->negate;
-		instr->bits3.ia1.src1_address_mode = src->address_mode;
-		instr->bits3.ia1.src1_horiz_stride = src->horiz_stride;
-		instr->bits3.ia1.src1_width = src->width;
-		instr->bits3.ia1.src1_vert_stride = src->vert_stride;
-		if (src->swizzle && src->swizzle != BRW_SWIZZLE_NOOP) {
+		instr->bits3.ia1.src1_indirect_offset = src->reg.dw1.bits.indirect_offset;
+		instr->bits3.ia1.src1_subreg_nr = get_indirect_subreg_address(src->reg.subnr);
+		instr->bits3.ia1.src1_abs = src->reg.abs;
+		instr->bits3.ia1.src1_negate = src->reg.negate;
+		instr->bits3.ia1.src1_address_mode = src->reg.address_mode;
+		instr->bits3.ia1.src1_horiz_stride = src->reg.hstride;
+		instr->bits3.ia1.src1_width = src->reg.width;
+		instr->bits3.ia1.src1_vert_stride = src->reg.vstride;
+		if (SWIZZLE(src->reg) && SWIZZLE(src->reg) != BRW_SWIZZLE_NOOP) {
 			fprintf(stderr, "error: swizzle bits set in align1 "
 				"instruction\n");
 			return 1;
 		}
             } else {
-		instr->bits3.ia16.src1_swz_x = BRW_GET_SWZ(src->swizzle, 0);
-		instr->bits3.ia16.src1_swz_y = BRW_GET_SWZ(src->swizzle, 1);
-		instr->bits3.ia16.src1_swz_z = BRW_GET_SWZ(src->swizzle, 2);
-		instr->bits3.ia16.src1_swz_w = BRW_GET_SWZ(src->swizzle, 3);
-		instr->bits3.ia16.src1_indirect_offset = (src->indirect_offset >> 4); /* half register aligned */
-		instr->bits3.ia16.src1_subreg_nr = get_indirect_subreg_address(src->subreg_nr);
-		instr->bits3.ia16.src1_abs = src->abs;
-		instr->bits3.ia16.src1_negate = src->negate;
-		instr->bits3.ia16.src1_address_mode = src->address_mode;
-		instr->bits3.ia16.src1_vert_stride = src->vert_stride;
+		instr->bits3.ia16.src1_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
+		instr->bits3.ia16.src1_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
+		instr->bits3.ia16.src1_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
+		instr->bits3.ia16.src1_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
+		instr->bits3.ia16.src1_indirect_offset = (src->reg.dw1.bits.indirect_offset >> 4); /* half register aligned */
+		instr->bits3.ia16.src1_subreg_nr = get_indirect_subreg_address(src->reg.subnr);
+		instr->bits3.ia16.src1_abs = src->reg.abs;
+		instr->bits3.ia16.src1_negate = src->reg.negate;
+		instr->bits3.ia16.src1_address_mode = src->reg.address_mode;
+		instr->bits3.ia16.src1_vert_stride = src->reg.vstride;
             }
         }
 
@@ -3040,9 +3042,9 @@ int set_instruction_src0_three_src(struct brw_instruction *instr,
 		reset_instruction_src_region(instr, src);
 	}
 	// TODO: supporting src0 swizzle, src0 modifier, src0 rep_ctrl
-	instr->bits1.da3src.src_reg_type = reg_type_2_to_3(src->reg_type);
-	instr->bits2.da3src.src0_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode) / 4; // in DWORD
-	instr->bits2.da3src.src0_reg_nr = src->reg_nr;
+	instr->bits1.da3src.src_reg_type = reg_type_2_to_3(src->reg.type);
+	instr->bits2.da3src.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
+	instr->bits2.da3src.src0_reg_nr = src->reg.nr;
 	return 0;
 }
 
@@ -3053,10 +3055,10 @@ int set_instruction_src1_three_src(struct brw_instruction *instr,
 		reset_instruction_src_region(instr, src);
 	}
 	// TODO: supporting src1 swizzle, src1 modifier, src1 rep_ctrl
-	int v = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode) / 4; // in DWORD
+	int v = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
 	instr->bits2.da3src.src1_subreg_nr_low = v % 4; // lower 2 bits
 	instr->bits3.da3src.src1_subreg_nr_high = v / 4; // highest bit
-	instr->bits3.da3src.src1_reg_nr = src->reg_nr;
+	instr->bits3.da3src.src1_reg_nr = src->reg.nr;
 	return 0;
 }
 
@@ -3067,8 +3069,8 @@ int set_instruction_src2_three_src(struct brw_instruction *instr,
 		reset_instruction_src_region(instr, src);
 	}
 	// TODO: supporting src2 swizzle, src2 modifier, src2 rep_ctrl
-	instr->bits3.da3src.src2_subreg_nr = get_subreg_address(src->reg_file, src->reg_type, src->subreg_nr, src->address_mode) / 4; // in DWORD
-	instr->bits3.da3src.src2_reg_nr = src->reg_nr;
+	instr->bits3.da3src.src2_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
+	instr->bits3.da3src.src2_reg_nr = src->reg.nr;
 	return 0;
 }
 
@@ -3106,15 +3108,15 @@ void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
 			    int type)
 {
 	memset(src, 0, sizeof(*src));
-	src->address_mode = BRW_ADDRESS_DIRECT;
-	src->reg_file = reg->file;
-	src->reg_type = type;
-	src->subreg_nr = reg->subnr;
-	src->reg_nr = reg->nr;
-	src->vert_stride = 0;
-	src->width = 0;
-	src->horiz_stride = 0;
-	src->negate = 0;
-	src->abs = 0;
-	src->swizzle = BRW_SWIZZLE_NOOP;
+	src->reg.address_mode = BRW_ADDRESS_DIRECT;
+	src->reg.file = reg->file;
+	src->reg.type = type;
+	src->reg.subnr = reg->subnr;
+	src->reg.nr = reg->nr;
+	src->reg.vstride = 0;
+	src->reg.width = 0;
+	src->reg.hstride = 0;
+	src->reg.negate = 0;
+	src->reg.abs = 0;
+	SWIZZLE(src->reg) = BRW_SWIZZLE_NOOP;
 }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 50/90] assembler: Factor out the destination register validation
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (48 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 49/90] assembler: Use brw_reg in the source operand Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 51/90] assembler: Use brw_set_dest() to encode the destination Damien Lespiau
                   ` (40 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The goal is to use brw_set_dest(), so let's start by validating the
register we have before generating the opcode.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   31 +++++++++++++++++++------------
 1 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 28b722e..9c5f864 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -29,6 +29,7 @@
 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>
+#include <stdbool.h>
 #include <assert.h>
 #include "gen4asm.h"
 #include "brw_defines.h"
@@ -159,6 +160,21 @@ static int resolve_dst_region(struct declared_register *reference, int region)
     return resolved;
 }
 
+static bool validate_dst_reg(struct brw_instruction *insn, struct brw_reg *reg)
+{
+
+    if (reg->address_mode == BRW_ADDRESS_DIRECT &&
+	insn->header.access_mode == BRW_ALIGN_1 &&
+	reg->dw1.bits.writemask != 0 &&
+	reg->dw1.bits.writemask != BRW_WRITEMASK_XYZW)
+    {
+	fprintf(stderr, "error: write mask set in align1 instruction\n");
+	return false;
+    }
+
+    return true;
+}
+
 %}
 
 %start ROOT
@@ -2803,6 +2819,9 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
 int set_instruction_dest(struct brw_instruction *instr,
 			 struct brw_reg *dest)
 {
+	if (!validate_dst_reg(instr, dest))
+		return 1;
+
 	if (dest->address_mode == BRW_ADDRESS_DIRECT &&
 	    instr->header.access_mode == BRW_ALIGN_1) {
 		instr->bits1.da1.dest_reg_file = dest->file;
@@ -2811,12 +2830,6 @@ int set_instruction_dest(struct brw_instruction *instr,
 		instr->bits1.da1.dest_reg_nr = dest->nr;
 		instr->bits1.da1.dest_horiz_stride = dest->hstride;
 		instr->bits1.da1.dest_address_mode = dest->address_mode;
-		if (dest->dw1.bits.writemask != 0 &&
-		    dest->dw1.bits.writemask != BRW_WRITEMASK_XYZW) {
-			fprintf(stderr, "error: write mask set in align1 "
-				"instruction\n");
-			return 1;
-		}
 	} else if (dest->address_mode == BRW_ADDRESS_DIRECT) {
 		instr->bits1.da16.dest_reg_file = dest->file;
 		instr->bits1.da16.dest_reg_type = dest->type;
@@ -2832,12 +2845,6 @@ int set_instruction_dest(struct brw_instruction *instr,
 		instr->bits1.ia1.dest_horiz_stride = dest->hstride;
 		instr->bits1.ia1.dest_indirect_offset = dest->dw1.bits.indirect_offset;
 		instr->bits1.ia1.dest_address_mode = dest->address_mode;
-		if (dest->dw1.bits.writemask != 0 &&
-		    dest->dw1.bits.writemask != BRW_WRITEMASK_XYZW) {
-			fprintf(stderr, "error: write mask set in align1 "
-				"instruction\n");
-			return 1;
-		}
 	} else {
 		instr->bits1.ia16.dest_reg_file = dest->file;
 		instr->bits1.ia16.dest_reg_type = dest->type;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 51/90] assembler: Use brw_set_dest() to encode the destination
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (49 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 50/90] assembler: Factor out the destination register validation Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 52/90] assembler: Factor out the source register validation Damien Lespiau
                   ` (39 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

A few notes:

I needed to introduce a brw context and compile structs. These are only
used to get which generation we are compiling code for, but eventually
we can use more of the infrastructure.

brw_set_dest() uses the destination register width to program the
instruction execution size.

The assembler can either take subnr in bytes or in number of elements,
so we need a resolve step when setting a brw_reg.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    3 +
 assembler/gram.y    |  171 +++++++++++++++++++++++----------------------------
 assembler/main.c    |   11 +++
 3 files changed, 91 insertions(+), 94 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 49c6ea0..0e3b965 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -43,6 +43,9 @@ typedef float GLfloat;
 
 extern long int gen_level;
 
+extern struct brw_context genasm_context;
+extern struct brw_compile genasm_compile;
+
 /* Predicate for Gen X and above */
 #define IS_GENp(x) (gen_level >= (x)*10)
 
diff --git a/assembler/gram.y b/assembler/gram.y
index 9c5f864..bf8d688 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -32,8 +32,7 @@
 #include <stdbool.h>
 #include <assert.h>
 #include "gen4asm.h"
-#include "brw_defines.h"
-#include "brw_reg.h"
+#include "brw_eu.h"
 
 #define DEFAULT_EXECSIZE (ffs(program_defaults.execute_size) - 1)
 #define DEFAULT_DSTREGION -1
@@ -175,6 +174,52 @@ static bool validate_dst_reg(struct brw_instruction *insn, struct brw_reg *reg)
     return true;
 }
 
+static int get_subreg_address(GLuint regfile, GLuint type, GLuint subreg, GLuint address_mode)
+{
+    int unit_size = 1;
+
+    assert(address_mode == BRW_ADDRESS_DIRECT);
+    assert(regfile != BRW_IMMEDIATE_VALUE);
+
+    if (advanced_flag)
+	unit_size = get_type_size(type);
+
+    return subreg * unit_size;
+}
+
+/* only used in indirect address mode.
+ * input: sub-register number of an address register
+ * output: the value of AddrSubRegNum in the instruction binary code
+ *
+ * input  output(advanced_flag==0)  output(advanced_flag==1)
+ *  a0.0             0                         0
+ *  a0.1        invalid input                  1
+ *  a0.2             1                         2
+ *  a0.3        invalid input                  3
+ *  a0.4             2                         4
+ *  a0.5        invalid input                  5
+ *  a0.6             3                         6
+ *  a0.7        invalid input                  7
+ *  a0.8             4                  invalid input
+ *  a0.10            5                  invalid input
+ *  a0.12            6                  invalid input
+ *  a0.14            7                  invalid input
+ */
+static int get_indirect_subreg_address(GLuint subreg)
+{
+    return advanced_flag == 0 ? subreg / 2 : subreg;
+}
+
+static void resolve_subnr(struct brw_reg *reg)
+{
+   if (reg->address_mode == BRW_ADDRESS_DIRECT)
+	reg->subnr = get_subreg_address(reg->file, reg->type, reg->subnr,
+					reg->address_mode);
+   else
+        reg->subnr = get_indirect_subreg_address(reg->subnr);
+}
+
+
 %}
 
 %start ROOT
@@ -522,8 +567,8 @@ ifelseinstruction: ENDIF
 
 		    memset(&$$, 0, sizeof($$));
 		    $$.gen.header.opcode = $1;
-		    $$.gen.header.execution_size = $2;
 		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
+		    ip_dst.width = $2;
 		    set_instruction_dest(&$$.gen, &ip_dst);
 		    set_instruction_src0(&$$.gen, &ip_src);
 		    set_instruction_src1(&$$.gen, &$3);
@@ -557,9 +602,9 @@ ifelseinstruction: ENDIF
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$.gen, &$1);
 		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = $3;
 		  if(!IS_GENp(6)) {
 		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
+		    ip_dst.width = $3;
 		    set_instruction_dest(&$$.gen, &ip_dst);
 		    set_instruction_src0(&$$.gen, &ip_src);
 		    set_instruction_src1(&$$.gen, &$4);
@@ -593,11 +638,11 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		     * offset is the second source operand.  The offset is added
 		     * to the pre-incremented IP.
 		     */
+		    ip_dst.width = $3;
 		    set_instruction_dest(&$$.gen, &ip_dst);
 		    memset(&$$, 0, sizeof($$));
 		    set_instruction_predicate(&$$.gen, &$1);
 		    $$.gen.header.opcode = $2;
-		    $$.gen.header.execution_size = $3;
 		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
 		    set_instruction_src0(&$$.gen, &ip_src);
 		    set_instruction_src1(&$$.gen, &$4);
@@ -632,11 +677,11 @@ haltinstruction: predicate HALT execsize relativelocation relativelocation insto
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$.gen, &$1);
 		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = $3;
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
 		  $$.second_reloc_target = $5.reloc_target;
 		  $$.second_reloc_offset = $5.imm32;
+		  dst_null_reg.width = $3;
 		  set_instruction_dest(&$$.gen, &dst_null_reg);
 		  set_instruction_src0(&$$.gen, &src_null_reg);
 		};
@@ -648,10 +693,10 @@ multibranchinstruction:
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$.gen, &$1);
 		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = $3;
 		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
+		  dst_null_reg.width = $3;
 		  set_instruction_dest(&$$.gen, &dst_null_reg);
 		}
 		| predicate BRC execsize relativelocation relativelocation instoptions
@@ -660,12 +705,12 @@ multibranchinstruction:
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$.gen, &$1);
 		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = $3;
 		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
 		  $$.second_reloc_target = $5.reloc_target;
 		  $$.second_reloc_offset = $5.imm32;
+		  dst_null_reg.width = $3;
 		  set_instruction_dest(&$$.gen, &dst_null_reg);
 		  set_instruction_src0(&$$.gen, &src_null_reg);
 		}
@@ -691,9 +736,9 @@ subroutineinstruction:
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$.gen, &$1);
 		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = 1; /* execution size must be 2. Here 1 is encoded 2. */
 
 		  $4.type = BRW_REGISTER_TYPE_D; /* dest type should be DWORD */
+		  $4.width = 1; /* execution size must be 2. Here 1 is encoded 2. */
 		  set_instruction_dest(&$$.gen, &$4);
 
 		  struct src_operand src0;
@@ -719,7 +764,7 @@ subroutineinstruction:
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$.gen, &$1);
 		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = 1; /* execution size of RET should be 2 */
+		  dst_null_reg.width = 1; /* execution size of RET should be 2 */
 		  set_instruction_dest(&$$.gen, &dst_null_reg);
 		  $5.reg.type = BRW_REGISTER_TYPE_D;
 		  $5.reg.hstride = 1; /*encoded 1*/
@@ -737,7 +782,7 @@ unaryinstruction:
 		  $$.header.opcode = $2;
 		  $$.header.destreg__conditionalmod = $3.cond;
 		  $$.header.saturate = $4;
-		  $$.header.execution_size = $5;
+		  $6.width = $5;
 		  set_instruction_options(&$$, &$8);
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$6) != 0)
@@ -756,7 +801,7 @@ unaryinstruction:
 		  }
 
 		  if (!IS_GENp(6) && 
-				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $$.header.execution_size) == 64)
+				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
 		    $$.header.compression_control = BRW_COMPRESSION_COMPRESSED;
 		}
 ;
@@ -774,9 +819,9 @@ binaryinstruction:
 		  $$.header.opcode = $2;
 		  $$.header.destreg__conditionalmod = $3.cond;
 		  $$.header.saturate = $4;
-		  $$.header.execution_size = $5;
 		  set_instruction_options(&$$, &$9);
 		  set_instruction_predicate(&$$, &$1);
+		  $6.width = $5;
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$7) != 0)
@@ -795,7 +840,7 @@ binaryinstruction:
 		  }
 
 		  if (!IS_GENp(6) && 
-				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $$.header.execution_size) == 64)
+				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
 		    $$.header.compression_control = BRW_COMPRESSION_COMPRESSED;
 		}
 ;
@@ -813,7 +858,7 @@ binaryaccinstruction:
 		  $$.header.opcode = $2;
 		  $$.header.destreg__conditionalmod = $3.cond;
 		  $$.header.saturate = $4;
-		  $$.header.execution_size = $5;
+		  $6.width = $5;
 		  set_instruction_options(&$$, &$9);
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$6) != 0)
@@ -834,7 +879,7 @@ binaryaccinstruction:
 		  }
 
 		  if (!IS_GENp(6) && 
-				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $$.header.execution_size) == 64)
+				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
 		    $$.header.compression_control = BRW_COMPRESSION_COMPRESSED;
 		}
 ;
@@ -895,7 +940,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		   */
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
+		  $5.width = $3;
 		  $$.header.destreg__conditionalmod = $4; /* msg reg index */
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$5) != 0)
@@ -951,11 +996,11 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		{
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
 		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 
+		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$6) != 0)
@@ -963,6 +1008,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  /* XXX is this correct? */
 		  if (set_instruction_src1(&$$, &$7) != 0)
 		    YYERROR;
+
 		  }
 		| predicate SEND execsize dst sendleadreg payload imm32reg instoptions
                 {
@@ -974,10 +1020,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  }
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
 		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
+		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$6) != 0)
@@ -1004,10 +1050,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
                   $$.header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
 		  set_instruction_predicate(&$$, &$1);
 
+		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
                       YYERROR;
 
@@ -1050,10 +1096,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
                   $$.header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
 		  set_instruction_predicate(&$$, &$1);
 
+		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
                       YYERROR;
 
@@ -1085,10 +1131,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  }
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
 		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
+		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$6) != 0)
@@ -1107,11 +1153,11 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		{
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.execution_size = $3;
 		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 
+		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$6) != 0)
@@ -1141,10 +1187,10 @@ jumpinstruction: predicate JMPI execsize relativelocation2
 		   */
 		  memset(&$$, 0, sizeof($$));
 		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = ffs(1) - 1;
 		  if(advanced_flag)
 			$$.gen.header.mask_control = BRW_MASK_DISABLE;
 		  set_instruction_predicate(&$$.gen, &$1);
+		  ip_dst.width = ffs(1) - 1;
 		  set_instruction_dest(&$$.gen, &ip_dst);
 		  set_instruction_src0(&$$.gen, &ip_src);
 		  set_instruction_src1(&$$.gen, &$4);
@@ -1158,9 +1204,9 @@ mathinstruction: predicate MATH_INST execsize dst src srcimm math_function insto
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
 		  $$.header.destreg__conditionalmod = $7;
-		  $$.header.execution_size = $3;
 		  set_instruction_options(&$$, &$8);
 		  set_instruction_predicate(&$$, &$1);
+		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$5) != 0)
@@ -1199,8 +1245,8 @@ syncinstruction: predicate WAIT notifyreg
 
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
-		  $$.header.execution_size = ffs(1) - 1;
 		  set_direct_dst_operand(&notify_dst, &$3, BRW_REGISTER_TYPE_D);
+		  notify_dst.width = ffs(1) - 1;
 		  set_instruction_dest(&$$, &notify_dst);
 		  set_direct_src_operand(&notify_src, &$3, BRW_REGISTER_TYPE_D);
 		  set_instruction_src0(&$$, &notify_src);
@@ -2708,42 +2754,6 @@ static int get_type_size(GLuint type)
     return size;
 }
 
-static int get_subreg_address(GLuint regfile, GLuint type, GLuint subreg, GLuint address_mode)
-{
-    int unit_size = 1;
-
-    assert(address_mode == BRW_ADDRESS_DIRECT);
-    assert(regfile != BRW_IMMEDIATE_VALUE);
-
-    if (advanced_flag)
-	unit_size = get_type_size(type);
-
-    return subreg * unit_size;
-}
-
-/* only used in indirect address mode.
- * input: sub-register number of an address register
- * output: the value of AddrSubRegNum in the instruction binary code
- *
- * input  output(advanced_flag==0)  output(advanced_flag==1)
- *  a0.0             0                         0
- *  a0.1        invalid input                  1
- *  a0.2             1                         2
- *  a0.3        invalid input                  3
- *  a0.4             2                         4
- *  a0.5        invalid input                  5
- *  a0.6             3                         6
- *  a0.7        invalid input                  7
- *  a0.8             4                  invalid input
- *  a0.10            5                  invalid input
- *  a0.12            6                  invalid input
- *  a0.14            7                  invalid input
- */
-static int get_indirect_subreg_address(GLuint subreg)
-{
-    return advanced_flag == 0 ? subreg / 2 : subreg;
-}
-
 static void reset_instruction_src_region(struct brw_instruction *instr, 
                                          struct src_operand *src)
 {
@@ -2822,38 +2832,11 @@ int set_instruction_dest(struct brw_instruction *instr,
 	if (!validate_dst_reg(instr, dest))
 		return 1;
 
-	if (dest->address_mode == BRW_ADDRESS_DIRECT &&
-	    instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits1.da1.dest_reg_file = dest->file;
-		instr->bits1.da1.dest_reg_type = dest->type;
-		instr->bits1.da1.dest_subreg_nr = get_subreg_address(dest->file, dest->type, dest->subnr, dest->address_mode);
-		instr->bits1.da1.dest_reg_nr = dest->nr;
-		instr->bits1.da1.dest_horiz_stride = dest->hstride;
-		instr->bits1.da1.dest_address_mode = dest->address_mode;
-	} else if (dest->address_mode == BRW_ADDRESS_DIRECT) {
-		instr->bits1.da16.dest_reg_file = dest->file;
-		instr->bits1.da16.dest_reg_type = dest->type;
-		instr->bits1.da16.dest_subreg_nr = get_subreg_address(dest->file, dest->type, dest->subnr, dest->address_mode);
-		instr->bits1.da16.dest_reg_nr = dest->nr;
-		instr->bits1.da16.dest_address_mode = dest->address_mode;
-		instr->bits1.da16.dest_horiz_stride = ffs(1);
-		instr->bits1.da16.dest_writemask = dest->dw1.bits.writemask;
-	} else if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits1.ia1.dest_reg_file = dest->file;
-		instr->bits1.ia1.dest_reg_type = dest->type;
-		instr->bits1.ia1.dest_subreg_nr = dest->subnr;
-		instr->bits1.ia1.dest_horiz_stride = dest->hstride;
-		instr->bits1.ia1.dest_indirect_offset = dest->dw1.bits.indirect_offset;
-		instr->bits1.ia1.dest_address_mode = dest->address_mode;
-	} else {
-		instr->bits1.ia16.dest_reg_file = dest->file;
-		instr->bits1.ia16.dest_reg_type = dest->type;
-		instr->bits1.ia16.dest_subreg_nr = get_indirect_subreg_address(dest->subnr);
-		instr->bits1.ia16.dest_writemask = dest->dw1.bits.writemask;
-		instr->bits1.ia16.dest_horiz_stride = ffs(1);
-		instr->bits1.ia16.dest_indirect_offset = (dest->dw1.bits.indirect_offset >> 4); /* half register aligned */
-		instr->bits1.ia16.dest_address_mode = dest->address_mode;
-	}
+	/* the assembler support expressing subnr in bytes or in number of
+	 * elements. */
+	resolve_subnr(dest);
+
+	brw_set_dest(&genasm_compile, instr, *dest);
 
 	return 0;
 }
diff --git a/assembler/main.c b/assembler/main.c
index 176835b..cfee749 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -33,7 +33,9 @@
 #include <unistd.h>
 #include <assert.h>
 
+#include "ralloc.h"
 #include "gen4asm.h"
+#include "brw_eu.h"
 
 extern FILE *yyin;
 
@@ -48,6 +50,9 @@ char *export_filename = NULL;
 
 const char const *binary_prepend = "static const char gen_eu_bytes[] = {\n";
 
+struct brw_context genasm_brw_context;
+struct brw_compile genasm_compile;
+
 struct brw_program compiled_program;
 struct program_defaults program_defaults = {.register_type = BRW_REGISTER_TYPE_F};
 
@@ -286,6 +291,8 @@ int main(int argc, char **argv)
 	struct brw_program_instruction *entry, *entry1, *tmp_entry;
 	int err, inst_offset;
 	char o;
+	void *mem_ctx;
+
 	while ((o = getopt_long(argc, argv, "e:l:o:g:ab", longopts, NULL)) != -1) {
 		switch (o) {
 		case 'o':
@@ -358,6 +365,10 @@ int main(int argc, char **argv)
 		}
 	}
 
+	brw_init_context(&genasm_brw_context, gen_level);
+	mem_ctx = ralloc_context(NULL);
+	brw_init_compile(&genasm_brw_context, &genasm_compile, mem_ctx);
+
 	err = yyparse();
 
 	if (strcmp(argv[0], "-"))
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 52/90] assembler: Factor out the source register validation
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (50 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 51/90] assembler: Use brw_set_dest() to encode the destination Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 53/90] assembler: ExecSize can be as big as 32 channels Damien Lespiau
                   ` (38 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The goal is to use brw_set_src[01](), so let's start by validating the
register we have before generating the opcode.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   61 ++++++++++++++++++++++-------------------------------
 1 files changed, 25 insertions(+), 36 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index bf8d688..fb2b127 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -174,6 +174,21 @@ static bool validate_dst_reg(struct brw_instruction *insn, struct brw_reg *reg)
     return true;
 }
 
+static bool validate_src_reg(struct brw_instruction *insn, struct brw_reg reg)
+{
+    if (reg.file == BRW_IMMEDIATE_VALUE)
+	return true;
+
+    if (insn->header.access_mode == BRW_ALIGN_1 &&
+	SWIZZLE(reg) && SWIZZLE(reg) != BRW_SWIZZLE_NOOP)
+    {
+	fprintf(stderr, "error: swizzle bits set in align1 instruction\n");
+	return false;
+    }
+
+    return true;
+}
+
 static int get_subreg_address(GLuint regfile, GLuint type, GLuint subreg, GLuint address_mode)
 {
     int unit_size = 1;
@@ -2845,9 +2860,12 @@ int set_instruction_dest(struct brw_instruction *instr,
 int set_instruction_src0(struct brw_instruction *instr,
 			  struct src_operand *src)
 {
-	if (advanced_flag) {
+	if (advanced_flag)
 		reset_instruction_src_region(instr, src);
-	}
+
+	if (!validate_src_reg(instr, src->reg))
+		return 1;
+
 	instr->bits1.da1.src0_reg_file = src->reg.file;
 	instr->bits1.da1.src0_reg_type = src->reg.type;
 	if (src->reg.file == BRW_IMMEDIATE_VALUE) {
@@ -2862,11 +2880,6 @@ int set_instruction_src0(struct brw_instruction *instr,
 		instr->bits2.da1.src0_negate = src->reg.negate;
 		instr->bits2.da1.src0_abs = src->reg.abs;
 		instr->bits2.da1.src0_address_mode = src->reg.address_mode;
-		if (SWIZZLE(src->reg) && SWIZZLE(src->reg) != BRW_SWIZZLE_NOOP) {
-			fprintf(stderr, "error: swizzle bits set in align1 "
-				"instruction\n");
-			return 1;
-		}
             } else {
 		instr->bits2.da16.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
 		instr->bits2.da16.src0_reg_nr = src->reg.nr;
@@ -2889,11 +2902,6 @@ int set_instruction_src0(struct brw_instruction *instr,
 		instr->bits2.ia1.src0_horiz_stride = src->reg.hstride;
 		instr->bits2.ia1.src0_width = src->reg.width;
 		instr->bits2.ia1.src0_vert_stride = src->reg.vstride;
-		if (SWIZZLE(src->reg) && SWIZZLE(src->reg) != BRW_SWIZZLE_NOOP) {
-			fprintf(stderr, "error: swizzle bits set in align1 "
-				"instruction\n");
-			return 1;
-		}
             } else {
 		instr->bits2.ia16.src0_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
 		instr->bits2.ia16.src0_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
@@ -2916,9 +2924,12 @@ int set_instruction_src0(struct brw_instruction *instr,
 int set_instruction_src1(struct brw_instruction *instr,
 			  struct src_operand *src)
 {
-	if (advanced_flag) {
+	if (advanced_flag)
 		reset_instruction_src_region(instr, src);
-	}
+
+	if (!validate_src_reg(instr, src->reg))
+		return 1;
+
 	instr->bits1.da1.src1_reg_file = src->reg.file;
 	instr->bits1.da1.src1_reg_type = src->reg.type;
 	if (src->reg.file == BRW_IMMEDIATE_VALUE) {
@@ -2933,18 +2944,6 @@ int set_instruction_src1(struct brw_instruction *instr,
 		instr->bits3.da1.src1_negate = src->reg.negate;
 		instr->bits3.da1.src1_abs = src->reg.abs;
                 instr->bits3.da1.src1_address_mode = src->reg.address_mode;
-		/* XXX why?
-		if (src->address_mode != BRW_ADDRESS_DIRECT) {
-			fprintf(stderr, "error: swizzle bits set in align1 "
-				"instruction\n");
-			return 1;
-		}
-		*/
-		if (SWIZZLE(src->reg) && SWIZZLE(src->reg) != BRW_SWIZZLE_NOOP) {
-			fprintf(stderr, "error: swizzle bits set in align1 "
-				"instruction\n");
-			return 1;
-		}
             } else {
 		instr->bits3.da16.src1_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
 		instr->bits3.da16.src1_reg_nr = src->reg.nr;
@@ -2956,11 +2955,6 @@ int set_instruction_src1(struct brw_instruction *instr,
 		instr->bits3.da16.src1_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
 		instr->bits3.da16.src1_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
                 instr->bits3.da16.src1_address_mode = src->reg.address_mode;
-		if (src->reg.address_mode != BRW_ADDRESS_DIRECT) {
-			fprintf(stderr, "error: swizzle bits set in align1 "
-				"instruction\n");
-			return 1;
-		}
             }
 	} else {
             if (instr->header.access_mode == BRW_ALIGN_1) {
@@ -2972,11 +2966,6 @@ int set_instruction_src1(struct brw_instruction *instr,
 		instr->bits3.ia1.src1_horiz_stride = src->reg.hstride;
 		instr->bits3.ia1.src1_width = src->reg.width;
 		instr->bits3.ia1.src1_vert_stride = src->reg.vstride;
-		if (SWIZZLE(src->reg) && SWIZZLE(src->reg) != BRW_SWIZZLE_NOOP) {
-			fprintf(stderr, "error: swizzle bits set in align1 "
-				"instruction\n");
-			return 1;
-		}
             } else {
 		instr->bits3.ia16.src1_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
 		instr->bits3.ia16.src1_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 53/90] assembler: ExecSize can be as big as 32 channels
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (51 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 52/90] assembler: Factor out the source register validation Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 54/90] assembler: Fix comparisons between reg.type and Architecture registers Damien Lespiau
                   ` (37 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

See the IVB PRM, vol4 part3 5.2.3.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_eu_emit.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
index ea4baeb..ed24e48 100644
--- a/assembler/brw_eu_emit.c
+++ b/assembler/brw_eu_emit.c
@@ -163,7 +163,7 @@ validate_reg(struct brw_instruction *insn, struct brw_reg reg)
    int hstride_for_reg[] = {0, 1, 2, 4};
    int vstride_for_reg[] = {0, 1, 2, 4, 8, 16, 32, 64, 128, 256};
    int width_for_reg[] = {1, 2, 4, 8, 16};
-   int execsize_for_reg[] = {1, 2, 4, 8, 16};
+   int execsize_for_reg[] = {1, 2, 4, 8, 16, 32};
    int width, hstride, vstride, execsize;
 
    if (reg.file == BRW_IMMEDIATE_VALUE) {
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 54/90] assembler: Fix comparisons between reg.type and Architecture registers
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (52 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 53/90] assembler: ExecSize can be as big as 32 channels Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 55/90] assembler: Store immediate values in reg.dw1.ud Damien Lespiau
                   ` (36 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Of course the assertion is there to make sure GRF and MRF have a reg.nr
< 128. To exclude ARF registers, reg.file has be checked, not reg.type
(channel type). Most likely a typo never caught.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_eu_emit.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
index ed24e48..119eb34 100644
--- a/assembler/brw_eu_emit.c
+++ b/assembler/brw_eu_emit.c
@@ -240,7 +240,7 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
    struct brw_context *brw = p->brw;
    struct intel_context *intel = &brw->intel;
 
-   if (reg.type != BRW_ARCHITECTURE_REGISTER_FILE)
+   if (reg.file != BRW_ARCHITECTURE_REGISTER_FILE)
       assert(reg.nr < 128);
 
    gen7_convert_mrf_to_grf(p, &reg);
@@ -332,7 +332,7 @@ void brw_set_src1(struct brw_compile *p,
 {
    assert(reg.file != BRW_MESSAGE_REGISTER_FILE);
 
-   if (reg.type != BRW_ARCHITECTURE_REGISTER_FILE)
+   if (reg.file != BRW_ARCHITECTURE_REGISTER_FILE)
       assert(reg.nr < 128);
 
    gen7_convert_mrf_to_grf(p, &reg);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 55/90] assembler: Store immediate values in reg.dw1.ud
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (53 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 54/90] assembler: Fix comparisons between reg.type and Architecture registers Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 56/90] assembler: Don't warn if identical declared registers are redefined Damien Lespiau
                   ` (35 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Another step in pushing the parsing in struct brw_reg.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    2 +-
 assembler/gram.y    |   20 ++++++++++----------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 0e3b965..58cf11a 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -91,7 +91,7 @@ struct regtype {
 struct src_operand {
 	struct brw_reg reg;
 	int default_region;
-	uint32_t imm32; /* set if reg.file == BRW_IMMEDIATE_VALUE or it is expressing a branch offset */
+	uint32_t imm32; /* set if src_operand is expressing a branch offset */
 	char *reloc_target; /* bspec: branching instructions JIP and UIP are source operands */
 } src_operand;
 
diff --git a/assembler/gram.y b/assembler/gram.y
index fb2b127..be8ff01 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1030,7 +1030,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  if ($7.reg.type != BRW_REGISTER_TYPE_UD &&
 		      $7.reg.type != BRW_REGISTER_TYPE_D &&
 		      $7.reg.type != BRW_REGISTER_TYPE_V) {
-		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.imm32, $7.reg.type);
+		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.reg.dw1.ud, $7.reg.type);
 			YYERROR;
 		  }
 		  memset(&$$, 0, sizeof($$));
@@ -1045,7 +1045,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		    YYERROR;
 		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
 		  $$.bits1.da1.src1_reg_type = $7.reg.type;
-		  $$.bits3.ud = $7.imm32;
+		  $$.bits3.ud = $7.reg.dw1.ud;
                 }
 		| predicate SEND execsize dst sendleadreg sndopr imm32reg instoptions
 		{
@@ -1059,7 +1059,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  if ($7.reg.type != BRW_REGISTER_TYPE_UD &&
                       $7.reg.type != BRW_REGISTER_TYPE_D &&
                       $7.reg.type != BRW_REGISTER_TYPE_V) {
-                      fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.imm32, $7.reg.type);
+                      fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.reg.dw1.ud, $7.reg.type);
                       YYERROR;
 		  }
 
@@ -1089,7 +1089,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 
 		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
 		  $$.bits1.da1.src1_reg_type = $7.reg.type;
-                  $$.bits3.ud = $7.imm32;
+                  $$.bits3.ud = $7.reg.dw1.ud;
                   $$.bits3.generic_gen5.end_of_thread = !!($6 & EX_DESC_EOT_MASK);
 		}
 		| predicate SEND execsize dst sendleadreg sndopr directsrcoperand instoptions
@@ -1141,7 +1141,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  if ($8.reg.type != BRW_REGISTER_TYPE_UD &&
 		      $8.reg.type != BRW_REGISTER_TYPE_D &&
 		      $8.reg.type != BRW_REGISTER_TYPE_V) {
-		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $8.imm32, $8.reg.type);
+		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $8.reg.dw1.ud, $8.reg.type);
 			YYERROR;
 		  }
 		  memset(&$$, 0, sizeof($$));
@@ -1158,11 +1158,11 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  $$.bits1.da1.src1_reg_type = $8.reg.type;
 		  if (IS_GENx(5)) {
 		      $$.bits2.send_gen5.sfid = ($7 & EX_DESC_SFID_MASK);
-		      $$.bits3.ud = $8.imm32;
+		      $$.bits3.ud = $8.reg.dw1.ud;
 		      $$.bits3.generic_gen5.end_of_thread = !!($7 & EX_DESC_EOT_MASK);
 		  }
 		  else
-		      $$.bits3.ud = $8.imm32;
+		      $$.bits3.ud = $8.reg.dw1.ud;
 		}
 		| predicate SEND execsize dst sendleadreg payload exp directsrcoperand instoptions
 		{
@@ -1837,7 +1837,7 @@ imm32reg:	imm32 srcimmtype
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.reg.file = BRW_IMMEDIATE_VALUE;
 		  $$.reg.type = $2;
-		  $$.imm32 = d;
+		  $$.reg.dw1.ud = d;
 		}
 ;
 
@@ -2869,7 +2869,7 @@ int set_instruction_src0(struct brw_instruction *instr,
 	instr->bits1.da1.src0_reg_file = src->reg.file;
 	instr->bits1.da1.src0_reg_type = src->reg.type;
 	if (src->reg.file == BRW_IMMEDIATE_VALUE) {
-		instr->bits3.ud = src->imm32;
+		instr->bits3.ud = src->reg.dw1.ud;
 	} else if (src->reg.address_mode == BRW_ADDRESS_DIRECT) {
             if (instr->header.access_mode == BRW_ALIGN_1) {
 		instr->bits2.da1.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
@@ -2933,7 +2933,7 @@ int set_instruction_src1(struct brw_instruction *instr,
 	instr->bits1.da1.src1_reg_file = src->reg.file;
 	instr->bits1.da1.src1_reg_type = src->reg.type;
 	if (src->reg.file == BRW_IMMEDIATE_VALUE) {
-		instr->bits3.ud = src->imm32;
+		instr->bits3.ud = src->reg.dw1.ud;
 	} else if (src->reg.address_mode == BRW_ADDRESS_DIRECT) {
             if (instr->header.access_mode == BRW_ALIGN_1) {
 		instr->bits3.da1.src1_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 56/90] assembler: Don't warn if identical declared registers are redefined
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (54 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 55/90] assembler: Store immediate values in reg.dw1.ud Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 57/90] assembler: Add location support Damien Lespiau
                   ` (34 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

There's no real need to warn when the same register is declared twice.
Currently the libva driver does do that and this warning makes other
errors really hide in a sea of warnings.

Redefining a register with different parameters is a real error though,
so we should not allow that and error out in that case.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   74 +++++++++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 59 insertions(+), 15 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index be8ff01..8b56bd9 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -96,6 +96,46 @@ void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
 			    int type);
 
+/* like strcmp, but handles NULL pointers */
+static bool strcmp0(const char *s1, const char* s2)
+{
+    if (!s1)
+	return -(s1 != s2);
+    if (!s2)
+	return s1 != s2;
+    return strcmp (s1, s2);
+}
+
+static bool region_equal(struct region *r1, struct region *r2)
+{
+    return memcmp(r1, r2, sizeof(struct region)) == 0;
+}
+
+static bool reg_equal(struct brw_reg *r1, struct brw_reg *r2)
+{
+    return memcmp(r1, r2, sizeof(struct brw_reg)) == 0;
+}
+
+static bool declared_register_equal(struct declared_register *r1,
+				     struct declared_register *r2)
+{
+    if (strcmp0(r1->name, r2->name) != 0)
+	return false;
+
+    if (!reg_equal(&r1->reg, &r2->reg))
+	return false;
+
+    if (!region_equal(&r1->src_region, &r2->src_region))
+	return false;
+
+    if (r1->element_size != r2->element_size ||
+        r1->dst_region != r2->dst_region ||
+	r1->type != r2->type)
+	return false;
+
+    return true;
+}
+
 static void brw_program_init(struct brw_program *p)
 {
    memset(p, 0, sizeof(struct brw_program));
@@ -431,23 +471,27 @@ declare_type:	TYPE EQ regtype
 ;
 declare_pragma:	DECLARE_PRAGMA STRING declare_base declare_elementsize declare_srcregion declare_dstregion declare_type
 		{
-		    struct declared_register *reg;
-		    int defined;
-		    defined = (reg = find_register($2)) != NULL;
-		    if (defined) {
-			fprintf(stderr, "WARNING: %s already defined\n", $2);
+		    struct declared_register reg, *found, *new_reg;
+
+		    reg.name = $2;
+		    reg.reg = $3;
+		    reg.element_size = $4;
+		    reg.src_region = $5;
+		    reg.dst_region = $6;
+		    reg.type = $7;
+
+		    found = find_register($2);
+		    if (found) {
+		        if (!declared_register_equal(&reg, found)) {
+			    fprintf(stderr, "Error: %s already defined and "
+				    "definitions don't agree\n", $2);
+			    YYERROR;
+			}
 			free($2); // $2 has been malloc'ed by strdup
 		    } else {
-			reg = calloc(sizeof(struct declared_register), 1);
-			reg->name = $2;
-		    }
-		    reg->reg = $3;
-		    reg->element_size = $4;
-		    reg->src_region = $5;
-		    reg->dst_region = $6;
-		    reg->type = $7;
-		    if (!defined) {
-			insert_register(reg);
+			new_reg = malloc(sizeof(struct declared_register));
+			*new_reg = reg;
+			insert_register(new_reg);
 		    }
 		}
 ;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 57/90] assembler: Add location support
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (55 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 56/90] assembler: Don't warn if identical declared registers are redefined Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 58/90] assembler: Add error() and warn() shorthands and use them in set_src[01] Damien Lespiau
                   ` (33 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Let's generate location information about the tokens we are parsing.
This can be used to give accurate location when reporting errors and
warnings.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |    1 +
 assembler/lex.l  |   24 ++++++++++++++++++------
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 8b56bd9..9a4e510 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -276,6 +276,7 @@ static void resolve_subnr(struct brw_reg *reg)
 
 
 %}
+%locations
 
 %start ROOT
 
diff --git a/assembler/lex.l b/assembler/lex.l
index 626042f..769d98b 100644
--- a/assembler/lex.l
+++ b/assembler/lex.l
@@ -9,6 +9,15 @@
 int saved_state = 0;
 extern char *input_filename;
 
+/* Locations */
+int yycolumn = 1;
+
+#define YY_USER_ACTION						\
+	yylloc.first_line = yylloc.last_line = yylineno;	\
+	yylloc.first_column = yycolumn;				\
+	yylloc.last_column = yycolumn+yyleng-1;			\
+	yycolumn += yyleng;
+
 %}
 %x BLOCK_COMMENT
 %x CHANNEL
@@ -16,11 +25,11 @@ extern char *input_filename;
 %x FILENAME
 
 %%
-\/\/.*[\r\n] { } /* eat up single-line comments */
-"\.kernel".*[\r\n] { }
-"\.end_kernel".*[\r\n] { }
-"\.code".*[\r\n] { }
-"\.end_code".*[\r\n] { }
+\/\/.*[\r\n] { yycolumn = 1; } /* eat up single-line comments */
+"\.kernel".*[\r\n] { yycolumn = 1; }
+"\.end_kernel".*[\r\n] { yycolumn = 1; }
+"\.code".*[\r\n] { yycolumn = 1; }
+"\.end_code".*[\r\n] { yycolumn = 1; }
 
  /* eat up multi-line comments, non-nesting. */
 \/\* {
@@ -33,6 +42,7 @@ extern char *input_filename;
 <BLOCK_COMMENT>. { }
 <BLOCK_COMMENT>[\r\n] { }
 "#line"" "* { 
+	yycolumn = 1;
 	saved_state = YYSTATE;
 	BEGIN(LINENUMBER);
 }
@@ -407,7 +417,9 @@ yylval.integer = BRW_CHANNEL_W;
 	return NUMBER;
 }
 
-[ \t\n]+ { } /* eat up whitespace */
+[ \t]+ { } /* eat up whitespace */
+
+\n { yycolumn = 1; }
 
 . {
 	fprintf(stderr, "%s: %d: %s at \"%s\"\n",
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 58/90] assembler: Add error() and warn() shorthands and use them in set_src[01]
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (56 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 57/90] assembler: Add location support Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 59/90] assembler: Add a check for when width is 1 and hstride is not 0 Damien Lespiau
                   ` (32 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Now that we have locations, we can write error() and warn() functions
giving more information about where it's going wrong.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    2 +
 assembler/gram.y    |  121 ++++++++++++++++++++++++++++++++++-----------------
 2 files changed, 83 insertions(+), 40 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 58cf11a..8db7bce 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -69,6 +69,8 @@ typedef struct { \
 /* ensure nobody changes the size of struct brw_instruction */
 STRUCT_SIZE_ASSERT(brw_instruction, 16);
 
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof(x[0]))
+
 struct condition {
     	int cond;
 	int flag_reg_nr;
diff --git a/assembler/gram.y b/assembler/gram.y
index 9a4e510..f27f6fe 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -30,6 +30,7 @@
 #include <string.h>
 #include <stdlib.h>
 #include <stdbool.h>
+#include <stdarg.h>
 #include <assert.h>
 #include "gen4asm.h"
 #include "brw_eu.h"
@@ -39,6 +40,15 @@
 
 #define SWIZZLE(reg) (reg.dw1.bits.swizzle)
 
+#define YYLTYPE YYLTYPE
+typedef struct YYLTYPE
+{
+ int first_line;
+ int first_column;
+ int last_line;
+ int last_column;
+} YYLTYPE;
+
 extern long int gen_level;
 extern int advanced_flag;
 extern int yylineno;
@@ -76,9 +86,11 @@ static int get_type_size(GLuint type);
 int set_instruction_dest(struct brw_instruction *instr,
 			 struct brw_reg *dest);
 int set_instruction_src0(struct brw_instruction *instr,
-			 struct src_operand *src);
+			 struct src_operand *src,
+			 YYLTYPE *location);
 int set_instruction_src1(struct brw_instruction *instr,
-			 struct src_operand *src);
+			 struct src_operand *src,
+			 YYLTYPE *location);
 int set_instruction_dest_three_src(struct brw_instruction *instr,
                                    struct brw_reg *dest);
 int set_instruction_src0_three_src(struct brw_instruction *instr,
@@ -96,6 +108,31 @@ void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
 			    int type);
 
+enum message_level {
+    WARN,
+    ERROR,
+};
+
+static void message(enum message_level level, YYLTYPE *location,
+		    const char *fmt, ...)
+{
+    static const char *level_str[] = { "warning", "error" };
+    va_list args;
+
+    if (location)
+	fprintf(stderr, "%d:%d: %s: ", location->first_line,
+		location->first_column, level_str[level]);
+    else
+	fprintf(stderr, "%s: ", level_str[level]);
+
+    va_start(args, fmt);
+    vfprintf(stderr, fmt, args);
+    va_end(args);
+}
+
+#define warn(l, fmt, ...)	message(WARN, location, fmt, ## __VA_ARGS__)
+#define error(l, fmt, ...)	message(ERROR, location, fmt, ## __VA_ARGS__)
+
 /* like strcmp, but handles NULL pointers */
 static bool strcmp0(const char *s1, const char* s2)
 {
@@ -214,7 +251,9 @@ static bool validate_dst_reg(struct brw_instruction *insn, struct brw_reg *reg)
     return true;
 }
 
-static bool validate_src_reg(struct brw_instruction *insn, struct brw_reg reg)
+static bool validate_src_reg(struct brw_instruction *insn,
+			     struct brw_reg reg,
+			     YYLTYPE *location)
 {
     if (reg.file == BRW_IMMEDIATE_VALUE)
 	return true;
@@ -222,7 +261,7 @@ static bool validate_src_reg(struct brw_instruction *insn, struct brw_reg reg)
     if (insn->header.access_mode == BRW_ALIGN_1 &&
 	SWIZZLE(reg) && SWIZZLE(reg) != BRW_SWIZZLE_NOOP)
     {
-	fprintf(stderr, "error: swizzle bits set in align1 instruction\n");
+	error(location, "swizzle bits set in align1 instruction\n");
 	return false;
     }
 
@@ -630,8 +669,8 @@ ifelseinstruction: ENDIF
 		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
 		    ip_dst.width = $2;
 		    set_instruction_dest(&$$.gen, &ip_dst);
-		    set_instruction_src0(&$$.gen, &ip_src);
-		    set_instruction_src1(&$$.gen, &$3);
+		    set_instruction_src0(&$$.gen, &ip_src, NULL);
+		    set_instruction_src1(&$$.gen, &$3, NULL);
 		    $$.first_reloc_target = $3.reloc_target;
 		    $$.first_reloc_offset = $3.imm32;
 		  } else if(IS_GENp(6)) {
@@ -666,8 +705,8 @@ ifelseinstruction: ENDIF
 		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
 		    ip_dst.width = $3;
 		    set_instruction_dest(&$$.gen, &ip_dst);
-		    set_instruction_src0(&$$.gen, &ip_src);
-		    set_instruction_src1(&$$.gen, &$4);
+		    set_instruction_src0(&$$.gen, &ip_src, NULL);
+		    set_instruction_src1(&$$.gen, &$4, NULL);
 		  }
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
@@ -704,8 +743,8 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		    set_instruction_predicate(&$$.gen, &$1);
 		    $$.gen.header.opcode = $2;
 		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
-		    set_instruction_src0(&$$.gen, &ip_src);
-		    set_instruction_src1(&$$.gen, &$4);
+		    set_instruction_src0(&$$.gen, &ip_src, NULL);
+		    set_instruction_src1(&$$.gen, &$4, NULL);
 		    $$.first_reloc_target = $4.reloc_target;
 		    $$.first_reloc_offset = $4.imm32;
 		  } else if (IS_GENp(6)) {
@@ -743,7 +782,7 @@ haltinstruction: predicate HALT execsize relativelocation relativelocation insto
 		  $$.second_reloc_offset = $5.imm32;
 		  dst_null_reg.width = $3;
 		  set_instruction_dest(&$$.gen, &dst_null_reg);
-		  set_instruction_src0(&$$.gen, &src_null_reg);
+		  set_instruction_src0(&$$.gen, &src_null_reg, NULL);
 		};
 
 multibranchinstruction:
@@ -772,7 +811,7 @@ multibranchinstruction:
 		  $$.second_reloc_offset = $5.imm32;
 		  dst_null_reg.width = $3;
 		  set_instruction_dest(&$$.gen, &dst_null_reg);
-		  set_instruction_src0(&$$.gen, &src_null_reg);
+		  set_instruction_src0(&$$.gen, &src_null_reg, NULL);
 		}
 ;
 
@@ -808,7 +847,7 @@ subroutineinstruction:
 		  src0.reg.hstride = 1; /*encoded 1*/
 		  src0.reg.width = 1; /*encoded 2*/
 		  src0.reg.vstride = 2; /*encoded 2*/
-		  set_instruction_src0(&$$.gen, &src0);
+		  set_instruction_src0(&$$.gen, &src0, NULL);
 
 		  $$.first_reloc_target = $5.reloc_target;
 		  $$.first_reloc_offset = $5.imm32;
@@ -830,7 +869,7 @@ subroutineinstruction:
 		  $5.reg.hstride = 1; /*encoded 1*/
 		  $5.reg.width = 1; /*encoded 2*/
 		  $5.reg.vstride = 2; /*encoded 2*/
-		  set_instruction_src0(&$$.gen, &$5);
+		  set_instruction_src0(&$$.gen, &$5, NULL);
 		}
 ;
 
@@ -847,7 +886,7 @@ unaryinstruction:
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
-		  if (set_instruction_src0(&$$, &$7) != 0)
+		  if (set_instruction_src0(&$$, &$7, &@7) != 0)
 		    YYERROR;
 
 		  if ($3.flag_subreg_nr != -1) {
@@ -884,9 +923,9 @@ binaryinstruction:
 		  $6.width = $5;
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
-		  if (set_instruction_src0(&$$, &$7) != 0)
+		  if (set_instruction_src0(&$$, &$7, &@7) != 0)
 		    YYERROR;
-		  if (set_instruction_src1(&$$, &$8) != 0)
+		  if (set_instruction_src1(&$$, &$8, &@8) != 0)
 		    YYERROR;
 
 		  if ($3.flag_subreg_nr != -1) {
@@ -923,9 +962,9 @@ binaryaccinstruction:
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
-		  if (set_instruction_src0(&$$, &$7) != 0)
+		  if (set_instruction_src0(&$$, &$7, &@7) != 0)
 		    YYERROR;
-		  if (set_instruction_src1(&$$, &$8) != 0)
+		  if (set_instruction_src1(&$$, &$8, &@8) != 0)
 		    YYERROR;
 
 		  if ($3.flag_subreg_nr != -1) {
@@ -1020,9 +1059,9 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       src0.reg.type = BRW_REGISTER_TYPE_D;
                       src0.reg.nr = $4;
                       src0.reg.subnr = 0;
-                      set_instruction_src0(&$$, &src0);
+                      set_instruction_src0(&$$, &src0, NULL);
 		  } else {
-                      if (set_instruction_src0(&$$, &$6) != 0)
+                      if (set_instruction_src0(&$$, &$6, &@6) != 0)
                           YYERROR;
 		  }
 
@@ -1063,10 +1102,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
-		  if (set_instruction_src0(&$$, &$6) != 0)
+		  if (set_instruction_src0(&$$, &$6, &@6) != 0)
 		    YYERROR;
 		  /* XXX is this correct? */
-		  if (set_instruction_src1(&$$, &$7) != 0)
+		  if (set_instruction_src1(&$$, &$7, &@7) != 0)
 		    YYERROR;
 
 		  }
@@ -1086,7 +1125,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
-		  if (set_instruction_src0(&$$, &$6) != 0)
+		  if (set_instruction_src0(&$$, &$6, &@6) != 0)
 		    YYERROR;
 		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
 		  $$.bits1.da1.src1_reg_type = $7.reg.type;
@@ -1130,7 +1169,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 
                   src0.reg.nr = $5.nr;
                   src0.reg.subnr = 0;
-                  set_instruction_src0(&$$, &src0);
+                  set_instruction_src0(&$$, &src0, NULL);
 
 		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
 		  $$.bits1.da1.src1_reg_type = $7.reg.type;
@@ -1176,9 +1215,9 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 
                   src0.reg.nr = $5.nr;
                   src0.reg.subnr = 0;
-                  set_instruction_src0(&$$, &src0);
+                  set_instruction_src0(&$$, &src0, NULL);
 
-                  set_instruction_src1(&$$, &$7);
+                  set_instruction_src1(&$$, &$7, &@7);
                   $$.bits3.generic_gen5.end_of_thread = !!($6 & EX_DESC_EOT_MASK);
 		}
 		| predicate SEND execsize dst sendleadreg payload sndopr imm32reg instoptions
@@ -1197,7 +1236,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
-		  if (set_instruction_src0(&$$, &$6) != 0)
+		  if (set_instruction_src0(&$$, &$6, &@6) != 0)
 		    YYERROR;
 		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
 		  $$.bits1.da1.src1_reg_type = $8.reg.type;
@@ -1220,10 +1259,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
-		  if (set_instruction_src0(&$$, &$6) != 0)
+		  if (set_instruction_src0(&$$, &$6, &@6) != 0)
 		    YYERROR;
 		  /* XXX is this correct? */
-		  if (set_instruction_src1(&$$, &$8) != 0)
+		  if (set_instruction_src1(&$$, &$8, &@8) != 0)
 		    YYERROR;
 		  if (IS_GENx(5)) {
                       $$.bits2.send_gen5.sfid = $7;
@@ -1252,8 +1291,8 @@ jumpinstruction: predicate JMPI execsize relativelocation2
 		  set_instruction_predicate(&$$.gen, &$1);
 		  ip_dst.width = ffs(1) - 1;
 		  set_instruction_dest(&$$.gen, &ip_dst);
-		  set_instruction_src0(&$$.gen, &ip_src);
-		  set_instruction_src1(&$$.gen, &$4);
+		  set_instruction_src0(&$$.gen, &ip_src, NULL);
+		  set_instruction_src1(&$$.gen, &$4, NULL);
 		  $$.first_reloc_target = $4.reloc_target;
 		  $$.first_reloc_offset = $4.imm32;
 		}
@@ -1269,9 +1308,9 @@ mathinstruction: predicate MATH_INST execsize dst src srcimm math_function insto
 		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
 		    YYERROR;
-		  if (set_instruction_src0(&$$, &$5) != 0)
+		  if (set_instruction_src0(&$$, &$5, &@5) != 0)
 		    YYERROR;
-		  if (set_instruction_src1(&$$, &$6) != 0)
+		  if (set_instruction_src1(&$$, &$6, &@6) != 0)
 		    YYERROR;
 		}
 ;
@@ -1309,8 +1348,8 @@ syncinstruction: predicate WAIT notifyreg
 		  notify_dst.width = ffs(1) - 1;
 		  set_instruction_dest(&$$, &notify_dst);
 		  set_direct_src_operand(&notify_src, &$3, BRW_REGISTER_TYPE_D);
-		  set_instruction_src0(&$$, &notify_src);
-		  set_instruction_src1(&$$, &src_null_reg);
+		  set_instruction_src0(&$$, &notify_src, NULL);
+		  set_instruction_src1(&$$, &src_null_reg, NULL);
 		}
 		
 ;
@@ -2903,12 +2942,13 @@ int set_instruction_dest(struct brw_instruction *instr,
 
 /* Sets the first source operand for the instruction.  Returns 0 on success. */
 int set_instruction_src0(struct brw_instruction *instr,
-			  struct src_operand *src)
+			 struct src_operand *src,
+			 YYLTYPE *location)
 {
 	if (advanced_flag)
 		reset_instruction_src_region(instr, src);
 
-	if (!validate_src_reg(instr, src->reg))
+	if (!validate_src_reg(instr, src->reg, location))
 		return 1;
 
 	instr->bits1.da1.src0_reg_file = src->reg.file;
@@ -2967,12 +3007,13 @@ int set_instruction_src0(struct brw_instruction *instr,
 /* Sets the second source operand for the instruction.  Returns 0 on success.
  */
 int set_instruction_src1(struct brw_instruction *instr,
-			  struct src_operand *src)
+			 struct src_operand *src,
+			 YYLTYPE *location)
 {
 	if (advanced_flag)
 		reset_instruction_src_region(instr, src);
 
-	if (!validate_src_reg(instr, src->reg))
+	if (!validate_src_reg(instr, src->reg, location))
 		return 1;
 
 	instr->bits1.da1.src1_reg_file = src->reg.file;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 59/90] assembler: Add a check for when width is 1 and hstride is not 0
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (57 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 58/90] assembler: Add error() and warn() shorthands and use them in set_src[01] Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 60/90] assembler: Add a check for when ExecSize and width are 1 Damien Lespiau
                   ` (31 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

The list of region restrictions in bspec do say that we can't have:
     width == 1 && hstrize != 0

We do have plenty of assembly code that don't respect that behaviour. So
let's hide the warning under a -W flag (for now) while we fix things.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    4 ++++
 assembler/gram.y    |   29 ++++++++++++++++++++++++++++-
 assembler/main.c    |    7 ++++++-
 3 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 8db7bce..1e67c1c 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -43,6 +43,10 @@ typedef float GLfloat;
 
 extern long int gen_level;
 
+#define WARN_ALWAYS	(1 << 0)
+#define WARN_ALL	(1 << 31)
+extern unsigned int warning_flags;
+
 extern struct brw_context genasm_context;
 extern struct brw_compile genasm_compile;
 
diff --git a/assembler/gram.y b/assembler/gram.y
index f27f6fe..96bc797 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -130,7 +130,12 @@ static void message(enum message_level level, YYLTYPE *location,
     va_end(args);
 }
 
-#define warn(l, fmt, ...)	message(WARN, location, fmt, ## __VA_ARGS__)
+#define warn(flag, l, fmt, ...)					\
+    do {							\
+	if (warning_flags & WARN_ ## flag)			\
+	    message(WARN, location, fmt, ## __VA_ARGS__);	\
+    } while(0)
+
 #define error(l, fmt, ...)	message(ERROR, location, fmt, ## __VA_ARGS__)
 
 /* like strcmp, but handles NULL pointers */
@@ -255,6 +260,10 @@ static bool validate_src_reg(struct brw_instruction *insn,
 			     struct brw_reg reg,
 			     YYLTYPE *location)
 {
+    int hstride_for_reg[] = {0, 1, 2, 4};
+    int width_for_reg[] = {1, 2, 4, 8, 16};
+    int width, hstride;
+
     if (reg.file == BRW_IMMEDIATE_VALUE)
 	return true;
 
@@ -265,6 +274,24 @@ static bool validate_src_reg(struct brw_instruction *insn,
 	return false;
     }
 
+    assert(reg.hstride >= 0 && reg.hstride < ARRAY_SIZE(hstride_for_reg));
+    hstride = hstride_for_reg[reg.hstride];
+
+    assert(reg.width >= 0 && reg.width < ARRAY_SIZE(width_for_reg));
+    width = width_for_reg[reg.width];
+
+    /* Register Region Restrictions */
+
+    /* D. If Width = 1, HorzStride must be 0 regardless of the values of
+     * ExecSize and VertStride.
+     *
+     * FIXME: In "advanced mode" hstride is set to 1, this is probably a bug
+     * to fix, but it changes the generated opcodes and thus needs validation.
+     */
+    if (width == 1 && hstride != 0)
+	warn(ALL, location, "region width is 1 but horizontal stride is %d "
+	     " (should be 0)\n", hstride);
+
     return true;
 }
 
diff --git a/assembler/main.c b/assembler/main.c
index cfee749..4fe1315 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -43,6 +43,7 @@ extern int errors;
 
 long int gen_level = 40;
 int advanced_flag = 0; /* 0: in unit of byte, 1: in unit of data element size */
+unsigned int warning_flags = WARN_ALWAYS;
 int binary_like_output = 0; /* 0: default output style, 1: nice C-style output */
 int need_export = 0;
 char *input_filename = "<stdin>";
@@ -293,7 +294,7 @@ int main(int argc, char **argv)
 	char o;
 	void *mem_ctx;
 
-	while ((o = getopt_long(argc, argv, "e:l:o:g:ab", longopts, NULL)) != -1) {
+	while ((o = getopt_long(argc, argv, "e:l:o:g:abW", longopts, NULL)) != -1) {
 		switch (o) {
 		case 'o':
 			if (strcmp(optarg, "-") != 0)
@@ -344,6 +345,10 @@ int main(int argc, char **argv)
 				entry_table_file = optarg;
 			break;
 
+		case 'W':
+			warning_flags |= WARN_ALL;
+			break;
+
 		default:
 			usage();
 			exit(1);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 60/90] assembler: Add a check for when ExecSize and width are 1
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (58 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 59/90] assembler: Add a check for when width is 1 and hstride is not 0 Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 61/90] assembler: Add the input filename to the error/warning messages Damien Lespiau
                   ` (30 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Another check (that we hit if we try to use brw_set_src0()). Again,
protect it with the -W option.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   26 +++++++++++++++++++++++++-
 1 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 96bc797..b3578e8 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -261,8 +261,10 @@ static bool validate_src_reg(struct brw_instruction *insn,
 			     YYLTYPE *location)
 {
     int hstride_for_reg[] = {0, 1, 2, 4};
+    int vstride_for_reg[] = {0, 1, 2, 4, 8, 16, 32, 64, 128, 256};
     int width_for_reg[] = {1, 2, 4, 8, 16};
-    int width, hstride;
+    int execsize_for_reg[] = {1, 2, 4, 8, 16, 32};
+    int width, hstride, vstride, execsize;
 
     if (reg.file == BRW_IMMEDIATE_VALUE)
 	return true;
@@ -277,9 +279,20 @@ static bool validate_src_reg(struct brw_instruction *insn,
     assert(reg.hstride >= 0 && reg.hstride < ARRAY_SIZE(hstride_for_reg));
     hstride = hstride_for_reg[reg.hstride];
 
+    if (reg.vstride == 0xf) {
+	vstride = -1;
+    } else {
+	assert(reg.vstride >= 0 && reg.vstride < ARRAY_SIZE(vstride_for_reg));
+	vstride = vstride_for_reg[reg.vstride];
+    }
+
     assert(reg.width >= 0 && reg.width < ARRAY_SIZE(width_for_reg));
     width = width_for_reg[reg.width];
 
+    assert(insn->header.execution_size >= 0 &&
+	   insn->header.execution_size < ARRAY_SIZE(execsize_for_reg));
+    execsize = execsize_for_reg[insn->header.execution_size];
+
     /* Register Region Restrictions */
 
     /* D. If Width = 1, HorzStride must be 0 regardless of the values of
@@ -292,6 +305,17 @@ static bool validate_src_reg(struct brw_instruction *insn,
 	warn(ALL, location, "region width is 1 but horizontal stride is %d "
 	     " (should be 0)\n", hstride);
 
+    /* E. If ExecSize = Width = 1, both VertStride and HorzStride must be 0.
+     * This defines a scalar. */
+    if (execsize == 1 && width == 1) {
+        if (hstride != 0)
+	    warn(ALL, location, "execution size and region width are 1 but "
+		 "horizontal stride is %d (should be 0)\n", hstride);
+        if (vstride != 0)
+	    warn(ALL, location, "execution size and region width are 1 but "
+		 "vertical stride is %d (should be 0)\n", vstride);
+    }
+
     return true;
 }
 
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 61/90] assembler: Add the input filename to the error/warning messages
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (59 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 60/90] assembler: Add a check for when ExecSize and width are 1 Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 62/90] assembler: Use brw_set_src0() Damien Lespiau
                   ` (29 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    2 ++
 assembler/gram.y    |    4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 1e67c1c..9558a29 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -47,6 +47,8 @@ extern long int gen_level;
 #define WARN_ALL	(1 << 31)
 extern unsigned int warning_flags;
 
+extern char *input_filename;
+
 extern struct brw_context genasm_context;
 extern struct brw_compile genasm_compile;
 
diff --git a/assembler/gram.y b/assembler/gram.y
index b3578e8..93d3bd5 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -120,10 +120,10 @@ static void message(enum message_level level, YYLTYPE *location,
     va_list args;
 
     if (location)
-	fprintf(stderr, "%d:%d: %s: ", location->first_line,
+	fprintf(stderr, "%s:%d:%d: %s: ", input_filename, location->first_line,
 		location->first_column, level_str[level]);
     else
-	fprintf(stderr, "%s: ", level_str[level]);
+	fprintf(stderr, "%s:%s: ", input_filename, level_str[level]);
 
     va_start(args, fmt);
     vfprintf(stderr, fmt, args);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 62/90] assembler: Use brw_set_src0()
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (60 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 61/90] assembler: Add the input filename to the error/warning messages Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 63/90] assembler: Port the warning and error reporting to warn()/error() Damien Lespiau
                   ` (28 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Unfortunately, it's all a walk in the park. Both, internal code in the
assembler and external shaders (libva) generate registers that trigger
assertions in brw_eu_emit.c's brw_validate().

To fix all that I took the option to be able to emit warning with the -W
flag but still make the assembler generate the same opcodes.

We can fix all this, but it requires validation, something that I cannot
do right now.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_eu_emit.c |   25 ++++++++++++++++-
 assembler/gram.y        |   66 ++++++++++++----------------------------------
 2 files changed, 40 insertions(+), 51 deletions(-)

diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
index 119eb34..21c673e 100644
--- a/assembler/brw_eu_emit.c
+++ b/assembler/brw_eu_emit.c
@@ -204,10 +204,16 @@ validate_reg(struct brw_instruction *insn, struct brw_reg reg)
    /* 3. */
    assert(execsize >= width);
 
+   /* FIXME: the assembler has a lot of code written that triggers the
+    * assertions commented it below. Let's paper over it (for now!) until we
+    * can re-validate the shaders with those little inconsistencies fixed. */
+
    /* 4. */
+#if 0
    if (execsize == width && hstride != 0) {
       assert(vstride == -1 || vstride == width * hstride);
    }
+#endif
 
    /* 5. */
    if (execsize == width && hstride == 0) {
@@ -215,15 +221,19 @@ validate_reg(struct brw_instruction *insn, struct brw_reg reg)
    }
 
    /* 6. */
+#if 0
    if (width == 1) {
       assert(hstride == 0);
    }
+#endif
 
    /* 7. */
+#if 0
    if (execsize == 1 && width == 1) {
       assert(hstride == 0);
       assert(vstride == 0);
    }
+#endif
 
    /* 8. */
    if (vstride == 0 && hstride == 0) {
@@ -269,8 +279,14 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
    
       /* Required to set some fields in src1 as well:
        */
-      insn->bits1.da1.src1_reg_file = 0; /* arf */
+
+      /* FIXME: This looks quite wrong, tempering with src1. I did not find
+       * anything in the bspec that was hinting it woud be needed when setting
+       * src0. before removing this one needs to run piglit.
+
+      insn->bits1.da1.src1_reg_file = 0;
       insn->bits1.da1.src1_reg_type = reg.type;
+       */
    }
    else 
    {
@@ -296,6 +312,10 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
       }
 
       if (insn->header.access_mode == BRW_ALIGN_1) {
+
+	 /* FIXME: While this is correct, if the assembler uses that code path
+	  * the opcode generated are different and thus needs a validation
+	  * pass.
 	 if (reg.width == BRW_WIDTH_1 && 
 	     insn->header.execution_size == BRW_EXECUTE_1) {
 	    insn->bits2.da1.src0_horiz_stride = BRW_HORIZONTAL_STRIDE_0;
@@ -303,10 +323,11 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
 	    insn->bits2.da1.src0_vert_stride = BRW_VERTICAL_STRIDE_0;
 	 }
 	 else {
+         */
 	    insn->bits2.da1.src0_horiz_stride = reg.hstride;
 	    insn->bits2.da1.src0_width = reg.width;
 	    insn->bits2.da1.src0_vert_stride = reg.vstride;
-	 }
+     /* } */
       }
       else {
 	 insn->bits2.da16.src0_swz_x = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_X);
diff --git a/assembler/gram.y b/assembler/gram.y
index 93d3bd5..b2a3660 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -295,6 +295,14 @@ static bool validate_src_reg(struct brw_instruction *insn,
 
     /* Register Region Restrictions */
 
+    /* B. If ExecSize = Width and HorzStride ≠ 0, VertStride must be set to
+     * Width * HorzStride. */
+    if (execsize == width && hstride != 0) {
+	if (vstride != -1 && vstride != width * hstride);
+	    warn(ALL, location, "execution size == width and hstride != 0 but "
+		 "vstride is not width * hstride\n");
+    }
+
     /* D. If Width = 1, HorzStride must be 0 regardless of the values of
      * ExecSize and VertStride.
      *
@@ -357,6 +365,9 @@ static int get_indirect_subreg_address(GLuint subreg)
 
 static void resolve_subnr(struct brw_reg *reg)
 {
+   if (reg->file == BRW_IMMEDIATE_VALUE)
+	return;
+
    if (reg->address_mode == BRW_ADDRESS_DIRECT)
 	reg->subnr = get_subreg_address(reg->file, reg->type, reg->subnr,
 					reg->address_mode);
@@ -2996,61 +3007,18 @@ int set_instruction_src0(struct brw_instruction *instr,
 			 struct src_operand *src,
 			 YYLTYPE *location)
 {
+
 	if (advanced_flag)
 		reset_instruction_src_region(instr, src);
 
 	if (!validate_src_reg(instr, src->reg, location))
 		return 1;
 
-	instr->bits1.da1.src0_reg_file = src->reg.file;
-	instr->bits1.da1.src0_reg_type = src->reg.type;
-	if (src->reg.file == BRW_IMMEDIATE_VALUE) {
-		instr->bits3.ud = src->reg.dw1.ud;
-	} else if (src->reg.address_mode == BRW_ADDRESS_DIRECT) {
-            if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits2.da1.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
-		instr->bits2.da1.src0_reg_nr = src->reg.nr;
-		instr->bits2.da1.src0_vert_stride = src->reg.vstride;
-		instr->bits2.da1.src0_width = src->reg.width;
-		instr->bits2.da1.src0_horiz_stride = src->reg.hstride;
-		instr->bits2.da1.src0_negate = src->reg.negate;
-		instr->bits2.da1.src0_abs = src->reg.abs;
-		instr->bits2.da1.src0_address_mode = src->reg.address_mode;
-            } else {
-		instr->bits2.da16.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
-		instr->bits2.da16.src0_reg_nr = src->reg.nr;
-		instr->bits2.da16.src0_vert_stride = src->reg.vstride;
-		instr->bits2.da16.src0_negate = src->reg.negate;
-		instr->bits2.da16.src0_abs = src->reg.abs;
-		instr->bits2.da16.src0_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
-		instr->bits2.da16.src0_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
-		instr->bits2.da16.src0_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
-		instr->bits2.da16.src0_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
-		instr->bits2.da16.src0_address_mode = src->reg.address_mode;
-            }
-        } else {
-            if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits2.ia1.src0_indirect_offset = src->reg.dw1.bits.indirect_offset;
-		instr->bits2.ia1.src0_subreg_nr = get_indirect_subreg_address(src->reg.subnr);
-		instr->bits2.ia1.src0_abs = src->reg.abs;
-		instr->bits2.ia1.src0_negate = src->reg.negate;
-		instr->bits2.ia1.src0_address_mode = src->reg.address_mode;
-		instr->bits2.ia1.src0_horiz_stride = src->reg.hstride;
-		instr->bits2.ia1.src0_width = src->reg.width;
-		instr->bits2.ia1.src0_vert_stride = src->reg.vstride;
-            } else {
-		instr->bits2.ia16.src0_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
-		instr->bits2.ia16.src0_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
-		instr->bits2.ia16.src0_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
-		instr->bits2.ia16.src0_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
-		instr->bits2.ia16.src0_indirect_offset = (src->reg.dw1.bits.indirect_offset >> 4); /* half register aligned */
-		instr->bits2.ia16.src0_subreg_nr = get_indirect_subreg_address(src->reg.subnr);
-		instr->bits2.ia16.src0_abs = src->reg.abs;
-		instr->bits2.ia16.src0_negate = src->reg.negate;
-		instr->bits2.ia16.src0_address_mode = src->reg.address_mode;
-		instr->bits2.ia16.src0_vert_stride = src->reg.vstride;
-            }
-        }
+	/* the assembler support expressing subnr in bytes or in number of
+	 * elements. */
+	resolve_subnr(&src->reg);
+
+	brw_set_src0(&genasm_compile, instr, src->reg);
 
 	return 0;
 }
-- 
1.7.7.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 63/90] assembler: Port the warning and error reporting to warn()/error()
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (61 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 62/90] assembler: Use brw_set_src0() Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:27 ` [PATCH 64/90] assembler: Cleanup visibility of a few global variables/functions Damien Lespiau
                   ` (27 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

This way we ensure to have a single place where these are handled. The
immediate benefit is that now line numbers are always printed out, which
is quite handy.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |  274 ++++++++++++++++++++++--------------------------------
 1 files changed, 113 insertions(+), 161 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index b2a3660..c022277 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -133,10 +133,13 @@ static void message(enum message_level level, YYLTYPE *location,
 #define warn(flag, l, fmt, ...)					\
     do {							\
 	if (warning_flags & WARN_ ## flag)			\
-	    message(WARN, location, fmt, ## __VA_ARGS__);	\
+	    message(WARN, l, fmt, ## __VA_ARGS__);	\
     } while(0)
 
-#define error(l, fmt, ...)	message(ERROR, location, fmt, ## __VA_ARGS__)
+#define error(l, fmt, ...)			\
+    do {					\
+	message(ERROR, l, fmt, ## __VA_ARGS__);	\
+    } while(0)
 
 /* like strcmp, but handles NULL pointers */
 static bool strcmp0(const char *s1, const char* s2)
@@ -510,6 +513,18 @@ static void resolve_subnr(struct brw_reg *reg)
 %type <src_operand> indirectsrcoperand
 %type <src_operand> src srcimm imm32reg payload srcacc srcaccimm swizzle
 %type <src_operand> relativelocation relativelocation2
+
+%code {
+
+#undef error
+#define error(l, fmt, ...)			\
+    do {					\
+	message(ERROR, l, fmt, ## __VA_ARGS__);	\
+	YYERROR;				\
+    } while(0)
+
+}
+
 %%
 simple_int:     INTEGER { $$ = $1; }
 		| MINUS INTEGER { $$ = -$2;}
@@ -584,11 +599,9 @@ declare_pragma:	DECLARE_PRAGMA STRING declare_base declare_elementsize declare_s
 
 		    found = find_register($2);
 		    if (found) {
-		        if (!declared_register_equal(&reg, found)) {
-			    fprintf(stderr, "Error: %s already defined and "
-				    "definitions don't agree\n", $2);
-			    YYERROR;
-			}
+		        if (!declared_register_equal(&reg, found))
+			    error(&@1, "%s already defined and definitions "
+				  "don't agree\n", $2);
 			free($2); // $2 has been malloc'ed by strdup
 		    } else {
 			new_reg = malloc(sizeof(struct declared_register));
@@ -694,10 +707,8 @@ relocatableinstruction:	ifelseinstruction
 ifelseinstruction: ENDIF
 		{
 		  // for Gen4 
-		  if(IS_GENp(6)) { // For gen6+.
-		    fprintf(stderr, "ENDIF Syntax error: should be 'ENDIF execsize relativelocation'\n");
-		    YYERROR;
-		  }
+		  if(IS_GENp(6)) // For gen6+.
+		    error(&@1, "should be 'ENDIF execsize relativelocation'\n");
 		  memset(&$$, 0, sizeof($$));
 		  $$.gen.header.opcode = $1;
 		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
@@ -709,10 +720,8 @@ ifelseinstruction: ENDIF
 		{
 		  // for Gen6+
 		  /* Gen6, Gen7 bspec: predication is prohibited */
-		  if(!IS_GENp(6)) { // for gen6-
-		    fprintf(stderr, "ENDIF Syntax error: should be 'ENDIF'\n");
-		    YYERROR;
-		  }
+		  if(!IS_GENp(6)) // for gen6-
+		    error(&@1, "ENDIF Syntax error: should be 'ENDIF'\n");
 		  memset(&$$, 0, sizeof($$));
 		  $$.gen.header.opcode = $1;
 		  $$.gen.header.execution_size = $2;
@@ -742,24 +751,19 @@ ifelseinstruction: ENDIF
 		    $$.first_reloc_target = $3.reloc_target;
 		    $$.first_reloc_offset = $3.imm32;
 		  } else {
-		    fprintf(stderr, "'ELSE' instruction is not implemented.\n");
-		    YYERROR;
+		    error(&@1, "'ELSE' instruction is not implemented.\n");
 		  }
 		}
 		| predicate IF execsize relativelocation
 		{
-		  /* for Gen4, Gen5 */
 		  /* The branch instructions require that the IP register
 		   * be the destination and first source operand, while the
 		   * offset is the second source operand.  The offset is added
 		   * to the pre-incremented IP.
 		   */
-		  /* for Gen6 */
-		  if(IS_GENp(7)) {
-			/* Error in Gen7+. */		   
-		    fprintf(stderr, "Syntax error: IF should be 'IF execsize JIP UIP'\n");
-		    YYERROR;
-		  }
+		  if(IS_GENp(7)) /* Error in Gen7+. */
+		    error(&@2, "IF should be 'IF execsize JIP UIP'\n");
+
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$.gen, &$1);
 		  $$.gen.header.opcode = $2;
@@ -776,10 +780,9 @@ ifelseinstruction: ENDIF
 		| predicate IF execsize relativelocation relativelocation
 		{
 		  /* for Gen7+ */
-		  if(!IS_GENp(7)) {
-		    fprintf(stderr, "Syntax error: IF should be 'IF execsize relativelocation'\n");
-		    YYERROR;
-		  }
+		  if(!IS_GENp(7))
+		    error(&@2, "IF should be 'IF execsize relativelocation'\n");
+
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$.gen, &$1);
 		  $$.gen.header.opcode = $2;
@@ -820,8 +823,7 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		    $$.first_reloc_target = $4.reloc_target;
 		    $$.first_reloc_offset = $4.imm32;
 		  } else {
-		    fprintf(stderr, "'WHILE' instruction is not implemented!\n");
-		    YYERROR;
+		    error(&@2, "'WHILE' instruction is not implemented!\n");
 		  }
 		}
 		| DO
@@ -955,7 +957,9 @@ unaryinstruction:
 		    if ($$.header.predicate_control != BRW_PREDICATE_NONE &&
                         ($1.bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
                          $1.bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
-                        fprintf(stderr, "WARNING: must use the same flag register if both prediction and conditional modifier are enabled\n");
+                        warn(ALWAYS, &@3, "must use the same flag register if "
+			     "both prediction and conditional modifier are "
+			     "enabled\n");
 
 		    $$.bits2.da1.flag_reg_nr = $3.flag_reg_nr;
 		    $$.bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
@@ -994,7 +998,9 @@ binaryinstruction:
 		    if ($$.header.predicate_control != BRW_PREDICATE_NONE &&
                         ($1.bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
                          $1.bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
-                        fprintf(stderr, "WARNING: must use the same flag register if both prediction and conditional modifier are enabled\n");
+                        warn(ALWAYS, &@3, "must use the same flag register if "
+			     "both prediction and conditional modifier are "
+			     "enabled\n");
 
 		    $$.bits2.da1.flag_reg_nr = $3.flag_reg_nr;
 		    $$.bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
@@ -1033,7 +1039,9 @@ binaryaccinstruction:
 		    if ($$.header.predicate_control != BRW_PREDICATE_NONE &&
                         ($1.bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
                          $1.bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
-                        fprintf(stderr, "WARNING: must use the same flag register if both prediction and conditional modifier are enabled\n");
+                        warn(ALWAYS, &@3, "must use the same flag register if "
+			     "both prediction and conditional modifier are "
+			     "enabled\n");
 
 		    $$.bits2.da1.flag_reg_nr = $3.flag_reg_nr;
 		    $$.bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
@@ -1082,7 +1090,9 @@ trinaryinstruction:
 		    if ($$.header.predicate_control != BRW_PREDICATE_NONE &&
                         ($1.bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
                          $1.bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
-                        fprintf(stderr, "WARNING: must use the same flag register if both prediction and conditional modifier are enabled\n");
+                        warn(ALWAYS, &@3, "must use the same flag register if "
+			     "both prediction and conditional modifier are "
+			     "enabled\n");
 		  }
 }
 ;
@@ -1176,8 +1186,8 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  if ($7.reg.type != BRW_REGISTER_TYPE_UD &&
 		      $7.reg.type != BRW_REGISTER_TYPE_D &&
 		      $7.reg.type != BRW_REGISTER_TYPE_V) {
-		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.reg.dw1.ud, $7.reg.type);
-			YYERROR;
+		    error (&@7, "non-int D/UD/V representation: %d,"
+			   "type=%d\n", $7.reg.dw1.ud, $7.reg.type);
 		  }
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
@@ -1197,16 +1207,14 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		{
 		  struct src_operand src0;
 
-		  if (!IS_GENp(6)) {
-                      fprintf(stderr, "error: the syntax of send instruction\n");
-                      YYERROR;
-		  }
+		  if (!IS_GENp(6))
+                      error(&@2, "the syntax of send instruction\n");
 
 		  if ($7.reg.type != BRW_REGISTER_TYPE_UD &&
                       $7.reg.type != BRW_REGISTER_TYPE_D &&
                       $7.reg.type != BRW_REGISTER_TYPE_V) {
-                      fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $7.reg.dw1.ud, $7.reg.type);
-                      YYERROR;
+                      error(&@7,"non-int D/UD/V representation: %d,"
+			    "type=%d\n", $7.reg.dw1.ud, $7.reg.type);
 		  }
 
 		  memset(&$$, 0, sizeof($$));
@@ -1242,17 +1250,14 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		{
 		  struct src_operand src0;
 
-		  if (!IS_GENp(6)) {
-                      fprintf(stderr, "error: the syntax of send instruction\n");
-                      YYERROR;
-		  }
+		  if (!IS_GENp(6))
+                      error(&@2, "the syntax of send instruction\n");
 
                   if ($7.reg.file != BRW_ARCHITECTURE_REGISTER_FILE ||
                       ($7.reg.nr & 0xF0) != BRW_ARF_ADDRESS ||
                       ($7.reg.nr & 0x0F) != 0 ||
                       $7.reg.subnr != 0) {
-                      fprintf (stderr, "%d: scalar register must be a0.0<0;1,0>:ud\n", yylineno);
-                      YYERROR;
+                      error (&@7, "scalar register must be a0.0<0;1,0>:ud\n");
 		  }
 
 		  memset(&$$, 0, sizeof($$));
@@ -1287,8 +1292,8 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  if ($8.reg.type != BRW_REGISTER_TYPE_UD &&
 		      $8.reg.type != BRW_REGISTER_TYPE_D &&
 		      $8.reg.type != BRW_REGISTER_TYPE_V) {
-		    fprintf (stderr, "%d: non-int D/UD/V representation: %d,type=%d\n", yylineno, $8.reg.dw1.ud, $8.reg.type);
-			YYERROR;
+		    error(&@8, "non-int D/UD/V representation: %d,"
+			  "type=%d\n", $8.reg.dw1.ud, $8.reg.type);
 		  }
 		  memset(&$$, 0, sizeof($$));
 		  $$.header.opcode = $2;
@@ -1476,8 +1481,7 @@ msgtarget:	NULL_TOKEN
 		| MATH math_function saturate math_signed math_scalar
 		{
 		  if (IS_GENp(6)) {
-                      fprintf (stderr, "Gen6+ doesn't have math function\n");
-                      YYERROR;
+                      error (&@1, "Gen6+ doesn't have math function\n");
 		  } else if (IS_GENx(5)) {
                       $$.bits2.send_gen5.sfid = BRW_SFID_MATH;
                       $$.bits3.generic_gen5.header_present = 0;
@@ -1682,16 +1686,14 @@ msgtarget:	NULL_TOKEN
                       $$.bits3.vme_gen6.message_type = $9;
                       $$.bits3.generic_gen5.header_present = 1; 
 		  } else {
-                      fprintf (stderr, "Gen6- doesn't have vme function\n");
-                      YYERROR;
+                      error (&@1, "Gen6- doesn't have vme function\n");
 		  }    
 		} 
 		| CRE LPAREN INTEGER COMMA INTEGER RPAREN
 		{
-		   if (gen_level < 75) {
-                      fprintf (stderr, "Below Gen7.5 doesn't have CRE function\n");
-                      YYERROR;
-		    }
+		   if (gen_level < 75)
+                      error (&@1, "Below Gen7.5 doesn't have CRE function\n");
+
 		   $$.bits3.generic.msg_target = HSW_SFID_CRE;
 
                    $$.bits2.send_gen5.sfid = HSW_SFID_CRE;
@@ -1711,8 +1713,7 @@ msgtarget:	NULL_TOKEN
                             $3 != GEN6_SFID_DATAPORT_RENDER_CACHE &&
                             $3 != GEN6_SFID_DATAPORT_CONSTANT_CACHE &&
                             $3 != GEN7_SFID_DATAPORT_DATA_CACHE) {
-                            fprintf (stderr, "error: wrong cache type\n");
-                            YYERROR;
+                            error (&@3, "error: wrong cache type\n");
                         }
 
                         $$.bits3.gen7_dp.category = $11;
@@ -1723,8 +1724,7 @@ msgtarget:	NULL_TOKEN
                         if ($3 != GEN6_SFID_DATAPORT_SAMPLER_CACHE &&
                             $3 != GEN6_SFID_DATAPORT_RENDER_CACHE &&
                             $3 != GEN6_SFID_DATAPORT_CONSTANT_CACHE) {
-                            fprintf (stderr, "error: wrong cache type\n");
-                            YYERROR;
+                            error (&@3, "error: wrong cache type\n");
                         }
 
                         $$.bits3.gen6_dp.send_commit_msg = $11;
@@ -1732,8 +1732,7 @@ msgtarget:	NULL_TOKEN
                         $$.bits3.gen6_dp.msg_control = $7;
                         $$.bits3.gen6_dp.msg_type = $5;
                     } else if (!IS_GENp(5)) {
-                        fprintf (stderr, "Gen6- doesn't support data port for sampler/render/constant/data cache\n");
-                        YYERROR;
+                        error (&@1, "Gen6- doesn't support data port for sampler/render/constant/data cache\n");
                     }
 		} 
 ;
@@ -1838,10 +1837,8 @@ symbol_reg:	STRING %prec STR_SYMBOL_REG
 		{
 		    struct declared_register *dcl_reg = find_register($1);
 
-		    if (dcl_reg == NULL) {
-			fprintf(stderr, "can't find register %s\n", $1);
-			YYERROR;
-		    }
+		    if (dcl_reg == NULL)
+			error(&@1, "can't find register %s\n", $1);
 
 		    memcpy(&$$, dcl_reg, sizeof(*dcl_reg));
 		    free($1); // $1 has been malloc'ed by strdup
@@ -1856,10 +1853,8 @@ symbol_reg_p: STRING LPAREN exp RPAREN
 		{
 		    struct declared_register *dcl_reg = find_register($1);	
 
-		    if (dcl_reg == NULL) {
-			fprintf(stderr, "can't find register %s\n", $1);
-			YYERROR;
-		    }
+		    if (dcl_reg == NULL)
+			error(&@1, "can't find register %s\n", $1);
 
 		    memcpy(&$$, dcl_reg, sizeof(*dcl_reg));
 		    $$.reg.nr += $3;
@@ -1869,10 +1864,8 @@ symbol_reg_p: STRING LPAREN exp RPAREN
 		{
 		    struct declared_register *dcl_reg = find_register($1);	
 
-		    if (dcl_reg == NULL) {
-			fprintf(stderr, "can't find register %s\n", $1);
-			YYERROR;
-		    }
+		    if (dcl_reg == NULL)
+			error(&@1, "can't find register %s\n", $1);
 
 		    memcpy(&$$, dcl_reg, sizeof(*dcl_reg));
 		    $$.reg.nr += $3;
@@ -1940,8 +1933,7 @@ imm32reg:	imm32 srcimmtype
 		      d = $1.u.d;
 		      break;
 		    default:
-		      fprintf (stderr, "%d: non-int D/UD/V/VF representation: %d,type=%d\n", yylineno, $1.r, $2);
-		      YYERROR;
+		      error (&@2, "non-int D/UD/V/VF representation: %d,type=%d\n", $1.r, $2);
 		    }
 		    break;
 		  case BRW_REGISTER_TYPE_UW:
@@ -1951,8 +1943,7 @@ imm32reg:	imm32 srcimmtype
 		      d = $1.u.d;
 		      break;
 		    default:
-		      fprintf (stderr, "non-int W/UW representation\n");
-		      YYERROR;
+		      error (&@2, "non-int W/UW representation\n");
 		    }
 		    d &= 0xffff;
 		    d |= d << 16;
@@ -1966,8 +1957,7 @@ imm32reg:	imm32 srcimmtype
 		      intfloat.f = (float) $1.u.d;
 		      break;
 		    default:
-		      fprintf (stderr, "non-float F representation\n");
-		      YYERROR;
+		      error (&@2, "non-float F representation\n");
 		    }
 		    d = intfloat.i;
 		    break;
@@ -1977,8 +1967,7 @@ imm32reg:	imm32 srcimmtype
 		    YYERROR;
 #endif
 		  default:
-		    fprintf(stderr, "unknown immediate type %d\n", $2);
-		    YYERROR;
+		    error(&@2, "unknown immediate type %d\n", $2);
 		  }
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.reg.file = BRW_IMMEDIATE_VALUE;
@@ -2145,11 +2134,8 @@ indirectsrcoperand:
  */
 addrparam:	addrreg COMMA immaddroffset
 		{
-		    if ($3 < -512 || $3 > 511) {
-		    fprintf(stderr, "Address immediate offset %d out of"
-			    "range %d\n", $3, yylineno);
-		    YYERROR;
-		  }
+		  if ($3 < -512 || $3 > 511)
+		    error(&@3, "Address immediate offset %d out of range\n", $3);
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.subnr = $1.subnr;
 		  $$.dw1.bits.indirect_offset = $3;
@@ -2220,11 +2206,9 @@ indirectmsgreg: MSGREGFILE LSQUARE addrparam RSQUARE
 
 addrreg:	ADDRESSREG subregnum
 		{
-		  if ($1 != 0) {
-		    fprintf(stderr,
-			    "address register number %d out of range", $1);
-		    YYERROR;
-		  }
+		  if ($1 != 0)
+		    error(&@2, "address register number %d out of range", $1);
+
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
 		  $$.nr = BRW_ARF_ADDRESS | $1;
@@ -2234,11 +2218,8 @@ addrreg:	ADDRESSREG subregnum
 
 accreg:		ACCREG subregnum
 		{
-		  if ($1 > 1) {
-		    fprintf(stderr,
-			    "accumulator register number %d out of range", $1);
-		    YYERROR;
-		  }
+		  if ($1 > 1)
+		    error(&@1, "accumulator register number %d out of range", $1);
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
 		  $$.nr = BRW_ARF_ACCUMULATOR | $1;
@@ -2250,16 +2231,11 @@ flagreg:	FLAGREG subregnum
 		{
 		  if ((!IS_GENp(7) && $1) > 0 ||
 		      (IS_GENp(7) && $1 > 1)) {
-                    fprintf(stderr,
-			    "flag register number %d out of range\n", $1);
-		    YYERROR;
+                    error(&@2, "flag register number %d out of range\n", $1);
 		  }
 
-		  if ($2 > 1) {
-		    fprintf(stderr,
-			    "flag subregister number %d out of range\n", $1);
-		    YYERROR;
-		  }
+		  if ($2 > 1)
+		    error(&@2, "flag subregister number %d out of range\n", $1);
 
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
@@ -2270,11 +2246,9 @@ flagreg:	FLAGREG subregnum
 
 maskreg:	MASKREG subregnum
 		{
-		  if ($1 > 0) {
-		    fprintf(stderr,
-			    "mask register number %d out of range", $1);
-		    YYERROR;
-		  }
+		  if ($1 > 0)
+		    error(&@1, "mask register number %d out of range", $1);
+
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
 		  $$.nr = BRW_ARF_MASK;
@@ -2294,11 +2268,8 @@ mask_subreg:	AMASK | IMASK | LMASK | CMASK
 
 maskstackreg:	MASKSTACKREG subregnum
 		{
-		  if ($1 > 0) {
-		    fprintf(stderr,
-			    "mask stack register number %d out of range", $1);
-		    YYERROR;
-		  }
+		  if ($1 > 0)
+		    error(&@1, "mask stack register number %d out of range", $1);
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
 		  $$.nr = BRW_ARF_MASK_STACK;
@@ -2319,11 +2290,8 @@ maskstack_subreg: IMS | LMS
 /*
 maskstackdepthreg: MASKSTACKDEPTHREG subregnum
 		{
-		  if ($1 > 0) {
-		    fprintf(stderr,
-			    "mask stack register number %d out of range", $1);
-		    YYERROR;
-		  }
+		  if ($1 > 0)
+		    error(&@1, "mask stack register number %d out of range", $1);
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
 		  $$.reg_nr = BRW_ARF_MASK_STACK_DEPTH;
@@ -2346,12 +2314,10 @@ notifyreg:	NOTIFYREG regtype
 		{
 		  int num_notifyreg = (IS_GENp(6)) ? 3 : 2;
 
-		  if ($1 > num_notifyreg) {
-		    fprintf(stderr,
-			    "notification register number %d out of range",
-			    $1);
-		    YYERROR;
-		  }
+		  if ($1 > num_notifyreg)
+		    error(&@1, "notification register number %d out of range",
+			  $1);
+
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
 
@@ -2382,16 +2348,12 @@ notifyreg:	NOTIFYREG regtype
 
 statereg:	STATEREG subregnum
 		{
-		  if ($1 > 0) {
-		    fprintf(stderr,
-			    "state register number %d out of range", $1);
-		    YYERROR;
-		  }
-		  if ($2 > 1) {
-		    fprintf(stderr,
-			    "state subregister number %d out of range", $1);
-		    YYERROR;
-		  }
+		  if ($1 > 0)
+		    error(&@1, "state register number %d out of range", $1);
+
+		  if ($2 > 1)
+		    error(&@2, "state subregister number %d out of range", $1);
+
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
 		  $$.nr = BRW_ARF_STATE | $1;
@@ -2401,16 +2363,11 @@ statereg:	STATEREG subregnum
 
 controlreg:	CONTROLREG subregnum
 		{
-		  if ($1 > 0) {
-		    fprintf(stderr,
-			    "control register number %d out of range", $1);
-		    YYERROR;
-		  }
-		  if ($2 > 2) {
-		    fprintf(stderr,
-			    "control subregister number %d out of range", $1);
-		    YYERROR;
-		  }
+		  if ($1 > 0)
+		    error(&@1, "control register number %d out of range", $1);
+
+		  if ($2 > 2)
+		    error(&@2, "control subregister number %d out of range", $1);
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.file = BRW_ARCHITECTURE_REGISTER_FILE;
 		  $$.nr = BRW_ARF_CONTROL | $1;
@@ -2440,12 +2397,8 @@ nullreg:	NULL_TOKEN
 relativelocation:
 		simple_int
 		{
-		  if (($1 > 32767) || ($1 < -32768)) {
-		    fprintf(stderr,
-			    "error: relative offset %d out of range \n", 
-			    $1);
-		    YYERROR;
-		  }
+		  if (($1 > 32767) || ($1 < -32768))
+		    error(&@1, "error: relative offset %d out of range \n", $1);
 
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.reg.file = BRW_IMMEDIATE_VALUE;
@@ -2520,9 +2473,9 @@ dstregion:	/* empty */
 		  /* Returns a value for a horiz_stride field of an
 		   * instruction.
 		   */
-		  if ($2 != 1 && $2 != 2 && $2 != 4) {
-		    fprintf(stderr, "Invalid horiz size %d\n", $2);
-		  }
+		  if ($2 != 1 && $2 != 2 && $2 != 4)
+		    error(&@2, "Invalid horiz size %d\n", $2);
+
 		  $$ = ffs($2);
 		}
 ;
@@ -2723,10 +2676,9 @@ execsize:	/* empty */ %prec EMPTEXECSIZE
 		   * instruction.
 		   */
 		  if ($2 != 1 && $2 != 2 && $2 != 4 && $2 != 8 && $2 != 16 &&
-		      $2 != 32) {
-		    fprintf(stderr, "Invalid execution size %d\n", $2);
-		    YYERROR;
-		  }
+		      $2 != 32)
+		    error(&@2, "Invalid execution size %d\n", $2);
+
 		  $$ = ffs($2) - 1;
 		}
 ;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 64/90] assembler: Cleanup visibility of a few global variables/functions
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (62 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 63/90] assembler: Port the warning and error reporting to warn()/error() Damien Lespiau
@ 2013-02-04 15:27 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 65/90] assembler: Fix ')' placement in condition Damien Lespiau
                   ` (26 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:27 UTC (permalink / raw)
  To: intel-gfx

Not everything has to be exported out the compilation unit. Do a small
cleanup pass.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    2 +
 assembler/gram.y    |  102 ++++++++++++++++++++++++---------------------------
 assembler/main.c    |   12 +++---
 3 files changed, 56 insertions(+), 60 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 9558a29..332c8b9 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -42,6 +42,8 @@ typedef int GLint;
 typedef float GLfloat;
 
 extern long int gen_level;
+extern int advanced_flag;
+extern int errors;
 
 #define WARN_ALWAYS	(1 << 0)
 #define WARN_ALL	(1 << 31)
diff --git a/assembler/gram.y b/assembler/gram.y
index c022277..4b5c6a3 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -49,9 +49,6 @@ typedef struct YYLTYPE
  int last_column;
 } YYLTYPE;
 
-extern long int gen_level;
-extern int advanced_flag;
-extern int yylineno;
 extern int need_export;
 static struct src_operand src_null_reg =
 {
@@ -83,30 +80,30 @@ static struct src_operand ip_src =
 };
 
 static int get_type_size(GLuint type);
-int set_instruction_dest(struct brw_instruction *instr,
-			 struct brw_reg *dest);
-int set_instruction_src0(struct brw_instruction *instr,
-			 struct src_operand *src,
-			 YYLTYPE *location);
-int set_instruction_src1(struct brw_instruction *instr,
-			 struct src_operand *src,
-			 YYLTYPE *location);
-int set_instruction_dest_three_src(struct brw_instruction *instr,
-                                   struct brw_reg *dest);
-int set_instruction_src0_three_src(struct brw_instruction *instr,
-                                   struct src_operand *src);
-int set_instruction_src1_three_src(struct brw_instruction *instr,
-                                   struct src_operand *src);
-int set_instruction_src2_three_src(struct brw_instruction *instr,
-                                   struct src_operand *src);
-void set_instruction_options(struct brw_instruction *instr,
-			     struct brw_instruction *options);
-void set_instruction_predicate(struct brw_instruction *instr,
-			       struct brw_instruction *predicate);
-void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
-			    int type);
-void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
-			    int type);
+static int set_instruction_dest(struct brw_instruction *instr,
+				struct brw_reg *dest);
+static int set_instruction_src0(struct brw_instruction *instr,
+				struct src_operand *src,
+				YYLTYPE *location);
+static int set_instruction_src1(struct brw_instruction *instr,
+				struct src_operand *src,
+				YYLTYPE *location);
+static int set_instruction_dest_three_src(struct brw_instruction *instr,
+					  struct brw_reg *dest);
+static int set_instruction_src0_three_src(struct brw_instruction *instr,
+					  struct src_operand *src);
+static int set_instruction_src1_three_src(struct brw_instruction *instr,
+					  struct src_operand *src);
+static int set_instruction_src2_three_src(struct brw_instruction *instr,
+					  struct src_operand *src);
+static void set_instruction_options(struct brw_instruction *instr,
+				    struct brw_instruction *options);
+static void set_instruction_predicate(struct brw_instruction *instr,
+				      struct brw_instruction *predicate);
+static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
+				   int type);
+static void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
+				   int type);
 
 enum message_level {
     WARN,
@@ -2826,9 +2823,6 @@ instoption:	ALIGN1 { $$ = ALIGN1; }
 
 %%
 extern int yylineno;
-extern char *input_filename;
-
-int errors;
 
 void yyerror (char *msg)
 {
@@ -2939,8 +2933,8 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
 /**
  * Fills in the destination register information in instr from the bits in dst.
  */
-int set_instruction_dest(struct brw_instruction *instr,
-			 struct brw_reg *dest)
+static int set_instruction_dest(struct brw_instruction *instr,
+				struct brw_reg *dest)
 {
 	if (!validate_dst_reg(instr, dest))
 		return 1;
@@ -2955,9 +2949,9 @@ int set_instruction_dest(struct brw_instruction *instr,
 }
 
 /* Sets the first source operand for the instruction.  Returns 0 on success. */
-int set_instruction_src0(struct brw_instruction *instr,
-			 struct src_operand *src,
-			 YYLTYPE *location)
+static int set_instruction_src0(struct brw_instruction *instr,
+				struct src_operand *src,
+				YYLTYPE *location)
 {
 
 	if (advanced_flag)
@@ -2977,9 +2971,9 @@ int set_instruction_src0(struct brw_instruction *instr,
 
 /* Sets the second source operand for the instruction.  Returns 0 on success.
  */
-int set_instruction_src1(struct brw_instruction *instr,
-			 struct src_operand *src,
-			 YYLTYPE *location)
+static int set_instruction_src1(struct brw_instruction *instr,
+				struct src_operand *src,
+				YYLTYPE *location)
 {
 	if (advanced_flag)
 		reset_instruction_src_region(instr, src);
@@ -3060,8 +3054,8 @@ static int reg_type_2_to_3(int reg_type)
 	return r;
 }
 
-int set_instruction_dest_three_src(struct brw_instruction *instr,
-                                   struct brw_reg *dest)
+static int set_instruction_dest_three_src(struct brw_instruction *instr,
+					  struct brw_reg *dest)
 {
 	instr->bits1.da3src.dest_reg_file = dest->file;
 	instr->bits1.da3src.dest_reg_nr = dest->nr;
@@ -3071,8 +3065,8 @@ int set_instruction_dest_three_src(struct brw_instruction *instr,
 	return 0;
 }
 
-int set_instruction_src0_three_src(struct brw_instruction *instr,
-                                   struct src_operand *src)
+static int set_instruction_src0_three_src(struct brw_instruction *instr,
+					  struct src_operand *src)
 {
 	if (advanced_flag) {
 		reset_instruction_src_region(instr, src);
@@ -3084,8 +3078,8 @@ int set_instruction_src0_three_src(struct brw_instruction *instr,
 	return 0;
 }
 
-int set_instruction_src1_three_src(struct brw_instruction *instr,
-                                   struct src_operand *src)
+static int set_instruction_src1_three_src(struct brw_instruction *instr,
+					  struct src_operand *src)
 {
 	if (advanced_flag) {
 		reset_instruction_src_region(instr, src);
@@ -3098,8 +3092,8 @@ int set_instruction_src1_three_src(struct brw_instruction *instr,
 	return 0;
 }
 
-int set_instruction_src2_three_src(struct brw_instruction *instr,
-                                   struct src_operand *src)
+static int set_instruction_src2_three_src(struct brw_instruction *instr,
+					  struct src_operand *src)
 {
 	if (advanced_flag) {
 		reset_instruction_src_region(instr, src);
@@ -3110,8 +3104,8 @@ int set_instruction_src2_three_src(struct brw_instruction *instr,
 	return 0;
 }
 
-void set_instruction_options(struct brw_instruction *instr,
-			     struct brw_instruction *options)
+static void set_instruction_options(struct brw_instruction *instr,
+				    struct brw_instruction *options)
 {
 	/* XXX: more instr options */
 	instr->header.access_mode = options->header.access_mode;
@@ -3121,8 +3115,8 @@ void set_instruction_options(struct brw_instruction *instr,
 		options->header.compression_control;
 }
 
-void set_instruction_predicate(struct brw_instruction *instr,
-			       struct brw_instruction *predicate)
+static void set_instruction_predicate(struct brw_instruction *instr,
+				      struct brw_instruction *predicate)
 {
 	instr->header.predicate_control = predicate->header.predicate_control;
 	instr->header.predicate_inverse = predicate->header.predicate_inverse;
@@ -3130,8 +3124,8 @@ void set_instruction_predicate(struct brw_instruction *instr,
 	instr->bits2.da1.flag_subreg_nr = predicate->bits2.da1.flag_subreg_nr;
 }
 
-void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
-			    int type)
+static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
+				   int type)
 {
 	*dst = *reg;
 	dst->address_mode = BRW_ADDRESS_DIRECT;
@@ -3140,8 +3134,8 @@ void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 	dst->dw1.bits.writemask = BRW_WRITEMASK_XYZW;
 }
 
-void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
-			    int type)
+static void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
+				   int type)
 {
 	memset(src, 0, sizeof(*src));
 	src->reg.address_mode = BRW_ADDRESS_DIRECT;
diff --git a/assembler/main.c b/assembler/main.c
index 4fe1315..269bc26 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -39,17 +39,12 @@
 
 extern FILE *yyin;
 
-extern int errors;
-
 long int gen_level = 40;
 int advanced_flag = 0; /* 0: in unit of byte, 1: in unit of data element size */
 unsigned int warning_flags = WARN_ALWAYS;
-int binary_like_output = 0; /* 0: default output style, 1: nice C-style output */
 int need_export = 0;
 char *input_filename = "<stdin>";
-char *export_filename = NULL;
-
-const char const *binary_prepend = "static const char gen_eu_bytes[] = {\n";
+int errors;
 
 struct brw_context genasm_brw_context;
 struct brw_compile genasm_compile;
@@ -57,6 +52,11 @@ struct brw_compile genasm_compile;
 struct brw_program compiled_program;
 struct program_defaults program_defaults = {.register_type = BRW_REGISTER_TYPE_F};
 
+/* 0: default output style, 1: nice C-style output */
+static int binary_like_output = 0;
+static char *export_filename = NULL;
+static const char binary_prepend[] = "static const char gen_eu_bytes[] = {\n";
+
 #define HASH_SIZE 37
 
 struct hash_item {
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 65/90] assembler: Fix ')' placement in condition
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (63 preceding siblings ...)
  2013-02-04 15:27 ` [PATCH 64/90] assembler: Cleanup visibility of a few global variables/functions Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 66/90] assembler: Implement register-indirect addressing mode in brw_set_src1() Damien Lespiau
                   ` (25 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

A small typo in the condition.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 4b5c6a3..c86e28f 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -2226,7 +2226,7 @@ accreg:		ACCREG subregnum
 
 flagreg:	FLAGREG subregnum
 		{
-		  if ((!IS_GENp(7) && $1) > 0 ||
+		  if ((!IS_GENp(7) && $1 > 0) ||
 		      (IS_GENp(7) && $1 > 1)) {
                     error(&@2, "flag register number %d out of range\n", $1);
 		  }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 66/90] assembler: Implement register-indirect addressing mode in brw_set_src1()
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (64 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 65/90] assembler: Fix ')' placement in condition Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 67/90] assembler: Use brw_set_src1() Damien Lespiau
                   ` (24 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

The assembler allows people to do that and that's something available
since Crestline.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_eu_emit.c |   39 +++++++++++++++++++++++++++------------
 1 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
index 21c673e..c63f1fc 100644
--- a/assembler/brw_eu_emit.c
+++ b/assembler/brw_eu_emit.c
@@ -351,6 +351,9 @@ void brw_set_src1(struct brw_compile *p,
 		  struct brw_instruction *insn,
 		  struct brw_reg reg)
 {
+   struct brw_context *brw = p->brw;
+   struct intel_context *intel = &brw->intel;
+
    assert(reg.file != BRW_MESSAGE_REGISTER_FILE);
 
    if (reg.file != BRW_ARCHITECTURE_REGISTER_FILE)
@@ -364,6 +367,7 @@ void brw_set_src1(struct brw_compile *p,
    insn->bits1.da1.src1_reg_type = reg.type;
    insn->bits3.da1.src1_abs = reg.abs;
    insn->bits3.da1.src1_negate = reg.negate;
+   insn->bits3.da1.src1_address_mode = reg.address_mode;
 
    /* Only src1 can be immediate in two-argument instructions.
     */
@@ -373,33 +377,44 @@ void brw_set_src1(struct brw_compile *p,
       insn->bits3.ud = reg.dw1.ud;
    }
    else {
-      /* This is a hardware restriction, which may or may not be lifted
-       * in the future:
-       */
-      assert (reg.address_mode == BRW_ADDRESS_DIRECT);
-      /* assert (reg.file == BRW_GENERAL_REGISTER_FILE); */
+      /* It's only BRW that does not support register-indirect addressing on
+       * src1 */
+      assert (intel->gen >= 4 || reg.address_mode == BRW_ADDRESS_DIRECT);
 
-      if (insn->header.access_mode == BRW_ALIGN_1) {
-	 insn->bits3.da1.src1_subreg_nr = reg.subnr;
-	 insn->bits3.da1.src1_reg_nr = reg.nr;
+      if (reg.address_mode == BRW_ADDRESS_DIRECT) {
+	 if (insn->header.access_mode == BRW_ALIGN_1) {
+	    insn->bits3.da1.src1_subreg_nr = reg.subnr;
+	    insn->bits3.da1.src1_reg_nr = reg.nr;
+	 }
+	 else {
+	    insn->bits3.da16.src1_subreg_nr = reg.subnr / 16;
+	    insn->bits3.da16.src1_reg_nr = reg.nr;
+	 }
       }
       else {
-	 insn->bits3.da16.src1_subreg_nr = reg.subnr / 16;
-	 insn->bits3.da16.src1_reg_nr = reg.nr;
+	 insn->bits3.ia1.src1_subreg_nr = reg.subnr;
+
+	 if (insn->header.access_mode == BRW_ALIGN_1)
+	    insn->bits3.ia1.src1_indirect_offset = reg.dw1.bits.indirect_offset;
+	 else
+	    insn->bits3.ia16.src1_indirect_offset = reg.dw1.bits.indirect_offset / 16;
       }
 
       if (insn->header.access_mode == BRW_ALIGN_1) {
+	 /* FIXME: While this is correct, if the assembler uses that code path
+	  * the opcode generated are different and thus needs a validation
+	  * pass.
 	 if (reg.width == BRW_WIDTH_1 && 
 	     insn->header.execution_size == BRW_EXECUTE_1) {
 	    insn->bits3.da1.src1_horiz_stride = BRW_HORIZONTAL_STRIDE_0;
 	    insn->bits3.da1.src1_width = BRW_WIDTH_1;
 	    insn->bits3.da1.src1_vert_stride = BRW_VERTICAL_STRIDE_0;
 	 }
-	 else {
+	 else { */
 	    insn->bits3.da1.src1_horiz_stride = reg.hstride;
 	    insn->bits3.da1.src1_width = reg.width;
 	    insn->bits3.da1.src1_vert_stride = reg.vstride;
-	 }
+     /* } */
       }
       else {
 	 insn->bits3.da16.src1_swz_x = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_X);
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 67/90] assembler: Use brw_set_src1()
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (65 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 66/90] assembler: Implement register-indirect addressing mode in brw_set_src1() Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 68/90] assembler: Renamed the instruction field to insn Damien Lespiau
                   ` (23 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Everything is now aligned to be able to use brw_set_src1() in the
opcode generation, so use it.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   54 +++++-------------------------------------------------
 1 files changed, 5 insertions(+), 49 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index c86e28f..8d81a04 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -2981,55 +2981,11 @@ static int set_instruction_src1(struct brw_instruction *instr,
 	if (!validate_src_reg(instr, src->reg, location))
 		return 1;
 
-	instr->bits1.da1.src1_reg_file = src->reg.file;
-	instr->bits1.da1.src1_reg_type = src->reg.type;
-	if (src->reg.file == BRW_IMMEDIATE_VALUE) {
-		instr->bits3.ud = src->reg.dw1.ud;
-	} else if (src->reg.address_mode == BRW_ADDRESS_DIRECT) {
-            if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits3.da1.src1_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
-		instr->bits3.da1.src1_reg_nr = src->reg.nr;
-		instr->bits3.da1.src1_vert_stride = src->reg.vstride;
-		instr->bits3.da1.src1_width = src->reg.width;
-		instr->bits3.da1.src1_horiz_stride = src->reg.hstride;
-		instr->bits3.da1.src1_negate = src->reg.negate;
-		instr->bits3.da1.src1_abs = src->reg.abs;
-                instr->bits3.da1.src1_address_mode = src->reg.address_mode;
-            } else {
-		instr->bits3.da16.src1_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode);
-		instr->bits3.da16.src1_reg_nr = src->reg.nr;
-		instr->bits3.da16.src1_vert_stride = src->reg.vstride;
-		instr->bits3.da16.src1_negate = src->reg.negate;
-		instr->bits3.da16.src1_abs = src->reg.abs;
-		instr->bits3.da16.src1_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
-		instr->bits3.da16.src1_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
-		instr->bits3.da16.src1_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
-		instr->bits3.da16.src1_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
-                instr->bits3.da16.src1_address_mode = src->reg.address_mode;
-            }
-	} else {
-            if (instr->header.access_mode == BRW_ALIGN_1) {
-		instr->bits3.ia1.src1_indirect_offset = src->reg.dw1.bits.indirect_offset;
-		instr->bits3.ia1.src1_subreg_nr = get_indirect_subreg_address(src->reg.subnr);
-		instr->bits3.ia1.src1_abs = src->reg.abs;
-		instr->bits3.ia1.src1_negate = src->reg.negate;
-		instr->bits3.ia1.src1_address_mode = src->reg.address_mode;
-		instr->bits3.ia1.src1_horiz_stride = src->reg.hstride;
-		instr->bits3.ia1.src1_width = src->reg.width;
-		instr->bits3.ia1.src1_vert_stride = src->reg.vstride;
-            } else {
-		instr->bits3.ia16.src1_swz_x = BRW_GET_SWZ(SWIZZLE(src->reg), 0);
-		instr->bits3.ia16.src1_swz_y = BRW_GET_SWZ(SWIZZLE(src->reg), 1);
-		instr->bits3.ia16.src1_swz_z = BRW_GET_SWZ(SWIZZLE(src->reg), 2);
-		instr->bits3.ia16.src1_swz_w = BRW_GET_SWZ(SWIZZLE(src->reg), 3);
-		instr->bits3.ia16.src1_indirect_offset = (src->reg.dw1.bits.indirect_offset >> 4); /* half register aligned */
-		instr->bits3.ia16.src1_subreg_nr = get_indirect_subreg_address(src->reg.subnr);
-		instr->bits3.ia16.src1_abs = src->reg.abs;
-		instr->bits3.ia16.src1_negate = src->reg.negate;
-		instr->bits3.ia16.src1_address_mode = src->reg.address_mode;
-		instr->bits3.ia16.src1_vert_stride = src->reg.vstride;
-            }
-        }
+	/* the assembler support expressing subnr in bytes or in number of
+	 * elements. */
+	resolve_subnr(&src->reg);
+
+	brw_set_src1(&genasm_compile, instr, src->reg);
 
 	return 0;
 }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 68/90] assembler: Renamed the instruction field to insn
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (66 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 67/90] assembler: Use brw_set_src1() Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 69/90] assembler: Unify all instructions to be brw_program_instructions Damien Lespiau
                   ` (22 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

This will be less typing for the refactoring to come (which is use
struct brw_program_instruction in gram.y for the type of all the
instructions).

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/disasm-main.c |    6 +++---
 assembler/gen4asm.h     |    4 ++--
 assembler/gram.y        |    6 +++---
 assembler/main.c        |   10 +++++-----
 4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/assembler/disasm-main.c b/assembler/disasm-main.c
index fbb6ae3..87e6737 100644
--- a/assembler/disasm-main.c
+++ b/assembler/disasm-main.c
@@ -51,7 +51,7 @@ read_program (FILE *input)
 		++n;
 		if (n == 4) {
 		    entry = malloc (sizeof (struct brw_program_instruction));
-		    memcpy (&entry->instruction, inst, 4 * sizeof (uint32_t));
+		    memcpy (&entry->insn, inst, 4 * sizeof (uint32_t));
 		    entry->next = NULL;
 		    *prev = entry;
 		    prev = &entry->next;
@@ -82,7 +82,7 @@ read_program_binary (FILE *input)
 		inst[n++] = (uint8_t)temp;
 		if (n == 16) {
 		    entry = malloc (sizeof (struct brw_program_instruction));
-		    memcpy (&entry->instruction, inst, 16 * sizeof (uint8_t));
+		    memcpy (&entry->insn, inst, 16 * sizeof (uint8_t));
 		    entry->next = NULL;
 		    *prev = entry;
 		    prev = &entry->next;
@@ -167,6 +167,6 @@ int main(int argc, char **argv)
     }
 	    
     for (inst = program->first; inst; inst = inst->next)
-	brw_disasm (output, &inst->instruction.gen, gen);
+	brw_disasm (output, &inst->insn.gen, gen);
     exit (0);
 }
diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 332c8b9..0781eaf 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -143,7 +143,7 @@ struct brw_program_instruction {
 	struct brw_instruction gen;
 	struct relocatable_instruction reloc;
 	struct label_instruction label;
-    } instruction;
+    } insn;
     struct brw_program_instruction *next;
 };
 
@@ -155,7 +155,7 @@ static inline bool is_label(struct brw_program_instruction *instruction)
 static inline char *label_name(struct brw_program_instruction *i)
 {
     assert(is_label(i));
-    return i->instruction.label.name;
+    return i->insn.label.name;
 }
 
 static inline bool is_relocatable(struct brw_program_instruction *intruction)
diff --git a/assembler/gram.y b/assembler/gram.y
index 8d81a04..67a5da9 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -201,7 +201,7 @@ static void brw_program_add_instruction(struct brw_program *p,
 
     list_entry = calloc(sizeof(struct brw_program_instruction), 1);
     list_entry->type = GEN4ASM_INSTRUCTION_GEN;
-    list_entry->instruction.gen = *instruction;
+    list_entry->insn.gen = *instruction;
     brw_program_append_entry(p, list_entry);
 }
 
@@ -212,7 +212,7 @@ static void brw_program_add_relocatable(struct brw_program *p,
 
     list_entry = calloc(sizeof(struct brw_program_instruction), 1);
     list_entry->type = GEN4ASM_INSTRUCTION_GEN_RELOCATABLE;
-    list_entry->instruction.reloc = *reloc;
+    list_entry->insn.reloc = *reloc;
     brw_program_append_entry(p, list_entry);
 }
 
@@ -222,7 +222,7 @@ static void brw_program_add_label(struct brw_program *p, const char *label)
 
     list_entry = calloc(sizeof(struct brw_program_instruction), 1);
     list_entry->type = GEN4ASM_INSTRUCTION_LABEL;
-    list_entry->instruction.label.name = strdup(label);
+    list_entry->insn.label.name = strdup(label);
     brw_program_append_entry(p, list_entry);
 }
 
diff --git a/assembler/main.c b/assembler/main.c
index 269bc26..8579f96 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -237,7 +237,7 @@ static int is_entry_point(struct brw_program_instruction *i)
 	assert(i->type == GEN4ASM_INSTRUCTION_LABEL);
 
 	for (p = entry_point_table; p; p = p->next) {
-	    if (strcmp(p->str, i->instruction.label.name) == 0)
+	    if (strcmp(p->str, i->insn.label.name) == 0)
 		return 1;
 	}
 	return 0;
@@ -406,7 +406,7 @@ int main(int argc, char **argv)
 		// insert NOP instructions until (inst_offset+1) % 4 == 0
 		while (((inst_offset+1) % 4) != 0) {
 		    tmp_entry = calloc(sizeof(*tmp_entry), 1);
-		    tmp_entry->instruction.gen.header.opcode = BRW_OPCODE_NOP;
+		    tmp_entry->insn.gen.header.opcode = BRW_OPCODE_NOP;
 		    entry->next = tmp_entry;
 		    tmp_entry->next = entry1;
 		    entry = tmp_entry;
@@ -437,7 +437,7 @@ int main(int argc, char **argv)
 	}
 
 	for (entry = compiled_program.first; entry; entry = entry->next) {
-	    struct relocatable_instruction *reloc = &entry->instruction.reloc;
+	    struct relocatable_instruction *reloc = &entry->insn.reloc;
 	    struct brw_instruction *inst = &reloc->gen;
 
 	    if (!is_relocatable(entry))
@@ -497,9 +497,9 @@ int main(int argc, char **argv)
 		entry = entry1) {
 	    entry1 = entry->next;
 	    if (!is_label(entry))
-		print_instruction(output, &entry->instruction.gen);
+		print_instruction(output, &entry->insn.gen);
 	    else
-		free(entry->instruction.label.name);
+		free(entry->insn.label.name);
 	    free(entry);
 	}
 	if (binary_like_output)
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 69/90] assembler: Unify all instructions to be brw_program_instructions
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (67 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 68/90] assembler: Renamed the instruction field to insn Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 70/90] assembler: Move struct relocation out of relocatable instructions Damien Lespiau
                   ` (21 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Time to finally unify all instructions on the same structure.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |  880 +++++++++++++++++++++++++++---------------------------
 1 files changed, 441 insertions(+), 439 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 67a5da9..f078bfe 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -40,6 +40,8 @@
 
 #define SWIZZLE(reg) (reg.dw1.bits.swizzle)
 
+#define GEN(i)	(&(i)->insn.gen)
+
 #define YYLTYPE YYLTYPE
 typedef struct YYLTYPE
 {
@@ -80,26 +82,26 @@ static struct src_operand ip_src =
 };
 
 static int get_type_size(GLuint type);
-static int set_instruction_dest(struct brw_instruction *instr,
+static int set_instruction_dest(struct brw_program_instruction *instr,
 				struct brw_reg *dest);
-static int set_instruction_src0(struct brw_instruction *instr,
+static int set_instruction_src0(struct brw_program_instruction *instr,
 				struct src_operand *src,
 				YYLTYPE *location);
-static int set_instruction_src1(struct brw_instruction *instr,
+static int set_instruction_src1(struct brw_program_instruction *instr,
 				struct src_operand *src,
 				YYLTYPE *location);
-static int set_instruction_dest_three_src(struct brw_instruction *instr,
+static int set_instruction_dest_three_src(struct brw_program_instruction *instr,
 					  struct brw_reg *dest);
-static int set_instruction_src0_three_src(struct brw_instruction *instr,
+static int set_instruction_src0_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src);
-static int set_instruction_src1_three_src(struct brw_instruction *instr,
+static int set_instruction_src1_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src);
-static int set_instruction_src2_three_src(struct brw_instruction *instr,
+static int set_instruction_src2_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src);
-static void set_instruction_options(struct brw_instruction *instr,
-				    struct brw_instruction *options);
-static void set_instruction_predicate(struct brw_instruction *instr,
-				      struct brw_instruction *predicate);
+static void set_instruction_options(struct brw_program_instruction *instr,
+				    struct brw_program_instruction *options);
+static void set_instruction_predicate(struct brw_program_instruction *instr,
+				      struct brw_program_instruction *p);
 static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 				   int type);
 static void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
@@ -194,25 +196,26 @@ static void brw_program_append_entry(struct brw_program *p,
     p->last = entry;
 }
 
-static void brw_program_add_instruction(struct brw_program *p,
-					struct brw_instruction *instruction)
+static void
+brw_program_add_instruction(struct brw_program *p,
+			    struct brw_program_instruction *instruction)
 {
     struct brw_program_instruction *list_entry;
 
     list_entry = calloc(sizeof(struct brw_program_instruction), 1);
     list_entry->type = GEN4ASM_INSTRUCTION_GEN;
-    list_entry->insn.gen = *instruction;
+    list_entry->insn.gen = instruction->insn.gen;
     brw_program_append_entry(p, list_entry);
 }
 
 static void brw_program_add_relocatable(struct brw_program *p,
-					struct relocatable_instruction *reloc)
+					struct brw_program_instruction *reloc)
 {
     struct brw_program_instruction *list_entry;
 
     list_entry = calloc(sizeof(struct brw_program_instruction), 1);
     list_entry->type = GEN4ASM_INSTRUCTION_GEN_RELOCATABLE;
-    list_entry->insn.reloc = *reloc;
+    list_entry->insn.reloc = reloc->insn.reloc;
     brw_program_append_entry(p, list_entry);
 }
 
@@ -385,8 +388,7 @@ static void resolve_subnr(struct brw_reg *reg)
 	char *string;
 	int integer;
 	double number;
-	struct brw_instruction instruction;
-	struct relocatable_instruction relocatable;
+	struct brw_program_instruction instruction;
 	struct brw_program program;
 	struct region region;
 	struct regtype regtype;
@@ -474,9 +476,9 @@ static void resolve_subnr(struct brw_reg *reg)
 %type <instruction> instoptions instoption_list predicate
 %type <instruction> mathinstruction
 %type <instruction> nopinstruction
-%type <relocatable> relocatableinstruction breakinstruction
-%type <relocatable> ifelseinstruction loopinstruction haltinstruction
-%type <relocatable> multibranchinstruction subroutineinstruction jumpinstruction
+%type <instruction> relocatableinstruction breakinstruction
+%type <instruction> ifelseinstruction loopinstruction haltinstruction
+%type <instruction> multibranchinstruction subroutineinstruction jumpinstruction
 %type <string> label
 %type <program> instrseq
 %type <integer> instoption
@@ -707,11 +709,11 @@ ifelseinstruction: ENDIF
 		  if(IS_GENp(6)) // For gen6+.
 		    error(&@1, "should be 'ENDIF execsize relativelocation'\n");
 		  memset(&$$, 0, sizeof($$));
-		  $$.gen.header.opcode = $1;
-		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
-		  $$.gen.bits1.da1.dest_horiz_stride = 1;
-		  $$.gen.bits1.da1.src1_reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
-		  $$.gen.bits1.da1.src1_reg_type = BRW_REGISTER_TYPE_UD;
+		  GEN(&$$)->header.opcode = $1;
+		  GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
+		  GEN(&$$)->bits1.da1.dest_horiz_stride = 1;
+		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
+		  GEN(&$$)->bits1.da1.src1_reg_type = BRW_REGISTER_TYPE_UD;
 		}
 		| ENDIF execsize relativelocation instoptions
 		{
@@ -720,10 +722,10 @@ ifelseinstruction: ENDIF
 		  if(!IS_GENp(6)) // for gen6-
 		    error(&@1, "ENDIF Syntax error: should be 'ENDIF'\n");
 		  memset(&$$, 0, sizeof($$));
-		  $$.gen.header.opcode = $1;
-		  $$.gen.header.execution_size = $2;
-		  $$.first_reloc_target = $3.reloc_target;
-		  $$.first_reloc_offset = $3.imm32;
+		  GEN(&$$)->header.opcode = $1;
+		  GEN(&$$)->header.execution_size = $2;
+		  $$.insn.reloc.first_reloc_target = $3.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $3.imm32;
 		}
 		| ELSE execsize relativelocation instoptions
 		{
@@ -733,20 +735,20 @@ ifelseinstruction: ENDIF
 		    $3.imm32 |= (1 << 16);
 
 		    memset(&$$, 0, sizeof($$));
-		    $$.gen.header.opcode = $1;
-		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
+		    GEN(&$$)->header.opcode = $1;
+		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		    ip_dst.width = $2;
-		    set_instruction_dest(&$$.gen, &ip_dst);
-		    set_instruction_src0(&$$.gen, &ip_src, NULL);
-		    set_instruction_src1(&$$.gen, &$3, NULL);
-		    $$.first_reloc_target = $3.reloc_target;
-		    $$.first_reloc_offset = $3.imm32;
+		    set_instruction_dest(&$$, &ip_dst);
+		    set_instruction_src0(&$$, &ip_src, NULL);
+		    set_instruction_src1(&$$, &$3, NULL);
+		    $$.insn.reloc.first_reloc_target = $3.reloc_target;
+		    $$.insn.reloc.first_reloc_offset = $3.imm32;
 		  } else if(IS_GENp(6)) {
 		    memset(&$$, 0, sizeof($$));
-		    $$.gen.header.opcode = $1;
-		    $$.gen.header.execution_size = $2;
-		    $$.first_reloc_target = $3.reloc_target;
-		    $$.first_reloc_offset = $3.imm32;
+		    GEN(&$$)->header.opcode = $1;
+		    GEN(&$$)->header.execution_size = $2;
+		    $$.insn.reloc.first_reloc_target = $3.reloc_target;
+		    $$.insn.reloc.first_reloc_offset = $3.imm32;
 		  } else {
 		    error(&@1, "'ELSE' instruction is not implemented.\n");
 		  }
@@ -762,17 +764,17 @@ ifelseinstruction: ENDIF
 		    error(&@2, "IF should be 'IF execsize JIP UIP'\n");
 
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$.gen, &$1);
-		  $$.gen.header.opcode = $2;
+		  set_instruction_predicate(&$$, &$1);
+		  GEN(&$$)->header.opcode = $2;
 		  if(!IS_GENp(6)) {
-		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
+		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		    ip_dst.width = $3;
-		    set_instruction_dest(&$$.gen, &ip_dst);
-		    set_instruction_src0(&$$.gen, &ip_src, NULL);
-		    set_instruction_src1(&$$.gen, &$4, NULL);
+		    set_instruction_dest(&$$, &ip_dst);
+		    set_instruction_src0(&$$, &ip_src, NULL);
+		    set_instruction_src1(&$$, &$4, NULL);
 		  }
-		  $$.first_reloc_target = $4.reloc_target;
-		  $$.first_reloc_offset = $4.imm32;
+		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $4.imm32;
 		}
 		| predicate IF execsize relativelocation relativelocation
 		{
@@ -781,13 +783,13 @@ ifelseinstruction: ENDIF
 		    error(&@2, "IF should be 'IF execsize relativelocation'\n");
 
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$.gen, &$1);
-		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = $3;
-		  $$.first_reloc_target = $4.reloc_target;
-		  $$.first_reloc_offset = $4.imm32;
-		  $$.second_reloc_target = $5.reloc_target;
-		  $$.second_reloc_offset = $5.imm32;
+		  set_instruction_predicate(&$$, &$1);
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.execution_size = $3;
+		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $4.imm32;
+		  $$.insn.reloc.second_reloc_target = $5.reloc_target;
+		  $$.insn.reloc.second_reloc_offset = $5.imm32;
 		}
 ;
 
@@ -800,25 +802,25 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		     * to the pre-incremented IP.
 		     */
 		    ip_dst.width = $3;
-		    set_instruction_dest(&$$.gen, &ip_dst);
+		    set_instruction_dest(&$$, &ip_dst);
 		    memset(&$$, 0, sizeof($$));
-		    set_instruction_predicate(&$$.gen, &$1);
-		    $$.gen.header.opcode = $2;
-		    $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
-		    set_instruction_src0(&$$.gen, &ip_src, NULL);
-		    set_instruction_src1(&$$.gen, &$4, NULL);
-		    $$.first_reloc_target = $4.reloc_target;
-		    $$.first_reloc_offset = $4.imm32;
+		    set_instruction_predicate(&$$, &$1);
+		    GEN(&$$)->header.opcode = $2;
+		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
+		    set_instruction_src0(&$$, &ip_src, NULL);
+		    set_instruction_src1(&$$, &$4, NULL);
+		    $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		    $$.insn.reloc.first_reloc_offset = $4.imm32;
 		  } else if (IS_GENp(6)) {
 		    /* Gen6 spec:
 		         dest must have the same element size as src0.
 		         dest horizontal stride must be 1. */
 		    memset(&$$, 0, sizeof($$));
-		    set_instruction_predicate(&$$.gen, &$1);
-		    $$.gen.header.opcode = $2;
-		    $$.gen.header.execution_size = $3;
-		    $$.first_reloc_target = $4.reloc_target;
-		    $$.first_reloc_offset = $4.imm32;
+		    set_instruction_predicate(&$$, &$1);
+		    GEN(&$$)->header.opcode = $2;
+		    GEN(&$$)->header.execution_size = $3;
+		    $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		    $$.insn.reloc.first_reloc_offset = $4.imm32;
 		  } else {
 		    error(&@2, "'WHILE' instruction is not implemented!\n");
 		  }
@@ -827,7 +829,7 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		{
 		  // deprecated
 		  memset(&$$, 0, sizeof($$));
-		  $$.gen.header.opcode = $1;
+		  GEN(&$$)->header.opcode = $1;
 		};
 
 haltinstruction: predicate HALT execsize relativelocation relativelocation instoptions
@@ -835,15 +837,15 @@ haltinstruction: predicate HALT execsize relativelocation relativelocation insto
 		  // for Gen6, Gen7
 		  /* Gen6, Gen7 bspec: dst and src0 must be the null reg. */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$.gen, &$1);
-		  $$.gen.header.opcode = $2;
-		  $$.first_reloc_target = $4.reloc_target;
-		  $$.first_reloc_offset = $4.imm32;
-		  $$.second_reloc_target = $5.reloc_target;
-		  $$.second_reloc_offset = $5.imm32;
+		  set_instruction_predicate(&$$, &$1);
+		  GEN(&$$)->header.opcode = $2;
+		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $4.imm32;
+		  $$.insn.reloc.second_reloc_target = $5.reloc_target;
+		  $$.insn.reloc.second_reloc_offset = $5.imm32;
 		  dst_null_reg.width = $3;
-		  set_instruction_dest(&$$.gen, &dst_null_reg);
-		  set_instruction_src0(&$$.gen, &src_null_reg, NULL);
+		  set_instruction_dest(&$$, &dst_null_reg);
+		  set_instruction_src0(&$$, &src_null_reg, NULL);
 		};
 
 multibranchinstruction:
@@ -851,28 +853,28 @@ multibranchinstruction:
 		{
 		  /* Gen7 bspec: dest must be null. use Switch option */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$.gen, &$1);
-		  $$.gen.header.opcode = $2;
-		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
-		  $$.first_reloc_target = $4.reloc_target;
-		  $$.first_reloc_offset = $4.imm32;
+		  set_instruction_predicate(&$$, &$1);
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
+		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $4.imm32;
 		  dst_null_reg.width = $3;
-		  set_instruction_dest(&$$.gen, &dst_null_reg);
+		  set_instruction_dest(&$$, &dst_null_reg);
 		}
 		| predicate BRC execsize relativelocation relativelocation instoptions
 		{
 		  /* Gen7 bspec: dest must be null. src0 must be null. use Switch option */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$.gen, &$1);
-		  $$.gen.header.opcode = $2;
-		  $$.gen.header.thread_control |= BRW_THREAD_SWITCH;
-		  $$.first_reloc_target = $4.reloc_target;
-		  $$.first_reloc_offset = $4.imm32;
-		  $$.second_reloc_target = $5.reloc_target;
-		  $$.second_reloc_offset = $5.imm32;
+		  set_instruction_predicate(&$$, &$1);
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
+		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $4.imm32;
+		  $$.insn.reloc.second_reloc_target = $5.reloc_target;
+		  $$.insn.reloc.second_reloc_offset = $5.imm32;
 		  dst_null_reg.width = $3;
-		  set_instruction_dest(&$$.gen, &dst_null_reg);
-		  set_instruction_src0(&$$.gen, &src_null_reg, NULL);
+		  set_instruction_dest(&$$, &dst_null_reg);
+		  set_instruction_src0(&$$, &src_null_reg, NULL);
 		}
 ;
 
@@ -894,12 +896,12 @@ subroutineinstruction:
 		       execution size must be 2.
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$.gen, &$1);
-		  $$.gen.header.opcode = $2;
+		  set_instruction_predicate(&$$, &$1);
+		  GEN(&$$)->header.opcode = $2;
 
 		  $4.type = BRW_REGISTER_TYPE_D; /* dest type should be DWORD */
 		  $4.width = 1; /* execution size must be 2. Here 1 is encoded 2. */
-		  set_instruction_dest(&$$.gen, &$4);
+		  set_instruction_dest(&$$, &$4);
 
 		  struct src_operand src0;
 		  memset(&src0, 0, sizeof(src0));
@@ -908,10 +910,10 @@ subroutineinstruction:
 		  src0.reg.hstride = 1; /*encoded 1*/
 		  src0.reg.width = 1; /*encoded 2*/
 		  src0.reg.vstride = 2; /*encoded 2*/
-		  set_instruction_src0(&$$.gen, &src0, NULL);
+		  set_instruction_src0(&$$, &src0, NULL);
 
-		  $$.first_reloc_target = $5.reloc_target;
-		  $$.first_reloc_offset = $5.imm32;
+		  $$.insn.reloc.first_reloc_target = $5.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $5.imm32;
 		}
 		| predicate RET execsize dstoperandex src instoptions
 		{
@@ -922,15 +924,15 @@ subroutineinstruction:
 		       src0 region control must be <2,2,1> (not specified clearly. should be same as CALL)
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$.gen, &$1);
-		  $$.gen.header.opcode = $2;
+		  set_instruction_predicate(&$$, &$1);
+		  GEN(&$$)->header.opcode = $2;
 		  dst_null_reg.width = 1; /* execution size of RET should be 2 */
-		  set_instruction_dest(&$$.gen, &dst_null_reg);
+		  set_instruction_dest(&$$, &dst_null_reg);
 		  $5.reg.type = BRW_REGISTER_TYPE_D;
 		  $5.reg.hstride = 1; /*encoded 1*/
 		  $5.reg.width = 1; /*encoded 2*/
 		  $5.reg.vstride = 2; /*encoded 2*/
-		  set_instruction_src0(&$$.gen, &$5, NULL);
+		  set_instruction_src0(&$$, &$5, NULL);
 		}
 ;
 
@@ -939,9 +941,9 @@ unaryinstruction:
 		dst srcaccimm instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $3.cond;
-		  $$.header.saturate = $4;
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
+		  GEN(&$$)->header.saturate = $4;
 		  $6.width = $5;
 		  set_instruction_options(&$$, &$8);
 		  set_instruction_predicate(&$$, &$1);
@@ -951,20 +953,20 @@ unaryinstruction:
 		    YYERROR;
 
 		  if ($3.flag_subreg_nr != -1) {
-		    if ($$.header.predicate_control != BRW_PREDICATE_NONE &&
-                        ($1.bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
-                         $1.bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
+		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
+                        (GEN(&$1)->bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
+                         GEN(&$1)->bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
                         warn(ALWAYS, &@3, "must use the same flag register if "
 			     "both prediction and conditional modifier are "
 			     "enabled\n");
 
-		    $$.bits2.da1.flag_reg_nr = $3.flag_reg_nr;
-		    $$.bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
+		    GEN(&$$)->bits2.da1.flag_reg_nr = $3.flag_reg_nr;
+		    GEN(&$$)->bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
 		  }
 
 		  if (!IS_GENp(6) && 
-				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
-		    $$.header.compression_control = BRW_COMPRESSION_COMPRESSED;
+				get_type_size(GEN(&$$)->bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
+		    GEN(&$$)->header.compression_control = BRW_COMPRESSION_COMPRESSED;
 		}
 ;
 
@@ -978,9 +980,9 @@ binaryinstruction:
 		dst src srcimm instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $3.cond;
-		  $$.header.saturate = $4;
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
+		  GEN(&$$)->header.saturate = $4;
 		  set_instruction_options(&$$, &$9);
 		  set_instruction_predicate(&$$, &$1);
 		  $6.width = $5;
@@ -992,20 +994,20 @@ binaryinstruction:
 		    YYERROR;
 
 		  if ($3.flag_subreg_nr != -1) {
-		    if ($$.header.predicate_control != BRW_PREDICATE_NONE &&
-                        ($1.bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
-                         $1.bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
+		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
+                        (GEN(&$1)->bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
+                         GEN(&$1)->bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
                         warn(ALWAYS, &@3, "must use the same flag register if "
 			     "both prediction and conditional modifier are "
 			     "enabled\n");
 
-		    $$.bits2.da1.flag_reg_nr = $3.flag_reg_nr;
-		    $$.bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
+		    GEN(&$$)->bits2.da1.flag_reg_nr = $3.flag_reg_nr;
+		    GEN(&$$)->bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
 		  }
 
 		  if (!IS_GENp(6) && 
-				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
-		    $$.header.compression_control = BRW_COMPRESSION_COMPRESSED;
+				get_type_size(GEN(&$$)->bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
+		    GEN(&$$)->header.compression_control = BRW_COMPRESSION_COMPRESSED;
 		}
 ;
 
@@ -1019,9 +1021,9 @@ binaryaccinstruction:
 		dst srcacc srcimm instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $3.cond;
-		  $$.header.saturate = $4;
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
+		  GEN(&$$)->header.saturate = $4;
 		  $6.width = $5;
 		  set_instruction_options(&$$, &$9);
 		  set_instruction_predicate(&$$, &$1);
@@ -1033,20 +1035,20 @@ binaryaccinstruction:
 		    YYERROR;
 
 		  if ($3.flag_subreg_nr != -1) {
-		    if ($$.header.predicate_control != BRW_PREDICATE_NONE &&
-                        ($1.bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
-                         $1.bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
+		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
+                        (GEN(&$1)->bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
+                         GEN(&$1)->bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
                         warn(ALWAYS, &@3, "must use the same flag register if "
 			     "both prediction and conditional modifier are "
 			     "enabled\n");
 
-		    $$.bits2.da1.flag_reg_nr = $3.flag_reg_nr;
-		    $$.bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
+		    GEN(&$$)->bits2.da1.flag_reg_nr = $3.flag_reg_nr;
+		    GEN(&$$)->bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
 		  }
 
 		  if (!IS_GENp(6) && 
-				get_type_size($$.bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
-		    $$.header.compression_control = BRW_COMPRESSION_COMPRESSED;
+				get_type_size(GEN(&$$)->bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
+		    GEN(&$$)->header.compression_control = BRW_COMPRESSION_COMPRESSED;
 		}
 ;
 
@@ -1063,15 +1065,15 @@ trinaryinstruction:
 {
 		  memset(&$$, 0, sizeof($$));
 
-		  $$.header.predicate_control = $1.header.predicate_control;
-		  $$.header.predicate_inverse = $1.header.predicate_inverse;
-		  $$.bits1.da3src.flag_reg_nr = $1.bits2.da1.flag_reg_nr;
-		  $$.bits1.da3src.flag_subreg_nr = $1.bits2.da1.flag_subreg_nr;
+		  GEN(&$$)->header.predicate_control = GEN(&$1)->header.predicate_control;
+		  GEN(&$$)->header.predicate_inverse = GEN(&$1)->header.predicate_inverse;
+		  GEN(&$$)->bits1.da3src.flag_reg_nr = GEN(&$1)->bits2.da1.flag_reg_nr;
+		  GEN(&$$)->bits1.da3src.flag_subreg_nr = GEN(&$1)->bits2.da1.flag_subreg_nr;
 
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $3.cond;
-		  $$.header.saturate = $4;
-		  $$.header.execution_size = $5;
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
+		  GEN(&$$)->header.saturate = $4;
+		  GEN(&$$)->header.execution_size = $5;
 
 		  if (set_instruction_dest_three_src(&$$, &$6))
 		    YYERROR;
@@ -1084,9 +1086,9 @@ trinaryinstruction:
 		  set_instruction_options(&$$, &$10);
 
 		  if ($3.flag_subreg_nr != -1) {
-		    if ($$.header.predicate_control != BRW_PREDICATE_NONE &&
-                        ($1.bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
-                         $1.bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
+		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
+                        (GEN(&$1)->bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
+                         GEN(&$1)->bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
                         warn(ALWAYS, &@3, "must use the same flag register if "
 			     "both prediction and conditional modifier are "
 			     "enabled\n");
@@ -1107,9 +1109,9 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		   * implicitly loaded if non-null.
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
+		  GEN(&$$)->header.opcode = $2;
 		  $5.width = $3;
-		  $$.header.destreg__conditionalmod = $4; /* msg reg index */
+		  GEN(&$$)->header.destreg__conditionalmod = $4; /* msg reg index */
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$5) != 0)
 		    YYERROR;
@@ -1134,37 +1136,37 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                           YYERROR;
 		  }
 
-		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.bits1.da1.src1_reg_type = BRW_REGISTER_TYPE_D;
+		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
+		  GEN(&$$)->bits1.da1.src1_reg_type = BRW_REGISTER_TYPE_D;
 
 		  if (IS_GENp(5)) {
                       if (IS_GENp(6)) {
-                          $$.header.destreg__conditionalmod = $7.bits2.send_gen5.sfid;
+                          GEN(&$$)->header.destreg__conditionalmod = GEN(&$7)->bits2.send_gen5.sfid;
                       } else {
-                          $$.header.destreg__conditionalmod = $4; /* msg reg index */
-                          $$.bits2.send_gen5.sfid = $7.bits2.send_gen5.sfid;
-                          $$.bits2.send_gen5.end_of_thread = $12.bits3.generic_gen5.end_of_thread;
+                          GEN(&$$)->header.destreg__conditionalmod = $4; /* msg reg index */
+                          GEN(&$$)->bits2.send_gen5.sfid = GEN(&$7)->bits2.send_gen5.sfid;
+                          GEN(&$$)->bits2.send_gen5.end_of_thread = GEN(&$12)->bits3.generic_gen5.end_of_thread;
                       }
 
-                      $$.bits3.generic_gen5 = $7.bits3.generic_gen5;
-                      $$.bits3.generic_gen5.msg_length = $9;
-                      $$.bits3.generic_gen5.response_length = $11;
-                      $$.bits3.generic_gen5.end_of_thread =
-                          $12.bits3.generic_gen5.end_of_thread;
+                      GEN(&$$)->bits3.generic_gen5 = GEN(&$7)->bits3.generic_gen5;
+                      GEN(&$$)->bits3.generic_gen5.msg_length = $9;
+                      GEN(&$$)->bits3.generic_gen5.response_length = $11;
+                      GEN(&$$)->bits3.generic_gen5.end_of_thread =
+                          GEN(&$12)->bits3.generic_gen5.end_of_thread;
 		  } else {
-                      $$.header.destreg__conditionalmod = $4; /* msg reg index */
-                      $$.bits3.generic = $7.bits3.generic;
-                      $$.bits3.generic.msg_length = $9;
-                      $$.bits3.generic.response_length = $11;
-                      $$.bits3.generic.end_of_thread =
-                          $12.bits3.generic.end_of_thread;
+                      GEN(&$$)->header.destreg__conditionalmod = $4; /* msg reg index */
+                      GEN(&$$)->bits3.generic = GEN(&$7)->bits3.generic;
+                      GEN(&$$)->bits3.generic.msg_length = $9;
+                      GEN(&$$)->bits3.generic.response_length = $11;
+                      GEN(&$$)->bits3.generic.end_of_thread =
+                          GEN(&$12)->bits3.generic.end_of_thread;
 		  }
 		}
 		| predicate SEND execsize dst sendleadreg payload directsrcoperand instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 
@@ -1187,8 +1189,8 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 			   "type=%d\n", $7.reg.dw1.ud, $7.reg.type);
 		  }
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 		  $4.width = $3;
@@ -1196,9 +1198,9 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$6, &@6) != 0)
 		    YYERROR;
-		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.bits1.da1.src1_reg_type = $7.reg.type;
-		  $$.bits3.ud = $7.reg.dw1.ud;
+		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
+		  GEN(&$$)->bits1.da1.src1_reg_type = $7.reg.type;
+		  GEN(&$$)->bits3.ud = $7.reg.dw1.ud;
                 }
 		| predicate SEND execsize dst sendleadreg sndopr imm32reg instoptions
 		{
@@ -1215,8 +1217,8 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  }
 
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-                  $$.header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
+		  GEN(&$$)->header.opcode = $2;
+                  GEN(&$$)->header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
 		  set_instruction_predicate(&$$, &$1);
 
 		  $4.width = $3;
@@ -1238,10 +1240,10 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                   src0.reg.subnr = 0;
                   set_instruction_src0(&$$, &src0, NULL);
 
-		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.bits1.da1.src1_reg_type = $7.reg.type;
-                  $$.bits3.ud = $7.reg.dw1.ud;
-                  $$.bits3.generic_gen5.end_of_thread = !!($6 & EX_DESC_EOT_MASK);
+		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
+		  GEN(&$$)->bits1.da1.src1_reg_type = $7.reg.type;
+                  GEN(&$$)->bits3.ud = $7.reg.dw1.ud;
+                  GEN(&$$)->bits3.generic_gen5.end_of_thread = !!($6 & EX_DESC_EOT_MASK);
 		}
 		| predicate SEND execsize dst sendleadreg sndopr directsrcoperand instoptions
 		{
@@ -1258,8 +1260,8 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  }
 
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-                  $$.header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
+		  GEN(&$$)->header.opcode = $2;
+                  GEN(&$$)->header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
 		  set_instruction_predicate(&$$, &$1);
 
 		  $4.width = $3;
@@ -1282,7 +1284,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                   set_instruction_src0(&$$, &src0, NULL);
 
                   set_instruction_src1(&$$, &$7, &@7);
-                  $$.bits3.generic_gen5.end_of_thread = !!($6 & EX_DESC_EOT_MASK);
+                  GEN(&$$)->bits3.generic_gen5.end_of_thread = !!($6 & EX_DESC_EOT_MASK);
 		}
 		| predicate SEND execsize dst sendleadreg payload sndopr imm32reg instoptions
 		{
@@ -1293,8 +1295,8 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 			  "type=%d\n", $8.reg.dw1.ud, $8.reg.type);
 		  }
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 		  $4.width = $3;
@@ -1302,21 +1304,21 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$6, &@6) != 0)
 		    YYERROR;
-		  $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  $$.bits1.da1.src1_reg_type = $8.reg.type;
+		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
+		  GEN(&$$)->bits1.da1.src1_reg_type = $8.reg.type;
 		  if (IS_GENx(5)) {
-		      $$.bits2.send_gen5.sfid = ($7 & EX_DESC_SFID_MASK);
-		      $$.bits3.ud = $8.reg.dw1.ud;
-		      $$.bits3.generic_gen5.end_of_thread = !!($7 & EX_DESC_EOT_MASK);
+		      GEN(&$$)->bits2.send_gen5.sfid = ($7 & EX_DESC_SFID_MASK);
+		      GEN(&$$)->bits3.ud = $8.reg.dw1.ud;
+		      GEN(&$$)->bits3.generic_gen5.end_of_thread = !!($7 & EX_DESC_EOT_MASK);
 		  }
 		  else
-		      $$.bits3.ud = $8.reg.dw1.ud;
+		      GEN(&$$)->bits3.ud = $8.reg.dw1.ud;
 		}
 		| predicate SEND execsize dst sendleadreg payload exp directsrcoperand instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $5.nr; /* msg reg index */
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
 
@@ -1329,7 +1331,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  if (set_instruction_src1(&$$, &$8, &@8) != 0)
 		    YYERROR;
 		  if (IS_GENx(5)) {
-                      $$.bits2.send_gen5.sfid = $7;
+                      GEN(&$$)->bits2.send_gen5.sfid = $7;
 		  }
 		}
 		
@@ -1349,24 +1351,24 @@ jumpinstruction: predicate JMPI execsize relativelocation2
 		   * is the post-incremented IP plus the offset.
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  $$.gen.header.opcode = $2;
+		  GEN(&$$)->header.opcode = $2;
 		  if(advanced_flag)
-			$$.gen.header.mask_control = BRW_MASK_DISABLE;
-		  set_instruction_predicate(&$$.gen, &$1);
+			GEN(&$$)->header.mask_control = BRW_MASK_DISABLE;
+		  set_instruction_predicate(&$$, &$1);
 		  ip_dst.width = ffs(1) - 1;
-		  set_instruction_dest(&$$.gen, &ip_dst);
-		  set_instruction_src0(&$$.gen, &ip_src, NULL);
-		  set_instruction_src1(&$$.gen, &$4, NULL);
-		  $$.first_reloc_target = $4.reloc_target;
-		  $$.first_reloc_offset = $4.imm32;
+		  set_instruction_dest(&$$, &ip_dst);
+		  set_instruction_src0(&$$, &ip_src, NULL);
+		  set_instruction_src1(&$$, &$4, NULL);
+		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $4.imm32;
 		}
 ;
 
 mathinstruction: predicate MATH_INST execsize dst src srcimm math_function instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
-		  $$.header.destreg__conditionalmod = $7;
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.destreg__conditionalmod = $7;
 		  set_instruction_options(&$$, &$8);
 		  set_instruction_predicate(&$$, &$1);
 		  $4.width = $3;
@@ -1383,13 +1385,13 @@ breakinstruction: predicate breakop execsize relativelocation relativelocation i
 		{
 		  // for Gen6, Gen7
 		  memset(&$$, 0, sizeof($$));
-		  set_instruction_predicate(&$$.gen, &$1);
-		  $$.gen.header.opcode = $2;
-		  $$.gen.header.execution_size = $3;
-		  $$.first_reloc_target = $4.reloc_target;
-		  $$.first_reloc_offset = $4.imm32;
-		  $$.second_reloc_target = $5.reloc_target;
-		  $$.second_reloc_offset = $5.imm32;
+		  set_instruction_predicate(&$$, &$1);
+		  GEN(&$$)->header.opcode = $2;
+		  GEN(&$$)->header.execution_size = $3;
+		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
+		  $$.insn.reloc.first_reloc_offset = $4.imm32;
+		  $$.insn.reloc.second_reloc_target = $5.reloc_target;
+		  $$.insn.reloc.second_reloc_offset = $5.imm32;
 		}
 ;
 
@@ -1407,7 +1409,7 @@ syncinstruction: predicate WAIT notifyreg
 		  struct src_operand notify_src;
 
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $2;
+		  GEN(&$$)->header.opcode = $2;
 		  set_direct_dst_operand(&notify_dst, &$3, BRW_REGISTER_TYPE_D);
 		  notify_dst.width = ffs(1) - 1;
 		  set_instruction_dest(&$$, &notify_dst);
@@ -1421,7 +1423,7 @@ syncinstruction: predicate WAIT notifyreg
 nopinstruction: NOP
 		{
 		  memset(&$$, 0, sizeof($$));
-		  $$.header.opcode = $1;
+		  GEN(&$$)->header.opcode = $1;
 		};
 
 /* XXX! */
@@ -1434,42 +1436,42 @@ post_dst:	dst
 msgtarget:	NULL_TOKEN
 		{
 		  if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid= BRW_SFID_NULL;
-                      $$.bits3.generic_gen5.header_present = 0;  /* ??? */
+                      GEN(&$$)->bits2.send_gen5.sfid= BRW_SFID_NULL;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 0;  /* ??? */
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_SFID_NULL;
+                      GEN(&$$)->bits3.generic.msg_target = BRW_SFID_NULL;
 		  }
 		}
 		| SAMPLER LPAREN INTEGER COMMA INTEGER COMMA
 		sampler_datatype RPAREN
 		{
 		  if (IS_GENp(7)) {
-                      $$.bits2.send_gen5.sfid = BRW_SFID_SAMPLER;
-                      $$.bits3.generic_gen5.header_present = 1;   /* ??? */
-                      $$.bits3.sampler_gen7.binding_table_index = $3;
-                      $$.bits3.sampler_gen7.sampler = $5;
-                      $$.bits3.sampler_gen7.simd_mode = 2; /* SIMD16, maybe we should add a new parameter */
+                      GEN(&$$)->bits2.send_gen5.sfid = BRW_SFID_SAMPLER;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;   /* ??? */
+                      GEN(&$$)->bits3.sampler_gen7.binding_table_index = $3;
+                      GEN(&$$)->bits3.sampler_gen7.sampler = $5;
+                      GEN(&$$)->bits3.sampler_gen7.simd_mode = 2; /* SIMD16, maybe we should add a new parameter */
 		  } else if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid = BRW_SFID_SAMPLER;
-                      $$.bits3.generic_gen5.header_present = 1;   /* ??? */
-                      $$.bits3.sampler_gen5.binding_table_index = $3;
-                      $$.bits3.sampler_gen5.sampler = $5;
-                      $$.bits3.sampler_gen5.simd_mode = 2; /* SIMD16, maybe we should add a new parameter */
+                      GEN(&$$)->bits2.send_gen5.sfid = BRW_SFID_SAMPLER;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;   /* ??? */
+                      GEN(&$$)->bits3.sampler_gen5.binding_table_index = $3;
+                      GEN(&$$)->bits3.sampler_gen5.sampler = $5;
+                      GEN(&$$)->bits3.sampler_gen5.simd_mode = 2; /* SIMD16, maybe we should add a new parameter */
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_SFID_SAMPLER;	
-                      $$.bits3.sampler.binding_table_index = $3;
-                      $$.bits3.sampler.sampler = $5;
+                      GEN(&$$)->bits3.generic.msg_target = BRW_SFID_SAMPLER;
+                      GEN(&$$)->bits3.sampler.binding_table_index = $3;
+                      GEN(&$$)->bits3.sampler.sampler = $5;
                       switch ($7) {
                       case TYPE_F:
-                          $$.bits3.sampler.return_format =
+                          GEN(&$$)->bits3.sampler.return_format =
                               BRW_SAMPLER_RETURN_FORMAT_FLOAT32;
                           break;
                       case TYPE_UD:
-                          $$.bits3.sampler.return_format =
+                          GEN(&$$)->bits3.sampler.return_format =
                               BRW_SAMPLER_RETURN_FORMAT_UINT32;
                           break;
                       case TYPE_D:
-                          $$.bits3.sampler.return_format =
+                          GEN(&$$)->bits3.sampler.return_format =
                               BRW_SAMPLER_RETURN_FORMAT_SINT32;
                           break;
                       }
@@ -1480,208 +1482,208 @@ msgtarget:	NULL_TOKEN
 		  if (IS_GENp(6)) {
                       error (&@1, "Gen6+ doesn't have math function\n");
 		  } else if (IS_GENx(5)) {
-                      $$.bits2.send_gen5.sfid = BRW_SFID_MATH;
-                      $$.bits3.generic_gen5.header_present = 0;
-                      $$.bits3.math_gen5.function = $2;
+                      GEN(&$$)->bits2.send_gen5.sfid = BRW_SFID_MATH;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 0;
+                      GEN(&$$)->bits3.math_gen5.function = $2;
                       if ($3 == BRW_INSTRUCTION_SATURATE)
-                          $$.bits3.math_gen5.saturate = 1;
+                          GEN(&$$)->bits3.math_gen5.saturate = 1;
                       else
-                          $$.bits3.math_gen5.saturate = 0;
-                      $$.bits3.math_gen5.int_type = $4;
-                      $$.bits3.math_gen5.precision = BRW_MATH_PRECISION_FULL;
-                      $$.bits3.math_gen5.data_type = $5;
+                          GEN(&$$)->bits3.math_gen5.saturate = 0;
+                      GEN(&$$)->bits3.math_gen5.int_type = $4;
+                      GEN(&$$)->bits3.math_gen5.precision = BRW_MATH_PRECISION_FULL;
+                      GEN(&$$)->bits3.math_gen5.data_type = $5;
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_SFID_MATH;
-                      $$.bits3.math.function = $2;
+                      GEN(&$$)->bits3.generic.msg_target = BRW_SFID_MATH;
+                      GEN(&$$)->bits3.math.function = $2;
                       if ($3 == BRW_INSTRUCTION_SATURATE)
-                          $$.bits3.math.saturate = 1;
+                          GEN(&$$)->bits3.math.saturate = 1;
                       else
-                          $$.bits3.math.saturate = 0;
-                      $$.bits3.math.int_type = $4;
-                      $$.bits3.math.precision = BRW_MATH_PRECISION_FULL;
-                      $$.bits3.math.data_type = $5;
+                          GEN(&$$)->bits3.math.saturate = 0;
+                      GEN(&$$)->bits3.math.int_type = $4;
+                      GEN(&$$)->bits3.math.precision = BRW_MATH_PRECISION_FULL;
+                      GEN(&$$)->bits3.math.data_type = $5;
 		  }
 		}
 		| GATEWAY
 		{
 		  if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid = BRW_SFID_MESSAGE_GATEWAY;
-                      $$.bits3.generic_gen5.header_present = 0;  /* ??? */
+                      GEN(&$$)->bits2.send_gen5.sfid = BRW_SFID_MESSAGE_GATEWAY;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 0;  /* ??? */
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_SFID_MESSAGE_GATEWAY;
+                      GEN(&$$)->bits3.generic.msg_target = BRW_SFID_MESSAGE_GATEWAY;
 		  }
 		}
 		| READ  LPAREN INTEGER COMMA INTEGER COMMA INTEGER COMMA
                 INTEGER RPAREN
 		{
 		  if (IS_GENx(7)) {
-                      $$.bits2.send_gen5.sfid = 
+                      GEN(&$$)->bits2.send_gen5.sfid =
                           GEN6_SFID_DATAPORT_SAMPLER_CACHE;
-                      $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.gen7_dp.binding_table_index = $3;
-                      $$.bits3.gen7_dp.msg_control = $7;
-                      $$.bits3.gen7_dp.msg_type = $9;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;
+                      GEN(&$$)->bits3.gen7_dp.binding_table_index = $3;
+                      GEN(&$$)->bits3.gen7_dp.msg_control = $7;
+                      GEN(&$$)->bits3.gen7_dp.msg_type = $9;
 		  } else if (IS_GENx(6)) {
-                      $$.bits2.send_gen5.sfid = 
+                      GEN(&$$)->bits2.send_gen5.sfid =
                           GEN6_SFID_DATAPORT_SAMPLER_CACHE;
-                      $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.gen6_dp_sampler_const_cache.binding_table_index = $3;
-                      $$.bits3.gen6_dp_sampler_const_cache.msg_control = $7;
-                      $$.bits3.gen6_dp_sampler_const_cache.msg_type = $9;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;
+                      GEN(&$$)->bits3.gen6_dp_sampler_const_cache.binding_table_index = $3;
+                      GEN(&$$)->bits3.gen6_dp_sampler_const_cache.msg_control = $7;
+                      GEN(&$$)->bits3.gen6_dp_sampler_const_cache.msg_type = $9;
 		  } else if (IS_GENx(5)) {
-                      $$.bits2.send_gen5.sfid = 
+                      GEN(&$$)->bits2.send_gen5.sfid =
                           BRW_SFID_DATAPORT_READ;
-                      $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.dp_read_gen5.binding_table_index = $3;
-                      $$.bits3.dp_read_gen5.target_cache = $5;
-                      $$.bits3.dp_read_gen5.msg_control = $7;
-                      $$.bits3.dp_read_gen5.msg_type = $9;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;
+                      GEN(&$$)->bits3.dp_read_gen5.binding_table_index = $3;
+                      GEN(&$$)->bits3.dp_read_gen5.target_cache = $5;
+                      GEN(&$$)->bits3.dp_read_gen5.msg_control = $7;
+                      GEN(&$$)->bits3.dp_read_gen5.msg_type = $9;
 		  } else {
-                      $$.bits3.generic.msg_target =
+                      GEN(&$$)->bits3.generic.msg_target =
                           BRW_SFID_DATAPORT_READ;
-                      $$.bits3.dp_read.binding_table_index = $3;
-                      $$.bits3.dp_read.target_cache = $5;
-                      $$.bits3.dp_read.msg_control = $7;
-                      $$.bits3.dp_read.msg_type = $9;
+                      GEN(&$$)->bits3.dp_read.binding_table_index = $3;
+                      GEN(&$$)->bits3.dp_read.target_cache = $5;
+                      GEN(&$$)->bits3.dp_read.msg_control = $7;
+                      GEN(&$$)->bits3.dp_read.msg_type = $9;
 		  }
 		}
 		| WRITE LPAREN INTEGER COMMA INTEGER COMMA INTEGER COMMA
 		INTEGER RPAREN
 		{
 		  if (IS_GENx(7)) {
-                      $$.bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
-                      $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.gen7_dp.binding_table_index = $3;
-                      $$.bits3.gen7_dp.msg_control = $5;
-                      $$.bits3.gen7_dp.msg_type = $7;
+                      GEN(&$$)->bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;
+                      GEN(&$$)->bits3.gen7_dp.binding_table_index = $3;
+                      GEN(&$$)->bits3.gen7_dp.msg_control = $5;
+                      GEN(&$$)->bits3.gen7_dp.msg_type = $7;
                   } else if (IS_GENx(6)) {
-                      $$.bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
+                      GEN(&$$)->bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
                       /* Sandybridge supports headerlesss message for render target write.
                        * Currently the GFX assembler doesn't support it. so the program must provide 
                        * message header
                        */
-                      $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.gen6_dp.binding_table_index = $3;
-                      $$.bits3.gen6_dp.msg_control = $5;
-                     $$.bits3.gen6_dp.msg_type = $7;
-                      $$.bits3.gen6_dp.send_commit_msg = $9;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;
+                      GEN(&$$)->bits3.gen6_dp.binding_table_index = $3;
+                      GEN(&$$)->bits3.gen6_dp.msg_control = $5;
+                     GEN(&$$)->bits3.gen6_dp.msg_type = $7;
+                      GEN(&$$)->bits3.gen6_dp.send_commit_msg = $9;
 		  } else if (IS_GENx(5)) {
-                      $$.bits2.send_gen5.sfid =
+                      GEN(&$$)->bits2.send_gen5.sfid =
                           BRW_SFID_DATAPORT_WRITE;
-                      $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.dp_write_gen5.binding_table_index = $3;
-                      $$.bits3.dp_write_gen5.last_render_target = ($5 & 0x8) >> 3;
-                      $$.bits3.dp_write_gen5.msg_control = $5 & 0x7;
-                      $$.bits3.dp_write_gen5.msg_type = $7;
-                      $$.bits3.dp_write_gen5.send_commit_msg = $9;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;
+                      GEN(&$$)->bits3.dp_write_gen5.binding_table_index = $3;
+                      GEN(&$$)->bits3.dp_write_gen5.last_render_target = ($5 & 0x8) >> 3;
+                      GEN(&$$)->bits3.dp_write_gen5.msg_control = $5 & 0x7;
+                      GEN(&$$)->bits3.dp_write_gen5.msg_type = $7;
+                      GEN(&$$)->bits3.dp_write_gen5.send_commit_msg = $9;
 		  } else {
-                      $$.bits3.generic.msg_target =
+                      GEN(&$$)->bits3.generic.msg_target =
                           BRW_SFID_DATAPORT_WRITE;
-                      $$.bits3.dp_write.binding_table_index = $3;
+                      GEN(&$$)->bits3.dp_write.binding_table_index = $3;
                       /* The msg control field of brw_struct.h is split into
                        * msg control and last_render_target, even though
                        * last_render_target isn't common to all write messages.
                        */
-                      $$.bits3.dp_write.last_render_target = ($5 & 0x8) >> 3;
-                      $$.bits3.dp_write.msg_control = $5 & 0x7;
-                      $$.bits3.dp_write.msg_type = $7;
-                      $$.bits3.dp_write.send_commit_msg = $9;
+                      GEN(&$$)->bits3.dp_write.last_render_target = ($5 & 0x8) >> 3;
+                      GEN(&$$)->bits3.dp_write.msg_control = $5 & 0x7;
+                      GEN(&$$)->bits3.dp_write.msg_type = $7;
+                      GEN(&$$)->bits3.dp_write.send_commit_msg = $9;
 		  }
 		}
 		| WRITE LPAREN INTEGER COMMA INTEGER COMMA INTEGER COMMA
 		INTEGER COMMA INTEGER RPAREN
 		{
 		  if (IS_GENx(7)) {
-                      $$.bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
-                      $$.bits3.generic_gen5.header_present = ($11 != 0);
-                      $$.bits3.gen7_dp.binding_table_index = $3;
-                      $$.bits3.gen7_dp.msg_control = $5;
-                      $$.bits3.gen7_dp.msg_type = $7;
+                      GEN(&$$)->bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
+                      GEN(&$$)->bits3.generic_gen5.header_present = ($11 != 0);
+                      GEN(&$$)->bits3.gen7_dp.binding_table_index = $3;
+                      GEN(&$$)->bits3.gen7_dp.msg_control = $5;
+                      GEN(&$$)->bits3.gen7_dp.msg_type = $7;
 		  } else if (IS_GENx(6)) {
-                      $$.bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
-                      $$.bits3.generic_gen5.header_present = ($11 != 0);
-                      $$.bits3.gen6_dp.binding_table_index = $3;
-                      $$.bits3.gen6_dp.msg_control = $5;
-                     $$.bits3.gen6_dp.msg_type = $7;
-                      $$.bits3.gen6_dp.send_commit_msg = $9;
+                      GEN(&$$)->bits2.send_gen5.sfid = GEN6_SFID_DATAPORT_RENDER_CACHE;
+                      GEN(&$$)->bits3.generic_gen5.header_present = ($11 != 0);
+                      GEN(&$$)->bits3.gen6_dp.binding_table_index = $3;
+                      GEN(&$$)->bits3.gen6_dp.msg_control = $5;
+                     GEN(&$$)->bits3.gen6_dp.msg_type = $7;
+                      GEN(&$$)->bits3.gen6_dp.send_commit_msg = $9;
 		  } else if (IS_GENx(5)) {
-                      $$.bits2.send_gen5.sfid =
+                      GEN(&$$)->bits2.send_gen5.sfid =
                           BRW_SFID_DATAPORT_WRITE;
-                      $$.bits3.generic_gen5.header_present = ($11 != 0);
-                      $$.bits3.dp_write_gen5.binding_table_index = $3;
-                      $$.bits3.dp_write_gen5.last_render_target = ($5 & 0x8) >> 3;
-                      $$.bits3.dp_write_gen5.msg_control = $5 & 0x7;
-                      $$.bits3.dp_write_gen5.msg_type = $7;
-                      $$.bits3.dp_write_gen5.send_commit_msg = $9;
+                      GEN(&$$)->bits3.generic_gen5.header_present = ($11 != 0);
+                      GEN(&$$)->bits3.dp_write_gen5.binding_table_index = $3;
+                      GEN(&$$)->bits3.dp_write_gen5.last_render_target = ($5 & 0x8) >> 3;
+                      GEN(&$$)->bits3.dp_write_gen5.msg_control = $5 & 0x7;
+                      GEN(&$$)->bits3.dp_write_gen5.msg_type = $7;
+                      GEN(&$$)->bits3.dp_write_gen5.send_commit_msg = $9;
 		  } else {
-                      $$.bits3.generic.msg_target =
+                      GEN(&$$)->bits3.generic.msg_target =
                           BRW_SFID_DATAPORT_WRITE;
-                      $$.bits3.dp_write.binding_table_index = $3;
+                      GEN(&$$)->bits3.dp_write.binding_table_index = $3;
                       /* The msg control field of brw_struct.h is split into
                        * msg control and last_render_target, even though
                        * last_render_target isn't common to all write messages.
                        */
-                      $$.bits3.dp_write.last_render_target = ($5 & 0x8) >> 3;
-                      $$.bits3.dp_write.msg_control = $5 & 0x7;
-                      $$.bits3.dp_write.msg_type = $7;
-                      $$.bits3.dp_write.send_commit_msg = $9;
+                      GEN(&$$)->bits3.dp_write.last_render_target = ($5 & 0x8) >> 3;
+                      GEN(&$$)->bits3.dp_write.msg_control = $5 & 0x7;
+                      GEN(&$$)->bits3.dp_write.msg_type = $7;
+                      GEN(&$$)->bits3.dp_write.send_commit_msg = $9;
 		  }
 		}
 		| URB INTEGER urb_swizzle urb_allocate urb_used urb_complete
 		{
-		  $$.bits3.generic.msg_target = BRW_SFID_URB;
+		  GEN(&$$)->bits3.generic.msg_target = BRW_SFID_URB;
 		  if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid = BRW_SFID_URB;
-                      $$.bits3.generic_gen5.header_present = 1;
-                      $$.bits3.urb_gen5.opcode = BRW_URB_OPCODE_WRITE;
-                      $$.bits3.urb_gen5.offset = $2;
-                      $$.bits3.urb_gen5.swizzle_control = $3;
-                      $$.bits3.urb_gen5.pad = 0;
-                      $$.bits3.urb_gen5.allocate = $4;
-                      $$.bits3.urb_gen5.used = $5;
-                      $$.bits3.urb_gen5.complete = $6;
+                      GEN(&$$)->bits2.send_gen5.sfid = BRW_SFID_URB;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;
+                      GEN(&$$)->bits3.urb_gen5.opcode = BRW_URB_OPCODE_WRITE;
+                      GEN(&$$)->bits3.urb_gen5.offset = $2;
+                      GEN(&$$)->bits3.urb_gen5.swizzle_control = $3;
+                      GEN(&$$)->bits3.urb_gen5.pad = 0;
+                      GEN(&$$)->bits3.urb_gen5.allocate = $4;
+                      GEN(&$$)->bits3.urb_gen5.used = $5;
+                      GEN(&$$)->bits3.urb_gen5.complete = $6;
 		  } else {
-                      $$.bits3.generic.msg_target = BRW_SFID_URB;
-                      $$.bits3.urb.opcode = BRW_URB_OPCODE_WRITE;
-                      $$.bits3.urb.offset = $2;
-                      $$.bits3.urb.swizzle_control = $3;
-                      $$.bits3.urb.pad = 0;
-                      $$.bits3.urb.allocate = $4;
-                      $$.bits3.urb.used = $5;
-                      $$.bits3.urb.complete = $6;
+                      GEN(&$$)->bits3.generic.msg_target = BRW_SFID_URB;
+                      GEN(&$$)->bits3.urb.opcode = BRW_URB_OPCODE_WRITE;
+                      GEN(&$$)->bits3.urb.offset = $2;
+                      GEN(&$$)->bits3.urb.swizzle_control = $3;
+                      GEN(&$$)->bits3.urb.pad = 0;
+                      GEN(&$$)->bits3.urb.allocate = $4;
+                      GEN(&$$)->bits3.urb.used = $5;
+                      GEN(&$$)->bits3.urb.complete = $6;
 		  }
 		}
 		| THREAD_SPAWNER  LPAREN INTEGER COMMA INTEGER COMMA
                         INTEGER RPAREN
 		{
-		  $$.bits3.generic.msg_target =
+		  GEN(&$$)->bits3.generic.msg_target =
 		    BRW_SFID_THREAD_SPAWNER;
 		  if (IS_GENp(5)) {
-                      $$.bits2.send_gen5.sfid = 
+                      GEN(&$$)->bits2.send_gen5.sfid =
                           BRW_SFID_THREAD_SPAWNER;
-                      $$.bits3.generic_gen5.header_present = 0;
-                      $$.bits3.thread_spawner_gen5.opcode = $3;
-                      $$.bits3.thread_spawner_gen5.requester_type  = $5;
-                      $$.bits3.thread_spawner_gen5.resource_select = $7;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 0;
+                      GEN(&$$)->bits3.thread_spawner_gen5.opcode = $3;
+                      GEN(&$$)->bits3.thread_spawner_gen5.requester_type  = $5;
+                      GEN(&$$)->bits3.thread_spawner_gen5.resource_select = $7;
 		  } else {
-                      $$.bits3.generic.msg_target =
+                      GEN(&$$)->bits3.generic.msg_target =
                           BRW_SFID_THREAD_SPAWNER;
-                      $$.bits3.thread_spawner.opcode = $3;
-                      $$.bits3.thread_spawner.requester_type  = $5;
-                      $$.bits3.thread_spawner.resource_select = $7;
+                      GEN(&$$)->bits3.thread_spawner.opcode = $3;
+                      GEN(&$$)->bits3.thread_spawner.requester_type  = $5;
+                      GEN(&$$)->bits3.thread_spawner.resource_select = $7;
 		  }
 		}
 		| VME  LPAREN INTEGER COMMA INTEGER COMMA INTEGER COMMA INTEGER RPAREN
 		{
-		  $$.bits3.generic.msg_target = GEN6_SFID_VME;
+		  GEN(&$$)->bits3.generic.msg_target = GEN6_SFID_VME;
 
 		  if (IS_GENp(6)) { 
-                      $$.bits2.send_gen5.sfid = GEN6_SFID_VME;
-                      $$.bits3.vme_gen6.binding_table_index = $3;
-                      $$.bits3.vme_gen6.search_path_index = $5;
-                      $$.bits3.vme_gen6.lut_subindex = $7;
-                      $$.bits3.vme_gen6.message_type = $9;
-                      $$.bits3.generic_gen5.header_present = 1; 
+                      GEN(&$$)->bits2.send_gen5.sfid = GEN6_SFID_VME;
+                      GEN(&$$)->bits3.vme_gen6.binding_table_index = $3;
+                      GEN(&$$)->bits3.vme_gen6.search_path_index = $5;
+                      GEN(&$$)->bits3.vme_gen6.lut_subindex = $7;
+                      GEN(&$$)->bits3.vme_gen6.message_type = $9;
+                      GEN(&$$)->bits3.generic_gen5.header_present = 1;
 		  } else {
                       error (&@1, "Gen6- doesn't have vme function\n");
 		  }    
@@ -1691,19 +1693,19 @@ msgtarget:	NULL_TOKEN
 		   if (gen_level < 75)
                       error (&@1, "Below Gen7.5 doesn't have CRE function\n");
 
-		   $$.bits3.generic.msg_target = HSW_SFID_CRE;
+		   GEN(&$$)->bits3.generic.msg_target = HSW_SFID_CRE;
 
-                   $$.bits2.send_gen5.sfid = HSW_SFID_CRE;
-                   $$.bits3.cre_gen75.binding_table_index = $3;
-                   $$.bits3.cre_gen75.message_type = $5;
-                   $$.bits3.generic_gen5.header_present = 1; 
+                   GEN(&$$)->bits2.send_gen5.sfid = HSW_SFID_CRE;
+                   GEN(&$$)->bits3.cre_gen75.binding_table_index = $3;
+                   GEN(&$$)->bits3.cre_gen75.message_type = $5;
+                   GEN(&$$)->bits3.generic_gen5.header_present = 1;
 		}
 
 		| DATA_PORT LPAREN INTEGER COMMA INTEGER COMMA INTEGER COMMA 
                 INTEGER COMMA INTEGER COMMA INTEGER RPAREN
 		{
-                    $$.bits2.send_gen5.sfid = $3;
-                    $$.bits3.generic_gen5.header_present = ($13 != 0);
+                    GEN(&$$)->bits2.send_gen5.sfid = $3;
+                    GEN(&$$)->bits3.generic_gen5.header_present = ($13 != 0);
 
                     if (IS_GENp(7)) {
                         if ($3 != GEN6_SFID_DATAPORT_SAMPLER_CACHE &&
@@ -1713,10 +1715,10 @@ msgtarget:	NULL_TOKEN
                             error (&@3, "error: wrong cache type\n");
                         }
 
-                        $$.bits3.gen7_dp.category = $11;
-                        $$.bits3.gen7_dp.binding_table_index = $9;
-                        $$.bits3.gen7_dp.msg_control = $7;
-                        $$.bits3.gen7_dp.msg_type = $5;
+                        GEN(&$$)->bits3.gen7_dp.category = $11;
+                        GEN(&$$)->bits3.gen7_dp.binding_table_index = $9;
+                        GEN(&$$)->bits3.gen7_dp.msg_control = $7;
+                        GEN(&$$)->bits3.gen7_dp.msg_type = $5;
                     } else if (IS_GENx(6)) {
                         if ($3 != GEN6_SFID_DATAPORT_SAMPLER_CACHE &&
                             $3 != GEN6_SFID_DATAPORT_RENDER_CACHE &&
@@ -1724,10 +1726,10 @@ msgtarget:	NULL_TOKEN
                             error (&@3, "error: wrong cache type\n");
                         }
 
-                        $$.bits3.gen6_dp.send_commit_msg = $11;
-                        $$.bits3.gen6_dp.binding_table_index = $9;
-                        $$.bits3.gen6_dp.msg_control = $7;
-                        $$.bits3.gen6_dp.msg_type = $5;
+                        GEN(&$$)->bits3.gen6_dp.send_commit_msg = $11;
+                        GEN(&$$)->bits3.gen6_dp.binding_table_index = $9;
+                        GEN(&$$)->bits3.gen6_dp.msg_control = $7;
+                        GEN(&$$)->bits3.gen6_dp.msg_type = $5;
                     } else if (!IS_GENp(5)) {
                         error (&@1, "Gen6- doesn't support data port for sampler/render/constant/data cache\n");
                     }
@@ -2615,21 +2617,21 @@ imm32:		exp { $$.r = imm32_d; $$.u.d = $1; }
 /* 1.4.12: Predication and modifiers */
 predicate:	/* empty */
 		{
-		  $$.header.predicate_control = BRW_PREDICATE_NONE;
-		  $$.bits2.da1.flag_reg_nr = 0;
-		  $$.bits2.da1.flag_subreg_nr = 0;
-		  $$.header.predicate_inverse = 0;
+		  GEN(&$$)->header.predicate_control = BRW_PREDICATE_NONE;
+		  GEN(&$$)->bits2.da1.flag_reg_nr = 0;
+		  GEN(&$$)->bits2.da1.flag_subreg_nr = 0;
+		  GEN(&$$)->header.predicate_inverse = 0;
 		}
 		| LPAREN predstate flagreg predctrl RPAREN
 		{
-		  $$.header.predicate_control = $4;
+		  GEN(&$$)->header.predicate_control = $4;
 		  /* XXX: Should deal with erroring when the user tries to
 		   * set a predicate for one flag register and conditional
 		   * modification on the other flag register.
 		   */
-		  $$.bits2.da1.flag_reg_nr = ($3.nr & 0xF);
-		  $$.bits2.da1.flag_subreg_nr = $3.subnr;
-		  $$.header.predicate_inverse = $2;
+		  GEN(&$$)->bits2.da1.flag_reg_nr = ($3.nr & 0xF);
+		  GEN(&$$)->bits2.da1.flag_subreg_nr = $3.subnr;
+		  GEN(&$$)->header.predicate_inverse = $2;
 		}
 ;
 
@@ -2722,40 +2724,40 @@ instoption_list:instoption_list COMMA instoption
 		  $$ = $1;
 		  switch ($3) {
 		  case ALIGN1:
-		    $$.header.access_mode = BRW_ALIGN_1;
+		    GEN(&$$)->header.access_mode = BRW_ALIGN_1;
 		    break;
 		  case ALIGN16:
-		    $$.header.access_mode = BRW_ALIGN_16;
+		    GEN(&$$)->header.access_mode = BRW_ALIGN_16;
 		    break;
 		  case SECHALF:
-		    $$.header.compression_control |= BRW_COMPRESSION_2NDHALF;
+		    GEN(&$$)->header.compression_control |= BRW_COMPRESSION_2NDHALF;
 		    break;
 		  case COMPR:
 		    if (!IS_GENp(6)) {
-                        $$.header.compression_control |=
+                        GEN(&$$)->header.compression_control |=
                             BRW_COMPRESSION_COMPRESSED;
 		    }
 		    break;
 		  case SWITCH:
-		    $$.header.thread_control |= BRW_THREAD_SWITCH;
+		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		    break;
 		  case ATOMIC:
-		    $$.header.thread_control |= BRW_THREAD_ATOMIC;
+		    GEN(&$$)->header.thread_control |= BRW_THREAD_ATOMIC;
 		    break;
 		  case NODDCHK:
-		    $$.header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
+		    GEN(&$$)->header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
 		    break;
 		  case NODDCLR:
-		    $$.header.dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
+		    GEN(&$$)->header.dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
 		    break;
 		  case MASK_DISABLE:
-		    $$.header.mask_control = BRW_MASK_DISABLE;
+		    GEN(&$$)->header.mask_control = BRW_MASK_DISABLE;
 		    break;
 		  case BREAKPOINT:
-		    $$.header.debug_control = BRW_DEBUG_BREAKPOINT;
+		    GEN(&$$)->header.debug_control = BRW_DEBUG_BREAKPOINT;
 		    break;
 		  case ACCWRCTRL:
-		    $$.header.acc_wr_control = BRW_ACCUMULATOR_WRITE_ENABLE;
+		    GEN(&$$)->header.acc_wr_control = BRW_ACCUMULATOR_WRITE_ENABLE;
 		  }
 		}
 		| instoption_list instoption
@@ -2763,41 +2765,41 @@ instoption_list:instoption_list COMMA instoption
 		  $$ = $1;
 		  switch ($2) {
 		  case ALIGN1:
-		    $$.header.access_mode = BRW_ALIGN_1;
+		    GEN(&$$)->header.access_mode = BRW_ALIGN_1;
 		    break;
 		  case ALIGN16:
-		    $$.header.access_mode = BRW_ALIGN_16;
+		    GEN(&$$)->header.access_mode = BRW_ALIGN_16;
 		    break;
 		  case SECHALF:
-		    $$.header.compression_control |= BRW_COMPRESSION_2NDHALF;
+		    GEN(&$$)->header.compression_control |= BRW_COMPRESSION_2NDHALF;
 		    break;
 		  case COMPR:
 			if (!IS_GENp(6)) {
-		      $$.header.compression_control |=
+		      GEN(&$$)->header.compression_control |=
 		        BRW_COMPRESSION_COMPRESSED;
 			}
 		    break;
 		  case SWITCH:
-		    $$.header.thread_control |= BRW_THREAD_SWITCH;
+		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		    break;
 		  case ATOMIC:
-		    $$.header.thread_control |= BRW_THREAD_ATOMIC;
+		    GEN(&$$)->header.thread_control |= BRW_THREAD_ATOMIC;
 		    break;
 		  case NODDCHK:
-		    $$.header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
+		    GEN(&$$)->header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
 		    break;
 		  case NODDCLR:
-		    $$.header.dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
+		    GEN(&$$)->header.dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
 		    break;
 		  case MASK_DISABLE:
-		    $$.header.mask_control = BRW_MASK_DISABLE;
+		    GEN(&$$)->header.mask_control = BRW_MASK_DISABLE;
 		    break;
 		  case BREAKPOINT:
-		    $$.header.debug_control = BRW_DEBUG_BREAKPOINT;
+		    GEN(&$$)->header.debug_control = BRW_DEBUG_BREAKPOINT;
 		    break;
 		  case EOT:
 		    /* XXX: EOT shouldn't be an instoption, I don't think */
-		    $$.bits3.generic.end_of_thread = 1;
+		    GEN(&$$)->bits3.generic.end_of_thread = 1;
 		    break;
 		  }
 		}
@@ -2933,59 +2935,59 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
 /**
  * Fills in the destination register information in instr from the bits in dst.
  */
-static int set_instruction_dest(struct brw_instruction *instr,
+static int set_instruction_dest(struct brw_program_instruction *instr,
 				struct brw_reg *dest)
 {
-	if (!validate_dst_reg(instr, dest))
+	if (!validate_dst_reg(GEN(instr), dest))
 		return 1;
 
 	/* the assembler support expressing subnr in bytes or in number of
 	 * elements. */
 	resolve_subnr(dest);
 
-	brw_set_dest(&genasm_compile, instr, *dest);
+	brw_set_dest(&genasm_compile, GEN(instr), *dest);
 
 	return 0;
 }
 
 /* Sets the first source operand for the instruction.  Returns 0 on success. */
-static int set_instruction_src0(struct brw_instruction *instr,
+static int set_instruction_src0(struct brw_program_instruction *instr,
 				struct src_operand *src,
 				YYLTYPE *location)
 {
 
 	if (advanced_flag)
-		reset_instruction_src_region(instr, src);
+		reset_instruction_src_region(GEN(instr), src);
 
-	if (!validate_src_reg(instr, src->reg, location))
+	if (!validate_src_reg(GEN(instr), src->reg, location))
 		return 1;
 
 	/* the assembler support expressing subnr in bytes or in number of
 	 * elements. */
 	resolve_subnr(&src->reg);
 
-	brw_set_src0(&genasm_compile, instr, src->reg);
+	brw_set_src0(&genasm_compile, GEN(instr), src->reg);
 
 	return 0;
 }
 
 /* Sets the second source operand for the instruction.  Returns 0 on success.
  */
-static int set_instruction_src1(struct brw_instruction *instr,
+static int set_instruction_src1(struct brw_program_instruction *instr,
 				struct src_operand *src,
 				YYLTYPE *location)
 {
 	if (advanced_flag)
-		reset_instruction_src_region(instr, src);
+		reset_instruction_src_region(GEN(instr), src);
 
-	if (!validate_src_reg(instr, src->reg, location))
+	if (!validate_src_reg(GEN(instr), src->reg, location))
 		return 1;
 
 	/* the assembler support expressing subnr in bytes or in number of
 	 * elements. */
 	resolve_subnr(&src->reg);
 
-	brw_set_src1(&genasm_compile, instr, src->reg);
+	brw_set_src1(&genasm_compile, GEN(instr), src->reg);
 
 	return 0;
 }
@@ -3010,74 +3012,74 @@ static int reg_type_2_to_3(int reg_type)
 	return r;
 }
 
-static int set_instruction_dest_three_src(struct brw_instruction *instr,
+static int set_instruction_dest_three_src(struct brw_program_instruction *instr,
 					  struct brw_reg *dest)
 {
-	instr->bits1.da3src.dest_reg_file = dest->file;
-	instr->bits1.da3src.dest_reg_nr = dest->nr;
-	instr->bits1.da3src.dest_subreg_nr = get_subreg_address(dest->file, dest->type, dest->subnr, dest->address_mode) / 4; // in DWORD
-	instr->bits1.da3src.dest_writemask = dest->dw1.bits.writemask;
-	instr->bits1.da3src.dest_reg_type = reg_type_2_to_3(dest->type);
+	GEN(instr)->bits1.da3src.dest_reg_file = dest->file;
+	GEN(instr)->bits1.da3src.dest_reg_nr = dest->nr;
+	GEN(instr)->bits1.da3src.dest_subreg_nr = get_subreg_address(dest->file, dest->type, dest->subnr, dest->address_mode) / 4; // in DWORD
+	GEN(instr)->bits1.da3src.dest_writemask = dest->dw1.bits.writemask;
+	GEN(instr)->bits1.da3src.dest_reg_type = reg_type_2_to_3(dest->type);
 	return 0;
 }
 
-static int set_instruction_src0_three_src(struct brw_instruction *instr,
+static int set_instruction_src0_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src)
 {
 	if (advanced_flag) {
-		reset_instruction_src_region(instr, src);
+		reset_instruction_src_region(GEN(instr), src);
 	}
 	// TODO: supporting src0 swizzle, src0 modifier, src0 rep_ctrl
-	instr->bits1.da3src.src_reg_type = reg_type_2_to_3(src->reg.type);
-	instr->bits2.da3src.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
-	instr->bits2.da3src.src0_reg_nr = src->reg.nr;
+	GEN(instr)->bits1.da3src.src_reg_type = reg_type_2_to_3(src->reg.type);
+	GEN(instr)->bits2.da3src.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
+	GEN(instr)->bits2.da3src.src0_reg_nr = src->reg.nr;
 	return 0;
 }
 
-static int set_instruction_src1_three_src(struct brw_instruction *instr,
+static int set_instruction_src1_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src)
 {
 	if (advanced_flag) {
-		reset_instruction_src_region(instr, src);
+		reset_instruction_src_region(GEN(instr), src);
 	}
 	// TODO: supporting src1 swizzle, src1 modifier, src1 rep_ctrl
 	int v = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
-	instr->bits2.da3src.src1_subreg_nr_low = v % 4; // lower 2 bits
-	instr->bits3.da3src.src1_subreg_nr_high = v / 4; // highest bit
-	instr->bits3.da3src.src1_reg_nr = src->reg.nr;
+	GEN(instr)->bits2.da3src.src1_subreg_nr_low = v % 4; // lower 2 bits
+	GEN(instr)->bits3.da3src.src1_subreg_nr_high = v / 4; // highest bit
+	GEN(instr)->bits3.da3src.src1_reg_nr = src->reg.nr;
 	return 0;
 }
 
-static int set_instruction_src2_three_src(struct brw_instruction *instr,
+static int set_instruction_src2_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src)
 {
 	if (advanced_flag) {
-		reset_instruction_src_region(instr, src);
+		reset_instruction_src_region(GEN(instr), src);
 	}
 	// TODO: supporting src2 swizzle, src2 modifier, src2 rep_ctrl
-	instr->bits3.da3src.src2_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
-	instr->bits3.da3src.src2_reg_nr = src->reg.nr;
+	GEN(instr)->bits3.da3src.src2_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
+	GEN(instr)->bits3.da3src.src2_reg_nr = src->reg.nr;
 	return 0;
 }
 
-static void set_instruction_options(struct brw_instruction *instr,
-				    struct brw_instruction *options)
+static void set_instruction_options(struct brw_program_instruction *instr,
+				    struct brw_program_instruction *options)
 {
 	/* XXX: more instr options */
-	instr->header.access_mode = options->header.access_mode;
-	instr->header.mask_control = options->header.mask_control;
-	instr->header.dependency_control = options->header.dependency_control;
-	instr->header.compression_control =
-		options->header.compression_control;
+	GEN(instr)->header.access_mode = GEN(options)->header.access_mode;
+	GEN(instr)->header.mask_control = GEN(options)->header.mask_control;
+	GEN(instr)->header.dependency_control = GEN(options)->header.dependency_control;
+	GEN(instr)->header.compression_control =
+		GEN(options)->header.compression_control;
 }
 
-static void set_instruction_predicate(struct brw_instruction *instr,
-				      struct brw_instruction *predicate)
+static void set_instruction_predicate(struct brw_program_instruction *instr,
+				      struct brw_program_instruction *p)
 {
-	instr->header.predicate_control = predicate->header.predicate_control;
-	instr->header.predicate_inverse = predicate->header.predicate_inverse;
-	instr->bits2.da1.flag_reg_nr = predicate->bits2.da1.flag_reg_nr;
-	instr->bits2.da1.flag_subreg_nr = predicate->bits2.da1.flag_subreg_nr;
+	GEN(instr)->header.predicate_control = GEN(p)->header.predicate_control;
+	GEN(instr)->header.predicate_inverse = GEN(p)->header.predicate_inverse;
+	GEN(instr)->bits2.da1.flag_reg_nr = GEN(p)->bits2.da1.flag_reg_nr;
+	GEN(instr)->bits2.da1.flag_subreg_nr = GEN(p)->bits2.da1.flag_subreg_nr;
 }
 
 static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 70/90] assembler: Move struct relocation out of relocatable instructions
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (68 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 69/90] assembler: Unify all instructions to be brw_program_instructions Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 71/90] assembler: Gather all predicate data in its own structure Damien Lespiau
                   ` (20 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Now that all instructions (relocatable or not) are struct
brw_program_instructions, this means we can move the relocation specific
information out the "relocatable instruction" structure. This will allow
us to share the relocation information between different types of
instructions.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    5 +--
 assembler/gram.y    |   76 ++++++++++++++++++++++++++-------------------------
 assembler/main.c    |    4 +-
 3 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 0781eaf..5673e2c 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -126,8 +126,7 @@ struct label_instruction {
     char   *name;
 };
 
-struct relocatable_instruction {
-    struct brw_instruction gen;
+struct relocation {
     char *first_reloc_target, *second_reloc_target; // JIP and UIP respectively
     GLint first_reloc_offset, second_reloc_offset; // in number of instructions
 };
@@ -141,9 +140,9 @@ struct brw_program_instruction {
     unsigned inst_offset;
     union {
 	struct brw_instruction gen;
-	struct relocatable_instruction reloc;
 	struct label_instruction label;
     } insn;
+    struct relocation reloc;
     struct brw_program_instruction *next;
 };
 
diff --git a/assembler/gram.y b/assembler/gram.y
index f078bfe..9b90db9 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -208,14 +208,16 @@ brw_program_add_instruction(struct brw_program *p,
     brw_program_append_entry(p, list_entry);
 }
 
-static void brw_program_add_relocatable(struct brw_program *p,
-					struct brw_program_instruction *reloc)
+static void
+brw_program_add_relocatable(struct brw_program *p,
+			    struct brw_program_instruction *instruction)
 {
     struct brw_program_instruction *list_entry;
 
     list_entry = calloc(sizeof(struct brw_program_instruction), 1);
     list_entry->type = GEN4ASM_INSTRUCTION_GEN_RELOCATABLE;
-    list_entry->insn.reloc = reloc->insn.reloc;
+    list_entry->insn.gen = instruction->insn.gen;
+    list_entry->reloc = instruction->reloc;
     brw_program_append_entry(p, list_entry);
 }
 
@@ -724,8 +726,8 @@ ifelseinstruction: ENDIF
 		  memset(&$$, 0, sizeof($$));
 		  GEN(&$$)->header.opcode = $1;
 		  GEN(&$$)->header.execution_size = $2;
-		  $$.insn.reloc.first_reloc_target = $3.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $3.imm32;
+		  $$.reloc.first_reloc_target = $3.reloc_target;
+		  $$.reloc.first_reloc_offset = $3.imm32;
 		}
 		| ELSE execsize relativelocation instoptions
 		{
@@ -741,14 +743,14 @@ ifelseinstruction: ENDIF
 		    set_instruction_dest(&$$, &ip_dst);
 		    set_instruction_src0(&$$, &ip_src, NULL);
 		    set_instruction_src1(&$$, &$3, NULL);
-		    $$.insn.reloc.first_reloc_target = $3.reloc_target;
-		    $$.insn.reloc.first_reloc_offset = $3.imm32;
+		    $$.reloc.first_reloc_target = $3.reloc_target;
+		    $$.reloc.first_reloc_offset = $3.imm32;
 		  } else if(IS_GENp(6)) {
 		    memset(&$$, 0, sizeof($$));
 		    GEN(&$$)->header.opcode = $1;
 		    GEN(&$$)->header.execution_size = $2;
-		    $$.insn.reloc.first_reloc_target = $3.reloc_target;
-		    $$.insn.reloc.first_reloc_offset = $3.imm32;
+		    $$.reloc.first_reloc_target = $3.reloc_target;
+		    $$.reloc.first_reloc_offset = $3.imm32;
 		  } else {
 		    error(&@1, "'ELSE' instruction is not implemented.\n");
 		  }
@@ -773,8 +775,8 @@ ifelseinstruction: ENDIF
 		    set_instruction_src0(&$$, &ip_src, NULL);
 		    set_instruction_src1(&$$, &$4, NULL);
 		  }
-		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $4.imm32;
+		  $$.reloc.first_reloc_target = $4.reloc_target;
+		  $$.reloc.first_reloc_offset = $4.imm32;
 		}
 		| predicate IF execsize relativelocation relativelocation
 		{
@@ -786,10 +788,10 @@ ifelseinstruction: ENDIF
 		  set_instruction_predicate(&$$, &$1);
 		  GEN(&$$)->header.opcode = $2;
 		  GEN(&$$)->header.execution_size = $3;
-		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $4.imm32;
-		  $$.insn.reloc.second_reloc_target = $5.reloc_target;
-		  $$.insn.reloc.second_reloc_offset = $5.imm32;
+		  $$.reloc.first_reloc_target = $4.reloc_target;
+		  $$.reloc.first_reloc_offset = $4.imm32;
+		  $$.reloc.second_reloc_target = $5.reloc_target;
+		  $$.reloc.second_reloc_offset = $5.imm32;
 		}
 ;
 
@@ -809,8 +811,8 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		    set_instruction_src0(&$$, &ip_src, NULL);
 		    set_instruction_src1(&$$, &$4, NULL);
-		    $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		    $$.insn.reloc.first_reloc_offset = $4.imm32;
+		    $$.reloc.first_reloc_target = $4.reloc_target;
+		    $$.reloc.first_reloc_offset = $4.imm32;
 		  } else if (IS_GENp(6)) {
 		    /* Gen6 spec:
 		         dest must have the same element size as src0.
@@ -819,8 +821,8 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		    set_instruction_predicate(&$$, &$1);
 		    GEN(&$$)->header.opcode = $2;
 		    GEN(&$$)->header.execution_size = $3;
-		    $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		    $$.insn.reloc.first_reloc_offset = $4.imm32;
+		    $$.reloc.first_reloc_target = $4.reloc_target;
+		    $$.reloc.first_reloc_offset = $4.imm32;
 		  } else {
 		    error(&@2, "'WHILE' instruction is not implemented!\n");
 		  }
@@ -839,10 +841,10 @@ haltinstruction: predicate HALT execsize relativelocation relativelocation insto
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
 		  GEN(&$$)->header.opcode = $2;
-		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $4.imm32;
-		  $$.insn.reloc.second_reloc_target = $5.reloc_target;
-		  $$.insn.reloc.second_reloc_offset = $5.imm32;
+		  $$.reloc.first_reloc_target = $4.reloc_target;
+		  $$.reloc.first_reloc_offset = $4.imm32;
+		  $$.reloc.second_reloc_target = $5.reloc_target;
+		  $$.reloc.second_reloc_offset = $5.imm32;
 		  dst_null_reg.width = $3;
 		  set_instruction_dest(&$$, &dst_null_reg);
 		  set_instruction_src0(&$$, &src_null_reg, NULL);
@@ -856,8 +858,8 @@ multibranchinstruction:
 		  set_instruction_predicate(&$$, &$1);
 		  GEN(&$$)->header.opcode = $2;
 		  GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
-		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $4.imm32;
+		  $$.reloc.first_reloc_target = $4.reloc_target;
+		  $$.reloc.first_reloc_offset = $4.imm32;
 		  dst_null_reg.width = $3;
 		  set_instruction_dest(&$$, &dst_null_reg);
 		}
@@ -868,10 +870,10 @@ multibranchinstruction:
 		  set_instruction_predicate(&$$, &$1);
 		  GEN(&$$)->header.opcode = $2;
 		  GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
-		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $4.imm32;
-		  $$.insn.reloc.second_reloc_target = $5.reloc_target;
-		  $$.insn.reloc.second_reloc_offset = $5.imm32;
+		  $$.reloc.first_reloc_target = $4.reloc_target;
+		  $$.reloc.first_reloc_offset = $4.imm32;
+		  $$.reloc.second_reloc_target = $5.reloc_target;
+		  $$.reloc.second_reloc_offset = $5.imm32;
 		  dst_null_reg.width = $3;
 		  set_instruction_dest(&$$, &dst_null_reg);
 		  set_instruction_src0(&$$, &src_null_reg, NULL);
@@ -912,8 +914,8 @@ subroutineinstruction:
 		  src0.reg.vstride = 2; /*encoded 2*/
 		  set_instruction_src0(&$$, &src0, NULL);
 
-		  $$.insn.reloc.first_reloc_target = $5.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $5.imm32;
+		  $$.reloc.first_reloc_target = $5.reloc_target;
+		  $$.reloc.first_reloc_offset = $5.imm32;
 		}
 		| predicate RET execsize dstoperandex src instoptions
 		{
@@ -1359,8 +1361,8 @@ jumpinstruction: predicate JMPI execsize relativelocation2
 		  set_instruction_dest(&$$, &ip_dst);
 		  set_instruction_src0(&$$, &ip_src, NULL);
 		  set_instruction_src1(&$$, &$4, NULL);
-		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $4.imm32;
+		  $$.reloc.first_reloc_target = $4.reloc_target;
+		  $$.reloc.first_reloc_offset = $4.imm32;
 		}
 ;
 
@@ -1388,10 +1390,10 @@ breakinstruction: predicate breakop execsize relativelocation relativelocation i
 		  set_instruction_predicate(&$$, &$1);
 		  GEN(&$$)->header.opcode = $2;
 		  GEN(&$$)->header.execution_size = $3;
-		  $$.insn.reloc.first_reloc_target = $4.reloc_target;
-		  $$.insn.reloc.first_reloc_offset = $4.imm32;
-		  $$.insn.reloc.second_reloc_target = $5.reloc_target;
-		  $$.insn.reloc.second_reloc_offset = $5.imm32;
+		  $$.reloc.first_reloc_target = $4.reloc_target;
+		  $$.reloc.first_reloc_offset = $4.imm32;
+		  $$.reloc.second_reloc_target = $5.reloc_target;
+		  $$.reloc.second_reloc_offset = $5.imm32;
 		}
 ;
 
diff --git a/assembler/main.c b/assembler/main.c
index 8579f96..f1d78d0 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -437,8 +437,8 @@ int main(int argc, char **argv)
 	}
 
 	for (entry = compiled_program.first; entry; entry = entry->next) {
-	    struct relocatable_instruction *reloc = &entry->insn.reloc;
-	    struct brw_instruction *inst = &reloc->gen;
+	    struct relocation *reloc = &entry->reloc;
+	    struct brw_instruction *inst = &entry->insn.gen;
 
 	    if (!is_relocatable(entry))
 		continue;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 71/90] assembler: Gather all predicate data in its own structure
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (69 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 70/90] assembler: Move struct relocation out of relocatable instructions Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 72/90] assembler: Unify adding options to the header Damien Lespiau
                   ` (19 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Rather than user a full instruction for that. Also use
set_instruction_predicate() for a case that coud not be done like that
before the refactoring (because everyone now uses the same instruction
structure).

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    7 ++++++
 assembler/gram.y    |   53 +++++++++++++++++++++++++--------------------------
 2 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 5673e2c..00dd73a 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -85,6 +85,13 @@ struct condition {
 	int flag_subreg_nr;
 };
 
+struct predicate {
+    unsigned pred_control:4;
+    unsigned pred_inverse:1;
+    unsigned flag_reg_nr:1;
+    unsigned flag_subreg_nr:1;
+};
+
 struct region {
     int vert_stride, width, horiz_stride;
     int is_default;        
diff --git a/assembler/gram.y b/assembler/gram.y
index 9b90db9..23d7cfb 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -101,7 +101,7 @@ static int set_instruction_src2_three_src(struct brw_program_instruction *instr,
 static void set_instruction_options(struct brw_program_instruction *instr,
 				    struct brw_program_instruction *options);
 static void set_instruction_predicate(struct brw_program_instruction *instr,
-				      struct brw_program_instruction *p);
+				      struct predicate *p);
 static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 				   int type);
 static void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
@@ -396,6 +396,7 @@ static void resolve_subnr(struct brw_reg *reg)
 	struct regtype regtype;
 	struct brw_reg reg;
 	struct condition condition;
+	struct predicate predicate;
 	struct declared_register symbol_reg;
 	imm32_t imm32;
 
@@ -475,7 +476,7 @@ static void resolve_subnr(struct brw_reg *reg)
 %type <instruction> binaryaccinstruction trinaryinstruction sendinstruction
 %type <instruction> syncinstruction
 %type <instruction> msgtarget
-%type <instruction> instoptions instoption_list predicate
+%type <instruction> instoptions instoption_list
 %type <instruction> mathinstruction
 %type <instruction> nopinstruction
 %type <instruction> relocatableinstruction breakinstruction
@@ -487,6 +488,7 @@ static void resolve_subnr(struct brw_reg *reg)
 %type <integer> unaryop binaryop binaryaccop breakop
 %type <integer> trinaryop
 %type <condition> conditionalmodifier 
+%type <predicate> predicate
 %type <integer> condition saturate negate abs chansel
 %type <integer> writemask_x writemask_y writemask_z writemask_w
 %type <integer> srcimmtype execsize dstregion immaddroffset
@@ -956,8 +958,8 @@ unaryinstruction:
 
 		  if ($3.flag_subreg_nr != -1) {
 		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
-                        (GEN(&$1)->bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
-                         GEN(&$1)->bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
+                        ($1.flag_reg_nr != $3.flag_reg_nr ||
+                         $1.flag_subreg_nr != $3.flag_subreg_nr))
                         warn(ALWAYS, &@3, "must use the same flag register if "
 			     "both prediction and conditional modifier are "
 			     "enabled\n");
@@ -997,8 +999,8 @@ binaryinstruction:
 
 		  if ($3.flag_subreg_nr != -1) {
 		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
-                        (GEN(&$1)->bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
-                         GEN(&$1)->bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
+                        ($1.flag_reg_nr != $3.flag_reg_nr ||
+                         $1.flag_subreg_nr != $3.flag_subreg_nr))
                         warn(ALWAYS, &@3, "must use the same flag register if "
 			     "both prediction and conditional modifier are "
 			     "enabled\n");
@@ -1038,8 +1040,8 @@ binaryaccinstruction:
 
 		  if ($3.flag_subreg_nr != -1) {
 		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
-                        (GEN(&$1)->bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
-                         GEN(&$1)->bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
+                        ($1.flag_reg_nr != $3.flag_reg_nr ||
+                         $1.flag_subreg_nr != $3.flag_subreg_nr))
                         warn(ALWAYS, &@3, "must use the same flag register if "
 			     "both prediction and conditional modifier are "
 			     "enabled\n");
@@ -1067,10 +1069,7 @@ trinaryinstruction:
 {
 		  memset(&$$, 0, sizeof($$));
 
-		  GEN(&$$)->header.predicate_control = GEN(&$1)->header.predicate_control;
-		  GEN(&$$)->header.predicate_inverse = GEN(&$1)->header.predicate_inverse;
-		  GEN(&$$)->bits1.da3src.flag_reg_nr = GEN(&$1)->bits2.da1.flag_reg_nr;
-		  GEN(&$$)->bits1.da3src.flag_subreg_nr = GEN(&$1)->bits2.da1.flag_subreg_nr;
+		  set_instruction_predicate(&$$, &$1);
 
 		  GEN(&$$)->header.opcode = $2;
 		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
@@ -1089,8 +1088,8 @@ trinaryinstruction:
 
 		  if ($3.flag_subreg_nr != -1) {
 		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
-                        (GEN(&$1)->bits2.da1.flag_reg_nr != $3.flag_reg_nr ||
-                         GEN(&$1)->bits2.da1.flag_subreg_nr != $3.flag_subreg_nr))
+                        ($1.flag_reg_nr != $3.flag_reg_nr ||
+                         $1.flag_subreg_nr != $3.flag_subreg_nr))
                         warn(ALWAYS, &@3, "must use the same flag register if "
 			     "both prediction and conditional modifier are "
 			     "enabled\n");
@@ -2619,21 +2618,21 @@ imm32:		exp { $$.r = imm32_d; $$.u.d = $1; }
 /* 1.4.12: Predication and modifiers */
 predicate:	/* empty */
 		{
-		  GEN(&$$)->header.predicate_control = BRW_PREDICATE_NONE;
-		  GEN(&$$)->bits2.da1.flag_reg_nr = 0;
-		  GEN(&$$)->bits2.da1.flag_subreg_nr = 0;
-		  GEN(&$$)->header.predicate_inverse = 0;
+		  $$.pred_control = BRW_PREDICATE_NONE;
+		  $$.flag_reg_nr = 0;
+		  $$.flag_subreg_nr = 0;
+		  $$.pred_inverse = 0;
 		}
 		| LPAREN predstate flagreg predctrl RPAREN
 		{
-		  GEN(&$$)->header.predicate_control = $4;
+		  $$.pred_control = $4;
 		  /* XXX: Should deal with erroring when the user tries to
 		   * set a predicate for one flag register and conditional
 		   * modification on the other flag register.
 		   */
-		  GEN(&$$)->bits2.da1.flag_reg_nr = ($3.nr & 0xF);
-		  GEN(&$$)->bits2.da1.flag_subreg_nr = $3.subnr;
-		  GEN(&$$)->header.predicate_inverse = $2;
+		  $$.flag_reg_nr = $3.nr;
+		  $$.flag_subreg_nr = $3.subnr;
+		  $$.pred_inverse = $2;
 		}
 ;
 
@@ -3076,12 +3075,12 @@ static void set_instruction_options(struct brw_program_instruction *instr,
 }
 
 static void set_instruction_predicate(struct brw_program_instruction *instr,
-				      struct brw_program_instruction *p)
+				      struct predicate *p)
 {
-	GEN(instr)->header.predicate_control = GEN(p)->header.predicate_control;
-	GEN(instr)->header.predicate_inverse = GEN(p)->header.predicate_inverse;
-	GEN(instr)->bits2.da1.flag_reg_nr = GEN(p)->bits2.da1.flag_reg_nr;
-	GEN(instr)->bits2.da1.flag_subreg_nr = GEN(p)->bits2.da1.flag_subreg_nr;
+	GEN(instr)->header.predicate_control = p->pred_control;
+	GEN(instr)->header.predicate_inverse = p->pred_inverse;
+	GEN(instr)->bits2.da1.flag_reg_nr = p->flag_reg_nr;
+	GEN(instr)->bits2.da1.flag_subreg_nr = p->flag_subreg_nr;
 }
 
 static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 72/90] assembler: Unify adding options to the header
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (70 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 71/90] assembler: Gather all predicate data in its own structure Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 73/90] assembler: Isolate all the options in their own structure Damien Lespiau
                   ` (18 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Right now we have duplicated code for when the option is the last in the
list or not. Put that code in a common function.

Interestingly it appears that both sides haven't been kept in sync and
that EOT and ACCWRCTRL had limitations on where they had to be in the
option list. It's fixed now!

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |  121 ++++++++++++++++++++----------------------------------
 1 files changed, 45 insertions(+), 76 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 23d7cfb..99376a2 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -526,6 +526,49 @@ static void resolve_subnr(struct brw_reg *reg)
 	YYERROR;				\
     } while(0)
 
+static void add_option(struct brw_program_instruction *insn, int option)
+{
+    switch (option) {
+    case ALIGN1:
+	GEN(insn)->header.access_mode = BRW_ALIGN_1;
+	break;
+    case ALIGN16:
+	GEN(insn)->header.access_mode = BRW_ALIGN_16;
+	break;
+    case SECHALF:
+	GEN(insn)->header.compression_control |= BRW_COMPRESSION_2NDHALF;
+	break;
+    case COMPR:
+	if (!IS_GENp(6))
+	    GEN(insn)->header.compression_control |= BRW_COMPRESSION_COMPRESSED;
+	break;
+    case SWITCH:
+	GEN(insn)->header.thread_control |= BRW_THREAD_SWITCH;
+	break;
+    case ATOMIC:
+	GEN(insn)->header.thread_control |= BRW_THREAD_ATOMIC;
+	break;
+    case NODDCHK:
+	GEN(insn)->header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
+	break;
+    case NODDCLR:
+	GEN(insn)->header.dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
+	break;
+    case MASK_DISABLE:
+	GEN(insn)->header.mask_control = BRW_MASK_DISABLE;
+	break;
+    case BREAKPOINT:
+	GEN(insn)->header.debug_control = BRW_DEBUG_BREAKPOINT;
+	break;
+    case ACCWRCTRL:
+	GEN(insn)->header.acc_wr_control = BRW_ACCUMULATOR_WRITE_ENABLE;
+	break;
+    case EOT:
+	GEN(insn)->bits3.generic.end_of_thread = 1;
+	break;
+    }
+}
+
 }
 
 %%
@@ -2723,86 +2766,12 @@ instoptions:	/* empty */
 instoption_list:instoption_list COMMA instoption
 		{
 		  $$ = $1;
-		  switch ($3) {
-		  case ALIGN1:
-		    GEN(&$$)->header.access_mode = BRW_ALIGN_1;
-		    break;
-		  case ALIGN16:
-		    GEN(&$$)->header.access_mode = BRW_ALIGN_16;
-		    break;
-		  case SECHALF:
-		    GEN(&$$)->header.compression_control |= BRW_COMPRESSION_2NDHALF;
-		    break;
-		  case COMPR:
-		    if (!IS_GENp(6)) {
-                        GEN(&$$)->header.compression_control |=
-                            BRW_COMPRESSION_COMPRESSED;
-		    }
-		    break;
-		  case SWITCH:
-		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
-		    break;
-		  case ATOMIC:
-		    GEN(&$$)->header.thread_control |= BRW_THREAD_ATOMIC;
-		    break;
-		  case NODDCHK:
-		    GEN(&$$)->header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
-		    break;
-		  case NODDCLR:
-		    GEN(&$$)->header.dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
-		    break;
-		  case MASK_DISABLE:
-		    GEN(&$$)->header.mask_control = BRW_MASK_DISABLE;
-		    break;
-		  case BREAKPOINT:
-		    GEN(&$$)->header.debug_control = BRW_DEBUG_BREAKPOINT;
-		    break;
-		  case ACCWRCTRL:
-		    GEN(&$$)->header.acc_wr_control = BRW_ACCUMULATOR_WRITE_ENABLE;
-		  }
+		  add_option(&$$, $3);
 		}
 		| instoption_list instoption
 		{
 		  $$ = $1;
-		  switch ($2) {
-		  case ALIGN1:
-		    GEN(&$$)->header.access_mode = BRW_ALIGN_1;
-		    break;
-		  case ALIGN16:
-		    GEN(&$$)->header.access_mode = BRW_ALIGN_16;
-		    break;
-		  case SECHALF:
-		    GEN(&$$)->header.compression_control |= BRW_COMPRESSION_2NDHALF;
-		    break;
-		  case COMPR:
-			if (!IS_GENp(6)) {
-		      GEN(&$$)->header.compression_control |=
-		        BRW_COMPRESSION_COMPRESSED;
-			}
-		    break;
-		  case SWITCH:
-		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
-		    break;
-		  case ATOMIC:
-		    GEN(&$$)->header.thread_control |= BRW_THREAD_ATOMIC;
-		    break;
-		  case NODDCHK:
-		    GEN(&$$)->header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
-		    break;
-		  case NODDCLR:
-		    GEN(&$$)->header.dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
-		    break;
-		  case MASK_DISABLE:
-		    GEN(&$$)->header.mask_control = BRW_MASK_DISABLE;
-		    break;
-		  case BREAKPOINT:
-		    GEN(&$$)->header.debug_control = BRW_DEBUG_BREAKPOINT;
-		    break;
-		  case EOT:
-		    /* XXX: EOT shouldn't be an instoption, I don't think */
-		    GEN(&$$)->bits3.generic.end_of_thread = 1;
-		    break;
-		  }
+		  add_option(&$$, $2);
 		}
 		| /* empty, header defaults to zeroes. */
 		{
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 73/90] assembler: Isolate all the options in their own structure
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (71 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 72/90] assembler: Unify adding options to the header Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 74/90] assembler: Introduce set_instruction_opcode() Damien Lespiau
                   ` (17 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Like with the predicate fields before, there's no need to use the full
instruction to collect the list of options. This allows us to decouple
the list of options from a specific instruction encoding.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |   12 +++++++++
 assembler/gram.y    |   65 ++++++++++++++++++++++++++-------------------------
 2 files changed, 45 insertions(+), 32 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 00dd73a..3b98444 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -92,6 +92,18 @@ struct predicate {
     unsigned flag_subreg_nr:1;
 };
 
+struct options {
+    unsigned access_mode:1;
+    unsigned compression_control:2; /* gen6: quater control */
+    unsigned thread_control:2;
+    unsigned dependency_control:2;
+    unsigned mask_control:1;
+    unsigned debug_control:1;
+    unsigned acc_wr_control:1;
+
+    unsigned end_of_thread:1;
+};
+
 struct region {
     int vert_stride, width, horiz_stride;
     int is_default;        
diff --git a/assembler/gram.y b/assembler/gram.y
index 99376a2..c6ba086 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -99,7 +99,7 @@ static int set_instruction_src1_three_src(struct brw_program_instruction *instr,
 static int set_instruction_src2_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src);
 static void set_instruction_options(struct brw_program_instruction *instr,
-				    struct brw_program_instruction *options);
+				    struct options options);
 static void set_instruction_predicate(struct brw_program_instruction *instr,
 				      struct predicate *p);
 static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
@@ -397,6 +397,7 @@ static void resolve_subnr(struct brw_reg *reg)
 	struct brw_reg reg;
 	struct condition condition;
 	struct predicate predicate;
+	struct options options;
 	struct declared_register symbol_reg;
 	imm32_t imm32;
 
@@ -476,7 +477,6 @@ static void resolve_subnr(struct brw_reg *reg)
 %type <instruction> binaryaccinstruction trinaryinstruction sendinstruction
 %type <instruction> syncinstruction
 %type <instruction> msgtarget
-%type <instruction> instoptions instoption_list
 %type <instruction> mathinstruction
 %type <instruction> nopinstruction
 %type <instruction> relocatableinstruction breakinstruction
@@ -489,6 +489,7 @@ static void resolve_subnr(struct brw_reg *reg)
 %type <integer> trinaryop
 %type <condition> conditionalmodifier 
 %type <predicate> predicate
+%type <options> instoptions instoption_list
 %type <integer> condition saturate negate abs chansel
 %type <integer> writemask_x writemask_y writemask_z writemask_w
 %type <integer> srcimmtype execsize dstregion immaddroffset
@@ -526,45 +527,45 @@ static void resolve_subnr(struct brw_reg *reg)
 	YYERROR;				\
     } while(0)
 
-static void add_option(struct brw_program_instruction *insn, int option)
+static void add_option(struct options *options, int option)
 {
     switch (option) {
     case ALIGN1:
-	GEN(insn)->header.access_mode = BRW_ALIGN_1;
+	options->access_mode = BRW_ALIGN_1;
 	break;
     case ALIGN16:
-	GEN(insn)->header.access_mode = BRW_ALIGN_16;
+	options->access_mode = BRW_ALIGN_16;
 	break;
     case SECHALF:
-	GEN(insn)->header.compression_control |= BRW_COMPRESSION_2NDHALF;
+	options->compression_control |= BRW_COMPRESSION_2NDHALF;
 	break;
     case COMPR:
 	if (!IS_GENp(6))
-	    GEN(insn)->header.compression_control |= BRW_COMPRESSION_COMPRESSED;
+	    options->compression_control |= BRW_COMPRESSION_COMPRESSED;
 	break;
     case SWITCH:
-	GEN(insn)->header.thread_control |= BRW_THREAD_SWITCH;
+	options->thread_control |= BRW_THREAD_SWITCH;
 	break;
     case ATOMIC:
-	GEN(insn)->header.thread_control |= BRW_THREAD_ATOMIC;
+	options->thread_control |= BRW_THREAD_ATOMIC;
 	break;
     case NODDCHK:
-	GEN(insn)->header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
+	options->dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
 	break;
     case NODDCLR:
-	GEN(insn)->header.dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
+	options->dependency_control |= BRW_DEPENDENCY_NOTCLEARED;
 	break;
     case MASK_DISABLE:
-	GEN(insn)->header.mask_control = BRW_MASK_DISABLE;
+	options->mask_control = BRW_MASK_DISABLE;
 	break;
     case BREAKPOINT:
-	GEN(insn)->header.debug_control = BRW_DEBUG_BREAKPOINT;
+	options->debug_control = BRW_DEBUG_BREAKPOINT;
 	break;
     case ACCWRCTRL:
-	GEN(insn)->header.acc_wr_control = BRW_ACCUMULATOR_WRITE_ENABLE;
+	options->acc_wr_control = BRW_ACCUMULATOR_WRITE_ENABLE;
 	break;
     case EOT:
-	GEN(insn)->bits3.generic.end_of_thread = 1;
+	options->end_of_thread = 1;
 	break;
     }
 }
@@ -992,7 +993,7 @@ unaryinstruction:
 		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  $6.width = $5;
-		  set_instruction_options(&$$, &$8);
+		  set_instruction_options(&$$, $8);
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
@@ -1030,7 +1031,7 @@ binaryinstruction:
 		  GEN(&$$)->header.opcode = $2;
 		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
-		  set_instruction_options(&$$, &$9);
+		  set_instruction_options(&$$, $9);
 		  set_instruction_predicate(&$$, &$1);
 		  $6.width = $5;
 		  if (set_instruction_dest(&$$, &$6) != 0)
@@ -1072,7 +1073,7 @@ binaryaccinstruction:
 		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  $6.width = $5;
-		  set_instruction_options(&$$, &$9);
+		  set_instruction_options(&$$, $9);
 		  set_instruction_predicate(&$$, &$1);
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
@@ -1127,7 +1128,7 @@ trinaryinstruction:
 		    YYERROR;
 		  if (set_instruction_src2_three_src(&$$, &$9))
 		    YYERROR;
-		  set_instruction_options(&$$, &$10);
+		  set_instruction_options(&$$, $10);
 
 		  if ($3.flag_subreg_nr != -1) {
 		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
@@ -1189,21 +1190,19 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                       } else {
                           GEN(&$$)->header.destreg__conditionalmod = $4; /* msg reg index */
                           GEN(&$$)->bits2.send_gen5.sfid = GEN(&$7)->bits2.send_gen5.sfid;
-                          GEN(&$$)->bits2.send_gen5.end_of_thread = GEN(&$12)->bits3.generic_gen5.end_of_thread;
+                          GEN(&$$)->bits2.send_gen5.end_of_thread = $12.end_of_thread;
                       }
 
                       GEN(&$$)->bits3.generic_gen5 = GEN(&$7)->bits3.generic_gen5;
                       GEN(&$$)->bits3.generic_gen5.msg_length = $9;
                       GEN(&$$)->bits3.generic_gen5.response_length = $11;
-                      GEN(&$$)->bits3.generic_gen5.end_of_thread =
-                          GEN(&$12)->bits3.generic_gen5.end_of_thread;
+                      GEN(&$$)->bits3.generic_gen5.end_of_thread = $12.end_of_thread;
 		  } else {
                       GEN(&$$)->header.destreg__conditionalmod = $4; /* msg reg index */
                       GEN(&$$)->bits3.generic = GEN(&$7)->bits3.generic;
                       GEN(&$$)->bits3.generic.msg_length = $9;
                       GEN(&$$)->bits3.generic.response_length = $11;
-                      GEN(&$$)->bits3.generic.end_of_thread =
-                          GEN(&$12)->bits3.generic.end_of_thread;
+                      GEN(&$$)->bits3.generic.end_of_thread = $12.end_of_thread;
 		  }
 		}
 		| predicate SEND execsize dst sendleadreg payload directsrcoperand instoptions
@@ -1413,7 +1412,7 @@ mathinstruction: predicate MATH_INST execsize dst src srcimm math_function insto
 		  memset(&$$, 0, sizeof($$));
 		  GEN(&$$)->header.opcode = $2;
 		  GEN(&$$)->header.destreg__conditionalmod = $7;
-		  set_instruction_options(&$$, &$8);
+		  set_instruction_options(&$$, $8);
 		  set_instruction_predicate(&$$, &$1);
 		  $4.width = $3;
 		  if (set_instruction_dest(&$$, &$4) != 0)
@@ -3033,14 +3032,16 @@ static int set_instruction_src2_three_src(struct brw_program_instruction *instr,
 }
 
 static void set_instruction_options(struct brw_program_instruction *instr,
-				    struct brw_program_instruction *options)
+				    struct options options)
 {
-	/* XXX: more instr options */
-	GEN(instr)->header.access_mode = GEN(options)->header.access_mode;
-	GEN(instr)->header.mask_control = GEN(options)->header.mask_control;
-	GEN(instr)->header.dependency_control = GEN(options)->header.dependency_control;
-	GEN(instr)->header.compression_control =
-		GEN(options)->header.compression_control;
+	GEN(instr)->header.access_mode = options.access_mode;
+	GEN(instr)->header.compression_control = options.compression_control;
+	GEN(instr)->header.thread_control = options.thread_control;
+	GEN(instr)->header.dependency_control = options.dependency_control;
+	GEN(instr)->header.mask_control = options.mask_control;
+	GEN(instr)->header.debug_control = options.debug_control;
+	GEN(instr)->header.acc_wr_control = options.acc_wr_control;
+	GEN(instr)->bits3.generic.end_of_thread = options.end_of_thread;
 }
 
 static void set_instruction_predicate(struct brw_program_instruction *instr,
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 74/90] assembler: Introduce set_instruction_opcode()
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (72 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 73/90] assembler: Isolate all the options in their own structure Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 75/90] assembler: Introduce set_intruction_pred_cond() Damien Lespiau
                   ` (16 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   72 ++++++++++++++++++++++++++++++------------------------
 1 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index c6ba086..2d72037 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -82,6 +82,8 @@ static struct src_operand ip_src =
 };
 
 static int get_type_size(GLuint type);
+static void set_instruction_opcode(struct brw_program_instruction *instr,
+				   unsigned opcode);
 static int set_instruction_dest(struct brw_program_instruction *instr,
 				struct brw_reg *dest);
 static int set_instruction_src0(struct brw_program_instruction *instr,
@@ -757,7 +759,7 @@ ifelseinstruction: ENDIF
 		  if(IS_GENp(6)) // For gen6+.
 		    error(&@1, "should be 'ENDIF execsize relativelocation'\n");
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $1;
+		  set_instruction_opcode(&$$, $1);
 		  GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		  GEN(&$$)->bits1.da1.dest_horiz_stride = 1;
 		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
@@ -770,7 +772,7 @@ ifelseinstruction: ENDIF
 		  if(!IS_GENp(6)) // for gen6-
 		    error(&@1, "ENDIF Syntax error: should be 'ENDIF'\n");
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $1;
+		  set_instruction_opcode(&$$, $1);
 		  GEN(&$$)->header.execution_size = $2;
 		  $$.reloc.first_reloc_target = $3.reloc_target;
 		  $$.reloc.first_reloc_offset = $3.imm32;
@@ -783,7 +785,7 @@ ifelseinstruction: ENDIF
 		    $3.imm32 |= (1 << 16);
 
 		    memset(&$$, 0, sizeof($$));
-		    GEN(&$$)->header.opcode = $1;
+		    set_instruction_opcode(&$$, $1);
 		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		    ip_dst.width = $2;
 		    set_instruction_dest(&$$, &ip_dst);
@@ -793,7 +795,7 @@ ifelseinstruction: ENDIF
 		    $$.reloc.first_reloc_offset = $3.imm32;
 		  } else if(IS_GENp(6)) {
 		    memset(&$$, 0, sizeof($$));
-		    GEN(&$$)->header.opcode = $1;
+		    set_instruction_opcode(&$$, $1);
 		    GEN(&$$)->header.execution_size = $2;
 		    $$.reloc.first_reloc_target = $3.reloc_target;
 		    $$.reloc.first_reloc_offset = $3.imm32;
@@ -813,7 +815,7 @@ ifelseinstruction: ENDIF
 
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  if(!IS_GENp(6)) {
 		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		    ip_dst.width = $3;
@@ -832,7 +834,7 @@ ifelseinstruction: ENDIF
 
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.execution_size = $3;
 		  $$.reloc.first_reloc_target = $4.reloc_target;
 		  $$.reloc.first_reloc_offset = $4.imm32;
@@ -853,7 +855,7 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		    set_instruction_dest(&$$, &ip_dst);
 		    memset(&$$, 0, sizeof($$));
 		    set_instruction_predicate(&$$, &$1);
-		    GEN(&$$)->header.opcode = $2;
+		    set_instruction_opcode(&$$, $2);
 		    GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		    set_instruction_src0(&$$, &ip_src, NULL);
 		    set_instruction_src1(&$$, &$4, NULL);
@@ -865,7 +867,7 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		         dest horizontal stride must be 1. */
 		    memset(&$$, 0, sizeof($$));
 		    set_instruction_predicate(&$$, &$1);
-		    GEN(&$$)->header.opcode = $2;
+		    set_instruction_opcode(&$$, $2);
 		    GEN(&$$)->header.execution_size = $3;
 		    $$.reloc.first_reloc_target = $4.reloc_target;
 		    $$.reloc.first_reloc_offset = $4.imm32;
@@ -877,7 +879,7 @@ loopinstruction: predicate WHILE execsize relativelocation instoptions
 		{
 		  // deprecated
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $1;
+		  set_instruction_opcode(&$$, $1);
 		};
 
 haltinstruction: predicate HALT execsize relativelocation relativelocation instoptions
@@ -886,7 +888,7 @@ haltinstruction: predicate HALT execsize relativelocation relativelocation insto
 		  /* Gen6, Gen7 bspec: dst and src0 must be the null reg. */
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  $$.reloc.first_reloc_target = $4.reloc_target;
 		  $$.reloc.first_reloc_offset = $4.imm32;
 		  $$.reloc.second_reloc_target = $5.reloc_target;
@@ -902,7 +904,7 @@ multibranchinstruction:
 		  /* Gen7 bspec: dest must be null. use Switch option */
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		  $$.reloc.first_reloc_target = $4.reloc_target;
 		  $$.reloc.first_reloc_offset = $4.imm32;
@@ -914,7 +916,7 @@ multibranchinstruction:
 		  /* Gen7 bspec: dest must be null. src0 must be null. use Switch option */
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.thread_control |= BRW_THREAD_SWITCH;
 		  $$.reloc.first_reloc_target = $4.reloc_target;
 		  $$.reloc.first_reloc_offset = $4.imm32;
@@ -945,7 +947,7 @@ subroutineinstruction:
 		   */
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 
 		  $4.type = BRW_REGISTER_TYPE_D; /* dest type should be DWORD */
 		  $4.width = 1; /* execution size must be 2. Here 1 is encoded 2. */
@@ -973,7 +975,7 @@ subroutineinstruction:
 		   */
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  dst_null_reg.width = 1; /* execution size of RET should be 2 */
 		  set_instruction_dest(&$$, &dst_null_reg);
 		  $5.reg.type = BRW_REGISTER_TYPE_D;
@@ -989,7 +991,7 @@ unaryinstruction:
 		dst srcaccimm instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  $6.width = $5;
@@ -1028,7 +1030,7 @@ binaryinstruction:
 		dst src srcimm instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  set_instruction_options(&$$, $9);
@@ -1069,7 +1071,7 @@ binaryaccinstruction:
 		dst srcacc srcimm instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  $6.width = $5;
@@ -1115,7 +1117,7 @@ trinaryinstruction:
 
 		  set_instruction_predicate(&$$, &$1);
 
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  GEN(&$$)->header.execution_size = $5;
@@ -1154,7 +1156,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		   * implicitly loaded if non-null.
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  $5.width = $3;
 		  GEN(&$$)->header.destreg__conditionalmod = $4; /* msg reg index */
 		  set_instruction_predicate(&$$, &$1);
@@ -1208,7 +1210,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		| predicate SEND execsize dst sendleadreg payload directsrcoperand instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
@@ -1232,7 +1234,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 			   "type=%d\n", $7.reg.dw1.ud, $7.reg.type);
 		  }
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
@@ -1260,7 +1262,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  }
 
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
                   GEN(&$$)->header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
 		  set_instruction_predicate(&$$, &$1);
 
@@ -1303,7 +1305,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		  }
 
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
                   GEN(&$$)->header.destreg__conditionalmod = ($6 & EX_DESC_SFID_MASK); /* SFID */
 		  set_instruction_predicate(&$$, &$1);
 
@@ -1338,7 +1340,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 			  "type=%d\n", $8.reg.dw1.ud, $8.reg.type);
 		  }
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
@@ -1360,7 +1362,7 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		| predicate SEND execsize dst sendleadreg payload exp directsrcoperand instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $5.nr; /* msg reg index */
 
 		  set_instruction_predicate(&$$, &$1);
@@ -1394,7 +1396,7 @@ jumpinstruction: predicate JMPI execsize relativelocation2
 		   * is the post-incremented IP plus the offset.
 		   */
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  if(advanced_flag)
 			GEN(&$$)->header.mask_control = BRW_MASK_DISABLE;
 		  set_instruction_predicate(&$$, &$1);
@@ -1410,7 +1412,7 @@ jumpinstruction: predicate JMPI execsize relativelocation2
 mathinstruction: predicate MATH_INST execsize dst src srcimm math_function instoptions
 		{
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.destreg__conditionalmod = $7;
 		  set_instruction_options(&$$, $8);
 		  set_instruction_predicate(&$$, &$1);
@@ -1429,7 +1431,7 @@ breakinstruction: predicate breakop execsize relativelocation relativelocation i
 		  // for Gen6, Gen7
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  GEN(&$$)->header.execution_size = $3;
 		  $$.reloc.first_reloc_target = $4.reloc_target;
 		  $$.reloc.first_reloc_offset = $4.imm32;
@@ -1452,7 +1454,7 @@ syncinstruction: predicate WAIT notifyreg
 		  struct src_operand notify_src;
 
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $2;
+		  set_instruction_opcode(&$$, $2);
 		  set_direct_dst_operand(&notify_dst, &$3, BRW_REGISTER_TYPE_D);
 		  notify_dst.width = ffs(1) - 1;
 		  set_instruction_dest(&$$, &notify_dst);
@@ -1466,7 +1468,7 @@ syncinstruction: predicate WAIT notifyreg
 nopinstruction: NOP
 		{
 		  memset(&$$, 0, sizeof($$));
-		  GEN(&$$)->header.opcode = $1;
+		  set_instruction_opcode(&$$, $1);
 		};
 
 /* XXX! */
@@ -1678,7 +1680,7 @@ msgtarget:	NULL_TOKEN
 		  if (IS_GENp(5)) {
                       GEN(&$$)->bits2.send_gen5.sfid = BRW_SFID_URB;
                       GEN(&$$)->bits3.generic_gen5.header_present = 1;
-                      GEN(&$$)->bits3.urb_gen5.opcode = BRW_URB_OPCODE_WRITE;
+		      set_instruction_opcode(&$$, BRW_URB_OPCODE_WRITE);
                       GEN(&$$)->bits3.urb_gen5.offset = $2;
                       GEN(&$$)->bits3.urb_gen5.swizzle_control = $3;
                       GEN(&$$)->bits3.urb_gen5.pad = 0;
@@ -1687,7 +1689,7 @@ msgtarget:	NULL_TOKEN
                       GEN(&$$)->bits3.urb_gen5.complete = $6;
 		  } else {
                       GEN(&$$)->bits3.generic.msg_target = BRW_SFID_URB;
-                      GEN(&$$)->bits3.urb.opcode = BRW_URB_OPCODE_WRITE;
+		      set_instruction_opcode(&$$, BRW_URB_OPCODE_WRITE);
                       GEN(&$$)->bits3.urb.offset = $2;
                       GEN(&$$)->bits3.urb.swizzle_control = $3;
                       GEN(&$$)->bits3.urb.pad = 0;
@@ -2901,6 +2903,12 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
     }
 }
 
+static void set_instruction_opcode(struct brw_program_instruction *instr,
+				  unsigned opcode)
+{
+  GEN(instr)->header.opcode = opcode;
+}
+
 /**
  * Fills in the destination register information in instr from the bits in dst.
  */
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 75/90] assembler: Introduce set_intruction_pred_cond()
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (73 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 74/90] assembler: Introduce set_instruction_opcode() Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 76/90] assembler: Introduce set_instruction_saturate() Damien Lespiau
                   ` (15 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

This allow us to factor out the test that checks if, when using both
predicates and conditional modifiers, we are using the same flag
register.

Also get rid of of a FIXME that we are now dealing with (the warning
mentioned above).

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   88 +++++++++++++++++++-----------------------------------
 1 files changed, 31 insertions(+), 57 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 2d72037..917bccf 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -104,6 +104,10 @@ static void set_instruction_options(struct brw_program_instruction *instr,
 				    struct options options);
 static void set_instruction_predicate(struct brw_program_instruction *instr,
 				      struct predicate *p);
+static void set_instruction_pred_cond(struct brw_program_instruction *instr,
+				      struct predicate *p,
+				      struct condition *c,
+				      YYLTYPE *location);
 static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 				   int type);
 static void set_direct_src_operand(struct src_operand *src, struct brw_reg *reg,
@@ -992,28 +996,15 @@ unaryinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_opcode(&$$, $2);
-		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  $6.width = $5;
 		  set_instruction_options(&$$, $8);
-		  set_instruction_predicate(&$$, &$1);
+		  set_instruction_pred_cond(&$$, &$1, &$3, &@3);
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$7, &@7) != 0)
 		    YYERROR;
 
-		  if ($3.flag_subreg_nr != -1) {
-		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
-                        ($1.flag_reg_nr != $3.flag_reg_nr ||
-                         $1.flag_subreg_nr != $3.flag_subreg_nr))
-                        warn(ALWAYS, &@3, "must use the same flag register if "
-			     "both prediction and conditional modifier are "
-			     "enabled\n");
-
-		    GEN(&$$)->bits2.da1.flag_reg_nr = $3.flag_reg_nr;
-		    GEN(&$$)->bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
-		  }
-
 		  if (!IS_GENp(6) && 
 				get_type_size(GEN(&$$)->bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
 		    GEN(&$$)->header.compression_control = BRW_COMPRESSION_COMPRESSED;
@@ -1031,10 +1022,9 @@ binaryinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_opcode(&$$, $2);
-		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  set_instruction_options(&$$, $9);
-		  set_instruction_predicate(&$$, &$1);
+		  set_instruction_pred_cond(&$$, &$1, &$3, &@3);
 		  $6.width = $5;
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
@@ -1043,18 +1033,6 @@ binaryinstruction:
 		  if (set_instruction_src1(&$$, &$8, &@8) != 0)
 		    YYERROR;
 
-		  if ($3.flag_subreg_nr != -1) {
-		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
-                        ($1.flag_reg_nr != $3.flag_reg_nr ||
-                         $1.flag_subreg_nr != $3.flag_subreg_nr))
-                        warn(ALWAYS, &@3, "must use the same flag register if "
-			     "both prediction and conditional modifier are "
-			     "enabled\n");
-
-		    GEN(&$$)->bits2.da1.flag_reg_nr = $3.flag_reg_nr;
-		    GEN(&$$)->bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
-		  }
-
 		  if (!IS_GENp(6) && 
 				get_type_size(GEN(&$$)->bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
 		    GEN(&$$)->header.compression_control = BRW_COMPRESSION_COMPRESSED;
@@ -1072,11 +1050,10 @@ binaryaccinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_opcode(&$$, $2);
-		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  $6.width = $5;
 		  set_instruction_options(&$$, $9);
-		  set_instruction_predicate(&$$, &$1);
+		  set_instruction_pred_cond(&$$, &$1, &$3, &@3);
 		  if (set_instruction_dest(&$$, &$6) != 0)
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$7, &@7) != 0)
@@ -1084,18 +1061,6 @@ binaryaccinstruction:
 		  if (set_instruction_src1(&$$, &$8, &@8) != 0)
 		    YYERROR;
 
-		  if ($3.flag_subreg_nr != -1) {
-		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
-                        ($1.flag_reg_nr != $3.flag_reg_nr ||
-                         $1.flag_subreg_nr != $3.flag_subreg_nr))
-                        warn(ALWAYS, &@3, "must use the same flag register if "
-			     "both prediction and conditional modifier are "
-			     "enabled\n");
-
-		    GEN(&$$)->bits2.da1.flag_reg_nr = $3.flag_reg_nr;
-		    GEN(&$$)->bits2.da1.flag_subreg_nr = $3.flag_subreg_nr;
-		  }
-
 		  if (!IS_GENp(6) && 
 				get_type_size(GEN(&$$)->bits1.da1.dest_reg_type) * (1 << $6.width) == 64)
 		    GEN(&$$)->header.compression_control = BRW_COMPRESSION_COMPRESSED;
@@ -1115,10 +1080,9 @@ trinaryinstruction:
 {
 		  memset(&$$, 0, sizeof($$));
 
-		  set_instruction_predicate(&$$, &$1);
+		  set_instruction_pred_cond(&$$, &$1, &$3, &@3);
 
 		  set_instruction_opcode(&$$, $2);
-		  GEN(&$$)->header.destreg__conditionalmod = $3.cond;
 		  GEN(&$$)->header.saturate = $4;
 		  GEN(&$$)->header.execution_size = $5;
 
@@ -1131,15 +1095,6 @@ trinaryinstruction:
 		  if (set_instruction_src2_three_src(&$$, &$9))
 		    YYERROR;
 		  set_instruction_options(&$$, $10);
-
-		  if ($3.flag_subreg_nr != -1) {
-		    if (GEN(&$$)->header.predicate_control != BRW_PREDICATE_NONE &&
-                        ($1.flag_reg_nr != $3.flag_reg_nr ||
-                         $1.flag_subreg_nr != $3.flag_subreg_nr))
-                        warn(ALWAYS, &@3, "must use the same flag register if "
-			     "both prediction and conditional modifier are "
-			     "enabled\n");
-		  }
 }
 ;
 
@@ -2670,10 +2625,6 @@ predicate:	/* empty */
 		| LPAREN predstate flagreg predctrl RPAREN
 		{
 		  $$.pred_control = $4;
-		  /* XXX: Should deal with erroring when the user tries to
-		   * set a predicate for one flag register and conditional
-		   * modification on the other flag register.
-		   */
 		  $$.flag_reg_nr = $3.nr;
 		  $$.flag_subreg_nr = $3.subnr;
 		  $$.pred_inverse = $2;
@@ -3061,6 +3012,29 @@ static void set_instruction_predicate(struct brw_program_instruction *instr,
 	GEN(instr)->bits2.da1.flag_subreg_nr = p->flag_subreg_nr;
 }
 
+static void set_instruction_pred_cond(struct brw_program_instruction *instr,
+				      struct predicate *p,
+				      struct condition *c,
+				      YYLTYPE *location)
+{
+    set_instruction_predicate(instr, p);
+    GEN(instr)->header.destreg__conditionalmod = c->cond;
+
+    if (c->flag_subreg_nr == -1)
+	return;
+
+    if (p->pred_control != BRW_PREDICATE_NONE &&
+	(p->flag_reg_nr != c->flag_reg_nr ||
+	 p->flag_subreg_nr != c->flag_subreg_nr))
+    {
+	warn(ALWAYS, location, "must use the same flag register if both "
+	     "prediction and conditional modifier are enabled\n");
+    }
+
+    GEN(instr)->bits2.da1.flag_reg_nr = c->flag_reg_nr;
+    GEN(instr)->bits2.da1.flag_subreg_nr = c->flag_subreg_nr;
+}
+
 static void set_direct_dst_operand(struct brw_reg *dst, struct brw_reg *reg,
 				   int type)
 {
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 76/90] assembler: Introduce set_instruction_saturate()
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (74 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 75/90] assembler: Introduce set_intruction_pred_cond() Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 77/90] assembler: Expose setters for 3src operands Damien Lespiau
                   ` (14 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Also simplify the logic that was setting the saturate bit in the math
instruction.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   26 ++++++++++++++------------
 1 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 917bccf..43c34f6 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -100,6 +100,8 @@ static int set_instruction_src1_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src);
 static int set_instruction_src2_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src);
+static void set_instruction_saturate(struct brw_program_instruction *instr,
+				     int saturate);
 static void set_instruction_options(struct brw_program_instruction *instr,
 				    struct options options);
 static void set_instruction_predicate(struct brw_program_instruction *instr,
@@ -996,7 +998,7 @@ unaryinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_opcode(&$$, $2);
-		  GEN(&$$)->header.saturate = $4;
+		  set_instruction_saturate(&$$, $4);
 		  $6.width = $5;
 		  set_instruction_options(&$$, $8);
 		  set_instruction_pred_cond(&$$, &$1, &$3, &@3);
@@ -1022,7 +1024,7 @@ binaryinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_opcode(&$$, $2);
-		  GEN(&$$)->header.saturate = $4;
+		  set_instruction_saturate(&$$, $4);
 		  set_instruction_options(&$$, $9);
 		  set_instruction_pred_cond(&$$, &$1, &$3, &@3);
 		  $6.width = $5;
@@ -1050,7 +1052,7 @@ binaryaccinstruction:
 		{
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_opcode(&$$, $2);
-		  GEN(&$$)->header.saturate = $4;
+		  set_instruction_saturate(&$$, $4);
 		  $6.width = $5;
 		  set_instruction_options(&$$, $9);
 		  set_instruction_pred_cond(&$$, &$1, &$3, &@3);
@@ -1083,7 +1085,7 @@ trinaryinstruction:
 		  set_instruction_pred_cond(&$$, &$1, &$3, &@3);
 
 		  set_instruction_opcode(&$$, $2);
-		  GEN(&$$)->header.saturate = $4;
+		  set_instruction_saturate(&$$, $4);
 		  GEN(&$$)->header.execution_size = $5;
 
 		  if (set_instruction_dest_three_src(&$$, &$6))
@@ -1485,20 +1487,14 @@ msgtarget:	NULL_TOKEN
                       GEN(&$$)->bits2.send_gen5.sfid = BRW_SFID_MATH;
                       GEN(&$$)->bits3.generic_gen5.header_present = 0;
                       GEN(&$$)->bits3.math_gen5.function = $2;
-                      if ($3 == BRW_INSTRUCTION_SATURATE)
-                          GEN(&$$)->bits3.math_gen5.saturate = 1;
-                      else
-                          GEN(&$$)->bits3.math_gen5.saturate = 0;
+		      set_instruction_saturate(&$$, $3);
                       GEN(&$$)->bits3.math_gen5.int_type = $4;
                       GEN(&$$)->bits3.math_gen5.precision = BRW_MATH_PRECISION_FULL;
                       GEN(&$$)->bits3.math_gen5.data_type = $5;
 		  } else {
                       GEN(&$$)->bits3.generic.msg_target = BRW_SFID_MATH;
                       GEN(&$$)->bits3.math.function = $2;
-                      if ($3 == BRW_INSTRUCTION_SATURATE)
-                          GEN(&$$)->bits3.math.saturate = 1;
-                      else
-                          GEN(&$$)->bits3.math.saturate = 0;
+		      set_instruction_saturate(&$$, $3);
                       GEN(&$$)->bits3.math.int_type = $4;
                       GEN(&$$)->bits3.math.precision = BRW_MATH_PRECISION_FULL;
                       GEN(&$$)->bits3.math.data_type = $5;
@@ -2990,6 +2986,12 @@ static int set_instruction_src2_three_src(struct brw_program_instruction *instr,
 	return 0;
 }
 
+static void set_instruction_saturate(struct brw_program_instruction *instr,
+				     int saturate)
+{
+    GEN(instr)->header.saturate = saturate;
+}
+
 static void set_instruction_options(struct brw_program_instruction *instr,
 				    struct options options)
 {
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 77/90] assembler: Expose setters for 3src operands
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (75 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 76/90] assembler: Introduce set_instruction_saturate() Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 78/90] assembler: Add support for D and UD in 3-src instructions Damien Lespiau
                   ` (13 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_eu.h      |   17 +++++++++++++++++
 assembler/brw_eu_emit.c |   43 +++++++++++++++++++++++++++++++++++--------
 2 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/assembler/brw_eu.h b/assembler/brw_eu.h
index 6d656a4..20d4b82 100644
--- a/assembler/brw_eu.h
+++ b/assembler/brw_eu.h
@@ -385,6 +385,23 @@ void brw_set_uip_jip(struct brw_compile *p);
 
 uint32_t brw_swap_cmod(uint32_t cmod);
 
+void
+brw_set_3src_dest(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg dest);
+void
+brw_set_3src_src0(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg src0);
+void
+brw_set_3src_src1(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg src1);
+void
+brw_set_3src_src2(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg src2);
+
 /* brw_eu_compact.c */
 void brw_init_compaction_tables(struct intel_context *intel);
 void brw_compact_instructions(struct brw_compile *p);
diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
index c63f1fc..e6e3e10 100644
--- a/assembler/brw_eu_emit.c
+++ b/assembler/brw_eu_emit.c
@@ -813,15 +813,11 @@ get_3src_subreg_nr(struct brw_reg reg)
    }
 }
 
-static struct brw_instruction *brw_alu3(struct brw_compile *p,
-					GLuint opcode,
-					struct brw_reg dest,
-					struct brw_reg src0,
-					struct brw_reg src1,
-					struct brw_reg src2)
+void
+brw_set_3src_dest(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg dest)
 {
-   struct brw_instruction *insn = next_insn(p, opcode);
-
    gen7_convert_mrf_to_grf(p, &dest);
 
    assert(insn->header.access_mode == BRW_ALIGN_16);
@@ -836,7 +832,13 @@ static struct brw_instruction *brw_alu3(struct brw_compile *p,
    insn->bits1.da3src.dest_subreg_nr = dest.subnr / 16;
    insn->bits1.da3src.dest_writemask = dest.dw1.bits.writemask;
    guess_execution_size(p, insn, dest);
+}
 
+void
+brw_set_3src_src0(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg src0)
+{
    assert(src0.file == BRW_GENERAL_REGISTER_FILE);
    assert(src0.address_mode == BRW_ADDRESS_DIRECT);
    assert(src0.nr < 128);
@@ -847,7 +849,13 @@ static struct brw_instruction *brw_alu3(struct brw_compile *p,
    insn->bits1.da3src.src0_abs = src0.abs;
    insn->bits1.da3src.src0_negate = src0.negate;
    insn->bits2.da3src.src0_rep_ctrl = src0.vstride == BRW_VERTICAL_STRIDE_0;
+}
 
+void
+brw_set_3src_src1(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg src1)
+{
    assert(src1.file == BRW_GENERAL_REGISTER_FILE);
    assert(src1.address_mode == BRW_ADDRESS_DIRECT);
    assert(src1.nr < 128);
@@ -859,7 +867,13 @@ static struct brw_instruction *brw_alu3(struct brw_compile *p,
    insn->bits3.da3src.src1_reg_nr = src1.nr;
    insn->bits1.da3src.src1_abs = src1.abs;
    insn->bits1.da3src.src1_negate = src1.negate;
+}
 
+void
+brw_set_3src_src2(struct brw_compile *p,
+		  struct brw_instruction *insn,
+		  struct brw_reg src2)
+{
    assert(src2.file == BRW_GENERAL_REGISTER_FILE);
    assert(src2.address_mode == BRW_ADDRESS_DIRECT);
    assert(src2.nr < 128);
@@ -870,7 +884,20 @@ static struct brw_instruction *brw_alu3(struct brw_compile *p,
    insn->bits3.da3src.src2_reg_nr = src2.nr;
    insn->bits1.da3src.src2_abs = src2.abs;
    insn->bits1.da3src.src2_negate = src2.negate;
+}
 
+static struct brw_instruction *brw_alu3(struct brw_compile *p,
+					GLuint opcode,
+					struct brw_reg dest,
+					struct brw_reg src0,
+					struct brw_reg src1,
+					struct brw_reg src2)
+{
+   struct brw_instruction *insn = next_insn(p, opcode);
+   brw_set_3src_dest(p, insn, dest);
+   brw_set_3src_src0(p, insn, src0);
+   brw_set_3src_src1(p, insn, src1);
+   brw_set_3src_src2(p, insn, src2);
    return insn;
 }
 
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 78/90] assembler: Add support for D and UD in 3-src instructions
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (76 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 77/90] assembler: Expose setters for 3src operands Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 79/90] assembler: Use brw_*() functions for " Damien Lespiau
                   ` (12 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_defines.h |    5 +++++
 assembler/brw_eu_emit.c |   23 +++++++++++++++++++----
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/assembler/brw_defines.h b/assembler/brw_defines.h
index 23402e3..98757da 100644
--- a/assembler/brw_defines.h
+++ b/assembler/brw_defines.h
@@ -785,6 +785,11 @@ enum opcode {
 #define BRW_REGISTER_TYPE_V   6	/* packed int vector, immediates only, uword dest only */
 #define BRW_REGISTER_TYPE_F   7
 
+#define BRW_REGISTER_3SRC_TYPE_F    0
+#define BRW_REGISTER_3SRC_TYPE_D    1
+#define BRW_REGISTER_3SRC_TYPE_UD   2
+#define BRW_REGISTER_3SRC_TYPE_DF   3
+
 #define BRW_ARF_NULL                  0x00
 #define BRW_ARF_ADDRESS               0x10
 #define BRW_ARF_ACCUMULATOR           0x20
diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
index e6e3e10..ae570c7 100644
--- a/assembler/brw_eu_emit.c
+++ b/assembler/brw_eu_emit.c
@@ -813,6 +813,21 @@ get_3src_subreg_nr(struct brw_reg reg)
    }
 }
 
+static int get_3src_type(int type)
+{
+   assert(type == BRW_REGISTER_TYPE_F ||
+	  type == BRW_REGISTER_TYPE_D ||
+	  type == BRW_REGISTER_TYPE_UD);
+
+   switch(type) {
+   case BRW_REGISTER_TYPE_F: return BRW_REGISTER_3SRC_TYPE_F;
+   case BRW_REGISTER_TYPE_D: return BRW_REGISTER_3SRC_TYPE_D;
+   case BRW_REGISTER_TYPE_UD: return BRW_REGISTER_3SRC_TYPE_UD;
+   }
+
+   return BRW_REGISTER_3SRC_TYPE_F;
+}
+
 void
 brw_set_3src_dest(struct brw_compile *p,
 		  struct brw_instruction *insn,
@@ -826,7 +841,7 @@ brw_set_3src_dest(struct brw_compile *p,
 	  dest.file == BRW_MESSAGE_REGISTER_FILE);
    assert(dest.nr < 128);
    assert(dest.address_mode == BRW_ADDRESS_DIRECT);
-   assert(dest.type == BRW_REGISTER_TYPE_F);
+   insn->bits1.da3src.dest_reg_type = get_3src_type(dest.type);
    insn->bits1.da3src.dest_reg_file = (dest.file == BRW_MESSAGE_REGISTER_FILE);
    insn->bits1.da3src.dest_reg_nr = dest.nr;
    insn->bits1.da3src.dest_subreg_nr = dest.subnr / 16;
@@ -842,7 +857,7 @@ brw_set_3src_src0(struct brw_compile *p,
    assert(src0.file == BRW_GENERAL_REGISTER_FILE);
    assert(src0.address_mode == BRW_ADDRESS_DIRECT);
    assert(src0.nr < 128);
-   assert(src0.type == BRW_REGISTER_TYPE_F);
+   insn->bits1.da3src.src_reg_type = get_3src_type(src0.type);
    insn->bits2.da3src.src0_swizzle = src0.dw1.bits.swizzle;
    insn->bits2.da3src.src0_subreg_nr = get_3src_subreg_nr(src0);
    insn->bits2.da3src.src0_reg_nr = src0.nr;
@@ -859,7 +874,7 @@ brw_set_3src_src1(struct brw_compile *p,
    assert(src1.file == BRW_GENERAL_REGISTER_FILE);
    assert(src1.address_mode == BRW_ADDRESS_DIRECT);
    assert(src1.nr < 128);
-   assert(src1.type == BRW_REGISTER_TYPE_F);
+   assert(src1.type == insn->bits1.da3src.src_reg_type);
    insn->bits2.da3src.src1_swizzle = src1.dw1.bits.swizzle;
    insn->bits2.da3src.src1_subreg_nr_low = get_3src_subreg_nr(src1) & 0x3;
    insn->bits3.da3src.src1_subreg_nr_high = get_3src_subreg_nr(src1) >> 2;
@@ -877,7 +892,7 @@ brw_set_3src_src2(struct brw_compile *p,
    assert(src2.file == BRW_GENERAL_REGISTER_FILE);
    assert(src2.address_mode == BRW_ADDRESS_DIRECT);
    assert(src2.nr < 128);
-   assert(src2.type == BRW_REGISTER_TYPE_F);
+   assert(src2.type == insn->bits1.da3src.src_reg_type);
    insn->bits3.da3src.src2_swizzle = src2.dw1.bits.swizzle;
    insn->bits3.da3src.src2_subreg_nr = get_3src_subreg_nr(src2);
    insn->bits3.da3src.src2_rep_ctrl = src2.vstride == BRW_VERTICAL_STRIDE_0;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 79/90] assembler: Use brw_*() functions for 3-src instructions
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (77 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 78/90] assembler: Add support for D and UD in 3-src instructions Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 80/90] assembler: Don't pollute the library files with gen4asm.h Damien Lespiau
                   ` (11 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   79 +++++++++++++++++++----------------------------------
 1 files changed, 28 insertions(+), 51 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index 43c34f6..cd42004 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1086,8 +1086,8 @@ trinaryinstruction:
 
 		  set_instruction_opcode(&$$, $2);
 		  set_instruction_saturate(&$$, $4);
-		  GEN(&$$)->header.execution_size = $5;
 
+		  $6.width = $5;
 		  if (set_instruction_dest_three_src(&$$, &$6))
 		    YYERROR;
 		  if (set_instruction_src0_three_src(&$$, &$7))
@@ -2916,74 +2916,51 @@ static int set_instruction_src1(struct brw_program_instruction *instr,
 	return 0;
 }
 
-/* convert 2-src reg type to 3-src reg type
- *
- * 2-src reg type:
- *  000=UD 001=D 010=UW 011=W 100=UB 101=B 110=DF 111=F
- *
- * 3-src reg type:
- *  00=F  01=D  10=UD  11=DF
- */
-static int reg_type_2_to_3(int reg_type)
-{
-	int r = 0;
-	switch(reg_type) {
-		case 7: r = 0; break;
-		case 1: r = 1; break;
-		case 0: r = 2; break;
-		// TODO: supporting DF
-	}
-	return r;
-}
-
 static int set_instruction_dest_three_src(struct brw_program_instruction *instr,
 					  struct brw_reg *dest)
 {
-	GEN(instr)->bits1.da3src.dest_reg_file = dest->file;
-	GEN(instr)->bits1.da3src.dest_reg_nr = dest->nr;
-	GEN(instr)->bits1.da3src.dest_subreg_nr = get_subreg_address(dest->file, dest->type, dest->subnr, dest->address_mode) / 4; // in DWORD
-	GEN(instr)->bits1.da3src.dest_writemask = dest->dw1.bits.writemask;
-	GEN(instr)->bits1.da3src.dest_reg_type = reg_type_2_to_3(dest->type);
-	return 0;
+    resolve_subnr(dest);
+    brw_set_3src_dest(&genasm_compile, GEN(instr), *dest);
+    return 0;
 }
 
 static int set_instruction_src0_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src)
 {
-	if (advanced_flag) {
-		reset_instruction_src_region(GEN(instr), src);
-	}
-	// TODO: supporting src0 swizzle, src0 modifier, src0 rep_ctrl
-	GEN(instr)->bits1.da3src.src_reg_type = reg_type_2_to_3(src->reg.type);
-	GEN(instr)->bits2.da3src.src0_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
-	GEN(instr)->bits2.da3src.src0_reg_nr = src->reg.nr;
-	return 0;
+    if (advanced_flag)
+	reset_instruction_src_region(GEN(instr), src);
+
+    resolve_subnr(&src->reg);
+
+    // TODO: src0 modifier, src0 rep_ctrl
+    brw_set_3src_src0(&genasm_compile, GEN(instr), src->reg);
+    return 0;
 }
 
 static int set_instruction_src1_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src)
 {
-	if (advanced_flag) {
-		reset_instruction_src_region(GEN(instr), src);
-	}
-	// TODO: supporting src1 swizzle, src1 modifier, src1 rep_ctrl
-	int v = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
-	GEN(instr)->bits2.da3src.src1_subreg_nr_low = v % 4; // lower 2 bits
-	GEN(instr)->bits3.da3src.src1_subreg_nr_high = v / 4; // highest bit
-	GEN(instr)->bits3.da3src.src1_reg_nr = src->reg.nr;
-	return 0;
+    if (advanced_flag)
+	reset_instruction_src_region(GEN(instr), src);
+
+    resolve_subnr(&src->reg);
+
+    // TODO: src1 modifier, src1 rep_ctrl
+    brw_set_3src_src1(&genasm_compile, GEN(instr), src->reg);
+    return 0;
 }
 
 static int set_instruction_src2_three_src(struct brw_program_instruction *instr,
 					  struct src_operand *src)
 {
-	if (advanced_flag) {
-		reset_instruction_src_region(GEN(instr), src);
-	}
-	// TODO: supporting src2 swizzle, src2 modifier, src2 rep_ctrl
-	GEN(instr)->bits3.da3src.src2_subreg_nr = get_subreg_address(src->reg.file, src->reg.type, src->reg.subnr, src->reg.address_mode) / 4; // in DWORD
-	GEN(instr)->bits3.da3src.src2_reg_nr = src->reg.nr;
-	return 0;
+    if (advanced_flag)
+	reset_instruction_src_region(GEN(instr), src);
+
+    resolve_subnr(&src->reg);
+
+    // TODO: src2 modifier, src2 rep_ctrl
+    brw_set_3src_src2(&genasm_compile, GEN(instr), src->reg);
+    return 0;
 }
 
 static void set_instruction_saturate(struct brw_program_instruction *instr,
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 80/90] assembler: Don't pollute the library files with gen4asm.h
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (78 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 79/90] assembler: Use brw_*() functions for " Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 81/90] assembler: Put struct opcode_desc back in brw_context.h Damien Lespiau
                   ` (10 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

gen4asm.h is assembler specific while we want the library files to be
somewhat of a proper library.

This means that we have to redefine the GL* typedefs for brw_structs.h,
not using any of thet GL typedef will be for a future commit.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_disasm.c  |    1 -
 assembler/brw_eu.c      |    1 -
 assembler/brw_eu.h      |    1 -
 assembler/brw_structs.h |    8 ++++++++
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/assembler/brw_disasm.c b/assembler/brw_disasm.c
index 8eaeb78..8524d41 100644
--- a/assembler/brw_disasm.c
+++ b/assembler/brw_disasm.c
@@ -27,7 +27,6 @@
 #include <unistd.h>
 #include <stdarg.h>
 
-#include "gen4asm.h"
 #include "brw_eu.h"
 
 const struct opcode_desc opcode_descs[128] = {
diff --git a/assembler/brw_eu.c b/assembler/brw_eu.c
index 1641c95..69f088d 100644
--- a/assembler/brw_eu.c
+++ b/assembler/brw_eu.c
@@ -32,7 +32,6 @@
 
 #include <string.h>
 
-#include "gen4asm.h"
 #include "brw_context.h"
 #include "brw_defines.h"
 #include "brw_eu.h"
diff --git a/assembler/brw_eu.h b/assembler/brw_eu.h
index 20d4b82..5d623c0 100644
--- a/assembler/brw_eu.h
+++ b/assembler/brw_eu.h
@@ -35,7 +35,6 @@
 
 #include <stdbool.h>
 #include <stdio.h>
-#include "gen4asm.h"
 #include "brw_context.h"
 #include "brw_structs.h"
 #include "brw_defines.h"
diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index e650bf5..2f6aafb 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -33,6 +33,14 @@
 #ifndef BRW_STRUCTS_H
 #define BRW_STRUCTS_H
 
+#include <stdint.h>
+
+typedef unsigned char GLubyte;
+typedef short GLshort;
+typedef unsigned int GLuint;
+typedef int GLint;
+typedef float GLfloat;
+
 /* These seem to be passed around as function args, so it works out
  * better to keep them as #defines:
  */
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 81/90] assembler: Put struct opcode_desc back in brw_context.h
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (79 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 80/90] assembler: Don't pollute the library files with gen4asm.h Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 82/90] assembler: Use set_instruction_src1() in send Damien Lespiau
                   ` (9 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

I originally moved struct opcode_desc from brw_context.h to brw_eu.h on
the mesa side, but that was before the realization we needed struct
brw_context if we wanted to not touch the code too much.

So put it back there now that the mesa patch has been dropped.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_context.h |   14 ++++++++++++++
 assembler/brw_disasm.c  |   17 +++++++++--------
 assembler/brw_eu.h      |   11 -----------
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/assembler/brw_context.h b/assembler/brw_context.h
index 16a9f70..90e66f7 100644
--- a/assembler/brw_context.h
+++ b/assembler/brw_context.h
@@ -31,6 +31,9 @@
 #define __BRW_CONTEXT_H__
 
 #include <stdbool.h>
+#include <stdio.h>
+
+#include "brw_structs.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -57,6 +60,17 @@ struct brw_context
 bool
 brw_init_context(struct brw_context *brw, int gen);
 
+/* brw_disasm.c */
+struct opcode_desc {
+    char    *name;
+    int	    nsrc;
+    int	    ndst;
+};
+
+extern const struct opcode_desc opcode_descs[128];
+
+int brw_disasm (FILE *file, struct brw_instruction *inst, int gen);
+
 #ifdef __cplusplus
 } /* end of extern "C" */
 #endif
diff --git a/assembler/brw_disasm.c b/assembler/brw_disasm.c
index 8524d41..9781f6b 100644
--- a/assembler/brw_disasm.c
+++ b/assembler/brw_disasm.c
@@ -27,7 +27,8 @@
 #include <unistd.h>
 #include <stdarg.h>
 
-#include "brw_eu.h"
+#include "brw_context.h"
+#include "brw_defines.h"
 
 const struct opcode_desc opcode_descs[128] = {
     [BRW_OPCODE_MOV] = { .name = "mov", .nsrc = 1, .ndst = 1 },
@@ -99,7 +100,7 @@ static const char * const conditional_modifier[16] = {
     [BRW_CONDITIONAL_U] = ".u",
 };
 
-static const char * const negate_op[2] = {
+static const char * const negate[2] = {
     [0] = "",
     [1] = "-",
 };
@@ -602,7 +603,7 @@ static int src_da1 (FILE *file, GLuint type, GLuint _reg_file,
 		    GLuint reg_num, GLuint sub_reg_num, GLuint __abs, GLuint _negate)
 {
     int err = 0;
-    err |= control (file, "negate", negate_op, _negate, NULL);
+    err |= control (file, "negate", negate, _negate, NULL);
     err |= control (file, "abs", _abs, __abs, NULL);
 
     err |= reg (file, _reg_file, reg_num);
@@ -628,7 +629,7 @@ static int src_ia1 (FILE *file,
 		    GLuint _vert_stride)
 {
     int err = 0;
-    err |= control (file, "negate", negate_op, _negate, NULL);
+    err |= control (file, "negate", negate, _negate, NULL);
     err |= control (file, "abs", _abs, __abs, NULL);
 
     string (file, "g[a0");
@@ -656,7 +657,7 @@ static int src_da16 (FILE *file,
 		     GLuint swz_w)
 {
     int err = 0;
-    err |= control (file, "negate", negate_op, _negate, NULL);
+    err |= control (file, "negate", negate, _negate, NULL);
     err |= control (file, "abs", _abs, __abs, NULL);
 
     err |= reg (file, _reg_file, _reg_nr);
@@ -707,7 +708,7 @@ static int src0_3src (FILE *file, struct brw_instruction *inst)
     GLuint swz_z = (inst->bits2.da3src.src0_swizzle >> 4) & 0x3;
     GLuint swz_w = (inst->bits2.da3src.src0_swizzle >> 6) & 0x3;
 
-    err |= control (file, "negate", negate_op, inst->bits1.da3src.src0_negate, NULL);
+    err |= control (file, "negate", negate, inst->bits1.da3src.src0_negate, NULL);
     err |= control (file, "abs", _abs, inst->bits1.da3src.src0_abs, NULL);
 
     err |= reg (file, BRW_GENERAL_REGISTER_FILE, inst->bits2.da3src.src0_reg_nr);
@@ -757,7 +758,7 @@ static int src1_3src (FILE *file, struct brw_instruction *inst)
     GLuint src1_subreg_nr = (inst->bits2.da3src.src1_subreg_nr_low |
 			     (inst->bits3.da3src.src1_subreg_nr_high << 2));
 
-    err |= control (file, "negate", negate_op, inst->bits1.da3src.src1_negate,
+    err |= control (file, "negate", negate, inst->bits1.da3src.src1_negate,
 		    NULL);
     err |= control (file, "abs", _abs, inst->bits1.da3src.src1_abs, NULL);
 
@@ -808,7 +809,7 @@ static int src2_3src (FILE *file, struct brw_instruction *inst)
     GLuint swz_z = (inst->bits3.da3src.src2_swizzle >> 4) & 0x3;
     GLuint swz_w = (inst->bits3.da3src.src2_swizzle >> 6) & 0x3;
 
-    err |= control (file, "negate", negate_op, inst->bits1.da3src.src2_negate,
+    err |= control (file, "negate", negate, inst->bits1.da3src.src2_negate,
 		    NULL);
     err |= control (file, "abs", _abs, inst->bits1.da3src.src2_abs, NULL);
 
diff --git a/assembler/brw_eu.h b/assembler/brw_eu.h
index 5d623c0..83c82d1 100644
--- a/assembler/brw_eu.h
+++ b/assembler/brw_eu.h
@@ -420,17 +420,6 @@ void brw_optimize(struct brw_compile *p);
 void brw_remove_duplicate_mrf_moves(struct brw_compile *p);
 void brw_remove_grf_to_mrf_moves(struct brw_compile *p);
 
-/* brw_disasm.c */
-struct opcode_desc {
-    char    *name;
-    int	    nsrc;
-    int	    ndst;
-};
-
-extern const struct opcode_desc opcode_descs[128];
-
-int brw_disasm (FILE *file, struct brw_instruction *inst, int gen);
-
 #ifdef __cplusplus
 }
 #endif
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 82/90] assembler: Use set_instruction_src1() in send
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (80 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 81/90] assembler: Put struct opcode_desc back in brw_context.h Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 83/90] assembler: Finish importing brw_eu_*c from mesa Damien Lespiau
                   ` (8 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

No reason not to!

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   17 ++++++-----------
 1 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index cd42004..d69d7b4 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -1200,9 +1200,8 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$6, &@6) != 0)
 		    YYERROR;
-		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  GEN(&$$)->bits1.da1.src1_reg_type = $7.reg.type;
-		  GEN(&$$)->bits3.ud = $7.reg.dw1.ud;
+		  if (set_instruction_src1(&$$, &$7, &@7) != 0)
+		    YYERROR;
                 }
 		| predicate SEND execsize dst sendleadreg sndopr imm32reg instoptions
 		{
@@ -1241,10 +1240,8 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
                   src0.reg.nr = $5.nr;
                   src0.reg.subnr = 0;
                   set_instruction_src0(&$$, &src0, NULL);
+		  set_instruction_src1(&$$, &$7, NULL);
 
-		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  GEN(&$$)->bits1.da1.src1_reg_type = $7.reg.type;
-                  GEN(&$$)->bits3.ud = $7.reg.dw1.ud;
                   GEN(&$$)->bits3.generic_gen5.end_of_thread = !!($6 & EX_DESC_EOT_MASK);
 		}
 		| predicate SEND execsize dst sendleadreg sndopr directsrcoperand instoptions
@@ -1306,15 +1303,13 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget
 		    YYERROR;
 		  if (set_instruction_src0(&$$, &$6, &@6) != 0)
 		    YYERROR;
-		  GEN(&$$)->bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE;
-		  GEN(&$$)->bits1.da1.src1_reg_type = $8.reg.type;
+		  if (set_instruction_src1(&$$, &$8, &@8) != 0)
+		    YYERROR;
+
 		  if (IS_GENx(5)) {
 		      GEN(&$$)->bits2.send_gen5.sfid = ($7 & EX_DESC_SFID_MASK);
-		      GEN(&$$)->bits3.ud = $8.reg.dw1.ud;
 		      GEN(&$$)->bits3.generic_gen5.end_of_thread = !!($7 & EX_DESC_EOT_MASK);
 		  }
-		  else
-		      GEN(&$$)->bits3.ud = $8.reg.dw1.ud;
 		}
 		| predicate SEND execsize dst sendleadreg payload exp directsrcoperand instoptions
 		{
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 83/90] assembler: Finish importing brw_eu_*c from mesa
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (81 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 82/90] assembler: Use set_instruction_src1() in send Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 84/90] assembler: Merge declared_register's type into the reg structure Damien Lespiau
                   ` (7 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/Makefile.am    |    2 +
 assembler/brw_eu_debug.c |   92 ++++++++++++++++++++++++++++++++++
 assembler/brw_eu_util.c  |  125 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 219 insertions(+), 0 deletions(-)
 create mode 100644 assembler/brw_eu_debug.c
 create mode 100644 assembler/brw_eu_util.c

diff --git a/assembler/Makefile.am b/assembler/Makefile.am
index 113e550..95ba08d 100644
--- a/assembler/Makefile.am
+++ b/assembler/Makefile.am
@@ -13,7 +13,9 @@ libbrw_la_SOURCES =		\
 	brw_eu.h		\
 	brw_eu.c		\
 	brw_eu_compact.c	\
+	brw_eu_debug.c		\
 	brw_eu_emit.c		\
+	brw_eu_util.c		\
 	brw_reg.h		\
 	brw_structs.h		\
 	ralloc.c		\
diff --git a/assembler/brw_eu_debug.c b/assembler/brw_eu_debug.c
new file mode 100644
index 0000000..1e4f933
--- /dev/null
+++ b/assembler/brw_eu_debug.c
@@ -0,0 +1,92 @@
+/*
+ Copyright (C) Intel Corp.  2006.  All Rights Reserved.
+ Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
+ develop this 3D driver.
+ 
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+ 
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.
+ 
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ 
+ **********************************************************************/
+ /*
+  * Authors:
+  *   Keith Whitwell <keith@tungstengraphics.com>
+  */
+    
+#include "brw_eu.h"
+
+void brw_print_reg( struct brw_reg hwreg )
+{
+   static const char *file[] = {
+      "arf",
+      "grf",
+      "msg",
+      "imm"
+   };
+
+   static const char *type[] = {
+      "ud",
+      "d",
+      "uw",
+      "w",
+      "ub",
+      "vf",
+      "hf",
+      "f"
+   };
+
+   printf("%s%s", 
+	  hwreg.abs ? "abs/" : "",
+	  hwreg.negate ? "-" : "");
+     
+   if (hwreg.file == BRW_GENERAL_REGISTER_FILE &&
+       hwreg.nr % 2 == 0 &&
+       hwreg.subnr == 0 &&
+       hwreg.vstride == BRW_VERTICAL_STRIDE_8 &&
+       hwreg.width == BRW_WIDTH_8 &&
+       hwreg.hstride == BRW_HORIZONTAL_STRIDE_1 &&
+       hwreg.type == BRW_REGISTER_TYPE_F) {
+      /* vector register */
+      printf("vec%d", hwreg.nr);
+   }
+   else if (hwreg.file == BRW_GENERAL_REGISTER_FILE &&
+	    hwreg.vstride == BRW_VERTICAL_STRIDE_0 &&
+	    hwreg.width == BRW_WIDTH_1 &&
+	    hwreg.hstride == BRW_HORIZONTAL_STRIDE_0 &&
+	    hwreg.type == BRW_REGISTER_TYPE_F) {      
+      /* "scalar" register */
+      printf("scl%d.%d", hwreg.nr, hwreg.subnr / 4);
+   }
+   else if (hwreg.file == BRW_IMMEDIATE_VALUE) {
+      printf("imm %f", hwreg.dw1.f);
+   }
+   else {
+      printf("%s%d.%d<%d;%d,%d>:%s", 
+		   file[hwreg.file],
+		   hwreg.nr,
+		   hwreg.subnr / type_sz(hwreg.type),
+		   hwreg.vstride ? (1<<(hwreg.vstride-1)) : 0,
+		   1<<hwreg.width,
+		   hwreg.hstride ? (1<<(hwreg.hstride-1)) : 0,		
+		   type[hwreg.type]);
+   }
+}
+
+
+
diff --git a/assembler/brw_eu_util.c b/assembler/brw_eu_util.c
new file mode 100644
index 0000000..2037634
--- /dev/null
+++ b/assembler/brw_eu_util.c
@@ -0,0 +1,125 @@
+/*
+ Copyright (C) Intel Corp.  2006.  All Rights Reserved.
+ Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
+ develop this 3D driver.
+ 
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+ 
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.
+ 
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ 
+ **********************************************************************/
+ /*
+  * Authors:
+  *   Keith Whitwell <keith@tungstengraphics.com>
+  */
+      
+
+#include "brw_context.h"
+#include "brw_defines.h"
+#include "brw_eu.h"
+
+
+void brw_math_invert( struct brw_compile *p, 
+			     struct brw_reg dst,
+			     struct brw_reg src)
+{
+   brw_math( p, 
+	     dst,
+	     BRW_MATH_FUNCTION_INV, 
+	     0,
+	     src,
+	     BRW_MATH_PRECISION_FULL, 
+	     BRW_MATH_DATA_VECTOR );
+}
+
+
+
+void brw_copy4(struct brw_compile *p,
+	       struct brw_reg dst,
+	       struct brw_reg src,
+	       GLuint count)
+{
+   GLuint i;
+
+   dst = vec4(dst);
+   src = vec4(src);
+
+   for (i = 0; i < count; i++)
+   {
+      GLuint delta = i*32;
+      brw_MOV(p, byte_offset(dst, delta),    byte_offset(src, delta));
+      brw_MOV(p, byte_offset(dst, delta+16), byte_offset(src, delta+16));
+   }
+}
+
+
+void brw_copy8(struct brw_compile *p,
+	       struct brw_reg dst,
+	       struct brw_reg src,
+	       GLuint count)
+{
+   GLuint i;
+
+   dst = vec8(dst);
+   src = vec8(src);
+
+   for (i = 0; i < count; i++)
+   {
+      GLuint delta = i*32;
+      brw_MOV(p, byte_offset(dst, delta),    byte_offset(src, delta));
+   }
+}
+
+
+void brw_copy_indirect_to_indirect(struct brw_compile *p,
+				   struct brw_indirect dst_ptr,
+				   struct brw_indirect src_ptr,
+				   GLuint count)
+{
+   GLuint i;
+
+   for (i = 0; i < count; i++)
+   {
+      GLuint delta = i*32;
+      brw_MOV(p, deref_4f(dst_ptr, delta),    deref_4f(src_ptr, delta));
+      brw_MOV(p, deref_4f(dst_ptr, delta+16), deref_4f(src_ptr, delta+16));
+   }
+}
+
+
+void brw_copy_from_indirect(struct brw_compile *p,
+			    struct brw_reg dst,
+			    struct brw_indirect ptr,
+			    GLuint count)
+{
+   GLuint i;
+
+   dst = vec4(dst);
+
+   for (i = 0; i < count; i++)
+   {
+      GLuint delta = i*32;
+      brw_MOV(p, byte_offset(dst, delta),    deref_4f(ptr, delta));
+      brw_MOV(p, byte_offset(dst, delta+16), deref_4f(ptr, delta+16));
+   }
+}
+
+
+
+
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 84/90] assembler: Merge declared_register's type into the reg structure
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (82 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 83/90] assembler: Finish importing brw_eu_*c from mesa Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 85/90] assembler: Use defines for width Damien Lespiau
                   ` (6 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    1 -
 assembler/gram.y    |   12 +++++-------
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 3b98444..49baf9d 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -220,7 +220,6 @@ struct declared_register {
     int element_size;
     struct region src_region;
     int dst_region;
-    int type;
 };
 struct declared_register *find_register(char *name);
 void insert_register(struct declared_register *reg);
diff --git a/assembler/gram.y b/assembler/gram.y
index d69d7b4..aa6d709 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -181,8 +181,7 @@ static bool declared_register_equal(struct declared_register *r1,
 	return false;
 
     if (r1->element_size != r2->element_size ||
-        r1->dst_region != r2->dst_region ||
-	r1->type != r2->type)
+        r1->dst_region != r2->dst_region)
 	return false;
 
     return true;
@@ -650,7 +649,7 @@ declare_pragma:	DECLARE_PRAGMA STRING declare_base declare_elementsize declare_s
 		    reg.element_size = $4;
 		    reg.src_region = $5;
 		    reg.dst_region = $6;
-		    reg.type = $7;
+		    reg.reg.type = $7;
 
 		    found = find_register($2);
 		    if (found) {
@@ -1771,7 +1770,6 @@ dstoperand:	symbol_reg dstregion
 		{
 		  $$ = $1.reg;
 	          $$.hstride = resolve_dst_region(&$1, $2);
-		  $$.type = $1.type;
 		}
 		| dstreg dstregion writemask regtype
 		{
@@ -1860,7 +1858,7 @@ symbol_reg_p: STRING LPAREN exp RPAREN
 		    memcpy(&$$, dcl_reg, sizeof(*dcl_reg));
 		    $$.reg.nr += $3;
 		    if(advanced_flag) {
-			int size = get_type_size(dcl_reg->type);
+			int size = get_type_size(dcl_reg->reg.type);
 		        $$.reg.nr += ($$.reg.subnr + $5) / (32 / size);
 		        $$.reg.subnr = ($$.reg.subnr + $5) % (32 / size);
 		    } else {
@@ -2047,7 +2045,7 @@ directsrcoperand:	negate abs symbol_reg region regtype
 		  $$.reg.nr = $3.reg.nr;
 		  $$.reg.subnr = $3.reg.subnr;
 		  if ($5.is_default) {
-		    $$.reg.type = $3.type;
+		    $$.reg.type = $3.reg.type;
 		  } else {
 		    $$.reg.type = $5.type;
 		  }
@@ -2434,7 +2432,7 @@ relativelocation2:
 		  $$.reg.file = $1.reg.file;
 		  $$.reg.nr = $1.reg.nr;
 		  $$.reg.subnr = $1.reg.subnr;
-		  $$.reg.type = $1.type;
+		  $$.reg.type = $1.reg.type;
 		  $$.reg.vstride = $1.src_region.vert_stride;
 		  $$.reg.width = $1.src_region.width;
 		  $$.reg.hstride = $1.src_region.horiz_stride;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 85/90] assembler: Use defines for width
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (83 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 84/90] assembler: Merge declared_register's type into the reg structure Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 86/90] assembler: Remove trailing white space Damien Lespiau
                   ` (5 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Instead of just using hardcoded numbers or resorting to ffs().

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gram.y |   22 +++++++++++-----------
 1 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/assembler/gram.y b/assembler/gram.y
index aa6d709..9d58fe6 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -618,7 +618,7 @@ declare_srcregion: /* empty */
 		  /* XXX is this default correct?*/
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.vert_stride = ffs(0);
-		  $$.width = ffs(1) - 1;
+		  $$.width = BRW_WIDTH_1;
 		  $$.horiz_stride = ffs(0);
 		}
 		| SRCREGION EQ region
@@ -955,7 +955,7 @@ subroutineinstruction:
 		  set_instruction_opcode(&$$, $2);
 
 		  $4.type = BRW_REGISTER_TYPE_D; /* dest type should be DWORD */
-		  $4.width = 1; /* execution size must be 2. Here 1 is encoded 2. */
+		  $4.width = BRW_WIDTH_2; /* execution size must be 2. */
 		  set_instruction_dest(&$$, &$4);
 
 		  struct src_operand src0;
@@ -963,7 +963,7 @@ subroutineinstruction:
 		  src0.reg.type = BRW_REGISTER_TYPE_D; /* source type should be DWORD */
 		  /* source0 region control must be <2,2,1>. */
 		  src0.reg.hstride = 1; /*encoded 1*/
-		  src0.reg.width = 1; /*encoded 2*/
+		  src0.reg.width = BRW_WIDTH_2;
 		  src0.reg.vstride = 2; /*encoded 2*/
 		  set_instruction_src0(&$$, &src0, NULL);
 
@@ -981,11 +981,11 @@ subroutineinstruction:
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_predicate(&$$, &$1);
 		  set_instruction_opcode(&$$, $2);
-		  dst_null_reg.width = 1; /* execution size of RET should be 2 */
+		  dst_null_reg.width = BRW_WIDTH_2; /* execution size of RET should be 2 */
 		  set_instruction_dest(&$$, &dst_null_reg);
 		  $5.reg.type = BRW_REGISTER_TYPE_D;
 		  $5.reg.hstride = 1; /*encoded 1*/
-		  $5.reg.width = 1; /*encoded 2*/
+		  $5.reg.width = BRW_WIDTH_2;
 		  $5.reg.vstride = 2; /*encoded 2*/
 		  set_instruction_src0(&$$, &$5, NULL);
 		}
@@ -1351,7 +1351,7 @@ jumpinstruction: predicate JMPI execsize relativelocation2
 		  if(advanced_flag)
 			GEN(&$$)->header.mask_control = BRW_MASK_DISABLE;
 		  set_instruction_predicate(&$$, &$1);
-		  ip_dst.width = ffs(1) - 1;
+		  ip_dst.width = BRW_WIDTH_1;
 		  set_instruction_dest(&$$, &ip_dst);
 		  set_instruction_src0(&$$, &ip_src, NULL);
 		  set_instruction_src1(&$$, &$4, NULL);
@@ -1407,7 +1407,7 @@ syncinstruction: predicate WAIT notifyreg
 		  memset(&$$, 0, sizeof($$));
 		  set_instruction_opcode(&$$, $2);
 		  set_direct_dst_operand(&notify_dst, &$3, BRW_REGISTER_TYPE_D);
-		  notify_dst.width = ffs(1) - 1;
+		  notify_dst.width = BRW_WIDTH_1;
 		  set_instruction_dest(&$$, &notify_dst);
 		  set_direct_src_operand(&notify_src, &$3, BRW_REGISTER_TYPE_D);
 		  set_instruction_src0(&$$, &notify_src, NULL);
@@ -2473,7 +2473,7 @@ region:		/* empty */
 		  /* XXX is this default value correct?*/
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.vert_stride = ffs(0);
-		  $$.width = ffs(1) - 1;
+		  $$.width = BRW_WIDTH_1;
 		  $$.horiz_stride = ffs(0);
 		  $$.is_default = 1;
 		}
@@ -2482,7 +2482,7 @@ region:		/* empty */
 		  /* XXX is this default value correct for accreg?*/
 		  memset (&$$, '\0', sizeof ($$));
 		  $$.vert_stride = ffs($2);
-		  $$.width = ffs(1) - 1;
+		  $$.width = BRW_WIDTH_1;
 		  $$.horiz_stride = ffs(0);
 		}
 		|LANGLE exp COMMA exp COMMA exp RANGLE
@@ -2783,7 +2783,7 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
     if (src->reg.file == BRW_ARCHITECTURE_REGISTER_FILE && 
         ((src->reg.nr & 0xF0) == BRW_ARF_ADDRESS)) {
         src->reg.vstride = ffs(0);
-        src->reg.width = ffs(1) - 1;
+        src->reg.width = BRW_WIDTH_1;
         src->reg.hstride = ffs(0);
     } else if (src->reg.file == BRW_ARCHITECTURE_REGISTER_FILE &&
                ((src->reg.nr & 0xF0) == BRW_ARF_ACCUMULATOR)) {
@@ -2805,7 +2805,7 @@ static void reset_instruction_src_region(struct brw_instruction *instr,
                (src->reg.nr == BRW_ARF_NULL) &&
                (instr->header.opcode == BRW_OPCODE_SEND)) {
         src->reg.vstride = ffs(8);
-        src->reg.width = ffs(8) - 1;
+        src->reg.width = BRW_WIDTH_8;
         src->reg.hstride = ffs(1);
     } else {
 
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 86/90] assembler: Remove trailing white space
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (84 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 85/90] assembler: Use defines for width Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 87/90] assembler: Don't use GL types Damien Lespiau
                   ` (4 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_eu.c       |   16 +++++++-------
 assembler/brw_eu_debug.c |   20 +++++++++---------
 assembler/brw_eu_emit.c  |   48 +++++++++++++++++++++++-----------------------
 assembler/brw_eu_util.c  |   18 ++++++++--------
 assembler/disasm-main.c  |    2 +-
 assembler/gen4asm.h      |    4 +-
 assembler/main.c         |    2 +-
 7 files changed, 55 insertions(+), 55 deletions(-)

diff --git a/assembler/brw_eu.c b/assembler/brw_eu.c
index 69f088d..a9afc82 100644
--- a/assembler/brw_eu.c
+++ b/assembler/brw_eu.c
@@ -2,7 +2,7 @@
  Copyright (C) Intel Corp.  2006.  All Rights Reserved.
  Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
  develop this 3D driver.
- 
+
  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
@@ -10,11 +10,11 @@
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:
- 
+
  The above copyright notice and this permission notice (including the
  next paragraph) shall be included in all copies or substantial
  portions of the Software.
- 
+
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -22,13 +22,13 @@
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- 
+
  **********************************************************************/
  /*
   * Authors:
   *   Keith Whitwell <keith@tungstengraphics.com>
   */
-  
+
 
 #include <string.h>
 
@@ -78,7 +78,7 @@ void brw_set_predicate_control_flag_value( struct brw_compile *p, GLuint value )
       }
 
       p->current->header.predicate_control = BRW_PREDICATE_NORMAL;
-   }   
+   }
 }
 
 void brw_set_predicate_control( struct brw_compile *p, GLuint pc )
@@ -165,7 +165,7 @@ void brw_push_insn_state( struct brw_compile *p )
    assert(p->current != &p->stack[BRW_EU_MAX_INSN_STACK-1]);
    memcpy(p->current+1, p->current, sizeof(struct brw_instruction));
    p->compressed_stack[p->current - p->stack] = p->compressed;
-   p->current++;   
+   p->current++;
 }
 
 void brw_pop_insn_state( struct brw_compile *p )
@@ -203,7 +203,7 @@ brw_init_compile(struct brw_context *brw, struct brw_compile *p, void *mem_ctx)
    brw_set_mask_control(p, BRW_MASK_ENABLE); /* what does this do? */
    brw_set_saturate(p, 0);
    brw_set_compression_control(p, BRW_COMPRESSION_NONE);
-   brw_set_predicate_control_flag_value(p, 0xff); 
+   brw_set_predicate_control_flag_value(p, 0xff);
 
    /* Set up control flow stack */
    p->if_stack_depth = 0;
diff --git a/assembler/brw_eu_debug.c b/assembler/brw_eu_debug.c
index 1e4f933..b446007 100644
--- a/assembler/brw_eu_debug.c
+++ b/assembler/brw_eu_debug.c
@@ -2,7 +2,7 @@
  Copyright (C) Intel Corp.  2006.  All Rights Reserved.
  Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
  develop this 3D driver.
- 
+
  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
@@ -10,11 +10,11 @@
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:
- 
+
  The above copyright notice and this permission notice (including the
  next paragraph) shall be included in all copies or substantial
  portions of the Software.
- 
+
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -22,13 +22,13 @@
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- 
+
  **********************************************************************/
  /*
   * Authors:
   *   Keith Whitwell <keith@tungstengraphics.com>
   */
-    
+
 #include "brw_eu.h"
 
 void brw_print_reg( struct brw_reg hwreg )
@@ -51,10 +51,10 @@ void brw_print_reg( struct brw_reg hwreg )
       "f"
    };
 
-   printf("%s%s", 
+   printf("%s%s",
 	  hwreg.abs ? "abs/" : "",
 	  hwreg.negate ? "-" : "");
-     
+
    if (hwreg.file == BRW_GENERAL_REGISTER_FILE &&
        hwreg.nr % 2 == 0 &&
        hwreg.subnr == 0 &&
@@ -69,7 +69,7 @@ void brw_print_reg( struct brw_reg hwreg )
 	    hwreg.vstride == BRW_VERTICAL_STRIDE_0 &&
 	    hwreg.width == BRW_WIDTH_1 &&
 	    hwreg.hstride == BRW_HORIZONTAL_STRIDE_0 &&
-	    hwreg.type == BRW_REGISTER_TYPE_F) {      
+	    hwreg.type == BRW_REGISTER_TYPE_F) {
       /* "scalar" register */
       printf("scl%d.%d", hwreg.nr, hwreg.subnr / 4);
    }
@@ -77,13 +77,13 @@ void brw_print_reg( struct brw_reg hwreg )
       printf("imm %f", hwreg.dw1.f);
    }
    else {
-      printf("%s%d.%d<%d;%d,%d>:%s", 
+      printf("%s%d.%d<%d;%d,%d>:%s",
 		   file[hwreg.file],
 		   hwreg.nr,
 		   hwreg.subnr / type_sz(hwreg.type),
 		   hwreg.vstride ? (1<<(hwreg.vstride-1)) : 0,
 		   1<<hwreg.width,
-		   hwreg.hstride ? (1<<(hwreg.hstride-1)) : 0,		
+		   hwreg.hstride ? (1<<(hwreg.hstride-1)) : 0,
 		   type[hwreg.type]);
    }
 }
diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
index ae570c7..a1e96d1 100644
--- a/assembler/brw_eu_emit.c
+++ b/assembler/brw_eu_emit.c
@@ -2,7 +2,7 @@
  Copyright (C) Intel Corp.  2006.  All Rights Reserved.
  Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
  develop this 3D driver.
- 
+
  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
@@ -10,11 +10,11 @@
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:
- 
+
  The above copyright notice and this permission notice (including the
  next paragraph) shall be included in all copies or substantial
  portions of the Software.
- 
+
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -22,13 +22,13 @@
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- 
+
  **********************************************************************/
  /*
   * Authors:
   *   Keith Whitwell <keith@tungstengraphics.com>
   */
-     
+
 #include <string.h>
 
 #include "brw_context.h"
@@ -115,7 +115,7 @@ brw_set_dest(struct brw_compile *p, struct brw_instruction *insn,
    insn->bits1.da1.dest_reg_type = dest.type;
    insn->bits1.da1.dest_address_mode = dest.address_mode;
 
-   if (dest.address_mode == BRW_ADDRESS_DIRECT) {   
+   if (dest.address_mode == BRW_ADDRESS_DIRECT) {
       insn->bits1.da1.dest_reg_nr = dest.nr;
 
       if (insn->header.access_mode == BRW_ALIGN_1) {
@@ -276,7 +276,7 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
 
    if (reg.file == BRW_IMMEDIATE_VALUE) {
       insn->bits3.ud = reg.dw1.ud;
-   
+
       /* Required to set some fields in src1 as well:
        */
 
@@ -288,7 +288,7 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
       insn->bits1.da1.src1_reg_type = reg.type;
        */
    }
-   else 
+   else
    {
       if (reg.address_mode == BRW_ADDRESS_DIRECT) {
 	 if (insn->header.access_mode == BRW_ALIGN_1) {
@@ -304,7 +304,7 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
 	 insn->bits2.ia1.src0_subreg_nr = reg.subnr;
 
 	 if (insn->header.access_mode == BRW_ALIGN_1) {
-	    insn->bits2.ia1.src0_indirect_offset = reg.dw1.bits.indirect_offset; 
+	    insn->bits2.ia1.src0_indirect_offset = reg.dw1.bits.indirect_offset;
 	 }
 	 else {
 	    insn->bits2.ia16.src0_subreg_nr = reg.dw1.bits.indirect_offset;
@@ -316,7 +316,7 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
 	 /* FIXME: While this is correct, if the assembler uses that code path
 	  * the opcode generated are different and thus needs a validation
 	  * pass.
-	 if (reg.width == BRW_WIDTH_1 && 
+	 if (reg.width == BRW_WIDTH_1 &&
 	     insn->header.execution_size == BRW_EXECUTE_1) {
 	    insn->bits2.da1.src0_horiz_stride = BRW_HORIZONTAL_STRIDE_0;
 	    insn->bits2.da1.src0_width = BRW_WIDTH_1;
@@ -404,7 +404,7 @@ void brw_set_src1(struct brw_compile *p,
 	 /* FIXME: While this is correct, if the assembler uses that code path
 	  * the opcode generated are different and thus needs a validation
 	  * pass.
-	 if (reg.width == BRW_WIDTH_1 && 
+	 if (reg.width == BRW_WIDTH_1 &&
 	     insn->header.execution_size == BRW_EXECUTE_1) {
 	    insn->bits3.da1.src1_horiz_stride = BRW_HORIZONTAL_STRIDE_0;
 	    insn->bits3.da1.src1_width = BRW_WIDTH_1;
@@ -766,7 +766,7 @@ brw_next_insn(struct brw_compile *p, GLuint opcode)
    insn = &p->store[p->nr_insn++];
    memcpy(insn, p->current, sizeof(*insn));
 
-   /* Reset this one-shot flag: 
+   /* Reset this one-shot flag:
     */
 
    if (p->current->header.destreg__conditionalmod) {
@@ -795,7 +795,7 @@ static struct brw_instruction *brw_alu2(struct brw_compile *p,
 					struct brw_reg src0,
 					struct brw_reg src1 )
 {
-   struct brw_instruction *insn = next_insn(p, opcode);   
+   struct brw_instruction *insn = next_insn(p, opcode);
    brw_set_dest(p, insn, dest);
    brw_set_src0(p, insn, src0);
    brw_set_src1(p, insn, src1);
@@ -1084,7 +1084,7 @@ struct brw_instruction *brw_MUL(struct brw_compile *p,
 
 void brw_NOP(struct brw_compile *p)
 {
-   struct brw_instruction *insn = next_insn(p, BRW_OPCODE_NOP);   
+   struct brw_instruction *insn = next_insn(p, BRW_OPCODE_NOP);
    brw_set_dest(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
    brw_set_src0(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD));
    brw_set_src1(p, insn, brw_imm_ud(0x0));
@@ -1098,7 +1098,7 @@ void brw_NOP(struct brw_compile *p)
  * Comparisons, if/else/endif
  */
 
-struct brw_instruction *brw_JMPI(struct brw_compile *p, 
+struct brw_instruction *brw_JMPI(struct brw_compile *p,
                                  struct brw_reg dest,
                                  struct brw_reg src0,
                                  struct brw_reg src1)
@@ -1736,7 +1736,7 @@ void brw_CMP(struct brw_compile *p,
 
    /* Make it so that future instructions will use the computed flag
     * value until brw_set_predicate_control_flag_value() is called
-    * again.  
+    * again.
     */
    if (dest.file == BRW_ARCHITECTURE_REGISTER_FILE &&
        dest.nr == 0) {
@@ -2211,7 +2211,7 @@ void brw_SAMPLE(struct brw_compile *p,
       /*printf("%s: zero writemask??\n", __FUNCTION__); */
       return;
    }
-   
+
    /* Hardware doesn't do destination dependency checking on send
     * instructions properly.  Add a workaround which generates the
     * dependency by other means.  In practice it seems like this bug
@@ -2260,11 +2260,11 @@ void brw_SAMPLE(struct brw_compile *p,
 
 	 brw_MOV(p, retype(m1, BRW_REGISTER_TYPE_UD),
 		 retype(brw_vec8_grf(0,0), BRW_REGISTER_TYPE_UD));
-  	 brw_MOV(p, get_element_ud(m1, 2), brw_imm_ud(newmask << 12)); 
+  	 brw_MOV(p, get_element_ud(m1, 2), brw_imm_ud(newmask << 12));
 
 	 brw_pop_insn_state(p);
 
-  	 src0 = retype(brw_null_reg(), BRW_REGISTER_TYPE_UW); 
+  	 src0 = retype(brw_null_reg(), BRW_REGISTER_TYPE_UW);
 	 dest = offset(dest, dst_offset);
 
 	 /* For 16-wide dispatch, masked channels are skipped in the
@@ -2278,7 +2278,7 @@ void brw_SAMPLE(struct brw_compile *p,
 
    {
       struct brw_instruction *insn;
-   
+
       gen6_resolve_implied_move(p, &src0, msg_reg_nr);
 
       insn = next_insn(p, BRW_OPCODE_SEND);
@@ -2293,7 +2293,7 @@ void brw_SAMPLE(struct brw_compile *p,
 			      binding_table_index,
 			      sampler,
 			      msg_type,
-			      response_length, 
+			      response_length,
 			      msg_length,
 			      header_present,
 			      simd_mode,
@@ -2363,9 +2363,9 @@ void brw_urb_WRITE(struct brw_compile *p,
 		       allocate,
 		       used,
 		       msg_length,
-		       response_length, 
-		       eot, 
-		       writes_complete, 
+		       response_length,
+		       eot,
+		       writes_complete,
 		       offset,
 		       swizzle);
 }
diff --git a/assembler/brw_eu_util.c b/assembler/brw_eu_util.c
index 2037634..e3bfbc7 100644
--- a/assembler/brw_eu_util.c
+++ b/assembler/brw_eu_util.c
@@ -2,7 +2,7 @@
  Copyright (C) Intel Corp.  2006.  All Rights Reserved.
  Intel funded Tungsten Graphics (http://www.tungstengraphics.com) to
  develop this 3D driver.
- 
+
  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
@@ -10,11 +10,11 @@
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:
- 
+
  The above copyright notice and this permission notice (including the
  next paragraph) shall be included in all copies or substantial
  portions of the Software.
- 
+
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@@ -22,29 +22,29 @@
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- 
+
  **********************************************************************/
  /*
   * Authors:
   *   Keith Whitwell <keith@tungstengraphics.com>
   */
-      
+
 
 #include "brw_context.h"
 #include "brw_defines.h"
 #include "brw_eu.h"
 
 
-void brw_math_invert( struct brw_compile *p, 
+void brw_math_invert( struct brw_compile *p,
 			     struct brw_reg dst,
 			     struct brw_reg src)
 {
-   brw_math( p, 
+   brw_math( p,
 	     dst,
-	     BRW_MATH_FUNCTION_INV, 
+	     BRW_MATH_FUNCTION_INV,
 	     0,
 	     src,
-	     BRW_MATH_PRECISION_FULL, 
+	     BRW_MATH_PRECISION_FULL,
 	     BRW_MATH_DATA_VECTOR );
 }
 
diff --git a/assembler/disasm-main.c b/assembler/disasm-main.c
index 87e6737..5bc75af 100644
--- a/assembler/disasm-main.c
+++ b/assembler/disasm-main.c
@@ -165,7 +165,7 @@ int main(int argc, char **argv)
 	    exit(1);
 	}
     }
-	    
+
     for (inst = program->first; inst; inst = inst->next)
 	brw_disasm (output, &inst->insn.gen, gen);
     exit (0);
diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 49baf9d..a708c52 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -106,7 +106,7 @@ struct options {
 
 struct region {
     int vert_stride, width, horiz_stride;
-    int is_default;        
+    int is_default;
 };
 struct regtype {
     int type;
@@ -114,7 +114,7 @@ struct regtype {
 };
 
 /**
- * This structure is the internal representation of source operands in the 
+ * This structure is the internal representation of source operands in the
  * parser.
  */
 struct src_operand {
diff --git a/assembler/main.c b/assembler/main.c
index f1d78d0..05ca337 100644
--- a/assembler/main.c
+++ b/assembler/main.c
@@ -457,7 +457,7 @@ int main(int argc, char **argv)
 		// this is a branch instruction with one offset argument
 		int offset = reloc->first_reloc_offset;
 		/* bspec: Unlike other flow control instructions, the offset used by JMPI is relative to the incremented instruction pointer rather than the IP value for the instruction itself. */
-		
+
 		int is_jmpi = inst->header.opcode == BRW_OPCODE_JMPI; // target relative to the post-incremented IP, so delta == 1 if JMPI
 		if(is_jmpi)
 		    offset --;
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 87/90] assembler: Don't use GL types
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (85 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 86/90] assembler: Remove trailing white space Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 88/90] assembler: Group the header inclusions together Damien Lespiau
                   ` (3 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

sed -i -e 's/GLuint/unsigned/g' -e 's/GLint/int/g' \
       -e 's/GLfloat/float/g' -e 's/GLubyte/uint8_t/g' \
       -e 's/GLshort/int16_t/g' assembler/*.[ch]

Drop the GL types here, they don't bring anything to the table. For
instance, GLuint has no guarantee to be 32 bits, so it does not make too
much sense to use it in structure describing hardware tables and
opcodes.

Of course, some bikeshedding can be applied to use uin32_t instead, I
figured that some of the GLuint are used without size constraints, so
a sed with uint32_t did not seem the right thing to do. On top of that
initial sed, one bothered enough could change the structures with size
constraints to actually use uint32_t.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_disasm.c  |   82 ++--
 assembler/brw_eu.c      |   18 +-
 assembler/brw_eu.h      |  142 ++--
 assembler/brw_eu_emit.c |  148 ++--
 assembler/brw_eu_util.c |   24 +-
 assembler/brw_structs.h | 1698 +++++++++++++++++++++++------------------------
 assembler/gen4asm.h     |    8 +-
 assembler/gram.y        |    8 +-
 8 files changed, 1058 insertions(+), 1070 deletions(-)

diff --git a/assembler/brw_disasm.c b/assembler/brw_disasm.c
index 9781f6b..de121e6 100644
--- a/assembler/brw_disasm.c
+++ b/assembler/brw_disasm.c
@@ -428,7 +428,7 @@ static int pad (FILE *f, int c)
 }
 
 static int control (FILE *file, const char *name, const char * const ctrl[],
-                    GLuint id, int *space)
+                    unsigned id, int *space)
 {
     if (!ctrl[id]) {
 	fprintf (file, "*** invalid %s value %d ",
@@ -456,7 +456,7 @@ static int print_opcode (FILE *file, int id)
     return 0;
 }
 
-static int reg (FILE *file, GLuint _reg_file, GLuint _reg_nr)
+static int reg (FILE *file, unsigned _reg_file, unsigned _reg_nr)
 {
     int	err = 0;
 
@@ -585,7 +585,7 @@ static int dest_3src (FILE *file, struct brw_instruction *inst)
 }
 
 static int src_align1_region (FILE *file,
-			      GLuint _vert_stride, GLuint _width, GLuint _horiz_stride)
+			      unsigned _vert_stride, unsigned _width, unsigned _horiz_stride)
 {
     int err = 0;
     string (file, "<");
@@ -598,9 +598,9 @@ static int src_align1_region (FILE *file,
     return err;
 }
 
-static int src_da1 (FILE *file, GLuint type, GLuint _reg_file,
-		    GLuint _vert_stride, GLuint _width, GLuint _horiz_stride,
-		    GLuint reg_num, GLuint sub_reg_num, GLuint __abs, GLuint _negate)
+static int src_da1 (FILE *file, unsigned type, unsigned _reg_file,
+		    unsigned _vert_stride, unsigned _width, unsigned _horiz_stride,
+		    unsigned reg_num, unsigned sub_reg_num, unsigned __abs, unsigned _negate)
 {
     int err = 0;
     err |= control (file, "negate", negate, _negate, NULL);
@@ -617,16 +617,16 @@ static int src_da1 (FILE *file, GLuint type, GLuint _reg_file,
 }
 
 static int src_ia1 (FILE *file,
-		    GLuint type,
-		    GLuint _reg_file,
-		    GLint _addr_imm,
-		    GLuint _addr_subreg_nr,
-		    GLuint _negate,
-		    GLuint __abs,
-		    GLuint _addr_mode,
-		    GLuint _horiz_stride,
-		    GLuint _width,
-		    GLuint _vert_stride)
+		    unsigned type,
+		    unsigned _reg_file,
+		    int _addr_imm,
+		    unsigned _addr_subreg_nr,
+		    unsigned _negate,
+		    unsigned __abs,
+		    unsigned _addr_mode,
+		    unsigned _horiz_stride,
+		    unsigned _width,
+		    unsigned _vert_stride)
 {
     int err = 0;
     err |= control (file, "negate", negate, _negate, NULL);
@@ -644,17 +644,17 @@ static int src_ia1 (FILE *file,
 }
 
 static int src_da16 (FILE *file,
-		     GLuint _reg_type,
-		     GLuint _reg_file,
-		     GLuint _vert_stride,
-		     GLuint _reg_nr,
-		     GLuint _subreg_nr,
-		     GLuint __abs,
-		     GLuint _negate,
-		     GLuint swz_x,
-		     GLuint swz_y,
-		     GLuint swz_z,
-		     GLuint swz_w)
+		     unsigned _reg_type,
+		     unsigned _reg_file,
+		     unsigned _vert_stride,
+		     unsigned _reg_nr,
+		     unsigned _subreg_nr,
+		     unsigned __abs,
+		     unsigned _negate,
+		     unsigned swz_x,
+		     unsigned swz_y,
+		     unsigned swz_z,
+		     unsigned swz_w)
 {
     int err = 0;
     err |= control (file, "negate", negate, _negate, NULL);
@@ -703,10 +703,10 @@ static int src_da16 (FILE *file,
 static int src0_3src (FILE *file, struct brw_instruction *inst)
 {
     int err = 0;
-    GLuint swz_x = (inst->bits2.da3src.src0_swizzle >> 0) & 0x3;
-    GLuint swz_y = (inst->bits2.da3src.src0_swizzle >> 2) & 0x3;
-    GLuint swz_z = (inst->bits2.da3src.src0_swizzle >> 4) & 0x3;
-    GLuint swz_w = (inst->bits2.da3src.src0_swizzle >> 6) & 0x3;
+    unsigned swz_x = (inst->bits2.da3src.src0_swizzle >> 0) & 0x3;
+    unsigned swz_y = (inst->bits2.da3src.src0_swizzle >> 2) & 0x3;
+    unsigned swz_z = (inst->bits2.da3src.src0_swizzle >> 4) & 0x3;
+    unsigned swz_w = (inst->bits2.da3src.src0_swizzle >> 6) & 0x3;
 
     err |= control (file, "negate", negate, inst->bits1.da3src.src0_negate, NULL);
     err |= control (file, "abs", _abs, inst->bits1.da3src.src0_abs, NULL);
@@ -751,11 +751,11 @@ static int src0_3src (FILE *file, struct brw_instruction *inst)
 static int src1_3src (FILE *file, struct brw_instruction *inst)
 {
     int err = 0;
-    GLuint swz_x = (inst->bits2.da3src.src1_swizzle >> 0) & 0x3;
-    GLuint swz_y = (inst->bits2.da3src.src1_swizzle >> 2) & 0x3;
-    GLuint swz_z = (inst->bits2.da3src.src1_swizzle >> 4) & 0x3;
-    GLuint swz_w = (inst->bits2.da3src.src1_swizzle >> 6) & 0x3;
-    GLuint src1_subreg_nr = (inst->bits2.da3src.src1_subreg_nr_low |
+    unsigned swz_x = (inst->bits2.da3src.src1_swizzle >> 0) & 0x3;
+    unsigned swz_y = (inst->bits2.da3src.src1_swizzle >> 2) & 0x3;
+    unsigned swz_z = (inst->bits2.da3src.src1_swizzle >> 4) & 0x3;
+    unsigned swz_w = (inst->bits2.da3src.src1_swizzle >> 6) & 0x3;
+    unsigned src1_subreg_nr = (inst->bits2.da3src.src1_subreg_nr_low |
 			     (inst->bits3.da3src.src1_subreg_nr_high << 2));
 
     err |= control (file, "negate", negate, inst->bits1.da3src.src1_negate,
@@ -804,10 +804,10 @@ static int src1_3src (FILE *file, struct brw_instruction *inst)
 static int src2_3src (FILE *file, struct brw_instruction *inst)
 {
     int err = 0;
-    GLuint swz_x = (inst->bits3.da3src.src2_swizzle >> 0) & 0x3;
-    GLuint swz_y = (inst->bits3.da3src.src2_swizzle >> 2) & 0x3;
-    GLuint swz_z = (inst->bits3.da3src.src2_swizzle >> 4) & 0x3;
-    GLuint swz_w = (inst->bits3.da3src.src2_swizzle >> 6) & 0x3;
+    unsigned swz_x = (inst->bits3.da3src.src2_swizzle >> 0) & 0x3;
+    unsigned swz_y = (inst->bits3.da3src.src2_swizzle >> 2) & 0x3;
+    unsigned swz_z = (inst->bits3.da3src.src2_swizzle >> 4) & 0x3;
+    unsigned swz_w = (inst->bits3.da3src.src2_swizzle >> 6) & 0x3;
 
     err |= control (file, "negate", negate, inst->bits1.da3src.src2_negate,
 		    NULL);
@@ -851,7 +851,7 @@ static int src2_3src (FILE *file, struct brw_instruction *inst)
     return err;
 }
 
-static int imm (FILE *file, GLuint type, struct brw_instruction *inst) {
+static int imm (FILE *file, unsigned type, struct brw_instruction *inst) {
     switch (type) {
     case BRW_REGISTER_TYPE_UD:
 	format (file, "0x%08xUD", inst->bits3.ud);
diff --git a/assembler/brw_eu.c b/assembler/brw_eu.c
index a9afc82..d874b79 100644
--- a/assembler/brw_eu.c
+++ b/assembler/brw_eu.c
@@ -65,7 +65,7 @@ brw_swap_cmod(uint32_t cmod)
 /* How does predicate control work when execution_size != 8?  Do I
  * need to test/set for 0xffff when execution_size is 16?
  */
-void brw_set_predicate_control_flag_value( struct brw_compile *p, GLuint value )
+void brw_set_predicate_control_flag_value( struct brw_compile *p, unsigned value )
 {
    p->current->header.predicate_control = BRW_PREDICATE_NONE;
 
@@ -81,7 +81,7 @@ void brw_set_predicate_control_flag_value( struct brw_compile *p, GLuint value )
    }
 }
 
-void brw_set_predicate_control( struct brw_compile *p, GLuint pc )
+void brw_set_predicate_control( struct brw_compile *p, unsigned pc )
 {
    p->current->header.predicate_control = pc;
 }
@@ -91,7 +91,7 @@ void brw_set_predicate_inverse(struct brw_compile *p, bool predicate_inverse)
    p->current->header.predicate_inverse = predicate_inverse;
 }
 
-void brw_set_conditionalmod( struct brw_compile *p, GLuint conditional )
+void brw_set_conditionalmod( struct brw_compile *p, unsigned conditional )
 {
    p->current->header.destreg__conditionalmod = conditional;
 }
@@ -102,7 +102,7 @@ void brw_set_flag_reg(struct brw_compile *p, int reg, int subreg)
    p->current->bits2.da1.flag_subreg_nr = subreg;
 }
 
-void brw_set_access_mode( struct brw_compile *p, GLuint access_mode )
+void brw_set_access_mode( struct brw_compile *p, unsigned access_mode )
 {
    p->current->header.access_mode = access_mode;
 }
@@ -144,7 +144,7 @@ brw_set_compression_control(struct brw_compile *p,
    }
 }
 
-void brw_set_mask_control( struct brw_compile *p, GLuint value )
+void brw_set_mask_control( struct brw_compile *p, unsigned value )
 {
    p->current->header.mask_control = value;
 }
@@ -154,7 +154,7 @@ void brw_set_saturate( struct brw_compile *p, bool enable )
    p->current->header.saturate = enable;
 }
 
-void brw_set_acc_write_control(struct brw_compile *p, GLuint value)
+void brw_set_acc_write_control(struct brw_compile *p, unsigned value)
 {
    if (p->brw->intel.gen >= 6)
       p->current->header.acc_wr_control = value;
@@ -219,13 +219,13 @@ brw_init_compile(struct brw_context *brw, struct brw_compile *p, void *mem_ctx)
 }
 
 
-const GLuint *brw_get_program( struct brw_compile *p,
-			       GLuint *sz )
+const unsigned *brw_get_program( struct brw_compile *p,
+			       unsigned *sz )
 {
    brw_compact_instructions(p);
 
    *sz = p->next_insn_offset;
-   return (const GLuint *)p->store;
+   return (const unsigned *)p->store;
 }
 
 void
diff --git a/assembler/brw_eu.h b/assembler/brw_eu.h
index 83c82d1..427db37 100644
--- a/assembler/brw_eu.h
+++ b/assembler/brw_eu.h
@@ -49,7 +49,7 @@ extern "C" {
 struct brw_compile {
    struct brw_instruction *store;
    int store_size;
-   GLuint nr_insn;
+   unsigned nr_insn;
    unsigned int next_insn_offset;
 
    void *mem_ctx;
@@ -60,7 +60,7 @@ struct brw_compile {
    bool compressed_stack[BRW_EU_MAX_INSN_STACK];
    struct brw_instruction *current;
 
-   GLuint flag_value;
+   unsigned flag_value;
    bool single_program_flow;
    bool compressed;
    struct brw_context *brw;
@@ -98,23 +98,23 @@ static inline struct brw_instruction *current_insn( struct brw_compile *p)
 
 void brw_pop_insn_state( struct brw_compile *p );
 void brw_push_insn_state( struct brw_compile *p );
-void brw_set_mask_control( struct brw_compile *p, GLuint value );
+void brw_set_mask_control( struct brw_compile *p, unsigned value );
 void brw_set_saturate( struct brw_compile *p, bool enable );
-void brw_set_access_mode( struct brw_compile *p, GLuint access_mode );
+void brw_set_access_mode( struct brw_compile *p, unsigned access_mode );
 void brw_set_compression_control(struct brw_compile *p, enum brw_compression c);
-void brw_set_predicate_control_flag_value( struct brw_compile *p, GLuint value );
-void brw_set_predicate_control( struct brw_compile *p, GLuint pc );
+void brw_set_predicate_control_flag_value( struct brw_compile *p, unsigned value );
+void brw_set_predicate_control( struct brw_compile *p, unsigned pc );
 void brw_set_predicate_inverse(struct brw_compile *p, bool predicate_inverse);
-void brw_set_conditionalmod( struct brw_compile *p, GLuint conditional );
+void brw_set_conditionalmod( struct brw_compile *p, unsigned conditional );
 void brw_set_flag_reg(struct brw_compile *p, int reg, int subreg);
-void brw_set_acc_write_control(struct brw_compile *p, GLuint value);
+void brw_set_acc_write_control(struct brw_compile *p, unsigned value);
 
 void brw_init_compile(struct brw_context *, struct brw_compile *p,
 		      void *mem_ctx);
 void brw_dump_compile(struct brw_compile *p, FILE *out, int start, int end);
-const GLuint *brw_get_program( struct brw_compile *p, GLuint *sz );
+const unsigned *brw_get_program( struct brw_compile *p, unsigned *sz );
 
-struct brw_instruction *brw_next_insn(struct brw_compile *p, GLuint opcode);
+struct brw_instruction *brw_next_insn(struct brw_compile *p, unsigned opcode);
 void brw_set_dest(struct brw_compile *p, struct brw_instruction *insn,
 		  struct brw_reg dest);
 void brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
@@ -122,7 +122,7 @@ void brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
 
 void gen6_resolve_implied_move(struct brw_compile *p,
 			       struct brw_reg *src,
-			       GLuint msg_reg_nr);
+			       unsigned msg_reg_nr);
 
 /* Helpers for regular instructions:
  */
@@ -188,101 +188,101 @@ ROUND(RNDE)
  */
 void brw_set_sampler_message(struct brw_compile *p,
                              struct brw_instruction *insn,
-                             GLuint binding_table_index,
-                             GLuint sampler,
-                             GLuint msg_type,
-                             GLuint response_length,
-                             GLuint msg_length,
-                             GLuint header_present,
-                             GLuint simd_mode,
-                             GLuint return_format);
+                             unsigned binding_table_index,
+                             unsigned sampler,
+                             unsigned msg_type,
+                             unsigned response_length,
+                             unsigned msg_length,
+                             unsigned header_present,
+                             unsigned simd_mode,
+                             unsigned return_format);
 
 void brw_set_dp_read_message(struct brw_compile *p,
 			     struct brw_instruction *insn,
-			     GLuint binding_table_index,
-			     GLuint msg_control,
-			     GLuint msg_type,
-			     GLuint target_cache,
-			     GLuint msg_length,
+			     unsigned binding_table_index,
+			     unsigned msg_control,
+			     unsigned msg_type,
+			     unsigned target_cache,
+			     unsigned msg_length,
                              bool header_present,
-			     GLuint response_length);
+			     unsigned response_length);
 
 void brw_set_dp_write_message(struct brw_compile *p,
 			      struct brw_instruction *insn,
-			      GLuint binding_table_index,
-			      GLuint msg_control,
-			      GLuint msg_type,
-			      GLuint msg_length,
+			      unsigned binding_table_index,
+			      unsigned msg_control,
+			      unsigned msg_type,
+			      unsigned msg_length,
 			      bool header_present,
-			      GLuint last_render_target,
-			      GLuint response_length,
-			      GLuint end_of_thread,
-			      GLuint send_commit_msg);
+			      unsigned last_render_target,
+			      unsigned response_length,
+			      unsigned end_of_thread,
+			      unsigned send_commit_msg);
 
 void brw_urb_WRITE(struct brw_compile *p,
 		   struct brw_reg dest,
-		   GLuint msg_reg_nr,
+		   unsigned msg_reg_nr,
 		   struct brw_reg src0,
 		   bool allocate,
 		   bool used,
-		   GLuint msg_length,
-		   GLuint response_length,
+		   unsigned msg_length,
+		   unsigned response_length,
 		   bool eot,
 		   bool writes_complete,
-		   GLuint offset,
-		   GLuint swizzle);
+		   unsigned offset,
+		   unsigned swizzle);
 
 void brw_ff_sync(struct brw_compile *p,
 		   struct brw_reg dest,
-		   GLuint msg_reg_nr,
+		   unsigned msg_reg_nr,
 		   struct brw_reg src0,
 		   bool allocate,
-		   GLuint response_length,
+		   unsigned response_length,
 		   bool eot);
 
 void brw_svb_write(struct brw_compile *p,
                    struct brw_reg dest,
-                   GLuint msg_reg_nr,
+                   unsigned msg_reg_nr,
                    struct brw_reg src0,
-                   GLuint binding_table_index,
+                   unsigned binding_table_index,
                    bool   send_commit_msg);
 
 void brw_fb_WRITE(struct brw_compile *p,
 		  int dispatch_width,
-		   GLuint msg_reg_nr,
+		   unsigned msg_reg_nr,
 		   struct brw_reg src0,
-		   GLuint msg_control,
-		   GLuint binding_table_index,
-		   GLuint msg_length,
-		   GLuint response_length,
+		   unsigned msg_control,
+		   unsigned binding_table_index,
+		   unsigned msg_length,
+		   unsigned response_length,
 		   bool eot,
 		   bool header_present);
 
 void brw_SAMPLE(struct brw_compile *p,
 		struct brw_reg dest,
-		GLuint msg_reg_nr,
+		unsigned msg_reg_nr,
 		struct brw_reg src0,
-		GLuint binding_table_index,
-		GLuint sampler,
-		GLuint writemask,
-		GLuint msg_type,
-		GLuint response_length,
-		GLuint msg_length,
-		GLuint header_present,
-		GLuint simd_mode,
-		GLuint return_format);
+		unsigned binding_table_index,
+		unsigned sampler,
+		unsigned writemask,
+		unsigned msg_type,
+		unsigned response_length,
+		unsigned msg_length,
+		unsigned header_present,
+		unsigned simd_mode,
+		unsigned return_format);
 
 void brw_math( struct brw_compile *p,
 	       struct brw_reg dest,
-	       GLuint function,
-	       GLuint msg_reg_nr,
+	       unsigned function,
+	       unsigned msg_reg_nr,
 	       struct brw_reg src,
-	       GLuint data_type,
-	       GLuint precision );
+	       unsigned data_type,
+	       unsigned precision );
 
 void brw_math2(struct brw_compile *p,
 	       struct brw_reg dest,
-	       GLuint function,
+	       unsigned function,
 	       struct brw_reg src0,
 	       struct brw_reg src1);
 
@@ -296,12 +296,12 @@ void brw_oword_block_read_scratch(struct brw_compile *p,
 				  struct brw_reg dest,
 				  struct brw_reg mrf,
 				  int num_regs,
-				  GLuint offset);
+				  unsigned offset);
 
 void brw_oword_block_write_scratch(struct brw_compile *p,
 				   struct brw_reg mrf,
 				   int num_regs,
-				   GLuint offset);
+				   unsigned offset);
 
 void brw_shader_time_add(struct brw_compile *p,
                          int mrf,
@@ -311,7 +311,7 @@ void brw_shader_time_add(struct brw_compile *p,
  * channel.
  */
 struct brw_instruction *brw_IF(struct brw_compile *p,
-			       GLuint execute_size);
+			       unsigned execute_size);
 struct brw_instruction *gen6_IF(struct brw_compile *p, uint32_t conditional,
 				struct brw_reg src0, struct brw_reg src1);
 
@@ -321,7 +321,7 @@ void brw_ENDIF(struct brw_compile *p);
 /* DO/WHILE loops:
  */
 struct brw_instruction *brw_DO(struct brw_compile *p,
-			       GLuint execute_size);
+			       unsigned execute_size);
 
 struct brw_instruction *brw_WHILE(struct brw_compile *p);
 
@@ -344,7 +344,7 @@ void brw_WAIT(struct brw_compile *p);
  */
 void brw_CMP(struct brw_compile *p,
 	     struct brw_reg dest,
-	     GLuint conditional,
+	     unsigned conditional,
 	     struct brw_reg src0,
 	     struct brw_reg src1);
 
@@ -355,22 +355,22 @@ void brw_CMP(struct brw_compile *p,
 void brw_copy_indirect_to_indirect(struct brw_compile *p,
 				   struct brw_indirect dst_ptr,
 				   struct brw_indirect src_ptr,
-				   GLuint count);
+				   unsigned count);
 
 void brw_copy_from_indirect(struct brw_compile *p,
 			    struct brw_reg dst,
 			    struct brw_indirect ptr,
-			    GLuint count);
+			    unsigned count);
 
 void brw_copy4(struct brw_compile *p,
 	       struct brw_reg dst,
 	       struct brw_reg src,
-	       GLuint count);
+	       unsigned count);
 
 void brw_copy8(struct brw_compile *p,
 	       struct brw_reg dst,
 	       struct brw_reg src,
-	       GLuint count);
+	       unsigned count);
 
 void brw_math_invert( struct brw_compile *p,
 		      struct brw_reg dst,
diff --git a/assembler/brw_eu_emit.c b/assembler/brw_eu_emit.c
index a1e96d1..23f0da5 100644
--- a/assembler/brw_eu_emit.c
+++ b/assembler/brw_eu_emit.c
@@ -62,7 +62,7 @@ static void guess_execution_size(struct brw_compile *p,
 void
 gen6_resolve_implied_move(struct brw_compile *p,
 			  struct brw_reg *src,
-			  GLuint msg_reg_nr)
+			  unsigned msg_reg_nr)
 {
    struct intel_context *intel = &p->brw->intel;
    if (intel->gen < 6)
@@ -478,10 +478,10 @@ brw_set_message_descriptor(struct brw_compile *p,
 
 static void brw_set_math_message( struct brw_compile *p,
 				  struct brw_instruction *insn,
-				  GLuint function,
-				  GLuint integer_type,
+				  unsigned function,
+				  unsigned integer_type,
 				  bool low_precision,
-				  GLuint dataType )
+				  unsigned dataType )
 {
    struct brw_context *brw = p->brw;
    struct intel_context *intel = &brw->intel;
@@ -536,7 +536,7 @@ static void brw_set_math_message( struct brw_compile *p,
 static void brw_set_ff_sync_message(struct brw_compile *p,
 				    struct brw_instruction *insn,
 				    bool allocate,
-				    GLuint response_length,
+				    unsigned response_length,
 				    bool end_of_thread)
 {
    brw_set_message_descriptor(p, insn, BRW_SFID_URB,
@@ -553,12 +553,12 @@ static void brw_set_urb_message( struct brw_compile *p,
 				 struct brw_instruction *insn,
 				 bool allocate,
 				 bool used,
-				 GLuint msg_length,
-				 GLuint response_length,
+				 unsigned msg_length,
+				 unsigned response_length,
 				 bool end_of_thread,
 				 bool complete,
-				 GLuint offset,
-				 GLuint swizzle_control )
+				 unsigned offset,
+				 unsigned swizzle_control )
 {
    struct brw_context *brw = p->brw;
    struct intel_context *intel = &brw->intel;
@@ -593,15 +593,15 @@ static void brw_set_urb_message( struct brw_compile *p,
 void
 brw_set_dp_write_message(struct brw_compile *p,
 			 struct brw_instruction *insn,
-			 GLuint binding_table_index,
-			 GLuint msg_control,
-			 GLuint msg_type,
-			 GLuint msg_length,
+			 unsigned binding_table_index,
+			 unsigned msg_control,
+			 unsigned msg_type,
+			 unsigned msg_length,
 			 bool header_present,
-			 GLuint last_render_target,
-			 GLuint response_length,
-			 GLuint end_of_thread,
-			 GLuint send_commit_msg)
+			 unsigned last_render_target,
+			 unsigned response_length,
+			 unsigned end_of_thread,
+			 unsigned send_commit_msg)
 {
    struct brw_context *brw = p->brw;
    struct intel_context *intel = &brw->intel;
@@ -652,13 +652,13 @@ brw_set_dp_write_message(struct brw_compile *p,
 void
 brw_set_dp_read_message(struct brw_compile *p,
 			struct brw_instruction *insn,
-			GLuint binding_table_index,
-			GLuint msg_control,
-			GLuint msg_type,
-			GLuint target_cache,
-			GLuint msg_length,
+			unsigned binding_table_index,
+			unsigned msg_control,
+			unsigned msg_type,
+			unsigned target_cache,
+			unsigned msg_length,
                         bool header_present,
-			GLuint response_length)
+			unsigned response_length)
 {
    struct brw_context *brw = p->brw;
    struct intel_context *intel = &brw->intel;
@@ -708,14 +708,14 @@ brw_set_dp_read_message(struct brw_compile *p,
 void
 brw_set_sampler_message(struct brw_compile *p,
                         struct brw_instruction *insn,
-                        GLuint binding_table_index,
-                        GLuint sampler,
-                        GLuint msg_type,
-                        GLuint response_length,
-                        GLuint msg_length,
-                        GLuint header_present,
-                        GLuint simd_mode,
-                        GLuint return_format)
+                        unsigned binding_table_index,
+                        unsigned sampler,
+                        unsigned msg_type,
+                        unsigned response_length,
+                        unsigned msg_length,
+                        unsigned header_present,
+                        unsigned simd_mode,
+                        unsigned return_format)
 {
    struct brw_context *brw = p->brw;
    struct intel_context *intel = &brw->intel;
@@ -748,7 +748,7 @@ brw_set_sampler_message(struct brw_compile *p,
 
 #define next_insn brw_next_insn
 struct brw_instruction *
-brw_next_insn(struct brw_compile *p, GLuint opcode)
+brw_next_insn(struct brw_compile *p, unsigned opcode)
 {
    struct brw_instruction *insn;
 
@@ -779,7 +779,7 @@ brw_next_insn(struct brw_compile *p, GLuint opcode)
 }
 
 static struct brw_instruction *brw_alu1( struct brw_compile *p,
-					 GLuint opcode,
+					 unsigned opcode,
 					 struct brw_reg dest,
 					 struct brw_reg src )
 {
@@ -790,7 +790,7 @@ static struct brw_instruction *brw_alu1( struct brw_compile *p,
 }
 
 static struct brw_instruction *brw_alu2(struct brw_compile *p,
-					GLuint opcode,
+					unsigned opcode,
 					struct brw_reg dest,
 					struct brw_reg src0,
 					struct brw_reg src1 )
@@ -902,7 +902,7 @@ brw_set_3src_src2(struct brw_compile *p,
 }
 
 static struct brw_instruction *brw_alu3(struct brw_compile *p,
-					GLuint opcode,
+					unsigned opcode,
 					struct brw_reg dest,
 					struct brw_reg src0,
 					struct brw_reg src1,
@@ -1170,7 +1170,7 @@ get_inner_do_insn(struct brw_compile *p)
  * popped off.  If the stack is now empty, normal execution resumes.
  */
 struct brw_instruction *
-brw_IF(struct brw_compile *p, GLuint execute_size)
+brw_IF(struct brw_compile *p, unsigned execute_size)
 {
    struct intel_context *intel = &p->brw->intel;
    struct brw_instruction *insn;
@@ -1572,7 +1572,7 @@ struct brw_instruction *gen6_HALT(struct brw_compile *p)
  * For gen6, there's no more mask stack, so no need for DO.  WHILE
  * just points back to the first instruction of the loop.
  */
-struct brw_instruction *brw_DO(struct brw_compile *p, GLuint execute_size)
+struct brw_instruction *brw_DO(struct brw_compile *p, unsigned execute_size)
 {
    struct intel_context *intel = &p->brw->intel;
 
@@ -1634,7 +1634,7 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p)
 {
    struct intel_context *intel = &p->brw->intel;
    struct brw_instruction *insn, *do_insn;
-   GLuint br = 1;
+   unsigned br = 1;
 
    if (intel->gen >= 5)
       br = 2;
@@ -1701,7 +1701,7 @@ void brw_land_fwd_jump(struct brw_compile *p, int jmp_insn_idx)
 {
    struct intel_context *intel = &p->brw->intel;
    struct brw_instruction *jmp_insn = &p->store[jmp_insn_idx];
-   GLuint jmpi = 1;
+   unsigned jmpi = 1;
 
    if (intel->gen >= 5)
       jmpi = 2;
@@ -1720,7 +1720,7 @@ void brw_land_fwd_jump(struct brw_compile *p, int jmp_insn_idx)
  */
 void brw_CMP(struct brw_compile *p,
 	     struct brw_reg dest,
-	     GLuint conditional,
+	     unsigned conditional,
 	     struct brw_reg src0,
 	     struct brw_reg src1)
 {
@@ -1769,11 +1769,11 @@ void brw_WAIT (struct brw_compile *p)
  */
 void brw_math( struct brw_compile *p,
 	       struct brw_reg dest,
-	       GLuint function,
-	       GLuint msg_reg_nr,
+	       unsigned function,
+	       unsigned msg_reg_nr,
 	       struct brw_reg src,
-	       GLuint data_type,
-	       GLuint precision )
+	       unsigned data_type,
+	       unsigned precision )
 {
    struct intel_context *intel = &p->brw->intel;
 
@@ -1833,7 +1833,7 @@ void brw_math( struct brw_compile *p,
  */
 void brw_math2(struct brw_compile *p,
 	       struct brw_reg dest,
-	       GLuint function,
+	       unsigned function,
 	       struct brw_reg src0,
 	       struct brw_reg src1)
 {
@@ -1893,7 +1893,7 @@ void brw_math2(struct brw_compile *p,
 void brw_oword_block_write_scratch(struct brw_compile *p,
 				   struct brw_reg mrf,
 				   int num_regs,
-				   GLuint offset)
+				   unsigned offset)
 {
    struct intel_context *intel = &p->brw->intel;
    uint32_t msg_control, msg_type;
@@ -2005,7 +2005,7 @@ brw_oword_block_read_scratch(struct brw_compile *p,
 			     struct brw_reg dest,
 			     struct brw_reg mrf,
 			     int num_regs,
-			     GLuint offset)
+			     unsigned offset)
 {
    struct intel_context *intel = &p->brw->intel;
    uint32_t msg_control;
@@ -2130,18 +2130,18 @@ void brw_oword_block_read(struct brw_compile *p,
 
 void brw_fb_WRITE(struct brw_compile *p,
 		  int dispatch_width,
-                  GLuint msg_reg_nr,
+                  unsigned msg_reg_nr,
                   struct brw_reg src0,
-                  GLuint msg_control,
-                  GLuint binding_table_index,
-                  GLuint msg_length,
-                  GLuint response_length,
+                  unsigned msg_control,
+                  unsigned binding_table_index,
+                  unsigned msg_length,
+                  unsigned response_length,
                   bool eot,
                   bool header_present)
 {
    struct intel_context *intel = &p->brw->intel;
    struct brw_instruction *insn;
-   GLuint msg_type;
+   unsigned msg_type;
    struct brw_reg dest;
 
    if (dispatch_width == 16)
@@ -2192,17 +2192,17 @@ void brw_fb_WRITE(struct brw_compile *p,
  */
 void brw_SAMPLE(struct brw_compile *p,
 		struct brw_reg dest,
-		GLuint msg_reg_nr,
+		unsigned msg_reg_nr,
 		struct brw_reg src0,
-		GLuint binding_table_index,
-		GLuint sampler,
-		GLuint writemask,
-		GLuint msg_type,
-		GLuint response_length,
-		GLuint msg_length,
-		GLuint header_present,
-		GLuint simd_mode,
-		GLuint return_format)
+		unsigned binding_table_index,
+		unsigned sampler,
+		unsigned writemask,
+		unsigned msg_type,
+		unsigned response_length,
+		unsigned msg_length,
+		unsigned header_present,
+		unsigned simd_mode,
+		unsigned return_format)
 {
    struct intel_context *intel = &p->brw->intel;
    bool need_stall = 0;
@@ -2223,8 +2223,8 @@ void brw_SAMPLE(struct brw_compile *p,
     * needed.
     */
    if (writemask != BRW_WRITEMASK_XYZW) {
-      GLuint dst_offset = 0;
-      GLuint i, newmask = 0, len = 0;
+      unsigned dst_offset = 0;
+      unsigned i, newmask = 0, len = 0;
 
       for (i = 0; i < 4; i++) {
 	 if (writemask & (1<<i))
@@ -2320,16 +2320,16 @@ void brw_SAMPLE(struct brw_compile *p,
  */
 void brw_urb_WRITE(struct brw_compile *p,
 		   struct brw_reg dest,
-		   GLuint msg_reg_nr,
+		   unsigned msg_reg_nr,
 		   struct brw_reg src0,
 		   bool allocate,
 		   bool used,
-		   GLuint msg_length,
-		   GLuint response_length,
+		   unsigned msg_length,
+		   unsigned response_length,
 		   bool eot,
 		   bool writes_complete,
-		   GLuint offset,
-		   GLuint swizzle)
+		   unsigned offset,
+		   unsigned swizzle)
 {
    struct intel_context *intel = &p->brw->intel;
    struct brw_instruction *insn;
@@ -2509,10 +2509,10 @@ brw_set_uip_jip(struct brw_compile *p)
 
 void brw_ff_sync(struct brw_compile *p,
 		   struct brw_reg dest,
-		   GLuint msg_reg_nr,
+		   unsigned msg_reg_nr,
 		   struct brw_reg src0,
 		   bool allocate,
-		   GLuint response_length,
+		   unsigned response_length,
 		   bool eot)
 {
    struct intel_context *intel = &p->brw->intel;
@@ -2549,9 +2549,9 @@ void brw_ff_sync(struct brw_compile *p,
 void
 brw_svb_write(struct brw_compile *p,
               struct brw_reg dest,
-              GLuint msg_reg_nr,
+              unsigned msg_reg_nr,
               struct brw_reg src0,
-              GLuint binding_table_index,
+              unsigned binding_table_index,
               bool   send_commit_msg)
 {
    struct brw_instruction *insn;
diff --git a/assembler/brw_eu_util.c b/assembler/brw_eu_util.c
index e3bfbc7..f9126ab 100644
--- a/assembler/brw_eu_util.c
+++ b/assembler/brw_eu_util.c
@@ -53,16 +53,16 @@ void brw_math_invert( struct brw_compile *p,
 void brw_copy4(struct brw_compile *p,
 	       struct brw_reg dst,
 	       struct brw_reg src,
-	       GLuint count)
+	       unsigned count)
 {
-   GLuint i;
+   unsigned i;
 
    dst = vec4(dst);
    src = vec4(src);
 
    for (i = 0; i < count; i++)
    {
-      GLuint delta = i*32;
+      unsigned delta = i*32;
       brw_MOV(p, byte_offset(dst, delta),    byte_offset(src, delta));
       brw_MOV(p, byte_offset(dst, delta+16), byte_offset(src, delta+16));
    }
@@ -72,16 +72,16 @@ void brw_copy4(struct brw_compile *p,
 void brw_copy8(struct brw_compile *p,
 	       struct brw_reg dst,
 	       struct brw_reg src,
-	       GLuint count)
+	       unsigned count)
 {
-   GLuint i;
+   unsigned i;
 
    dst = vec8(dst);
    src = vec8(src);
 
    for (i = 0; i < count; i++)
    {
-      GLuint delta = i*32;
+      unsigned delta = i*32;
       brw_MOV(p, byte_offset(dst, delta),    byte_offset(src, delta));
    }
 }
@@ -90,13 +90,13 @@ void brw_copy8(struct brw_compile *p,
 void brw_copy_indirect_to_indirect(struct brw_compile *p,
 				   struct brw_indirect dst_ptr,
 				   struct brw_indirect src_ptr,
-				   GLuint count)
+				   unsigned count)
 {
-   GLuint i;
+   unsigned i;
 
    for (i = 0; i < count; i++)
    {
-      GLuint delta = i*32;
+      unsigned delta = i*32;
       brw_MOV(p, deref_4f(dst_ptr, delta),    deref_4f(src_ptr, delta));
       brw_MOV(p, deref_4f(dst_ptr, delta+16), deref_4f(src_ptr, delta+16));
    }
@@ -106,15 +106,15 @@ void brw_copy_indirect_to_indirect(struct brw_compile *p,
 void brw_copy_from_indirect(struct brw_compile *p,
 			    struct brw_reg dst,
 			    struct brw_indirect ptr,
-			    GLuint count)
+			    unsigned count)
 {
-   GLuint i;
+   unsigned i;
 
    dst = vec4(dst);
 
    for (i = 0; i < count; i++)
    {
-      GLuint delta = i*32;
+      unsigned delta = i*32;
       brw_MOV(p, byte_offset(dst, delta),    deref_4f(ptr, delta));
       brw_MOV(p, byte_offset(dst, delta+16), deref_4f(ptr, delta+16));
    }
diff --git a/assembler/brw_structs.h b/assembler/brw_structs.h
index 2f6aafb..8c2d2b9 100644
--- a/assembler/brw_structs.h
+++ b/assembler/brw_structs.h
@@ -35,12 +35,6 @@
 
 #include <stdint.h>
 
-typedef unsigned char GLubyte;
-typedef short GLshort;
-typedef unsigned int GLuint;
-typedef int GLint;
-typedef float GLfloat;
-
 /* These seem to be passed around as function args, so it works out
  * better to keep them as #defines:
  */
@@ -53,31 +47,31 @@ struct brw_urb_fence
 {
    struct
    {
-      GLuint length:8;
-      GLuint vs_realloc:1;
-      GLuint gs_realloc:1;
-      GLuint clp_realloc:1;
-      GLuint sf_realloc:1;
-      GLuint vfe_realloc:1;
-      GLuint cs_realloc:1;
-      GLuint pad:2;
-      GLuint opcode:16;
+      unsigned length:8;
+      unsigned vs_realloc:1;
+      unsigned gs_realloc:1;
+      unsigned clp_realloc:1;
+      unsigned sf_realloc:1;
+      unsigned vfe_realloc:1;
+      unsigned cs_realloc:1;
+      unsigned pad:2;
+      unsigned opcode:16;
    } header;
 
    struct
    {
-      GLuint vs_fence:10;
-      GLuint gs_fence:10;
-      GLuint clp_fence:10;
-      GLuint pad:2;
+      unsigned vs_fence:10;
+      unsigned gs_fence:10;
+      unsigned clp_fence:10;
+      unsigned pad:2;
    } bits0;
 
    struct
    {
-      GLuint sf_fence:10;
-      GLuint vf_fence:10;
-      GLuint cs_fence:11;
-      GLuint pad:1;
+      unsigned sf_fence:10;
+      unsigned vf_fence:10;
+      unsigned cs_fence:11;
+      unsigned pad:1;
    } bits1;
 };
 
@@ -87,48 +81,48 @@ struct brw_urb_fence
 
 struct thread0
 {
-   GLuint pad0:1;
-   GLuint grf_reg_count:3;
-   GLuint pad1:2;
-   GLuint kernel_start_pointer:26; /* Offset from GENERAL_STATE_BASE */
+   unsigned pad0:1;
+   unsigned grf_reg_count:3;
+   unsigned pad1:2;
+   unsigned kernel_start_pointer:26; /* Offset from GENERAL_STATE_BASE */
 };
 
 struct thread1
 {
-   GLuint ext_halt_exception_enable:1;
-   GLuint sw_exception_enable:1;
-   GLuint mask_stack_exception_enable:1;
-   GLuint timeout_exception_enable:1;
-   GLuint illegal_op_exception_enable:1;
-   GLuint pad0:3;
-   GLuint depth_coef_urb_read_offset:6;	/* WM only */
-   GLuint pad1:2;
-   GLuint floating_point_mode:1;
-   GLuint thread_priority:1;
-   GLuint binding_table_entry_count:8;
-   GLuint pad3:5;
-   GLuint single_program_flow:1;
+   unsigned ext_halt_exception_enable:1;
+   unsigned sw_exception_enable:1;
+   unsigned mask_stack_exception_enable:1;
+   unsigned timeout_exception_enable:1;
+   unsigned illegal_op_exception_enable:1;
+   unsigned pad0:3;
+   unsigned depth_coef_urb_read_offset:6;	/* WM only */
+   unsigned pad1:2;
+   unsigned floating_point_mode:1;
+   unsigned thread_priority:1;
+   unsigned binding_table_entry_count:8;
+   unsigned pad3:5;
+   unsigned single_program_flow:1;
 };
 
 struct thread2
 {
-   GLuint per_thread_scratch_space:4;
-   GLuint pad0:6;
-   GLuint scratch_space_base_pointer:22;
+   unsigned per_thread_scratch_space:4;
+   unsigned pad0:6;
+   unsigned scratch_space_base_pointer:22;
 };
 
 
 struct thread3
 {
-   GLuint dispatch_grf_start_reg:4;
-   GLuint urb_entry_read_offset:6;
-   GLuint pad0:1;
-   GLuint urb_entry_read_length:6;
-   GLuint pad1:1;
-   GLuint const_urb_entry_read_offset:6;
-   GLuint pad2:1;
-   GLuint const_urb_entry_read_length:6;
-   GLuint pad3:1;
+   unsigned dispatch_grf_start_reg:4;
+   unsigned urb_entry_read_offset:6;
+   unsigned pad0:1;
+   unsigned urb_entry_read_length:6;
+   unsigned pad1:1;
+   unsigned const_urb_entry_read_offset:6;
+   unsigned pad2:1;
+   unsigned const_urb_entry_read_length:6;
+   unsigned pad3:1;
 };
 
 
@@ -138,18 +132,18 @@ struct brw_clip_unit_state
    struct thread0 thread0;
    struct
    {
-      GLuint pad0:7;
-      GLuint sw_exception_enable:1;
-      GLuint pad1:3;
-      GLuint mask_stack_exception_enable:1;
-      GLuint pad2:1;
-      GLuint illegal_op_exception_enable:1;
-      GLuint pad3:2;
-      GLuint floating_point_mode:1;
-      GLuint thread_priority:1;
-      GLuint binding_table_entry_count:8;
-      GLuint pad4:5;
-      GLuint single_program_flow:1;
+      unsigned pad0:7;
+      unsigned sw_exception_enable:1;
+      unsigned pad1:3;
+      unsigned mask_stack_exception_enable:1;
+      unsigned pad2:1;
+      unsigned illegal_op_exception_enable:1;
+      unsigned pad3:2;
+      unsigned floating_point_mode:1;
+      unsigned thread_priority:1;
+      unsigned binding_table_entry_count:8;
+      unsigned pad4:5;
+      unsigned single_program_flow:1;
    } thread1;
 
    struct thread2 thread2;
@@ -157,142 +151,142 @@ struct brw_clip_unit_state
 
    struct
    {
-      GLuint pad0:9;
-      GLuint gs_output_stats:1; /* not always */
-      GLuint stats_enable:1;
-      GLuint nr_urb_entries:7;
-      GLuint pad1:1;
-      GLuint urb_entry_allocation_size:5;
-      GLuint pad2:1;
-      GLuint max_threads:5; 	/* may be less */
-      GLuint pad3:2;
+      unsigned pad0:9;
+      unsigned gs_output_stats:1; /* not always */
+      unsigned stats_enable:1;
+      unsigned nr_urb_entries:7;
+      unsigned pad1:1;
+      unsigned urb_entry_allocation_size:5;
+      unsigned pad2:1;
+      unsigned max_threads:5; 	/* may be less */
+      unsigned pad3:2;
    } thread4;
 
    struct
    {
-      GLuint pad0:13;
-      GLuint clip_mode:3;
-      GLuint userclip_enable_flags:8;
-      GLuint userclip_must_clip:1;
-      GLuint negative_w_clip_test:1;
-      GLuint guard_band_enable:1;
-      GLuint viewport_z_clip_enable:1;
-      GLuint viewport_xy_clip_enable:1;
-      GLuint vertex_position_space:1;
-      GLuint api_mode:1;
-      GLuint pad2:1;
+      unsigned pad0:13;
+      unsigned clip_mode:3;
+      unsigned userclip_enable_flags:8;
+      unsigned userclip_must_clip:1;
+      unsigned negative_w_clip_test:1;
+      unsigned guard_band_enable:1;
+      unsigned viewport_z_clip_enable:1;
+      unsigned viewport_xy_clip_enable:1;
+      unsigned vertex_position_space:1;
+      unsigned api_mode:1;
+      unsigned pad2:1;
    } clip5;
 
    struct
    {
-      GLuint pad0:5;
-      GLuint clipper_viewport_state_ptr:27;
+      unsigned pad0:5;
+      unsigned clipper_viewport_state_ptr:27;
    } clip6;
 
 
-   GLfloat viewport_xmin;
-   GLfloat viewport_xmax;
-   GLfloat viewport_ymin;
-   GLfloat viewport_ymax;
+   float viewport_xmin;
+   float viewport_xmax;
+   float viewport_ymin;
+   float viewport_ymax;
 };
 
 struct gen6_blend_state
 {
    struct {
-      GLuint dest_blend_factor:5;
-      GLuint source_blend_factor:5;
-      GLuint pad3:1;
-      GLuint blend_func:3;
-      GLuint pad2:1;
-      GLuint ia_dest_blend_factor:5;
-      GLuint ia_source_blend_factor:5;
-      GLuint pad1:1;
-      GLuint ia_blend_func:3;
-      GLuint pad0:1;
-      GLuint ia_blend_enable:1;
-      GLuint blend_enable:1;
+      unsigned dest_blend_factor:5;
+      unsigned source_blend_factor:5;
+      unsigned pad3:1;
+      unsigned blend_func:3;
+      unsigned pad2:1;
+      unsigned ia_dest_blend_factor:5;
+      unsigned ia_source_blend_factor:5;
+      unsigned pad1:1;
+      unsigned ia_blend_func:3;
+      unsigned pad0:1;
+      unsigned ia_blend_enable:1;
+      unsigned blend_enable:1;
    } blend0;
 
    struct {
-      GLuint post_blend_clamp_enable:1;
-      GLuint pre_blend_clamp_enable:1;
-      GLuint clamp_range:2;
-      GLuint pad0:4;
-      GLuint x_dither_offset:2;
-      GLuint y_dither_offset:2;
-      GLuint dither_enable:1;
-      GLuint alpha_test_func:3;
-      GLuint alpha_test_enable:1;
-      GLuint pad1:1;
-      GLuint logic_op_func:4;
-      GLuint logic_op_enable:1;
-      GLuint pad2:1;
-      GLuint write_disable_b:1;
-      GLuint write_disable_g:1;
-      GLuint write_disable_r:1;
-      GLuint write_disable_a:1;
-      GLuint pad3:1;
-      GLuint alpha_to_coverage_dither:1;
-      GLuint alpha_to_one:1;
-      GLuint alpha_to_coverage:1;
+      unsigned post_blend_clamp_enable:1;
+      unsigned pre_blend_clamp_enable:1;
+      unsigned clamp_range:2;
+      unsigned pad0:4;
+      unsigned x_dither_offset:2;
+      unsigned y_dither_offset:2;
+      unsigned dither_enable:1;
+      unsigned alpha_test_func:3;
+      unsigned alpha_test_enable:1;
+      unsigned pad1:1;
+      unsigned logic_op_func:4;
+      unsigned logic_op_enable:1;
+      unsigned pad2:1;
+      unsigned write_disable_b:1;
+      unsigned write_disable_g:1;
+      unsigned write_disable_r:1;
+      unsigned write_disable_a:1;
+      unsigned pad3:1;
+      unsigned alpha_to_coverage_dither:1;
+      unsigned alpha_to_one:1;
+      unsigned alpha_to_coverage:1;
    } blend1;
 };
 
 struct gen6_color_calc_state
 {
    struct {
-      GLuint alpha_test_format:1;
-      GLuint pad0:14;
-      GLuint round_disable:1;
-      GLuint bf_stencil_ref:8;
-      GLuint stencil_ref:8;
+      unsigned alpha_test_format:1;
+      unsigned pad0:14;
+      unsigned round_disable:1;
+      unsigned bf_stencil_ref:8;
+      unsigned stencil_ref:8;
    } cc0;
 
    union {
-      GLfloat alpha_ref_f;
+      float alpha_ref_f;
       struct {
-	 GLuint ui:8;
-	 GLuint pad0:24;
+	 unsigned ui:8;
+	 unsigned pad0:24;
       } alpha_ref_fi;
    } cc1;
 
-   GLfloat constant_r;
-   GLfloat constant_g;
-   GLfloat constant_b;
-   GLfloat constant_a;
+   float constant_r;
+   float constant_g;
+   float constant_b;
+   float constant_a;
 };
 
 struct gen6_depth_stencil_state
 {
    struct {
-      GLuint pad0:3;
-      GLuint bf_stencil_pass_depth_pass_op:3;
-      GLuint bf_stencil_pass_depth_fail_op:3;
-      GLuint bf_stencil_fail_op:3;
-      GLuint bf_stencil_func:3;
-      GLuint bf_stencil_enable:1;
-      GLuint pad1:2;
-      GLuint stencil_write_enable:1;
-      GLuint stencil_pass_depth_pass_op:3;
-      GLuint stencil_pass_depth_fail_op:3;
-      GLuint stencil_fail_op:3;
-      GLuint stencil_func:3;
-      GLuint stencil_enable:1;
+      unsigned pad0:3;
+      unsigned bf_stencil_pass_depth_pass_op:3;
+      unsigned bf_stencil_pass_depth_fail_op:3;
+      unsigned bf_stencil_fail_op:3;
+      unsigned bf_stencil_func:3;
+      unsigned bf_stencil_enable:1;
+      unsigned pad1:2;
+      unsigned stencil_write_enable:1;
+      unsigned stencil_pass_depth_pass_op:3;
+      unsigned stencil_pass_depth_fail_op:3;
+      unsigned stencil_fail_op:3;
+      unsigned stencil_func:3;
+      unsigned stencil_enable:1;
    } ds0;
 
    struct {
-      GLuint bf_stencil_write_mask:8;
-      GLuint bf_stencil_test_mask:8;
-      GLuint stencil_write_mask:8;
-      GLuint stencil_test_mask:8;
+      unsigned bf_stencil_write_mask:8;
+      unsigned bf_stencil_test_mask:8;
+      unsigned stencil_write_mask:8;
+      unsigned stencil_test_mask:8;
    } ds1;
 
    struct {
-      GLuint pad0:26;
-      GLuint depth_write_enable:1;
-      GLuint depth_test_func:3;
-      GLuint pad1:1;
-      GLuint depth_test_enable:1;
+      unsigned pad0:26;
+      unsigned depth_write_enable:1;
+      unsigned depth_test_func:3;
+      unsigned pad1:1;
+      unsigned depth_test_enable:1;
    } ds2;
 };
 
@@ -300,90 +294,90 @@ struct brw_cc_unit_state
 {
    struct
    {
-      GLuint pad0:3;
-      GLuint bf_stencil_pass_depth_pass_op:3;
-      GLuint bf_stencil_pass_depth_fail_op:3;
-      GLuint bf_stencil_fail_op:3;
-      GLuint bf_stencil_func:3;
-      GLuint bf_stencil_enable:1;
-      GLuint pad1:2;
-      GLuint stencil_write_enable:1;
-      GLuint stencil_pass_depth_pass_op:3;
-      GLuint stencil_pass_depth_fail_op:3;
-      GLuint stencil_fail_op:3;
-      GLuint stencil_func:3;
-      GLuint stencil_enable:1;
+      unsigned pad0:3;
+      unsigned bf_stencil_pass_depth_pass_op:3;
+      unsigned bf_stencil_pass_depth_fail_op:3;
+      unsigned bf_stencil_fail_op:3;
+      unsigned bf_stencil_func:3;
+      unsigned bf_stencil_enable:1;
+      unsigned pad1:2;
+      unsigned stencil_write_enable:1;
+      unsigned stencil_pass_depth_pass_op:3;
+      unsigned stencil_pass_depth_fail_op:3;
+      unsigned stencil_fail_op:3;
+      unsigned stencil_func:3;
+      unsigned stencil_enable:1;
    } cc0;
 
 
    struct
    {
-      GLuint bf_stencil_ref:8;
-      GLuint stencil_write_mask:8;
-      GLuint stencil_test_mask:8;
-      GLuint stencil_ref:8;
+      unsigned bf_stencil_ref:8;
+      unsigned stencil_write_mask:8;
+      unsigned stencil_test_mask:8;
+      unsigned stencil_ref:8;
    } cc1;
 
 
    struct
    {
-      GLuint logicop_enable:1;
-      GLuint pad0:10;
-      GLuint depth_write_enable:1;
-      GLuint depth_test_function:3;
-      GLuint depth_test:1;
-      GLuint bf_stencil_write_mask:8;
-      GLuint bf_stencil_test_mask:8;
+      unsigned logicop_enable:1;
+      unsigned pad0:10;
+      unsigned depth_write_enable:1;
+      unsigned depth_test_function:3;
+      unsigned depth_test:1;
+      unsigned bf_stencil_write_mask:8;
+      unsigned bf_stencil_test_mask:8;
    } cc2;
 
 
    struct
    {
-      GLuint pad0:8;
-      GLuint alpha_test_func:3;
-      GLuint alpha_test:1;
-      GLuint blend_enable:1;
-      GLuint ia_blend_enable:1;
-      GLuint pad1:1;
-      GLuint alpha_test_format:1;
-      GLuint pad2:16;
+      unsigned pad0:8;
+      unsigned alpha_test_func:3;
+      unsigned alpha_test:1;
+      unsigned blend_enable:1;
+      unsigned ia_blend_enable:1;
+      unsigned pad1:1;
+      unsigned alpha_test_format:1;
+      unsigned pad2:16;
    } cc3;
 
    struct
    {
-      GLuint pad0:5;
-      GLuint cc_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
+      unsigned pad0:5;
+      unsigned cc_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
    } cc4;
 
    struct
    {
-      GLuint pad0:2;
-      GLuint ia_dest_blend_factor:5;
-      GLuint ia_src_blend_factor:5;
-      GLuint ia_blend_function:3;
-      GLuint statistics_enable:1;
-      GLuint logicop_func:4;
-      GLuint pad1:11;
-      GLuint dither_enable:1;
+      unsigned pad0:2;
+      unsigned ia_dest_blend_factor:5;
+      unsigned ia_src_blend_factor:5;
+      unsigned ia_blend_function:3;
+      unsigned statistics_enable:1;
+      unsigned logicop_func:4;
+      unsigned pad1:11;
+      unsigned dither_enable:1;
    } cc5;
 
    struct
    {
-      GLuint clamp_post_alpha_blend:1;
-      GLuint clamp_pre_alpha_blend:1;
-      GLuint clamp_range:2;
-      GLuint pad0:11;
-      GLuint y_dither_offset:2;
-      GLuint x_dither_offset:2;
-      GLuint dest_blend_factor:5;
-      GLuint src_blend_factor:5;
-      GLuint blend_function:3;
+      unsigned clamp_post_alpha_blend:1;
+      unsigned clamp_pre_alpha_blend:1;
+      unsigned clamp_range:2;
+      unsigned pad0:11;
+      unsigned y_dither_offset:2;
+      unsigned x_dither_offset:2;
+      unsigned dest_blend_factor:5;
+      unsigned src_blend_factor:5;
+      unsigned blend_function:3;
    } cc6;
 
    struct {
       union {
-	 GLfloat f;
-	 GLubyte ub[4];
+	 float f;
+	 uint8_t ub[4];
       } alpha_ref;
    } cc7;
 };
@@ -397,62 +391,62 @@ struct brw_sf_unit_state
 
    struct
    {
-      GLuint pad0:10;
-      GLuint stats_enable:1;
-      GLuint nr_urb_entries:7;
-      GLuint pad1:1;
-      GLuint urb_entry_allocation_size:5;
-      GLuint pad2:1;
-      GLuint max_threads:6;
-      GLuint pad3:1;
+      unsigned pad0:10;
+      unsigned stats_enable:1;
+      unsigned nr_urb_entries:7;
+      unsigned pad1:1;
+      unsigned urb_entry_allocation_size:5;
+      unsigned pad2:1;
+      unsigned max_threads:6;
+      unsigned pad3:1;
    } thread4;
 
    struct
    {
-      GLuint front_winding:1;
-      GLuint viewport_transform:1;
-      GLuint pad0:3;
-      GLuint sf_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
+      unsigned front_winding:1;
+      unsigned viewport_transform:1;
+      unsigned pad0:3;
+      unsigned sf_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
    } sf5;
 
    struct
    {
-      GLuint pad0:9;
-      GLuint dest_org_vbias:4;
-      GLuint dest_org_hbias:4;
-      GLuint scissor:1;
-      GLuint disable_2x2_trifilter:1;
-      GLuint disable_zero_pix_trifilter:1;
-      GLuint point_rast_rule:2;
-      GLuint line_endcap_aa_region_width:2;
-      GLuint line_width:4;
-      GLuint fast_scissor_disable:1;
-      GLuint cull_mode:2;
-      GLuint aa_enable:1;
+      unsigned pad0:9;
+      unsigned dest_org_vbias:4;
+      unsigned dest_org_hbias:4;
+      unsigned scissor:1;
+      unsigned disable_2x2_trifilter:1;
+      unsigned disable_zero_pix_trifilter:1;
+      unsigned point_rast_rule:2;
+      unsigned line_endcap_aa_region_width:2;
+      unsigned line_width:4;
+      unsigned fast_scissor_disable:1;
+      unsigned cull_mode:2;
+      unsigned aa_enable:1;
    } sf6;
 
    struct
    {
-      GLuint point_size:11;
-      GLuint use_point_size_state:1;
-      GLuint subpixel_precision:1;
-      GLuint sprite_point:1;
-      GLuint pad0:10;
-      GLuint aa_line_distance_mode:1;
-      GLuint trifan_pv:2;
-      GLuint linestrip_pv:2;
-      GLuint tristrip_pv:2;
-      GLuint line_last_pixel_enable:1;
+      unsigned point_size:11;
+      unsigned use_point_size_state:1;
+      unsigned subpixel_precision:1;
+      unsigned sprite_point:1;
+      unsigned pad0:10;
+      unsigned aa_line_distance_mode:1;
+      unsigned trifan_pv:2;
+      unsigned linestrip_pv:2;
+      unsigned tristrip_pv:2;
+      unsigned line_last_pixel_enable:1;
    } sf7;
 
 };
 
 struct gen6_scissor_rect
 {
-   GLuint xmin:16;
-   GLuint ymin:16;
-   GLuint xmax:16;
-   GLuint ymax:16;
+   unsigned xmin:16;
+   unsigned ymin:16;
+   unsigned xmax:16;
+   unsigned ymax:16;
 };
 
 struct brw_gs_unit_state
@@ -464,37 +458,37 @@ struct brw_gs_unit_state
 
    struct
    {
-      GLuint pad0:8;
-      GLuint rendering_enable:1; /* for Ironlake */
-      GLuint pad4:1;
-      GLuint stats_enable:1;
-      GLuint nr_urb_entries:7;
-      GLuint pad1:1;
-      GLuint urb_entry_allocation_size:5;
-      GLuint pad2:1;
-      GLuint max_threads:5;
-      GLuint pad3:2;
+      unsigned pad0:8;
+      unsigned rendering_enable:1; /* for Ironlake */
+      unsigned pad4:1;
+      unsigned stats_enable:1;
+      unsigned nr_urb_entries:7;
+      unsigned pad1:1;
+      unsigned urb_entry_allocation_size:5;
+      unsigned pad2:1;
+      unsigned max_threads:5;
+      unsigned pad3:2;
    } thread4;
 
    struct
    {
-      GLuint sampler_count:3;
-      GLuint pad0:2;
-      GLuint sampler_state_pointer:27;
+      unsigned sampler_count:3;
+      unsigned pad0:2;
+      unsigned sampler_state_pointer:27;
    } gs5;
 
 
    struct
    {
-      GLuint max_vp_index:4;
-      GLuint pad0:12;
-      GLuint svbi_post_inc_value:10;
-      GLuint pad1:1;
-      GLuint svbi_post_inc_enable:1;
-      GLuint svbi_payload:1;
-      GLuint discard_adjaceny:1;
-      GLuint reorder_enable:1;
-      GLuint pad2:1;
+      unsigned max_vp_index:4;
+      unsigned pad0:12;
+      unsigned svbi_post_inc_value:10;
+      unsigned pad1:1;
+      unsigned svbi_post_inc_enable:1;
+      unsigned svbi_payload:1;
+      unsigned discard_adjaceny:1;
+      unsigned reorder_enable:1;
+      unsigned pad2:1;
    } gs6;
 };
 
@@ -508,28 +502,28 @@ struct brw_vs_unit_state
 
    struct
    {
-      GLuint pad0:10;
-      GLuint stats_enable:1;
-      GLuint nr_urb_entries:7;
-      GLuint pad1:1;
-      GLuint urb_entry_allocation_size:5;
-      GLuint pad2:1;
-      GLuint max_threads:6;
-      GLuint pad3:1;
+      unsigned pad0:10;
+      unsigned stats_enable:1;
+      unsigned nr_urb_entries:7;
+      unsigned pad1:1;
+      unsigned urb_entry_allocation_size:5;
+      unsigned pad2:1;
+      unsigned max_threads:6;
+      unsigned pad3:1;
    } thread4;
 
    struct
    {
-      GLuint sampler_count:3;
-      GLuint pad0:2;
-      GLuint sampler_state_pointer:27;
+      unsigned sampler_count:3;
+      unsigned pad0:2;
+      unsigned sampler_state_pointer:27;
    } vs5;
 
    struct
    {
-      GLuint vs_enable:1;
-      GLuint vert_cache_disable:1;
-      GLuint pad0:30;
+      unsigned vs_enable:1;
+      unsigned vert_cache_disable:1;
+      unsigned pad0:30;
    } vs6;
 };
 
@@ -542,71 +536,71 @@ struct brw_wm_unit_state
    struct thread3 thread3;
 
    struct {
-      GLuint stats_enable:1;
-      GLuint depth_buffer_clear:1;
-      GLuint sampler_count:3;
-      GLuint sampler_state_pointer:27;
+      unsigned stats_enable:1;
+      unsigned depth_buffer_clear:1;
+      unsigned sampler_count:3;
+      unsigned sampler_state_pointer:27;
    } wm4;
 
    struct
    {
-      GLuint enable_8_pix:1;
-      GLuint enable_16_pix:1;
-      GLuint enable_32_pix:1;
-      GLuint enable_con_32_pix:1;
-      GLuint enable_con_64_pix:1;
-      GLuint pad0:1;
+      unsigned enable_8_pix:1;
+      unsigned enable_16_pix:1;
+      unsigned enable_32_pix:1;
+      unsigned enable_con_32_pix:1;
+      unsigned enable_con_64_pix:1;
+      unsigned pad0:1;
 
       /* These next four bits are for Ironlake+ */
-      GLuint fast_span_coverage_enable:1;
-      GLuint depth_buffer_clear:1;
-      GLuint depth_buffer_resolve_enable:1;
-      GLuint hierarchical_depth_buffer_resolve_enable:1;
-
-      GLuint legacy_global_depth_bias:1;
-      GLuint line_stipple:1;
-      GLuint depth_offset:1;
-      GLuint polygon_stipple:1;
-      GLuint line_aa_region_width:2;
-      GLuint line_endcap_aa_region_width:2;
-      GLuint early_depth_test:1;
-      GLuint thread_dispatch_enable:1;
-      GLuint program_uses_depth:1;
-      GLuint program_computes_depth:1;
-      GLuint program_uses_killpixel:1;
-      GLuint legacy_line_rast: 1;
-      GLuint transposed_urb_read_enable:1;
-      GLuint max_threads:7;
+      unsigned fast_span_coverage_enable:1;
+      unsigned depth_buffer_clear:1;
+      unsigned depth_buffer_resolve_enable:1;
+      unsigned hierarchical_depth_buffer_resolve_enable:1;
+
+      unsigned legacy_global_depth_bias:1;
+      unsigned line_stipple:1;
+      unsigned depth_offset:1;
+      unsigned polygon_stipple:1;
+      unsigned line_aa_region_width:2;
+      unsigned line_endcap_aa_region_width:2;
+      unsigned early_depth_test:1;
+      unsigned thread_dispatch_enable:1;
+      unsigned program_uses_depth:1;
+      unsigned program_computes_depth:1;
+      unsigned program_uses_killpixel:1;
+      unsigned legacy_line_rast: 1;
+      unsigned transposed_urb_read_enable:1;
+      unsigned max_threads:7;
    } wm5;
 
-   GLfloat global_depth_offset_constant;
-   GLfloat global_depth_offset_scale;
+   float global_depth_offset_constant;
+   float global_depth_offset_scale;
 
    /* for Ironlake only */
    struct {
-      GLuint pad0:1;
-      GLuint grf_reg_count_1:3;
-      GLuint pad1:2;
-      GLuint kernel_start_pointer_1:26;
+      unsigned pad0:1;
+      unsigned grf_reg_count_1:3;
+      unsigned pad1:2;
+      unsigned kernel_start_pointer_1:26;
    } wm8;
 
    struct {
-      GLuint pad0:1;
-      GLuint grf_reg_count_2:3;
-      GLuint pad1:2;
-      GLuint kernel_start_pointer_2:26;
+      unsigned pad0:1;
+      unsigned grf_reg_count_2:3;
+      unsigned pad1:2;
+      unsigned kernel_start_pointer_2:26;
    } wm9;
 
    struct {
-      GLuint pad0:1;
-      GLuint grf_reg_count_3:3;
-      GLuint pad1:2;
-      GLuint kernel_start_pointer_3:26;
+      unsigned pad0:1;
+      unsigned grf_reg_count_3:3;
+      unsigned pad1:2;
+      unsigned kernel_start_pointer_3:26;
    } wm10;
 };
 
 struct brw_sampler_default_color {
-   GLfloat color[4];
+   float color[4];
 };
 
 struct gen5_sampler_default_color {
@@ -623,48 +617,48 @@ struct brw_sampler_state
 
    struct
    {
-      GLuint shadow_function:3;
-      GLuint lod_bias:11;
-      GLuint min_filter:3;
-      GLuint mag_filter:3;
-      GLuint mip_filter:2;
-      GLuint base_level:5;
-      GLuint min_mag_neq:1;
-      GLuint lod_preclamp:1;
-      GLuint default_color_mode:1;
-      GLuint pad0:1;
-      GLuint disable:1;
+      unsigned shadow_function:3;
+      unsigned lod_bias:11;
+      unsigned min_filter:3;
+      unsigned mag_filter:3;
+      unsigned mip_filter:2;
+      unsigned base_level:5;
+      unsigned min_mag_neq:1;
+      unsigned lod_preclamp:1;
+      unsigned default_color_mode:1;
+      unsigned pad0:1;
+      unsigned disable:1;
    } ss0;
 
    struct
    {
-      GLuint r_wrap_mode:3;
-      GLuint t_wrap_mode:3;
-      GLuint s_wrap_mode:3;
-      GLuint cube_control_mode:1;
-      GLuint pad:2;
-      GLuint max_lod:10;
-      GLuint min_lod:10;
+      unsigned r_wrap_mode:3;
+      unsigned t_wrap_mode:3;
+      unsigned s_wrap_mode:3;
+      unsigned cube_control_mode:1;
+      unsigned pad:2;
+      unsigned max_lod:10;
+      unsigned min_lod:10;
    } ss1;
 
 
    struct
    {
-      GLuint pad:5;
-      GLuint default_color_pointer:27;
+      unsigned pad:5;
+      unsigned default_color_pointer:27;
    } ss2;
 
    struct
    {
-      GLuint non_normalized_coord:1;
-      GLuint pad:12;
-      GLuint address_round:6;
-      GLuint max_aniso:3;
-      GLuint chroma_key_mode:1;
-      GLuint chroma_key_index:2;
-      GLuint chroma_key_enable:1;
-      GLuint monochrome_filter_width:3;
-      GLuint monochrome_filter_height:3;
+      unsigned non_normalized_coord:1;
+      unsigned pad:12;
+      unsigned address_round:6;
+      unsigned max_aniso:3;
+      unsigned chroma_key_mode:1;
+      unsigned chroma_key_index:2;
+      unsigned chroma_key_enable:1;
+      unsigned monochrome_filter_width:3;
+      unsigned monochrome_filter_height:3;
    } ss3;
 };
 
@@ -672,152 +666,152 @@ struct gen7_sampler_state
 {
    struct
    {
-      GLuint aniso_algorithm:1;
-      GLuint lod_bias:13;
-      GLuint min_filter:3;
-      GLuint mag_filter:3;
-      GLuint mip_filter:2;
-      GLuint base_level:5;
-      GLuint pad1:1;
-      GLuint lod_preclamp:1;
-      GLuint default_color_mode:1;
-      GLuint pad0:1;
-      GLuint disable:1;
+      unsigned aniso_algorithm:1;
+      unsigned lod_bias:13;
+      unsigned min_filter:3;
+      unsigned mag_filter:3;
+      unsigned mip_filter:2;
+      unsigned base_level:5;
+      unsigned pad1:1;
+      unsigned lod_preclamp:1;
+      unsigned default_color_mode:1;
+      unsigned pad0:1;
+      unsigned disable:1;
    } ss0;
 
    struct
    {
-      GLuint cube_control_mode:1;
-      GLuint shadow_function:3;
-      GLuint pad:4;
-      GLuint max_lod:12;
-      GLuint min_lod:12;
+      unsigned cube_control_mode:1;
+      unsigned shadow_function:3;
+      unsigned pad:4;
+      unsigned max_lod:12;
+      unsigned min_lod:12;
    } ss1;
 
    struct
    {
-      GLuint pad:5;
-      GLuint default_color_pointer:27;
+      unsigned pad:5;
+      unsigned default_color_pointer:27;
    } ss2;
 
    struct
    {
-      GLuint r_wrap_mode:3;
-      GLuint t_wrap_mode:3;
-      GLuint s_wrap_mode:3;
-      GLuint pad:1;
-      GLuint non_normalized_coord:1;
-      GLuint trilinear_quality:2;
-      GLuint address_round:6;
-      GLuint max_aniso:3;
-      GLuint chroma_key_mode:1;
-      GLuint chroma_key_index:2;
-      GLuint chroma_key_enable:1;
-      GLuint pad0:6;
+      unsigned r_wrap_mode:3;
+      unsigned t_wrap_mode:3;
+      unsigned s_wrap_mode:3;
+      unsigned pad:1;
+      unsigned non_normalized_coord:1;
+      unsigned trilinear_quality:2;
+      unsigned address_round:6;
+      unsigned max_aniso:3;
+      unsigned chroma_key_mode:1;
+      unsigned chroma_key_index:2;
+      unsigned chroma_key_enable:1;
+      unsigned pad0:6;
    } ss3;
 };
 
 struct brw_clipper_viewport
 {
-   GLfloat xmin;
-   GLfloat xmax;
-   GLfloat ymin;
-   GLfloat ymax;
+   float xmin;
+   float xmax;
+   float ymin;
+   float ymax;
 };
 
 struct brw_cc_viewport
 {
-   GLfloat min_depth;
-   GLfloat max_depth;
+   float min_depth;
+   float max_depth;
 };
 
 struct brw_sf_viewport
 {
    struct {
-      GLfloat m00;
-      GLfloat m11;
-      GLfloat m22;
-      GLfloat m30;
-      GLfloat m31;
-      GLfloat m32;
+      float m00;
+      float m11;
+      float m22;
+      float m30;
+      float m31;
+      float m32;
    } viewport;
 
    /* scissor coordinates are inclusive */
    struct {
-      GLshort xmin;
-      GLshort ymin;
-      GLshort xmax;
-      GLshort ymax;
+      int16_t xmin;
+      int16_t ymin;
+      int16_t xmax;
+      int16_t ymax;
    } scissor;
 };
 
 struct gen6_sf_viewport {
-   GLfloat m00;
-   GLfloat m11;
-   GLfloat m22;
-   GLfloat m30;
-   GLfloat m31;
-   GLfloat m32;
+   float m00;
+   float m11;
+   float m22;
+   float m30;
+   float m31;
+   float m32;
 };
 
 struct gen7_sf_clip_viewport {
    struct {
-      GLfloat m00;
-      GLfloat m11;
-      GLfloat m22;
-      GLfloat m30;
-      GLfloat m31;
-      GLfloat m32;
+      float m00;
+      float m11;
+      float m22;
+      float m30;
+      float m31;
+      float m32;
    } viewport;
 
-   GLuint pad0[2];
+   unsigned pad0[2];
 
    struct {
-      GLfloat xmin;
-      GLfloat xmax;
-      GLfloat ymin;
-      GLfloat ymax;
+      float xmin;
+      float xmax;
+      float ymin;
+      float ymax;
    } guardband;
 
-   GLfloat pad1[4];
+   float pad1[4];
 };
 
 struct brw_vertex_element_state
 {
    struct
    {
-      GLuint src_offset:11;
-      GLuint pad:5;
-      GLuint src_format:9;
-      GLuint pad0:1;
-      GLuint valid:1;
-      GLuint vertex_buffer_index:5;
+      unsigned src_offset:11;
+      unsigned pad:5;
+      unsigned src_format:9;
+      unsigned pad0:1;
+      unsigned valid:1;
+      unsigned vertex_buffer_index:5;
    } ve0;
 
    struct
    {
-      GLuint dst_offset:8;
-      GLuint pad:8;
-      GLuint vfcomponent3:4;
-      GLuint vfcomponent2:4;
-      GLuint vfcomponent1:4;
-      GLuint vfcomponent0:4;
+      unsigned dst_offset:8;
+      unsigned pad:8;
+      unsigned vfcomponent3:4;
+      unsigned vfcomponent2:4;
+      unsigned vfcomponent1:4;
+      unsigned vfcomponent0:4;
    } ve1;
 };
 
 struct brw_urb_immediate {
-   GLuint opcode:4;
-   GLuint offset:6;
-   GLuint swizzle_control:2;
-   GLuint pad:1;
-   GLuint allocate:1;
-   GLuint used:1;
-   GLuint complete:1;
-   GLuint response_length:4;
-   GLuint msg_length:4;
-   GLuint msg_target:4;
-   GLuint pad1:3;
-   GLuint end_of_thread:1;
+   unsigned opcode:4;
+   unsigned offset:6;
+   unsigned swizzle_control:2;
+   unsigned pad:1;
+   unsigned allocate:1;
+   unsigned used:1;
+   unsigned complete:1;
+   unsigned response_length:4;
+   unsigned msg_length:4;
+   unsigned msg_target:4;
+   unsigned pad1:3;
+   unsigned end_of_thread:1;
 };
 
 /* Instruction format for the execution units:
@@ -827,119 +821,119 @@ struct brw_instruction
 {
    struct
    {
-      GLuint opcode:7;
-      GLuint pad:1;
-      GLuint access_mode:1;
-      GLuint mask_control:1;
-      GLuint dependency_control:2;
-      GLuint compression_control:2; /* gen6: quater control */
-      GLuint thread_control:2;
-      GLuint predicate_control:4;
-      GLuint predicate_inverse:1;
-      GLuint execution_size:3;
+      unsigned opcode:7;
+      unsigned pad:1;
+      unsigned access_mode:1;
+      unsigned mask_control:1;
+      unsigned dependency_control:2;
+      unsigned compression_control:2; /* gen6: quater control */
+      unsigned thread_control:2;
+      unsigned predicate_control:4;
+      unsigned predicate_inverse:1;
+      unsigned execution_size:3;
       /**
        * Conditional Modifier for most instructions.  On Gen6+, this is also
        * used for the SEND instruction's Message Target/SFID.
        */
-      GLuint destreg__conditionalmod:4;
-      GLuint acc_wr_control:1;
-      GLuint cmpt_control:1;
-      GLuint debug_control:1;
-      GLuint saturate:1;
+      unsigned destreg__conditionalmod:4;
+      unsigned acc_wr_control:1;
+      unsigned cmpt_control:1;
+      unsigned debug_control:1;
+      unsigned saturate:1;
    } header;
 
    union {
       struct
       {
-	 GLuint dest_reg_file:2;
-	 GLuint dest_reg_type:3;
-	 GLuint src0_reg_file:2;
-	 GLuint src0_reg_type:3;
-	 GLuint src1_reg_file:2;
-	 GLuint src1_reg_type:3;
-	 GLuint pad:1;
-	 GLuint dest_subreg_nr:5;
-	 GLuint dest_reg_nr:8;
-	 GLuint dest_horiz_stride:2;
-	 GLuint dest_address_mode:1;
+	 unsigned dest_reg_file:2;
+	 unsigned dest_reg_type:3;
+	 unsigned src0_reg_file:2;
+	 unsigned src0_reg_type:3;
+	 unsigned src1_reg_file:2;
+	 unsigned src1_reg_type:3;
+	 unsigned pad:1;
+	 unsigned dest_subreg_nr:5;
+	 unsigned dest_reg_nr:8;
+	 unsigned dest_horiz_stride:2;
+	 unsigned dest_address_mode:1;
       } da1;
 
       struct
       {
-	 GLuint dest_reg_file:2;
-	 GLuint dest_reg_type:3;
-	 GLuint src0_reg_file:2;
-	 GLuint src0_reg_type:3;
-	 GLuint src1_reg_file:2;        /* 0x00000c00 */
-	 GLuint src1_reg_type:3;        /* 0x00007000 */
-	 GLuint pad:1;
-	 GLint dest_indirect_offset:10;	/* offset against the deref'd address reg */
-	 GLuint dest_subreg_nr:3; /* subnr for the address reg a0.x */
-	 GLuint dest_horiz_stride:2;
-	 GLuint dest_address_mode:1;
+	 unsigned dest_reg_file:2;
+	 unsigned dest_reg_type:3;
+	 unsigned src0_reg_file:2;
+	 unsigned src0_reg_type:3;
+	 unsigned src1_reg_file:2;        /* 0x00000c00 */
+	 unsigned src1_reg_type:3;        /* 0x00007000 */
+	 unsigned pad:1;
+	 int dest_indirect_offset:10;	/* offset against the deref'd address reg */
+	 unsigned dest_subreg_nr:3; /* subnr for the address reg a0.x */
+	 unsigned dest_horiz_stride:2;
+	 unsigned dest_address_mode:1;
       } ia1;
 
       struct
       {
-	 GLuint dest_reg_file:2;
-	 GLuint dest_reg_type:3;
-	 GLuint src0_reg_file:2;
-	 GLuint src0_reg_type:3;
-	 GLuint src1_reg_file:2;
-	 GLuint src1_reg_type:3;
-	 GLuint pad:1;
-	 GLuint dest_writemask:4;
-	 GLuint dest_subreg_nr:1;
-	 GLuint dest_reg_nr:8;
-	 GLuint dest_horiz_stride:2;
-	 GLuint dest_address_mode:1;
+	 unsigned dest_reg_file:2;
+	 unsigned dest_reg_type:3;
+	 unsigned src0_reg_file:2;
+	 unsigned src0_reg_type:3;
+	 unsigned src1_reg_file:2;
+	 unsigned src1_reg_type:3;
+	 unsigned pad:1;
+	 unsigned dest_writemask:4;
+	 unsigned dest_subreg_nr:1;
+	 unsigned dest_reg_nr:8;
+	 unsigned dest_horiz_stride:2;
+	 unsigned dest_address_mode:1;
       } da16;
 
       struct
       {
-	 GLuint dest_reg_file:2;
-	 GLuint dest_reg_type:3;
-	 GLuint src0_reg_file:2;
-	 GLuint src0_reg_type:3;
-	 GLuint pad0:6;
-	 GLuint dest_writemask:4;
-	 GLint dest_indirect_offset:6;
-	 GLuint dest_subreg_nr:3;
-	 GLuint dest_horiz_stride:2;
-	 GLuint dest_address_mode:1;
+	 unsigned dest_reg_file:2;
+	 unsigned dest_reg_type:3;
+	 unsigned src0_reg_file:2;
+	 unsigned src0_reg_type:3;
+	 unsigned pad0:6;
+	 unsigned dest_writemask:4;
+	 int dest_indirect_offset:6;
+	 unsigned dest_subreg_nr:3;
+	 unsigned dest_horiz_stride:2;
+	 unsigned dest_address_mode:1;
       } ia16;
 
       struct {
-	 GLuint dest_reg_file:2;
-	 GLuint dest_reg_type:3;
-	 GLuint src0_reg_file:2;
-	 GLuint src0_reg_type:3;
-	 GLuint src1_reg_file:2;
-	 GLuint src1_reg_type:3;
-	 GLuint pad:1;
-
-	 GLint jump_count:16;
+	 unsigned dest_reg_file:2;
+	 unsigned dest_reg_type:3;
+	 unsigned src0_reg_file:2;
+	 unsigned src0_reg_type:3;
+	 unsigned src1_reg_file:2;
+	 unsigned src1_reg_type:3;
+	 unsigned pad:1;
+
+	 int jump_count:16;
       } branch_gen6;
 
       struct {
-	 GLuint dest_reg_file:1;
-	 GLuint flag_subreg_nr:1;
-	 GLuint flag_reg_nr:1;
-	 GLuint pad0:1;
-	 GLuint src0_abs:1;
-	 GLuint src0_negate:1;
-	 GLuint src1_abs:1;
-	 GLuint src1_negate:1;
-	 GLuint src2_abs:1;
-	 GLuint src2_negate:1;
-	 GLuint src_reg_type:2;
-	 GLuint dest_reg_type:2;
-	 GLuint pad1:1;
-	 GLuint nib_ctrl:1;
-	 GLuint pad2:1;
-	 GLuint dest_writemask:4;
-	 GLuint dest_subreg_nr:3;
-	 GLuint dest_reg_nr:8;
+	 unsigned dest_reg_file:1;
+	 unsigned flag_subreg_nr:1;
+	 unsigned flag_reg_nr:1;
+	 unsigned pad0:1;
+	 unsigned src0_abs:1;
+	 unsigned src0_negate:1;
+	 unsigned src1_abs:1;
+	 unsigned src1_negate:1;
+	 unsigned src2_abs:1;
+	 unsigned src2_negate:1;
+	 unsigned src_reg_type:2;
+	 unsigned dest_reg_type:2;
+	 unsigned pad1:1;
+	 unsigned nib_ctrl:1;
+	 unsigned pad2:1;
+	 unsigned dest_writemask:4;
+	 unsigned dest_subreg_nr:3;
+	 unsigned dest_reg_nr:8;
       } da3src;
 
       uint32_t ud;
@@ -949,68 +943,68 @@ struct brw_instruction
    union {
       struct
       {
-	 GLuint src0_subreg_nr:5;
-	 GLuint src0_reg_nr:8;
-	 GLuint src0_abs:1;
-	 GLuint src0_negate:1;
-	 GLuint src0_address_mode:1;
-	 GLuint src0_horiz_stride:2;
-	 GLuint src0_width:3;
-	 GLuint src0_vert_stride:4;
-	 GLuint flag_subreg_nr:1;
-	 GLuint flag_reg_nr:1;
-	 GLuint pad:5;
+	 unsigned src0_subreg_nr:5;
+	 unsigned src0_reg_nr:8;
+	 unsigned src0_abs:1;
+	 unsigned src0_negate:1;
+	 unsigned src0_address_mode:1;
+	 unsigned src0_horiz_stride:2;
+	 unsigned src0_width:3;
+	 unsigned src0_vert_stride:4;
+	 unsigned flag_subreg_nr:1;
+	 unsigned flag_reg_nr:1;
+	 unsigned pad:5;
       } da1;
 
       struct
       {
-	 GLint src0_indirect_offset:10;
-	 GLuint src0_subreg_nr:3;
-	 GLuint src0_abs:1;
-	 GLuint src0_negate:1;
-	 GLuint src0_address_mode:1;
-	 GLuint src0_horiz_stride:2;
-	 GLuint src0_width:3;
-	 GLuint src0_vert_stride:4;
-	 GLuint flag_subreg_nr:1;
-	 GLuint flag_reg_nr:1;
-	 GLuint pad:5;
+	 int src0_indirect_offset:10;
+	 unsigned src0_subreg_nr:3;
+	 unsigned src0_abs:1;
+	 unsigned src0_negate:1;
+	 unsigned src0_address_mode:1;
+	 unsigned src0_horiz_stride:2;
+	 unsigned src0_width:3;
+	 unsigned src0_vert_stride:4;
+	 unsigned flag_subreg_nr:1;
+	 unsigned flag_reg_nr:1;
+	 unsigned pad:5;
       } ia1;
 
       struct
       {
-	 GLuint src0_swz_x:2;
-	 GLuint src0_swz_y:2;
-	 GLuint src0_subreg_nr:1;
-	 GLuint src0_reg_nr:8;
-	 GLuint src0_abs:1;
-	 GLuint src0_negate:1;
-	 GLuint src0_address_mode:1;
-	 GLuint src0_swz_z:2;
-	 GLuint src0_swz_w:2;
-	 GLuint pad0:1;
-	 GLuint src0_vert_stride:4;
-	 GLuint flag_subreg_nr:1;
-	 GLuint flag_reg_nr:1;
-	 GLuint pad1:5;
+	 unsigned src0_swz_x:2;
+	 unsigned src0_swz_y:2;
+	 unsigned src0_subreg_nr:1;
+	 unsigned src0_reg_nr:8;
+	 unsigned src0_abs:1;
+	 unsigned src0_negate:1;
+	 unsigned src0_address_mode:1;
+	 unsigned src0_swz_z:2;
+	 unsigned src0_swz_w:2;
+	 unsigned pad0:1;
+	 unsigned src0_vert_stride:4;
+	 unsigned flag_subreg_nr:1;
+	 unsigned flag_reg_nr:1;
+	 unsigned pad1:5;
       } da16;
 
       struct
       {
-	 GLuint src0_swz_x:2;
-	 GLuint src0_swz_y:2;
-	 GLint src0_indirect_offset:6;
-	 GLuint src0_subreg_nr:3;
-	 GLuint src0_abs:1;
-	 GLuint src0_negate:1;
-	 GLuint src0_address_mode:1;
-	 GLuint src0_swz_z:2;
-	 GLuint src0_swz_w:2;
-	 GLuint pad0:1;
-	 GLuint src0_vert_stride:4;
-	 GLuint flag_subreg_nr:1;
-	 GLuint flag_reg_nr:1;
-	 GLuint pad1:5;
+	 unsigned src0_swz_x:2;
+	 unsigned src0_swz_y:2;
+	 int src0_indirect_offset:6;
+	 unsigned src0_subreg_nr:3;
+	 unsigned src0_abs:1;
+	 unsigned src0_negate:1;
+	 unsigned src0_address_mode:1;
+	 unsigned src0_swz_z:2;
+	 unsigned src0_swz_w:2;
+	 unsigned pad0:1;
+	 unsigned src0_vert_stride:4;
+	 unsigned flag_subreg_nr:1;
+	 unsigned flag_reg_nr:1;
+	 unsigned pad1:5;
       } ia16;
 
       /* Extended Message Descriptor for Ironlake (Gen5) SEND instruction.
@@ -1020,21 +1014,21 @@ struct brw_instruction
        */
        struct
        {
-           GLuint pad:26;
-           GLuint end_of_thread:1;
-           GLuint pad1:1;
-           GLuint sfid:4;
+           unsigned pad:26;
+           unsigned end_of_thread:1;
+           unsigned pad1:1;
+           unsigned sfid:4;
        } send_gen5;  /* for Ironlake only */
 
       struct {
-	 GLuint src0_rep_ctrl:1;
-	 GLuint src0_swizzle:8;
-	 GLuint src0_subreg_nr:3;
-	 GLuint src0_reg_nr:8;
-	 GLuint pad0:1;
-	 GLuint src1_rep_ctrl:1;
-	 GLuint src1_swizzle:8;
-	 GLuint src1_subreg_nr_low:2;
+	 unsigned src0_rep_ctrl:1;
+	 unsigned src0_swizzle:8;
+	 unsigned src0_subreg_nr:3;
+	 unsigned src0_reg_nr:8;
+	 unsigned pad0:1;
+	 unsigned src1_rep_ctrl:1;
+	 unsigned src1_swizzle:8;
+	 unsigned src1_subreg_nr_low:2;
       } da3src;
 
       uint32_t ud;
@@ -1044,68 +1038,68 @@ struct brw_instruction
    {
       struct
       {
-	 GLuint src1_subreg_nr:5;
-	 GLuint src1_reg_nr:8;
-	 GLuint src1_abs:1;
-	 GLuint src1_negate:1;
-	 GLuint src1_address_mode:1;
-	 GLuint src1_horiz_stride:2;
-	 GLuint src1_width:3;
-	 GLuint src1_vert_stride:4;
-	 GLuint pad0:7;
+	 unsigned src1_subreg_nr:5;
+	 unsigned src1_reg_nr:8;
+	 unsigned src1_abs:1;
+	 unsigned src1_negate:1;
+	 unsigned src1_address_mode:1;
+	 unsigned src1_horiz_stride:2;
+	 unsigned src1_width:3;
+	 unsigned src1_vert_stride:4;
+	 unsigned pad0:7;
       } da1;
 
       struct
       {
-	 GLuint src1_swz_x:2;
-	 GLuint src1_swz_y:2;
-	 GLuint src1_subreg_nr:1;
-	 GLuint src1_reg_nr:8;
-	 GLuint src1_abs:1;
-	 GLuint src1_negate:1;
-	 GLuint src1_address_mode:1;
-	 GLuint src1_swz_z:2;
-	 GLuint src1_swz_w:2;
-	 GLuint pad1:1;
-	 GLuint src1_vert_stride:4;
-	 GLuint pad2:7;
+	 unsigned src1_swz_x:2;
+	 unsigned src1_swz_y:2;
+	 unsigned src1_subreg_nr:1;
+	 unsigned src1_reg_nr:8;
+	 unsigned src1_abs:1;
+	 unsigned src1_negate:1;
+	 unsigned src1_address_mode:1;
+	 unsigned src1_swz_z:2;
+	 unsigned src1_swz_w:2;
+	 unsigned pad1:1;
+	 unsigned src1_vert_stride:4;
+	 unsigned pad2:7;
       } da16;
 
       struct
       {
-	 GLint  src1_indirect_offset:10;
-	 GLuint src1_subreg_nr:3;
-	 GLuint src1_abs:1;
-	 GLuint src1_negate:1;
-	 GLuint src1_address_mode:1;
-	 GLuint src1_horiz_stride:2;
-	 GLuint src1_width:3;
-	 GLuint src1_vert_stride:4;
-	 GLuint pad1:7;
+	 int  src1_indirect_offset:10;
+	 unsigned src1_subreg_nr:3;
+	 unsigned src1_abs:1;
+	 unsigned src1_negate:1;
+	 unsigned src1_address_mode:1;
+	 unsigned src1_horiz_stride:2;
+	 unsigned src1_width:3;
+	 unsigned src1_vert_stride:4;
+	 unsigned pad1:7;
       } ia1;
 
       struct
       {
-	 GLuint src1_swz_x:2;
-	 GLuint src1_swz_y:2;
-	 GLint  src1_indirect_offset:6;
-	 GLuint src1_subreg_nr:3;
-	 GLuint src1_abs:1;
-	 GLuint src1_negate:1;
-	 GLuint src1_address_mode:1;
-	 GLuint src1_swz_z:2;
-	 GLuint src1_swz_w:2;
-	 GLuint pad1:1;
-	 GLuint src1_vert_stride:4;
-	 GLuint pad2:7;
+	 unsigned src1_swz_x:2;
+	 unsigned src1_swz_y:2;
+	 int  src1_indirect_offset:6;
+	 unsigned src1_subreg_nr:3;
+	 unsigned src1_abs:1;
+	 unsigned src1_negate:1;
+	 unsigned src1_address_mode:1;
+	 unsigned src1_swz_z:2;
+	 unsigned src1_swz_w:2;
+	 unsigned pad1:1;
+	 unsigned src1_vert_stride:4;
+	 unsigned pad2:7;
       } ia16;
 
 
       struct
       {
-	 GLint  jump_count:16;	/* note: signed */
-	 GLuint  pop_count:4;
-	 GLuint  pad0:12;
+	 int  jump_count:16;	/* note: signed */
+	 unsigned  pop_count:4;
+	 unsigned  pad0:12;
       } if_else;
 
       /* This is also used for gen7 IF/ELSE instructions */
@@ -1124,7 +1118,7 @@ struct brw_instruction
 	 int uip:16;
       } break_cont;
 
-      GLint JIP; /* used by Gen6 CALL instructions; Gen7 JMPI */
+      int JIP; /* used by Gen6 CALL instructions; Gen7 JMPI */
 
       /**
        * \defgroup SEND instructions / Message Descriptors
@@ -1140,12 +1134,12 @@ struct brw_instruction
        * See the G45 PRM, Volume 4, Table 14-15.
        */
       struct {
-	 GLuint function_control:16;
-	 GLuint response_length:4;
-	 GLuint msg_length:4;
-	 GLuint msg_target:4;
-	 GLuint pad1:3;
-	 GLuint end_of_thread:1;
+	 unsigned function_control:16;
+	 unsigned response_length:4;
+	 unsigned msg_length:4;
+	 unsigned msg_target:4;
+	 unsigned pad1:3;
+	 unsigned end_of_thread:1;
       } generic;
 
       /**
@@ -1161,238 +1155,238 @@ struct brw_instruction
        *  bit 127 of the instruction word"...which is bit 31 of this field.
        */
       struct {
-	 GLuint function_control:19;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned function_control:19;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } generic_gen5;
 
       struct {
-	 GLuint opcode:1;
-	 GLuint requester_type:1;
-	 GLuint pad:2;
-	 GLuint resource_select:1;
-	 GLuint pad1:11;
-	 GLuint response_length:4;
-	 GLuint msg_length:4;
-	 GLuint msg_target:4;
-	 GLuint pad2:3;
-	 GLuint end_of_thread:1;
+	 unsigned opcode:1;
+	 unsigned requester_type:1;
+	 unsigned pad:2;
+	 unsigned resource_select:1;
+	 unsigned pad1:11;
+	 unsigned response_length:4;
+	 unsigned msg_length:4;
+	 unsigned msg_target:4;
+	 unsigned pad2:3;
+	 unsigned end_of_thread:1;
       } thread_spawner;
 
        struct {
-	 GLuint opcode:1;
-	 GLuint requester_type:1;
-	 GLuint pad0:2;
-	 GLuint resource_select:1;
-	 GLuint pad1:14;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad2:2;
-	 GLuint end_of_thread:1;
+	 unsigned opcode:1;
+	 unsigned requester_type:1;
+	 unsigned pad0:2;
+	 unsigned resource_select:1;
+	 unsigned pad1:14;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad2:2;
+	 unsigned end_of_thread:1;
       } thread_spawner_gen5;
 
       /** G45 PRM, Volume 4, Section 6.1.1.1 */
       struct {
-	 GLuint function:4;
-	 GLuint int_type:1;
-	 GLuint precision:1;
-	 GLuint saturate:1;
-	 GLuint data_type:1;
-	 GLuint pad0:8;
-	 GLuint response_length:4;
-	 GLuint msg_length:4;
-	 GLuint msg_target:4;
-	 GLuint pad1:3;
-	 GLuint end_of_thread:1;
+	 unsigned function:4;
+	 unsigned int_type:1;
+	 unsigned precision:1;
+	 unsigned saturate:1;
+	 unsigned data_type:1;
+	 unsigned pad0:8;
+	 unsigned response_length:4;
+	 unsigned msg_length:4;
+	 unsigned msg_target:4;
+	 unsigned pad1:3;
+	 unsigned end_of_thread:1;
       } math;
 
       /** Ironlake PRM, Volume 4 Part 1, Section 6.1.1.1 */
       struct {
-	 GLuint function:4;
-	 GLuint int_type:1;
-	 GLuint precision:1;
-	 GLuint saturate:1;
-	 GLuint data_type:1;
-	 GLuint snapshot:1;
-	 GLuint pad0:10;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned function:4;
+	 unsigned int_type:1;
+	 unsigned precision:1;
+	 unsigned saturate:1;
+	 unsigned data_type:1;
+	 unsigned snapshot:1;
+	 unsigned pad0:10;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } math_gen5;
 
       /** G45 PRM, Volume 4, Section 4.8.1.1.1 [DevBW] and [DevCL] */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint sampler:4;
-	 GLuint return_format:2;
-	 GLuint msg_type:2;
-	 GLuint response_length:4;
-	 GLuint msg_length:4;
-	 GLuint msg_target:4;
-	 GLuint pad1:3;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned sampler:4;
+	 unsigned return_format:2;
+	 unsigned msg_type:2;
+	 unsigned response_length:4;
+	 unsigned msg_length:4;
+	 unsigned msg_target:4;
+	 unsigned pad1:3;
+	 unsigned end_of_thread:1;
       } sampler;
 
       /** G45 PRM, Volume 4, Section 4.8.1.1.2 [DevCTG] */
       struct {
-         GLuint binding_table_index:8;
-         GLuint sampler:4;
-         GLuint msg_type:4;
-         GLuint response_length:4;
-         GLuint msg_length:4;
-         GLuint msg_target:4;
-         GLuint pad1:3;
-         GLuint end_of_thread:1;
+         unsigned binding_table_index:8;
+         unsigned sampler:4;
+         unsigned msg_type:4;
+         unsigned response_length:4;
+         unsigned msg_length:4;
+         unsigned msg_target:4;
+         unsigned pad1:3;
+         unsigned end_of_thread:1;
       } sampler_g4x;
 
       /** Ironlake PRM, Volume 4 Part 1, Section 4.11.1.1.3 */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint sampler:4;
-	 GLuint msg_type:4;
-	 GLuint simd_mode:2;
-	 GLuint pad0:1;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned sampler:4;
+	 unsigned msg_type:4;
+	 unsigned simd_mode:2;
+	 unsigned pad0:1;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } sampler_gen5;
 
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint sampler:4;
-	 GLuint msg_type:5;
-	 GLuint simd_mode:2;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned sampler:4;
+	 unsigned msg_type:5;
+	 unsigned simd_mode:2;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } sampler_gen7;
 
       struct brw_urb_immediate urb;
 
       struct {
-	 GLuint opcode:4;
-	 GLuint offset:6;
-	 GLuint swizzle_control:2;
-	 GLuint pad:1;
-	 GLuint allocate:1;
-	 GLuint used:1;
-	 GLuint complete:1;
-	 GLuint pad0:3;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned opcode:4;
+	 unsigned offset:6;
+	 unsigned swizzle_control:2;
+	 unsigned pad:1;
+	 unsigned allocate:1;
+	 unsigned used:1;
+	 unsigned complete:1;
+	 unsigned pad0:3;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } urb_gen5;
 
       struct {
-	 GLuint opcode:3;
-	 GLuint offset:11;
-	 GLuint swizzle_control:1;
-	 GLuint complete:1;
-	 GLuint per_slot_offset:1;
-	 GLuint pad0:2;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned opcode:3;
+	 unsigned offset:11;
+	 unsigned swizzle_control:1;
+	 unsigned complete:1;
+	 unsigned per_slot_offset:1;
+	 unsigned pad0:2;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } urb_gen7;
 
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint search_path_index:3;
-	 GLuint lut_subindex:2;
-	 GLuint message_type:2;
-	 GLuint pad0:4;
-	 GLuint header_present:1;
+	 unsigned binding_table_index:8;
+	 unsigned search_path_index:3;
+	 unsigned lut_subindex:2;
+	 unsigned message_type:2;
+	 unsigned pad0:4;
+	 unsigned header_present:1;
       } vme_gen6;
 
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint pad0:5;
-	 GLuint message_type:2;
-	 GLuint pad1:4;
-	 GLuint header_present:1;
+	 unsigned binding_table_index:8;
+	 unsigned pad0:5;
+	 unsigned message_type:2;
+	 unsigned pad1:4;
+	 unsigned header_present:1;
       } cre_gen75;
 
       /** 965 PRM, Volume 4, Section 5.10.1.1: Message Descriptor */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint msg_control:4;
-	 GLuint msg_type:2;
-	 GLuint target_cache:2;
-	 GLuint response_length:4;
-	 GLuint msg_length:4;
-	 GLuint msg_target:4;
-	 GLuint pad1:3;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned msg_control:4;
+	 unsigned msg_type:2;
+	 unsigned target_cache:2;
+	 unsigned response_length:4;
+	 unsigned msg_length:4;
+	 unsigned msg_target:4;
+	 unsigned pad1:3;
+	 unsigned end_of_thread:1;
       } dp_read;
 
       /** G45 PRM, Volume 4, Section 5.10.1.1.2 */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint msg_control:3;
-	 GLuint msg_type:3;
-	 GLuint target_cache:2;
-	 GLuint response_length:4;
-	 GLuint msg_length:4;
-	 GLuint msg_target:4;
-	 GLuint pad1:3;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned msg_control:3;
+	 unsigned msg_type:3;
+	 unsigned target_cache:2;
+	 unsigned response_length:4;
+	 unsigned msg_length:4;
+	 unsigned msg_target:4;
+	 unsigned pad1:3;
+	 unsigned end_of_thread:1;
       } dp_read_g4x;
 
       /** Ironlake PRM, Volume 4 Part 1, Section 5.10.2.1.2. */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint msg_control:4;
-	 GLuint msg_type:2;
-	 GLuint target_cache:2;
-	 GLuint pad0:3;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned msg_control:4;
+	 unsigned msg_type:2;
+	 unsigned target_cache:2;
+	 unsigned pad0:3;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } dp_read_gen5;
 
       /** G45 PRM, Volume 4, Section 5.10.1.1.2.  For both Gen4 and G45. */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint msg_control:3;
-	 GLuint last_render_target:1;
-	 GLuint msg_type:3;
-	 GLuint send_commit_msg:1;
-	 GLuint response_length:4;
-	 GLuint msg_length:4;
-	 GLuint msg_target:4;
-	 GLuint pad1:3;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned msg_control:3;
+	 unsigned last_render_target:1;
+	 unsigned msg_type:3;
+	 unsigned send_commit_msg:1;
+	 unsigned response_length:4;
+	 unsigned msg_length:4;
+	 unsigned msg_target:4;
+	 unsigned pad1:3;
+	 unsigned end_of_thread:1;
       } dp_write;
 
       /** Ironlake PRM, Volume 4 Part 1, Section 5.10.2.1.2. */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint msg_control:3;
-	 GLuint last_render_target:1;
-	 GLuint msg_type:3;
-	 GLuint send_commit_msg:1;
-	 GLuint pad0:3;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned msg_control:3;
+	 unsigned last_render_target:1;
+	 unsigned msg_type:3;
+	 unsigned send_commit_msg:1;
+	 unsigned pad0:3;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } dp_write_gen5;
 
       /**
@@ -1401,15 +1395,15 @@ struct brw_instruction
        * See the Sandybridge PRM, Volume 4 Part 1, Section 3.9.2.1.1.
        **/
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint msg_control:5;
-	 GLuint msg_type:3;
-	 GLuint pad0:3;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned msg_control:5;
+	 unsigned msg_type:3;
+	 unsigned pad0:3;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } gen6_dp_sampler_const_cache;
 
       /**
@@ -1423,16 +1417,16 @@ struct brw_instruction
        * Section 3.9.9.2.1 of the same volume.
        */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint msg_control:5;
-	 GLuint msg_type:4;
-	 GLuint send_commit_msg:1;
-	 GLuint pad0:1;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad1:2;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned msg_control:5;
+	 unsigned msg_type:4;
+	 unsigned send_commit_msg:1;
+	 unsigned pad0:1;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad1:2;
+	 unsigned end_of_thread:1;
       } gen6_dp;
 
       /**
@@ -1444,31 +1438,31 @@ struct brw_instruction
        * control for Render Target Writes.
        */
       struct {
-	 GLuint binding_table_index:8;
-	 GLuint msg_control:6;
-	 GLuint msg_type:4;
-	 GLuint category:1;
-	 GLuint header_present:1;
-	 GLuint response_length:5;
-	 GLuint msg_length:4;
-	 GLuint pad2:2;
-	 GLuint end_of_thread:1;
+	 unsigned binding_table_index:8;
+	 unsigned msg_control:6;
+	 unsigned msg_type:4;
+	 unsigned category:1;
+	 unsigned header_present:1;
+	 unsigned response_length:5;
+	 unsigned msg_length:4;
+	 unsigned pad2:2;
+	 unsigned end_of_thread:1;
       } gen7_dp;
       /** @} */
 
       struct {
-	 GLuint src1_subreg_nr_high:1;
-	 GLuint src1_reg_nr:8;
-	 GLuint pad0:1;
-	 GLuint src2_rep_ctrl:1;
-	 GLuint src2_swizzle:8;
-	 GLuint src2_subreg_nr:3;
-	 GLuint src2_reg_nr:8;
-	 GLuint pad1:2;
+	 unsigned src1_subreg_nr_high:1;
+	 unsigned src1_reg_nr:8;
+	 unsigned pad0:1;
+	 unsigned src2_rep_ctrl:1;
+	 unsigned src2_swizzle:8;
+	 unsigned src2_subreg_nr:3;
+	 unsigned src2_reg_nr:8;
+	 unsigned pad1:2;
       } da3src;
 
-      GLint d;
-      GLuint ud;
+      int d;
+      unsigned ud;
       float f;
    } bits3;
 };
diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index a708c52..8bfbcfe 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -35,12 +35,6 @@
 
 #include "brw_reg.h"
 
-typedef unsigned char GLubyte;
-typedef short GLshort;
-typedef unsigned int GLuint;
-typedef int GLint;
-typedef float GLfloat;
-
 extern long int gen_level;
 extern int advanced_flag;
 extern int errors;
@@ -147,7 +141,7 @@ struct label_instruction {
 
 struct relocation {
     char *first_reloc_target, *second_reloc_target; // JIP and UIP respectively
-    GLint first_reloc_offset, second_reloc_offset; // in number of instructions
+    int first_reloc_offset, second_reloc_offset; // in number of instructions
 };
 
 /**
diff --git a/assembler/gram.y b/assembler/gram.y
index 9d58fe6..50d71d1 100644
--- a/assembler/gram.y
+++ b/assembler/gram.y
@@ -81,7 +81,7 @@ static struct src_operand ip_src =
     .reg.dw1.bits.swizzle = BRW_SWIZZLE_NOOP,
 };
 
-static int get_type_size(GLuint type);
+static int get_type_size(unsigned type);
 static void set_instruction_opcode(struct brw_program_instruction *instr,
 				   unsigned opcode);
 static int set_instruction_dest(struct brw_program_instruction *instr,
@@ -339,7 +339,7 @@ static bool validate_src_reg(struct brw_instruction *insn,
     return true;
 }
 
-static int get_subreg_address(GLuint regfile, GLuint type, GLuint subreg, GLuint address_mode)
+static int get_subreg_address(unsigned regfile, unsigned type, unsigned subreg, unsigned address_mode)
 {
     int unit_size = 1;
 
@@ -370,7 +370,7 @@ static int get_subreg_address(GLuint regfile, GLuint type, GLuint subreg, GLuint
  *  a0.12            6                  invalid input
  *  a0.14            7                  invalid input
  */
-static int get_indirect_subreg_address(GLuint subreg)
+static int get_indirect_subreg_address(unsigned subreg)
 {
     return advanced_flag == 0 ? subreg / 2 : subreg;
 }
@@ -2744,7 +2744,7 @@ void yyerror (char *msg)
 	++errors;
 }
 
-static int get_type_size(GLuint type)
+static int get_type_size(unsigned type)
 {
     int size = 1;
 
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 88/90] assembler: Group the header inclusions together
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (86 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 87/90] assembler: Don't use GL types Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 89/90] assembler: Fix the decoding of the destination horizontal stride Damien Lespiau
                   ` (2 subsequent siblings)
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/gen4asm.h |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/assembler/gen4asm.h b/assembler/gen4asm.h
index 8bfbcfe..dca7f0f 100644
--- a/assembler/gen4asm.h
+++ b/assembler/gen4asm.h
@@ -34,6 +34,8 @@
 #include <assert.h>
 
 #include "brw_reg.h"
+#include "brw_defines.h"
+#include "brw_structs.h"
 
 extern long int gen_level;
 extern int advanced_flag;
@@ -57,9 +59,6 @@ extern struct brw_compile genasm_compile;
 /* Predicate to match Haswell processors */
 #define IS_HASWELL(x) (gen_level == 75)
 
-#include "brw_defines.h"
-#include "brw_structs.h"
-
 void yyerror (char *msg);
 
 #define STRUCT_SIZE_ASSERT(TYPE, SIZE) \
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 89/90] assembler: Fix the decoding of the destination horizontal stride
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (87 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 88/90] assembler: Group the header inclusions together Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-04 15:28 ` [PATCH 90/90] assembler: Mark format() as PRINTFLIKE in the disassembler Damien Lespiau
  2013-02-14 19:18 ` Sync the assembler with Mesa's opcode emission code Damien Lespiau
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

dest_horizontal_stride needs go through the horiz_stride[] indirection
to pick up the rigth stride when its value is 11b (4 elements).

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_disasm.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/assembler/brw_disasm.c b/assembler/brw_disasm.c
index de121e6..3fee682 100644
--- a/assembler/brw_disasm.c
+++ b/assembler/brw_disasm.c
@@ -522,7 +522,7 @@ static int dest (FILE *file, struct brw_instruction *inst)
 	    if (inst->bits1.da1.dest_subreg_nr)
 		format (file, ".%d", inst->bits1.da1.dest_subreg_nr /
 				     reg_type_size[inst->bits1.da1.dest_reg_type]);
-	    format (file, "<%d>", inst->bits1.da1.dest_horiz_stride);
+	    format (file, "<%s>", horiz_stride[inst->bits1.da1.dest_horiz_stride]);
 	    err |= control (file, "dest reg encoding", reg_encoding, inst->bits1.da1.dest_reg_type, NULL);
 	}
 	else
@@ -534,7 +534,7 @@ static int dest (FILE *file, struct brw_instruction *inst)
 	    if (inst->bits1.ia1.dest_indirect_offset)
 		format (file, " %d", inst->bits1.ia1.dest_indirect_offset);
 	    string (file, "]");
-	    format (file, "<%d>", inst->bits1.ia1.dest_horiz_stride);
+	    format (file, "<%s>", horiz_stride[inst->bits1.ia1.dest_horiz_stride]);
 	    err |= control (file, "dest reg encoding", reg_encoding, inst->bits1.ia1.dest_reg_type, NULL);
 	}
     }
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 90/90] assembler: Mark format() as PRINTFLIKE in the disassembler
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (88 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 89/90] assembler: Fix the decoding of the destination horizontal stride Damien Lespiau
@ 2013-02-04 15:28 ` Damien Lespiau
  2013-02-14 19:18 ` Sync the assembler with Mesa's opcode emission code Damien Lespiau
  90 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-02-04 15:28 UTC (permalink / raw)
  To: intel-gfx

So when making changes in code using that function, we get warnings
about mismatches between the format string and arguments.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
---
 assembler/brw_disasm.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/assembler/brw_disasm.c b/assembler/brw_disasm.c
index 3fee682..4dec829 100644
--- a/assembler/brw_disasm.c
+++ b/assembler/brw_disasm.c
@@ -27,6 +27,7 @@
 #include <unistd.h>
 #include <stdarg.h>
 
+#include "brw_compat.h"
 #include "brw_context.h"
 #include "brw_defines.h"
 
@@ -400,6 +401,7 @@ static int string (FILE *file, const char *string)
     return 0;
 }
 
+static int format (FILE *f, const char *format, ...) PRINTFLIKE(2, 3);
 static int format (FILE *f, const char *format, ...)
 {
     char    buf[1024];
-- 
1.7.7.5

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: Sync the assembler with Mesa's opcode emission code
  2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
                   ` (89 preceding siblings ...)
  2013-02-04 15:28 ` [PATCH 90/90] assembler: Mark format() as PRINTFLIKE in the disassembler Damien Lespiau
@ 2013-02-14 19:18 ` Damien Lespiau
  2013-03-04 16:38   ` Damien Lespiau
  90 siblings, 1 reply; 93+ messages in thread
From: Damien Lespiau @ 2013-02-14 19:18 UTC (permalink / raw)
  To: intel-gfx

On Mon, Feb 04, 2013 at 03:26:55PM +0000, Damien Lespiau wrote:
> Hey,
> 
> Some time ago, Daniel mentioned merging the assembler into intel-gpu-tools to
> lower maintenance cost and have more eyes on the code.
> 
> This series is the aftermath of that with an effort to sync the opcode emission
> from Mesa with the assembler. It's also available in my fdo i-g-t tree:
> 
> http://cgit.freedesktop.org/~damien/intel-gpu-tools/log/?h=wip/mesa-sync

I've asked Haihao if he could give a look at that branch and he kindly
tested it and found something in the xorg driver this new branch did not
generate as the old one did.

Turned out it was a gen7 shader still using a MRF register and I posted
a fix:
  http://lists.freedesktop.org/archives/intel-gfx/2013-February/024728.html

With this fixed, I think the assembler is good to go and I've prepared
two branches (resolving a small conflict in configure.ac) by either
merging the branch in master or rebasing the whole assembler branch
(both keeping the whole history of the assembler):

http://cgit.freedesktop.org/~damien/intel-gpu-tools/log/?h=assembler-merged

or

http://cgit.freedesktop.org/~damien/intel-gpu-tools/log/?h=assembler-rebased

-- 
Damien

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Sync the assembler with Mesa's opcode emission code
  2013-02-14 19:18 ` Sync the assembler with Mesa's opcode emission code Damien Lespiau
@ 2013-03-04 16:38   ` Damien Lespiau
  0 siblings, 0 replies; 93+ messages in thread
From: Damien Lespiau @ 2013-03-04 16:38 UTC (permalink / raw)
  To: intel-gfx

On Thu, Feb 14, 2013 at 07:18:41PM +0000, Damien Lespiau wrote:
> http://cgit.freedesktop.org/~damien/intel-gpu-tools/log/?h=assembler-rebased

As agreed on IRC and after making sure the rework doesn't introduce
regressions in our existing shaders, I've finally merged this assembler
branch and marked the old assembler repository as deprecated.

Some remaining little tasks (small issues that were in the initial
assembler as well):

  - Fix make check
  - Have a look at Homer's patches
  - Rename the assembler to drop the '4', nothing specific to gen4
    anymore.

-- 
Damien

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2013-03-04 16:38 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-04 15:26 Sync the assembler with Mesa's opcode emission code Damien Lespiau
2013-02-04 15:26 ` [PATCH 01/90] build: Add CAIRO_FLAGS to the debugger compilation Damien Lespiau
2013-02-04 15:26 ` [PATCH 02/90] gitignore: Ignore TAGS files Damien Lespiau
2013-02-04 15:26 ` [PATCH 03/90] build: Don't use AM_MAINTAINER_MODE Damien Lespiau
2013-02-04 15:26 ` [PATCH 04/90] build: Only build the assembler if flex and bison are found Damien Lespiau
2013-02-04 15:27 ` [PATCH 05/90] build: Add the debugger compilation status to the summary Damien Lespiau
2013-02-04 15:27 ` [PATCH 06/90] assembler: Sync brw_instruction's header with mesa's Damien Lespiau
2013-02-04 15:27 ` [PATCH 07/90] assembler: Rename three_src_gen6 to da3src Damien Lespiau
2013-02-04 15:27 ` [PATCH 08/90] assembler: Rename dp_read_gen6 to gen6_dp_sampler_const_cache Damien Lespiau
2013-02-04 15:27 ` [PATCH 09/90] assembler: Rename dp_gen6 to gen6_dp and sync with Mesa's Damien Lespiau
2013-02-04 15:27 ` [PATCH 10/90] assembler: Rename dp_gen7 to gen7_dp and sync it " Damien Lespiau
2013-02-04 15:27 ` [PATCH 11/90] assembler: Remove struct dp_write_gen6 and struct use gen6_dp Damien Lespiau
2013-02-04 15:27 ` [PATCH 12/90] assembler: Rename gen5 DP pixel_scoreboard_clear to last_render_target Damien Lespiau
2013-02-04 15:27 ` [PATCH 13/90] assembler: Rename branch to branch_gen6 Damien Lespiau
2013-02-04 15:27 ` [PATCH 14/90] assembler: Rename branch_2_offset to break_cont Damien Lespiau
2013-02-04 15:27 ` [PATCH 15/90] assembler: Rename bits3.id and bits3.fd Damien Lespiau
2013-02-04 15:27 ` [PATCH 16/90] assembler: Adopt brw_structs.h from mesa Damien Lespiau
2013-02-04 15:27 ` [PATCH 17/90] assembler: Remove trailing white spaces from brw_structs.h Damien Lespiau
2013-02-04 15:27 ` [PATCH 18/90] assembler: Adopt enum brw_message_target from mesa Damien Lespiau
2013-02-04 15:27 ` [PATCH 19/90] assembler: Rename BRW_ACCWRCTRL_ACCWRCTRL Damien Lespiau
2013-02-04 15:27 ` [PATCH 20/90] assembler: Import brw_defines.h from Mesa Damien Lespiau
2013-02-04 15:27 ` [PATCH 21/90] assembler: Remove trailing white space from brw_defines.h Damien Lespiau
2013-02-04 15:27 ` [PATCH 22/90] assembler: Update the disassembler code Damien Lespiau
2013-02-04 15:27 ` [PATCH 23/90] assembler: Import ralloc from Mesa Damien Lespiau
2013-02-04 15:27 ` [PATCH 24/90] assembler: Remove white space from brw_eu.h Damien Lespiau
2013-02-04 15:27 ` [PATCH 25/90] assembler: Introduce struct brw_context Damien Lespiau
2013-02-04 15:27 ` [PATCH 26/90] assembler: Make an libbrw library Damien Lespiau
2013-02-04 15:27 ` [PATCH 27/90] assembler: Protect gen4asm.h from multiple inclusions Damien Lespiau
2013-02-04 15:27 ` [PATCH 28/90] assembler: Import brw_eu_compact.c Damien Lespiau
2013-02-04 15:27 ` [PATCH 29/90] assembler: Import brw_eu.c Damien Lespiau
2013-02-04 15:27 ` [PATCH 30/90] assembler: Don't use -Wpointer-arith Damien Lespiau
2013-02-04 15:27 ` [PATCH 31/90] assembler: Import brw_eu_emit.c Damien Lespiau
2013-02-04 15:27 ` [PATCH 32/90] assembler: Use BRW_WRITEMASK_XYZW instead of the 0xf constant Damien Lespiau
2013-02-04 15:27 ` [PATCH 33/90] assembler: Remove the writemask_set field of struct dest_operand Damien Lespiau
2013-02-04 15:27 ` [PATCH 34/90] assembler: Use subreg_nr to store the address register subreg Damien Lespiau
2013-02-04 15:27 ` [PATCH 35/90] assembler: Simplify get_subreg_address() Damien Lespiau
2013-02-04 15:27 ` [PATCH 36/90] assembler: Make print_instruction() take an instruction Damien Lespiau
2013-02-04 15:27 ` [PATCH 37/90] assembler: Refactor the code adding instructions and labels Damien Lespiau
2013-02-04 15:27 ` [PATCH 38/90] assembler: Make explicit that labels are part of the instructions list Damien Lespiau
2013-02-04 15:27 ` [PATCH 39/90] assembler: Don't change the size of opcodes! Damien Lespiau
2013-02-04 15:27 ` [PATCH 40/90] assembler: Make sure nobody adds a field back to struct brw_instruction Damien Lespiau
2013-02-04 15:27 ` [PATCH 41/90] assembler: Don't expose functions only used in main.c Damien Lespiau
2013-02-04 15:27 ` [PATCH 42/90] assembler: Make struct declared_register use struct brw_reg Damien Lespiau
2013-02-04 15:27 ` [PATCH 43/90] assembler: Replace struct direct_reg by " Damien Lespiau
2013-02-04 15:27 ` [PATCH 44/90] assembler: Replace struct indirect_reg " Damien Lespiau
2013-02-04 15:27 ` [PATCH 45/90] assembler: Unify the direct and indirect register type Damien Lespiau
2013-02-04 15:27 ` [PATCH 46/90] assembler: Replace struct dst_operand by struct brw_reg Damien Lespiau
2013-02-04 15:27 ` [PATCH 47/90] assembler: Consolidate the swizzling configuration on 8 bits Damien Lespiau
2013-02-04 15:27 ` [PATCH 48/90] assembler: Get rid of src operand's swizzle_set Damien Lespiau
2013-02-04 15:27 ` [PATCH 49/90] assembler: Use brw_reg in the source operand Damien Lespiau
2013-02-04 15:27 ` [PATCH 50/90] assembler: Factor out the destination register validation Damien Lespiau
2013-02-04 15:27 ` [PATCH 51/90] assembler: Use brw_set_dest() to encode the destination Damien Lespiau
2013-02-04 15:27 ` [PATCH 52/90] assembler: Factor out the source register validation Damien Lespiau
2013-02-04 15:27 ` [PATCH 53/90] assembler: ExecSize can be as big as 32 channels Damien Lespiau
2013-02-04 15:27 ` [PATCH 54/90] assembler: Fix comparisons between reg.type and Architecture registers Damien Lespiau
2013-02-04 15:27 ` [PATCH 55/90] assembler: Store immediate values in reg.dw1.ud Damien Lespiau
2013-02-04 15:27 ` [PATCH 56/90] assembler: Don't warn if identical declared registers are redefined Damien Lespiau
2013-02-04 15:27 ` [PATCH 57/90] assembler: Add location support Damien Lespiau
2013-02-04 15:27 ` [PATCH 58/90] assembler: Add error() and warn() shorthands and use them in set_src[01] Damien Lespiau
2013-02-04 15:27 ` [PATCH 59/90] assembler: Add a check for when width is 1 and hstride is not 0 Damien Lespiau
2013-02-04 15:27 ` [PATCH 60/90] assembler: Add a check for when ExecSize and width are 1 Damien Lespiau
2013-02-04 15:27 ` [PATCH 61/90] assembler: Add the input filename to the error/warning messages Damien Lespiau
2013-02-04 15:27 ` [PATCH 62/90] assembler: Use brw_set_src0() Damien Lespiau
2013-02-04 15:27 ` [PATCH 63/90] assembler: Port the warning and error reporting to warn()/error() Damien Lespiau
2013-02-04 15:27 ` [PATCH 64/90] assembler: Cleanup visibility of a few global variables/functions Damien Lespiau
2013-02-04 15:28 ` [PATCH 65/90] assembler: Fix ')' placement in condition Damien Lespiau
2013-02-04 15:28 ` [PATCH 66/90] assembler: Implement register-indirect addressing mode in brw_set_src1() Damien Lespiau
2013-02-04 15:28 ` [PATCH 67/90] assembler: Use brw_set_src1() Damien Lespiau
2013-02-04 15:28 ` [PATCH 68/90] assembler: Renamed the instruction field to insn Damien Lespiau
2013-02-04 15:28 ` [PATCH 69/90] assembler: Unify all instructions to be brw_program_instructions Damien Lespiau
2013-02-04 15:28 ` [PATCH 70/90] assembler: Move struct relocation out of relocatable instructions Damien Lespiau
2013-02-04 15:28 ` [PATCH 71/90] assembler: Gather all predicate data in its own structure Damien Lespiau
2013-02-04 15:28 ` [PATCH 72/90] assembler: Unify adding options to the header Damien Lespiau
2013-02-04 15:28 ` [PATCH 73/90] assembler: Isolate all the options in their own structure Damien Lespiau
2013-02-04 15:28 ` [PATCH 74/90] assembler: Introduce set_instruction_opcode() Damien Lespiau
2013-02-04 15:28 ` [PATCH 75/90] assembler: Introduce set_intruction_pred_cond() Damien Lespiau
2013-02-04 15:28 ` [PATCH 76/90] assembler: Introduce set_instruction_saturate() Damien Lespiau
2013-02-04 15:28 ` [PATCH 77/90] assembler: Expose setters for 3src operands Damien Lespiau
2013-02-04 15:28 ` [PATCH 78/90] assembler: Add support for D and UD in 3-src instructions Damien Lespiau
2013-02-04 15:28 ` [PATCH 79/90] assembler: Use brw_*() functions for " Damien Lespiau
2013-02-04 15:28 ` [PATCH 80/90] assembler: Don't pollute the library files with gen4asm.h Damien Lespiau
2013-02-04 15:28 ` [PATCH 81/90] assembler: Put struct opcode_desc back in brw_context.h Damien Lespiau
2013-02-04 15:28 ` [PATCH 82/90] assembler: Use set_instruction_src1() in send Damien Lespiau
2013-02-04 15:28 ` [PATCH 83/90] assembler: Finish importing brw_eu_*c from mesa Damien Lespiau
2013-02-04 15:28 ` [PATCH 84/90] assembler: Merge declared_register's type into the reg structure Damien Lespiau
2013-02-04 15:28 ` [PATCH 85/90] assembler: Use defines for width Damien Lespiau
2013-02-04 15:28 ` [PATCH 86/90] assembler: Remove trailing white space Damien Lespiau
2013-02-04 15:28 ` [PATCH 87/90] assembler: Don't use GL types Damien Lespiau
2013-02-04 15:28 ` [PATCH 88/90] assembler: Group the header inclusions together Damien Lespiau
2013-02-04 15:28 ` [PATCH 89/90] assembler: Fix the decoding of the destination horizontal stride Damien Lespiau
2013-02-04 15:28 ` [PATCH 90/90] assembler: Mark format() as PRINTFLIKE in the disassembler Damien Lespiau
2013-02-14 19:18 ` Sync the assembler with Mesa's opcode emission code Damien Lespiau
2013-03-04 16:38   ` Damien Lespiau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).