All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
@ 2012-06-30 18:50 Olivier Galibert
  2012-06-30 18:50 ` [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE Olivier Galibert
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Olivier Galibert @ 2012-06-30 18:50 UTC (permalink / raw)
  To: intel-gfx, mesa-dev

  Hi,

This is the first part of the fixes I've done to make my gm45 work
correctly w.r.t clipping and interpolation.  There's a fair chance
they work for everything gen 4/5, but I have no way to be sure.

[PATCH 1/5] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
[PATCH 2/5] intel gen4-5: Compute the interpolation status for every
[PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.
[PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the
[PATCH 5/5] intel gen4-5: Make noperspective clipping work.

After this batch every piglit interpolation test involving no clipping
or fixed clipping passes.  Vertex clipping clearly never worked
(VERT_RESULT_CLIP_VERTEX is not used, so...) and clipdistance isn't
implemented.  These will be the topic of the second batch, whenever it
exists.

Best,

  OG.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-06-30 18:50 [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
@ 2012-06-30 18:50 ` Olivier Galibert
  2012-07-17  3:43   ` [Mesa-dev] " Paul Berry
  2012-06-30 18:50 ` [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place Olivier Galibert
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-06-30 18:50 UTC (permalink / raw)
  To: intel-gfx, mesa-dev; +Cc: Olivier Galibert

There was... confusion about which register goes where.  With that
patch urb_setup is in line with the vue setup, even when these
annoying backcolor slots are used.  And in addition the stray mov into
lalaland is avoided when only one of the front/back slots is used and
the backface is looking at you.  The code instead picks whatever slot
was written to by the vertex shader.  That makes most of the generated
piglit tests useless to test the backface selection though.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
---
 src/mesa/drivers/dri/i965/brw_fs.cpp     |   18 +++++-
 src/mesa/drivers/dri/i965/brw_sf.c       |    3 +-
 src/mesa/drivers/dri/i965/brw_sf_emit.c  |   93 +++++++++++++++++-------------
 src/mesa/drivers/dri/i965/brw_wm_pass2.c |   19 +++++-
 4 files changed, 89 insertions(+), 44 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6cef08a..710f2ff 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -721,8 +721,24 @@ fs_visitor::calculate_urb_setup()
 	 if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) {
 	    int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
 
+            /* Special case: two-sided vertex option, vertex program
+             * only writes to the back color.  Map it to the
+             * associated front color location.
+             */
+            if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 &&
+                ctx->VertexProgram._TwoSideEnabled &&
+                urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1)
+               fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+
+	    /* The back color slot is skipped when the front color is
+	     * also written to.  In addition, some slots can be
+	     * written in the vertex shader and not read in the
+	     * fragment shader.  So the register number must always be
+	     * incremented, mapped or not.
+	     */
 	    if (fp_index >= 0)
-	       urb_setup[fp_index] = urb_next++;
+	       urb_setup[fp_index] = urb_next;
+	    urb_next++;
 	 }
       }
 
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c
index 23a874a..7867ab5 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -192,7 +192,8 @@ brw_upload_sf_prog(struct brw_context *brw)
 
    /* _NEW_LIGHT */
    key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
-   key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide);
+   key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide) ||
+     ctx->VertexProgram._TwoSideEnabled;
 
    /* _NEW_POLYGON */
    if (key.do_twoside_color) {
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index ff6383b..9d8aa38 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr)
 /*********************************************************************** 
  * Twoside lighting
  */
-static void copy_bfc( struct brw_sf_compile *c,
-		      struct brw_reg vert )
-{
-   struct brw_compile *p = &c->func;
-   GLuint i;
-
-   for (i = 0; i < 2; i++) {
-      if (have_attr(c, VERT_RESULT_COL0+i) &&
-	  have_attr(c, VERT_RESULT_BFC0+i))
-	 brw_MOV(p, 
-		 get_vert_result(c, vert, VERT_RESULT_COL0+i),
-		 get_vert_result(c, vert, VERT_RESULT_BFC0+i));
-   }
-}
-
-
 static void do_twoside_color( struct brw_sf_compile *c )
 {
+   GLuint i, need_0, need_1;
    struct brw_compile *p = &c->func;
    GLuint backface_conditional = c->key.frontface_ccw ? BRW_CONDITIONAL_G : BRW_CONDITIONAL_L;
 
@@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile *c )
    if (c->key.primitive == SF_UNFILLED_TRIS)
       return;
 
-   /* XXX: What happens if BFC isn't present?  This could only happen
-    * for user-supplied vertex programs, as t_vp_build.c always does
-    * the right thing.
+   /* If the vertex shader provides both front and backface color, do
+    * the selection.  Otherwise the generated code will pick up
+    * whichever there is.
     */
-   if (!(have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0)) &&
-       !(have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1)))
+   need_0 = have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0);
+   need_1 = have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1);
+
+   if (!need_0 && !need_1)
       return;
    
    /* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order
@@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile *c )
    brw_push_insn_state(p);
    brw_CMP(p, vec4(brw_null_reg()), backface_conditional, c->det, brw_imm_f(0));
    brw_IF(p, BRW_EXECUTE_4);
-   {
-      switch (c->nr_verts) {
-      case 3: copy_bfc(c, c->vert[2]);
-      case 2: copy_bfc(c, c->vert[1]);
-      case 1: copy_bfc(c, c->vert[0]);
-      }
+   for (i=0; i<c->nr_verts; i++) {
+      if (need_0)
+	 brw_MOV(p, 
+		 get_vert_result(c, c->vert[i], VERT_RESULT_COL0),
+		 get_vert_result(c, c->vert[i], VERT_RESULT_BFC0));
+      if (need_1)
+	 brw_MOV(p, 
+		 get_vert_result(c, c->vert[i], VERT_RESULT_COL1),
+		 get_vert_result(c, c->vert[i], VERT_RESULT_BFC1));
    }
    brw_ENDIF(p);
    brw_pop_insn_state(p);
@@ -139,20 +129,27 @@ static void do_twoside_color( struct brw_sf_compile *c )
  */
 
 #define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \
-				BITFIELD64_BIT(VERT_RESULT_COL1))
+                                BITFIELD64_BIT(VERT_RESULT_COL1))
 
 static void copy_colors( struct brw_sf_compile *c,
 		     struct brw_reg dst,
-		     struct brw_reg src)
+                     struct brw_reg src,
+                     int allow_twoside)
 {
    struct brw_compile *p = &c->func;
    GLuint i;
 
    for (i = VERT_RESULT_COL0; i <= VERT_RESULT_COL1; i++) {
-      if (have_attr(c,i))
+      if (have_attr(c,i)) {
 	 brw_MOV(p, 
 		 get_vert_result(c, dst, i),
 		 get_vert_result(c, src, i));
+
+      } else if(allow_twoside && have_attr(c, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0)) {
+	 brw_MOV(p, 
+		 get_vert_result(c, dst, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0),
+		 get_vert_result(c, src, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0));
+      }
    }
 }
 
@@ -167,9 +164,19 @@ static void do_flatshade_triangle( struct brw_sf_compile *c )
    struct brw_compile *p = &c->func;
    struct intel_context *intel = &p->brw->intel;
    struct brw_reg ip = brw_ip_reg();
-   GLuint nr = _mesa_bitcount_64(c->key.attrs & VERT_RESULT_COLOR_BITS);
    GLuint jmpi = 1;
 
+   GLuint nr;
+
+   if (c->key.do_twoside_color) {
+      nr = ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL0) | BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) +
+         ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL1) | BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0);
+
+   } else {
+      nr = ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) +
+         ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL1)) != 0);
+   }
+
    if (!nr)
       return;
 
@@ -186,16 +193,16 @@ static void do_flatshade_triangle( struct brw_sf_compile *c )
    brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr*2+1)));
    brw_JMPI(p, ip, ip, c->pv);
 
-   copy_colors(c, c->vert[1], c->vert[0]);
-   copy_colors(c, c->vert[2], c->vert[0]);
+   copy_colors(c, c->vert[1], c->vert[0], c->key.do_twoside_color);
+   copy_colors(c, c->vert[2], c->vert[0], c->key.do_twoside_color);
    brw_JMPI(p, ip, ip, brw_imm_d(jmpi*(nr*4+1)));
 
-   copy_colors(c, c->vert[0], c->vert[1]);
-   copy_colors(c, c->vert[2], c->vert[1]);
+   copy_colors(c, c->vert[0], c->vert[1], c->key.do_twoside_color);
+   copy_colors(c, c->vert[2], c->vert[1], c->key.do_twoside_color);
    brw_JMPI(p, ip, ip, brw_imm_d(jmpi*nr*2));
 
-   copy_colors(c, c->vert[0], c->vert[2]);
-   copy_colors(c, c->vert[1], c->vert[2]);
+   copy_colors(c, c->vert[0], c->vert[2], c->key.do_twoside_color);
+   copy_colors(c, c->vert[1], c->vert[2], c->key.do_twoside_color);
 
    brw_pop_insn_state(p);
 }
@@ -224,10 +231,10 @@ static void do_flatshade_line( struct brw_sf_compile *c )
    
    brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr+1)));
    brw_JMPI(p, ip, ip, c->pv);
-   copy_colors(c, c->vert[1], c->vert[0]);
+   copy_colors(c, c->vert[1], c->vert[0], 0);
 
    brw_JMPI(p, ip, ip, brw_imm_ud(jmpi*nr));
-   copy_colors(c, c->vert[0], c->vert[1]);
+   copy_colors(c, c->vert[0], c->vert[1], 0);
 
    brw_pop_insn_state(p);
 }
@@ -337,13 +344,17 @@ calculate_masks(struct brw_sf_compile *c,
    if (c->key.do_flat_shading)
       persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS) |
                                     BITFIELD64_BIT(VERT_RESULT_COL0) |
-                                    BITFIELD64_BIT(VERT_RESULT_COL1));
+                                    BITFIELD64_BIT(VERT_RESULT_COL1) |
+                                    BITFIELD64_BIT(VERT_RESULT_BFC0) |
+                                    BITFIELD64_BIT(VERT_RESULT_BFC1));
    else
       persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS));
 
    if (c->key.do_flat_shading)
       linear_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_COL0) |
-                                     BITFIELD64_BIT(VERT_RESULT_COL1));
+                                     BITFIELD64_BIT(VERT_RESULT_COL1) |
+                                     BITFIELD64_BIT(VERT_RESULT_BFC0) |
+                                     BITFIELD64_BIT(VERT_RESULT_BFC1));
    else
       linear_mask = c->key.attrs;
 
diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
index 27c0a94..48143f3 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
@@ -96,9 +96,26 @@ static void init_registers( struct brw_wm_compile *c )
 	 if (c->key.vp_outputs_written & BITFIELD64_BIT(j)) {
 	    int fp_index = _mesa_vert_result_to_frag_attrib(j);
 
+            /* Special case: two-sided vertex option, vertex program
+             * only writes to the back color.  Map it to the
+             * associated front color location.
+             */
+            if (j >= VERT_RESULT_BFC0 && j <= VERT_RESULT_BFC1 &&
+                intel->ctx.VertexProgram._TwoSideEnabled &&
+                !(c->key.vp_outputs_written & BITFIELD64_BIT(j - VERT_RESULT_BFC0 + VERT_RESULT_COL0)))
+               fp_index = j - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+
 	    nr_interp_regs++;
+
+	    /* The back color slot is skipped when the front color is
+	     * also written to.  In addition, some slots can be
+	     * written in the vertex shader and not read in the
+	     * fragment shader.  So the register number must always be
+	     * incremented, mapped or not.
+	     */
 	    if (fp_index >= 0)
-	       prealloc_reg(c, &c->payload.input_interp[fp_index], i++);
+	       prealloc_reg(c, &c->payload.input_interp[fp_index], i);
+            i++;
 	 }
       }
       assert(nr_interp_regs >= 1);
-- 
1.7.10.rc3.1.gb306

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place.
  2012-06-30 18:50 [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
  2012-06-30 18:50 ` [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE Olivier Galibert
@ 2012-06-30 18:50 ` Olivier Galibert
  2012-07-17  4:24   ` Paul Berry
  2012-06-30 18:50 ` [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf Olivier Galibert
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-06-30 18:50 UTC (permalink / raw)
  To: intel-gfx, mesa-dev; +Cc: Olivier Galibert

The program keys are updated accordingly, but the values are not used
yet.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
---
 src/mesa/drivers/dri/i965/brw_clip.c    |   82 ++++++++++++++++++++++++++++++-
 src/mesa/drivers/dri/i965/brw_clip.h    |    1 +
 src/mesa/drivers/dri/i965/brw_context.h |   59 ++++++++++++++++++++++
 src/mesa/drivers/dri/i965/brw_sf.c      |    3 +-
 src/mesa/drivers/dri/i965/brw_sf.h      |    1 +
 src/mesa/drivers/dri/i965/brw_wm.c      |    4 ++
 src/mesa/drivers/dri/i965/brw_wm.h      |    1 +
 7 files changed, 149 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c
index d411208..52e8c47 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -47,6 +47,83 @@
 #define FRONT_UNFILLED_BIT  0x1
 #define BACK_UNFILLED_BIT   0x2
 
+/**
+ * Lookup the interpolation mode information for every element in the
+ * vue.
+ */
+static void
+brw_lookup_interpolation(struct brw_context *brw)
+{
+   /* pprog means "previous program", i.e. the last program before the
+    * fragment shader.  It can only be the vertex shader for now, but
+    * it may be a geometry shader in the future.
+    */
+   const struct gl_program *pprog = &brw->vertex_program->Base;
+   const struct gl_fragment_program *fprog = brw->fragment_program;
+   struct brw_vue_map *vue_map = &brw->vs.prog_data->vue_map;
+
+   /* Default everything to INTERP_QUALIFIER_NONE */
+   brw_clear_interpolation_modes(brw);
+
+   /* If there is no fragment shader, interpolation won't be needed,
+    * so defaulting to none is good.
+    */
+   if (!fprog)
+      return;
+
+   for (int i = 0; i < vue_map->num_slots; i++) {
+      /* First lookup the vert result, skip if there isn't one */
+      int vert_result = vue_map->slot_to_vert_result[i];
+      if (vert_result == BRW_VERT_RESULT_MAX)
+         continue;
+
+      /* HPOS is special, it must be linear
+       */
+      if (vert_result == VERT_RESULT_HPOS) {
+         brw_set_interpolation_mode(brw, i, INTERP_QUALIFIER_NOPERSPECTIVE);
+         continue;
+      }
+
+      /* There is a 1-1 mapping of vert result to frag attrib except
+       * for BackColor and vars
+       */
+      int frag_attrib = vert_result;
+      if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1)
+         frag_attrib = vert_result - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+      else if(vert_result >= VERT_RESULT_VAR0)
+         frag_attrib = vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0;
+
+      /* If the output is not used by the fragment shader, skip it. */
+      if (!(fprog->Base.InputsRead & BITFIELD64_BIT(frag_attrib)))
+         continue;
+
+      /* Lookup the interpolation mode */
+      enum glsl_interp_qualifier interpolation_mode = fprog->InterpQualifier[frag_attrib];
+
+      /* If the mode is not specified, then the default varies.  Color
+       * values follow the shader model, while all the rest uses
+       * smooth.
+       */
+      if (interpolation_mode == INTERP_QUALIFIER_NONE) {
+         if (frag_attrib >= FRAG_ATTRIB_COL0 && frag_attrib <= FRAG_ATTRIB_COL1)
+            interpolation_mode = brw->intel.ctx.Light.ShadeModel == GL_FLAT ? INTERP_QUALIFIER_FLAT : INTERP_QUALIFIER_SMOOTH;
+         else
+            interpolation_mode = INTERP_QUALIFIER_SMOOTH;
+      }
+
+      /* Finally, if we have both a front color and a back color for
+       * the same channel, the selection will be done before
+       * interpolation and the back color copied over the front color
+       * if necessary.  So interpolating the back color is
+       * unnecessary.
+       */
+      if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1)
+         if (pprog->OutputsWritten & BITFIELD64_BIT(vert_result - VERT_RESULT_BFC0 + VERT_RESULT_COL0))
+            interpolation_mode = INTERP_QUALIFIER_NONE;
+
+      brw_set_interpolation_mode(brw, i, interpolation_mode);
+   }
+}
 
 static void compile_clip_prog( struct brw_context *brw,
 			     struct brw_clip_prog_key *key )
@@ -141,6 +218,8 @@ brw_upload_clip_prog(struct brw_context *brw)
 
    memset(&key, 0, sizeof(key));
 
+   brw_lookup_interpolation(brw);
+
    /* Populate the key:
     */
    /* BRW_NEW_REDUCED_PRIMITIVE */
@@ -150,6 +229,7 @@ brw_upload_clip_prog(struct brw_context *brw)
    /* _NEW_LIGHT */
    key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
    key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
+   brw_copy_interpolation_modes(brw, key.interpolation_mode);
    /* _NEW_TRANSFORM (also part of VUE map)*/
    key.nr_userclip = _mesa_bitcount_64(ctx->Transform.ClipPlanesEnabled);
 
@@ -258,7 +338,7 @@ const struct brw_tracked_state brw_clip_prog = {
 		_NEW_TRANSFORM |
 		_NEW_POLYGON | 
 		_NEW_BUFFERS),
-      .brw   = (BRW_NEW_REDUCED_PRIMITIVE),
+      .brw   = (BRW_NEW_FRAGMENT_PROGRAM|BRW_NEW_REDUCED_PRIMITIVE),
       .cache = CACHE_NEW_VS_PROG
    },
    .emit = brw_upload_clip_prog
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h b/src/mesa/drivers/dri/i965/brw_clip.h
index 9185651..6f811ae 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -43,6 +43,7 @@
  */
 struct brw_clip_prog_key {
    GLbitfield64 attrs;
+   GLbitfield64 interpolation_mode[2]; /* copy of the main context */
    GLuint primitive:4;
    GLuint nr_userclip:4;
    GLuint do_flat_shading:1;
diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
index 69659c4e..1a417e5 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1041,6 +1041,17 @@ struct brw_context
    uint32_t render_target_format[MESA_FORMAT_COUNT];
    bool format_supported_as_render_target[MESA_FORMAT_COUNT];
 
+   /* Interpolation modes, two bits per vue slot, values equal to
+    * glsl_interp_qualifier.
+    *
+    * Used on gen4/5 by the clipper, sf and wm stages.  Given the
+    * update order, the clipper is resposible to update it.
+    *
+    * Ignored on gen 6+
+    */
+
+   GLbitfield64 interpolation_mode[2]; /* two bits per vue slot */
+
    /* PrimitiveRestart */
    struct {
       bool in_progress;
@@ -1262,6 +1273,54 @@ brw_program_reloc(struct brw_context *brw, uint32_t state_offset,
 
 bool brw_do_cubemap_normalize(struct exec_list *instructions);
 
+/**
+ * Pre-gen6, interpolation mode has to be resolved for code generation
+ * in clipper, sf and wm.  The resolution is done once by the clipper
+ * and stored in the interpolation_mode array and in the keys.  These
+ * functions allow access to that packed array.
+ */
+
+/* INTERP_QUALIFIER_NONE is required to be 0, so initialization is easy */
+static inline
+void brw_clear_interpolation_modes(struct brw_context *brw)
+{
+   brw->interpolation_mode[0] = brw->interpolation_mode[1] = 0;
+}
+
+static inline
+void brw_clear_interpolation_modes_key(GLbitfield64 *interpolation_mode)
+{
+   interpolation_mode[0] = interpolation_mode[1] = 0;
+}
+
+/* Store an interpolation mode */
+static inline
+void brw_set_interpolation_mode(struct brw_context *brw, int slot, enum glsl_interp_qualifier mode)
+{
+   int idx = slot >> 5;
+   int shift = 2*(slot & 31);
+   brw->interpolation_mode[idx] =
+      (brw->interpolation_mode[idx] & ~(3ULL << shift))
+      | (((GLbitfield64)mode) << shift);
+}
+
+/* Retrieve an interpolation mode */
+static inline
+enum glsl_interp_qualifier brw_get_interpolation_mode(struct brw_context *brw, int slot)
+{
+   int idx = slot >> 5;
+   int shift = 2*(slot & 31);
+   return (enum glsl_interp_qualifier)((brw->interpolation_mode[idx] >> shift) & 3);
+}
+
+/* Copy the interpolation mode information to a key */
+static inline
+void brw_copy_interpolation_modes(struct brw_context *brw, GLbitfield64 *dest)
+{
+   dest[0] = brw->interpolation_mode[0];
+   dest[1] = brw->interpolation_mode[1];
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c
index 7867ab5..0cc4fc7 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -194,6 +194,7 @@ brw_upload_sf_prog(struct brw_context *brw)
    key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
    key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide) ||
      ctx->VertexProgram._TwoSideEnabled;
+   brw_copy_interpolation_modes(brw, key.interpolation_mode);
 
    /* _NEW_POLYGON */
    if (key.do_twoside_color) {
@@ -216,7 +217,7 @@ const struct brw_tracked_state brw_sf_prog = {
    .dirty = {
       .mesa  = (_NEW_HINT | _NEW_LIGHT | _NEW_POLYGON | _NEW_POINT |
                 _NEW_TRANSFORM | _NEW_BUFFERS),
-      .brw   = (BRW_NEW_REDUCED_PRIMITIVE),
+      .brw   = (BRW_NEW_FRAGMENT_PROGRAM|BRW_NEW_REDUCED_PRIMITIVE),
       .cache = CACHE_NEW_VS_PROG
    },
    .emit = brw_upload_sf_prog
diff --git a/src/mesa/drivers/dri/i965/brw_sf.h b/src/mesa/drivers/dri/i965/brw_sf.h
index f908fc0..0a8135c 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.h
+++ b/src/mesa/drivers/dri/i965/brw_sf.h
@@ -46,6 +46,7 @@
 
 struct brw_sf_prog_key {
    GLbitfield64 attrs;
+   GLbitfield64 interpolation_mode[2]; /* copy of the main context */
    uint8_t point_sprite_coord_replace;
    GLuint primitive:2;
    GLuint do_twoside_color:1;
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c
index 4a7225c..c7c1c45 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -504,6 +504,10 @@ static void brw_wm_populate_key( struct brw_context *brw,
 
    /* _NEW_LIGHT */
    key->flat_shade = (ctx->Light.ShadeModel == GL_FLAT);
+   if (intel->gen < 6)
+      brw_copy_interpolation_modes(brw, key->interpolation_mode);
+   else
+      brw_clear_interpolation_modes_key(key->interpolation_mode);
 
    /* _NEW_FRAG_CLAMP | _NEW_BUFFERS */
    key->clamp_fragment_color = ctx->Color._ClampFragmentColor;
diff --git a/src/mesa/drivers/dri/i965/brw_wm.h b/src/mesa/drivers/dri/i965/brw_wm.h
index 2cde2a0..53b83f8 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.h
+++ b/src/mesa/drivers/dri/i965/brw_wm.h
@@ -60,6 +60,7 @@
 #define AA_ALWAYS    2
 
 struct brw_wm_prog_key {
+   GLbitfield64 interpolation_mode[2]; /* copy of the main context for gen4-5, 0 for gen6+ */
    uint8_t iz_lookup;
    GLuint stats_wm:1;
    GLuint flat_shade:1;
-- 
1.7.10.rc3.1.gb306

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.
  2012-06-30 18:50 [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
  2012-06-30 18:50 ` [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE Olivier Galibert
  2012-06-30 18:50 ` [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place Olivier Galibert
@ 2012-06-30 18:50 ` Olivier Galibert
  2012-07-17 12:50   ` [Mesa-dev] " Paul Berry
  2012-06-30 18:50 ` [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper Olivier Galibert
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-06-30 18:50 UTC (permalink / raw)
  To: intel-gfx, mesa-dev; +Cc: Olivier Galibert

This patch also correct a couple of problems with noperspective
interpolation.

At that point all the glsl 1.1/1.3 interpolation tests that do not
clip pass (the -none ones).

The fs code does not use the pre-resolved interpolation modes in order
not to mess with gen6+.  Sharing the resolution would require putting
brw_wm_prog before brw_clip_prog and brw_sf_prog.  This may be a good
thing, but it could have unexpected consequences, so it's better be
done independently in any case.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
---
 src/mesa/drivers/dri/i965/brw_fs.cpp         |    2 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |    5 +
 src/mesa/drivers/dri/i965/brw_sf.c           |    9 +-
 src/mesa/drivers/dri/i965/brw_sf.h           |    2 +-
 src/mesa/drivers/dri/i965/brw_sf_emit.c      |  164 +++++++++++++-------------
 5 files changed, 95 insertions(+), 87 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 710f2ff..b142f2b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -506,7 +506,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
 		  struct brw_reg interp = interp_reg(location, k);
                   emit_linterp(attr, fs_reg(interp), interpolation_mode,
                                ir->centroid);
-		  if (intel->gen < 6) {
+		  if (intel->gen < 6 && interpolation_mode == INTERP_QUALIFIER_SMOOTH) {
 		     emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w);
 		  }
 	       }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9bd1e67..ab83a95 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1924,6 +1924,11 @@ fs_visitor::emit_interpolation_setup_gen4()
    emit(BRW_OPCODE_ADD, this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
 	this->pixel_y, fs_reg(negate(brw_vec1_grf(1, 1))));
 
+   this->delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+     this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+   this->delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+     this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+
    this->current_annotation = "compute pos.w and 1/pos.w";
    /* Compute wpos.w.  It's always in our setup, since it's needed to
     * interpolate the other attributes.
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c
index 0cc4fc7..85f5f51 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw)
    struct brw_sf_prog_key key;
    /* _NEW_BUFFERS */
    bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
+   int i;
 
    memset(&key, 0, sizeof(key));
 
@@ -191,7 +192,13 @@ brw_upload_sf_prog(struct brw_context *brw)
       key.sprite_origin_lower_left = true;
 
    /* _NEW_LIGHT */
-   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
+   key.has_flat_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
+         key.has_flat_shading = 1;
+         break;
+      }
+   }
    key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide) ||
      ctx->VertexProgram._TwoSideEnabled;
    brw_copy_interpolation_modes(brw, key.interpolation_mode);
diff --git a/src/mesa/drivers/dri/i965/brw_sf.h b/src/mesa/drivers/dri/i965/brw_sf.h
index 0a8135c..c718072 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.h
+++ b/src/mesa/drivers/dri/i965/brw_sf.h
@@ -50,7 +50,7 @@ struct brw_sf_prog_key {
    uint8_t point_sprite_coord_replace;
    GLuint primitive:2;
    GLuint do_twoside_color:1;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
    GLuint frontface_ccw:1;
    GLuint do_point_sprite:1;
    GLuint do_point_coord:1;
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index 9d8aa38..387685a 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -44,6 +44,17 @@
 
 
 /**
+ * Determine the vue slot corresponding to the given half of the given
+ * register.  half=0 means the first half of a register, half=1 means the
+ * second half.
+ */
+static inline int vert_reg_to_vue_slot(struct brw_sf_compile *c, GLuint reg,
+                                       int half)
+{
+   return (reg + c->urb_entry_read_offset) * 2 + half;
+}
+
+/**
  * Determine the vert_result corresponding to the given half of the given
  * register.  half=0 means the first half of a register, half=1 means the
  * second half.
@@ -51,11 +62,24 @@
 static inline int vert_reg_to_vert_result(struct brw_sf_compile *c, GLuint reg,
                                           int half)
 {
-   int vue_slot = (reg + c->urb_entry_read_offset) * 2 + half;
+   int vue_slot = vert_reg_to_vue_slot(c, reg, half);
    return c->vue_map.slot_to_vert_result[vue_slot];
 }
 
 /**
+ * Determine the register corresponding to the given vue slot.
+ */
+static struct brw_reg get_vue_slot(struct brw_sf_compile *c,
+                                   struct brw_reg vert,
+                                   int vue_slot)
+{
+   GLuint off = vue_slot / 2 - c->urb_entry_read_offset;
+   GLuint sub = vue_slot % 2;
+
+   return brw_vec4_grf(vert.nr + off, sub * 4);
+}
+
+/**
  * Determine the register corresponding to the given vert_result.
  */
 static struct brw_reg get_vert_result(struct brw_sf_compile *c,
@@ -64,10 +88,7 @@ static struct brw_reg get_vert_result(struct brw_sf_compile *c,
 {
    int vue_slot = c->vue_map.vert_result_to_slot[vert_result];
    assert (vue_slot >= c->urb_entry_read_offset);
-   GLuint off = vue_slot / 2 - c->urb_entry_read_offset;
-   GLuint sub = vue_slot % 2;
-
-   return brw_vec4_grf(vert.nr + off, sub * 4);
+   return get_vue_slot(c, vert, vue_slot);
 }
 
 static bool
@@ -128,31 +149,37 @@ static void do_twoside_color( struct brw_sf_compile *c )
  * Flat shading
  */
 
-#define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \
-                                BITFIELD64_BIT(VERT_RESULT_COL1))
-
-static void copy_colors( struct brw_sf_compile *c,
-		     struct brw_reg dst,
-                     struct brw_reg src,
-                     int allow_twoside)
+static void copy_flatshaded_attributes( struct brw_sf_compile *c,
+                                        struct brw_reg dst,
+                                        struct brw_reg src)
 {
    struct brw_compile *p = &c->func;
+   struct brw_context *brw = p->brw;
    GLuint i;
 
-   for (i = VERT_RESULT_COL0; i <= VERT_RESULT_COL1; i++) {
-      if (have_attr(c,i)) {
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
 	 brw_MOV(p, 
-		 get_vert_result(c, dst, i),
-		 get_vert_result(c, src, i));
+		 get_vue_slot(c, dst, i),
+		 get_vue_slot(c, src, i));
 
-      } else if(allow_twoside && have_attr(c, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0)) {
-	 brw_MOV(p, 
-		 get_vert_result(c, dst, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0),
-		 get_vert_result(c, src, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0));
       }
    }
 }
 
+static GLuint count_flatshaded_attributes(struct brw_sf_compile *c )
+{
+   struct brw_compile *p = &c->func;
+   struct brw_context *brw = p->brw;
+   GLuint count = 0;
+   GLuint i;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT)
+         count++;
+   }
+   return count;
+}
+
 
 
 /* Need to use a computed jump to copy flatshaded attributes as the
@@ -168,18 +195,6 @@ static void do_flatshade_triangle( struct brw_sf_compile *c )
 
    GLuint nr;
 
-   if (c->key.do_twoside_color) {
-      nr = ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL0) | BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) +
-         ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL1) | BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0);
-
-   } else {
-      nr = ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) +
-         ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL1)) != 0);
-   }
-
-   if (!nr)
-      return;
-
    /* Already done in clip program:
     */
    if (c->key.primitive == SF_UNFILLED_TRIS)
@@ -188,21 +203,23 @@ static void do_flatshade_triangle( struct brw_sf_compile *c )
    if (intel->gen == 5)
        jmpi = 2;
 
+   nr = count_flatshaded_attributes(c);
+
    brw_push_insn_state(p);
    
    brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr*2+1)));
    brw_JMPI(p, ip, ip, c->pv);
 
-   copy_colors(c, c->vert[1], c->vert[0], c->key.do_twoside_color);
-   copy_colors(c, c->vert[2], c->vert[0], c->key.do_twoside_color);
+   copy_flatshaded_attributes(c, c->vert[1], c->vert[0]);
+   copy_flatshaded_attributes(c, c->vert[2], c->vert[0]);
    brw_JMPI(p, ip, ip, brw_imm_d(jmpi*(nr*4+1)));
 
-   copy_colors(c, c->vert[0], c->vert[1], c->key.do_twoside_color);
-   copy_colors(c, c->vert[2], c->vert[1], c->key.do_twoside_color);
+   copy_flatshaded_attributes(c, c->vert[0], c->vert[1]);
+   copy_flatshaded_attributes(c, c->vert[2], c->vert[1]);
    brw_JMPI(p, ip, ip, brw_imm_d(jmpi*nr*2));
 
-   copy_colors(c, c->vert[0], c->vert[2], c->key.do_twoside_color);
-   copy_colors(c, c->vert[1], c->vert[2], c->key.do_twoside_color);
+   copy_flatshaded_attributes(c, c->vert[0], c->vert[2]);
+   copy_flatshaded_attributes(c, c->vert[1], c->vert[2]);
 
    brw_pop_insn_state(p);
 }
@@ -213,12 +230,9 @@ static void do_flatshade_line( struct brw_sf_compile *c )
    struct brw_compile *p = &c->func;
    struct intel_context *intel = &p->brw->intel;
    struct brw_reg ip = brw_ip_reg();
-   GLuint nr = _mesa_bitcount_64(c->key.attrs & VERT_RESULT_COLOR_BITS);
+   GLuint nr;
    GLuint jmpi = 1;
 
-   if (!nr)
-      return;
-
    /* Already done in clip program: 
     */
    if (c->key.primitive == SF_UNFILLED_TRIS)
@@ -227,14 +241,16 @@ static void do_flatshade_line( struct brw_sf_compile *c )
    if (intel->gen == 5)
        jmpi = 2;
 
+   nr = count_flatshaded_attributes(c);
+
    brw_push_insn_state(p);
    
    brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr+1)));
    brw_JMPI(p, ip, ip, c->pv);
-   copy_colors(c, c->vert[1], c->vert[0], 0);
+   copy_flatshaded_attributes(c, c->vert[1], c->vert[0]);
 
    brw_JMPI(p, ip, ip, brw_imm_ud(jmpi*nr));
-   copy_colors(c, c->vert[0], c->vert[1], 0);
+   copy_flatshaded_attributes(c, c->vert[0], c->vert[1]);
 
    brw_pop_insn_state(p);
 }
@@ -332,40 +348,25 @@ static void invert_det( struct brw_sf_compile *c)
 
 static bool
 calculate_masks(struct brw_sf_compile *c,
-	        GLuint reg,
-		GLushort *pc,
-		GLushort *pc_persp,
-		GLushort *pc_linear)
+                GLuint reg,
+                GLushort *pc,
+                GLushort *pc_persp,
+                GLushort *pc_linear)
 {
+   struct brw_compile *p = &c->func;
+   struct brw_context *brw = p->brw;
+   enum glsl_interp_qualifier interp;
    bool is_last_attr = (reg == c->nr_setup_regs - 1);
-   GLbitfield64 persp_mask;
-   GLbitfield64 linear_mask;
-
-   if (c->key.do_flat_shading)
-      persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS) |
-                                    BITFIELD64_BIT(VERT_RESULT_COL0) |
-                                    BITFIELD64_BIT(VERT_RESULT_COL1) |
-                                    BITFIELD64_BIT(VERT_RESULT_BFC0) |
-                                    BITFIELD64_BIT(VERT_RESULT_BFC1));
-   else
-      persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS));
-
-   if (c->key.do_flat_shading)
-      linear_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_COL0) |
-                                     BITFIELD64_BIT(VERT_RESULT_COL1) |
-                                     BITFIELD64_BIT(VERT_RESULT_BFC0) |
-                                     BITFIELD64_BIT(VERT_RESULT_BFC1));
-   else
-      linear_mask = c->key.attrs;
 
    *pc_persp = 0;
    *pc_linear = 0;
    *pc = 0xf;
-      
-   if (persp_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 0)))
-      *pc_persp = 0xf;
 
-   if (linear_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 0)))
+   interp = brw_get_interpolation_mode(brw, vert_reg_to_vue_slot(c, reg, 0));
+   if (interp == INTERP_QUALIFIER_SMOOTH) {
+      *pc_linear = 0xf;
+      *pc_persp = 0xf;
+   } else if(interp == INTERP_QUALIFIER_NOPERSPECTIVE)
       *pc_linear = 0xf;
 
    /* Maybe only processs one attribute on the final round:
@@ -373,11 +374,12 @@ calculate_masks(struct brw_sf_compile *c,
    if (vert_reg_to_vert_result(c, reg, 1) != BRW_VERT_RESULT_MAX) {
       *pc |= 0xf0;
 
-      if (persp_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 1)))
-	 *pc_persp |= 0xf0;
-
-      if (linear_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 1)))
-	 *pc_linear |= 0xf0;
+      interp = brw_get_interpolation_mode(brw, vert_reg_to_vue_slot(c, reg, 1));
+      if (interp == INTERP_QUALIFIER_SMOOTH) {
+         *pc_linear |= 0xf0;
+         *pc_persp |= 0xf0;
+      } else if(interp == INTERP_QUALIFIER_NOPERSPECTIVE)
+         *pc_linear |= 0xf0;
    }
 
    return is_last_attr;
@@ -430,7 +432,7 @@ void brw_emit_tri_setup(struct brw_sf_compile *c, bool allocate)
    if (c->key.do_twoside_color) 
       do_twoside_color(c);
 
-   if (c->key.do_flat_shading)
+   if (c->key.has_flat_shading)
       do_flatshade_triangle(c);
       
    
@@ -443,7 +445,6 @@ void brw_emit_tri_setup(struct brw_sf_compile *c, bool allocate)
       struct brw_reg a2 = offset(c->vert[2], i);
       GLushort pc, pc_persp, pc_linear;
       bool last = calculate_masks(c, i, &pc, &pc_persp, &pc_linear);
-
       if (pc_persp)
       {
 	 brw_set_predicate_control_flag_value(p, pc_persp);
@@ -507,7 +508,6 @@ void brw_emit_line_setup(struct brw_sf_compile *c, bool allocate)
    struct brw_compile *p = &c->func;
    GLuint i;
 
-
    c->nr_verts = 2;
 
    if (allocate)
@@ -516,7 +516,7 @@ void brw_emit_line_setup(struct brw_sf_compile *c, bool allocate)
    invert_det(c);
    copy_z_inv_w(c);
 
-   if (c->key.do_flat_shading)
+   if (c->key.has_flat_shading)
       do_flatshade_line(c);
 
    for (i = 0; i < c->nr_setup_regs; i++)
@@ -799,7 +799,3 @@ void brw_emit_anyprim_setup( struct brw_sf_compile *c )
 
    brw_emit_point_setup( c, false );
 }
-
-
-
-
-- 
1.7.10.rc3.1.gb306

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper.
  2012-06-30 18:50 [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
                   ` (2 preceding siblings ...)
  2012-06-30 18:50 ` [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf Olivier Galibert
@ 2012-06-30 18:50 ` Olivier Galibert
  2012-07-17 12:55   ` [Mesa-dev] " Paul Berry
  2012-06-30 18:50 ` [PATCH 5/5] intel gen4-5: Make noperspective clipping work Olivier Galibert
  2012-07-13  7:20 ` [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
  5 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-06-30 18:50 UTC (permalink / raw)
  To: intel-gfx, mesa-dev; +Cc: Olivier Galibert

At that point, all interpolation piglit tests involving fixed clipping
work as long as there's no noperspective.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
---
 src/mesa/drivers/dri/i965/brw_clip.c          |   10 ++++-
 src/mesa/drivers/dri/i965/brw_clip.h          |    6 +--
 src/mesa/drivers/dri/i965/brw_clip_line.c     |    6 +--
 src/mesa/drivers/dri/i965/brw_clip_tri.c      |   20 ++++-----
 src/mesa/drivers/dri/i965/brw_clip_unfilled.c |    2 +-
 src/mesa/drivers/dri/i965/brw_clip_util.c     |   56 +++++++------------------
 6 files changed, 41 insertions(+), 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c
index 52e8c47..952eb4a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -215,7 +215,7 @@ brw_upload_clip_prog(struct brw_context *brw)
    struct intel_context *intel = &brw->intel;
    struct gl_context *ctx = &intel->ctx;
    struct brw_clip_prog_key key;
-
+   int i;
    memset(&key, 0, sizeof(key));
 
    brw_lookup_interpolation(brw);
@@ -227,7 +227,13 @@ brw_upload_clip_prog(struct brw_context *brw)
    /* CACHE_NEW_VS_PROG (also part of VUE map) */
    key.attrs = brw->vs.prog_data->outputs_written;
    /* _NEW_LIGHT */
-   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
+   key.has_flat_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
+         key.has_flat_shading = 1;
+         break;
+      }
+   }
    key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
    brw_copy_interpolation_modes(brw, key.interpolation_mode);
    /* _NEW_TRANSFORM (also part of VUE map)*/
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h b/src/mesa/drivers/dri/i965/brw_clip.h
index 6f811ae..0ea0394 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -46,7 +46,7 @@ struct brw_clip_prog_key {
    GLbitfield64 interpolation_mode[2]; /* copy of the main context */
    GLuint primitive:4;
    GLuint nr_userclip:4;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
    GLuint pv_first:1;
    GLuint do_unfilled:1;
    GLuint fill_cw:2;		/* includes cull information */
@@ -166,8 +166,8 @@ void brw_clip_kill_thread(struct brw_clip_compile *c);
 struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c );
 struct brw_reg brw_clip_plane0_address( struct brw_clip_compile *c );
 
-void brw_clip_copy_colors( struct brw_clip_compile *c,
-			   GLuint to, GLuint from );
+void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c,
+                                          GLuint to, GLuint from );
 
 void brw_clip_init_clipmask( struct brw_clip_compile *c );
 
diff --git a/src/mesa/drivers/dri/i965/brw_clip_line.c b/src/mesa/drivers/dri/i965/brw_clip_line.c
index 6cf2bd2..729d8c0 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_line.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_line.c
@@ -271,11 +271,11 @@ void brw_emit_line_clip( struct brw_clip_compile *c )
    brw_clip_line_alloc_regs(c);
    brw_clip_init_ff_sync(c);
 
-   if (c->key.do_flat_shading) {
+   if (c->key.has_flat_shading) {
       if (c->key.pv_first)
-         brw_clip_copy_colors(c, 1, 0);
+         brw_clip_copy_flatshaded_attributes(c, 1, 0);
       else
-         brw_clip_copy_colors(c, 0, 1);
+         brw_clip_copy_flatshaded_attributes(c, 0, 1);
    }
                 
    clip_and_emit_line(c);
diff --git a/src/mesa/drivers/dri/i965/brw_clip_tri.c b/src/mesa/drivers/dri/i965/brw_clip_tri.c
index a29f8e0..71225f5 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_tri.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_tri.c
@@ -187,8 +187,8 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 
    brw_IF(p, BRW_EXECUTE_1);
    {
-      brw_clip_copy_colors(c, 1, 0);
-      brw_clip_copy_colors(c, 2, 0);
+      brw_clip_copy_flatshaded_attributes(c, 1, 0);
+      brw_clip_copy_flatshaded_attributes(c, 2, 0);
    }
    brw_ELSE(p);
    {
@@ -200,19 +200,19 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 		 brw_imm_ud(_3DPRIM_TRIFAN));
 	 brw_IF(p, BRW_EXECUTE_1);
 	 {
-	    brw_clip_copy_colors(c, 0, 1);
-	    brw_clip_copy_colors(c, 2, 1);
+	    brw_clip_copy_flatshaded_attributes(c, 0, 1);
+	    brw_clip_copy_flatshaded_attributes(c, 2, 1);
 	 }
 	 brw_ELSE(p);
 	 {
-	    brw_clip_copy_colors(c, 1, 0);
-	    brw_clip_copy_colors(c, 2, 0);
+	    brw_clip_copy_flatshaded_attributes(c, 1, 0);
+	    brw_clip_copy_flatshaded_attributes(c, 2, 0);
 	 }
 	 brw_ENDIF(p);
       }
       else {
-         brw_clip_copy_colors(c, 0, 2);
-         brw_clip_copy_colors(c, 1, 2);
+         brw_clip_copy_flatshaded_attributes(c, 0, 2);
+         brw_clip_copy_flatshaded_attributes(c, 1, 2);
       }
    }
    brw_ENDIF(p);
@@ -606,8 +606,8 @@ void brw_emit_tri_clip( struct brw_clip_compile *c )
     * flatshading, need to apply the flatshade here because we don't
     * respect the PV when converting to trifan for emit:
     */
-   if (c->key.do_flat_shading) 
-      brw_clip_tri_flat_shade(c); 
+   if (c->key.has_flat_shading) 
+      brw_clip_tri_flat_shade(c);
       
    if ((c->key.clip_mode == BRW_CLIPMODE_NORMAL) ||
        (c->key.clip_mode == BRW_CLIPMODE_KERNEL_CLIP))
diff --git a/src/mesa/drivers/dri/i965/brw_clip_unfilled.c b/src/mesa/drivers/dri/i965/brw_clip_unfilled.c
index 03c7d42..96f9a84 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_unfilled.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_unfilled.c
@@ -502,7 +502,7 @@ void brw_emit_unfilled_clip( struct brw_clip_compile *c )
 
    /* Need to do this whether we clip or not:
     */
-   if (c->key.do_flat_shading)
+   if (c->key.has_flat_shading)
       brw_clip_tri_flat_shade(c);
    
    brw_clip_init_clipmask(c);
diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c b/src/mesa/drivers/dri/i965/brw_clip_util.c
index bf8cc3a..7b0205a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_util.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
@@ -165,7 +165,7 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
                  vert_result == VERT_RESULT_CLIP_DIST1) {
 	 /* PSIZ doesn't need interpolation because it isn't used by the
           * fragment shader.  CLIP_DIST0 and CLIP_DIST1 don't need
-          * intepolation because on pre-GEN6, these are just placeholder VUE
+          * interpolation because on pre-GEN6, these are just placeholder VUE
           * slots that don't perform any action.
           */
       } else if (vert_result < VERT_RESULT_MAX) {
@@ -291,49 +291,25 @@ struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c )
 }
 
 
-/* If flatshading, distribute color from provoking vertex prior to
+/* Distribute flatshaded attributes from provoking vertex prior to
  * clipping.
  */
-void brw_clip_copy_colors( struct brw_clip_compile *c,
-			   GLuint to, GLuint from )
+void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c,
+                                          GLuint to, GLuint from )
 {
    struct brw_compile *p = &c->func;
-
-   if (brw_clip_have_vert_result(c, VERT_RESULT_COL0))
-      brw_MOV(p, 
-	      byte_offset(c->reg.vertex[to],
-                          brw_vert_result_to_offset(&c->vue_map,
-                                                    VERT_RESULT_COL0)),
-	      byte_offset(c->reg.vertex[from],
-                          brw_vert_result_to_offset(&c->vue_map,
-                                                    VERT_RESULT_COL0)));
-
-   if (brw_clip_have_vert_result(c, VERT_RESULT_COL1))
-      brw_MOV(p, 
-	      byte_offset(c->reg.vertex[to],
-                          brw_vert_result_to_offset(&c->vue_map,
-                                                    VERT_RESULT_COL1)),
-	      byte_offset(c->reg.vertex[from],
-                          brw_vert_result_to_offset(&c->vue_map,
-                                                    VERT_RESULT_COL1)));
-
-   if (brw_clip_have_vert_result(c, VERT_RESULT_BFC0))
-      brw_MOV(p, 
-	      byte_offset(c->reg.vertex[to],
-                          brw_vert_result_to_offset(&c->vue_map,
-                                                    VERT_RESULT_BFC0)),
-	      byte_offset(c->reg.vertex[from],
-                          brw_vert_result_to_offset(&c->vue_map,
-                                                    VERT_RESULT_BFC0)));
-
-   if (brw_clip_have_vert_result(c, VERT_RESULT_BFC1))
-      brw_MOV(p, 
-	      byte_offset(c->reg.vertex[to],
-                          brw_vert_result_to_offset(&c->vue_map,
-                                                    VERT_RESULT_BFC1)),
-	      byte_offset(c->reg.vertex[from],
-                          brw_vert_result_to_offset(&c->vue_map,
-                                                    VERT_RESULT_BFC1)));
+   struct brw_context *brw = p->brw;
+   GLuint i;
+
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
+	 brw_MOV(p, 
+                 byte_offset(c->reg.vertex[to],
+                             brw_vue_slot_to_offset(i)),
+                 byte_offset(c->reg.vertex[from],
+                             brw_vue_slot_to_offset(i)));
+      }
+   }
 }
 
 
-- 
1.7.10.rc3.1.gb306

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/5] intel gen4-5: Make noperspective clipping work.
  2012-06-30 18:50 [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
                   ` (3 preceding siblings ...)
  2012-06-30 18:50 ` [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper Olivier Galibert
@ 2012-06-30 18:50 ` Olivier Galibert
  2012-07-17 14:07   ` [Mesa-dev] " Paul Berry
  2012-07-13  7:20 ` [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
  5 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-06-30 18:50 UTC (permalink / raw)
  To: intel-gfx, mesa-dev; +Cc: Olivier Galibert

At this point all interpolation tests with fixed clipping work.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
---
 src/mesa/drivers/dri/i965/brw_clip.c      |    9 ++
 src/mesa/drivers/dri/i965/brw_clip.h      |    1 +
 src/mesa/drivers/dri/i965/brw_clip_util.c |  133 ++++++++++++++++++++++++++---
 3 files changed, 132 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c
index 952eb4a..6bfdf24 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -234,6 +234,15 @@ brw_upload_clip_prog(struct brw_context *brw)
          break;
       }
    }
+   key.has_noperspective_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_NOPERSPECTIVE &&
+          brw->vs.prog_data->vue_map.slot_to_vert_result[i] != VERT_RESULT_HPOS) {
+         key.has_noperspective_shading = 1;
+         break;
+      }
+   }
+
    key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
    brw_copy_interpolation_modes(brw, key.interpolation_mode);
    /* _NEW_TRANSFORM (also part of VUE map)*/
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h b/src/mesa/drivers/dri/i965/brw_clip.h
index 0ea0394..2a7245a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -47,6 +47,7 @@ struct brw_clip_prog_key {
    GLuint primitive:4;
    GLuint nr_userclip:4;
    GLuint has_flat_shading:1;
+   GLuint has_noperspective_shading:1;
    GLuint pv_first:1;
    GLuint do_unfilled:1;
    GLuint fill_cw:2;		/* includes cull information */
diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c b/src/mesa/drivers/dri/i965/brw_clip_util.c
index 7b0205a..5bdcef8 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_util.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
@@ -129,6 +129,8 @@ static void brw_clip_project_vertex( struct brw_clip_compile *c,
 
 /* Interpolate between two vertices and put the result into a0.0.  
  * Increment a0.0 accordingly.
+ *
+ * Beware that dest_ptr can be equal to v0_ptr.
  */
 void brw_clip_interp_vertex( struct brw_clip_compile *c,
 			     struct brw_indirect dest_ptr,
@@ -138,8 +140,9 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 			     bool force_edgeflag)
 {
    struct brw_compile *p = &c->func;
-   struct brw_reg tmp = get_tmp(c);
-   GLuint slot;
+   struct brw_context *brw = p->brw;
+   struct brw_reg tmp, t_nopersp, v0_ndc_copy;
+   GLuint slot, delta;
 
    /* Just copy the vertex header:
     */
@@ -148,13 +151,119 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
     * back on Ironlake, so needn't change it
     */
    brw_copy_indirect_to_indirect(p, dest_ptr, v0_ptr, 1);
-      
-   /* Iterate over each attribute (could be done in pairs?)
+
+   /*
+    * First handle the 3D and NDC positioning, in case we need
+    * noperspective interpolation.  Doing it early has no performance
+    * impact in any case.
+    */
+
+   /* Start by picking up the v0 NDC coordinates, because that vertex
+    * may be shared with the destination.
+    */
+   if (c->key.has_noperspective_shading) {
+      v0_ndc_copy = get_tmp(c);
+      brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr,
+                                       brw_vert_result_to_offset(&c->vue_map,
+                                                                 BRW_VERT_RESULT_NDC)));
+   }      
+
+   /*
+    * Compute the new 3D position
+    */
+
+   delta = brw_vert_result_to_offset(&c->vue_map, VERT_RESULT_HPOS);
+   tmp = get_tmp(c);
+   brw_MUL(p, 
+           vec4(brw_null_reg()),
+           deref_4f(v1_ptr, delta),
+           t0);
+
+   brw_MAC(p, 
+           tmp,	      
+           negate(deref_4f(v0_ptr, delta)),
+           t0); 
+	      
+   brw_ADD(p,
+           deref_4f(dest_ptr, delta), 
+           deref_4f(v0_ptr, delta),
+           tmp);
+   release_tmp(c, tmp);
+
+   /* Then recreate the projected (NDC) coordinate in the new vertex
+    * header
     */
+   brw_clip_project_vertex(c, dest_ptr);
+
+   /*
+    * If we have noperspective attributes, we now need to compute the
+    * screen-space t.
+    */
+   if (c->key.has_noperspective_shading) {
+      delta = brw_vert_result_to_offset(&c->vue_map, BRW_VERT_RESULT_NDC);
+      t_nopersp = get_tmp(c);
+      tmp = get_tmp(c);
+
+      /* Build a register with coordinates from the second and new vertices */
+      brw_MOV(p, t_nopersp, deref_4f(v1_ptr, delta));
+      brw_MOV(p, tmp, deref_4f(dest_ptr, delta));
+      brw_set_access_mode(p, BRW_ALIGN_16);
+      brw_MOV(p,
+              brw_writemask(t_nopersp, WRITEMASK_ZW),
+              brw_swizzle(tmp, 0,1,0,1));
+
+      /* Subtract the coordinates of the first vertex */
+      brw_ADD(p, t_nopersp, t_nopersp, negate(brw_swizzle(v0_ndc_copy, 0,1,0,1)));
+
+      /* Add the absolute value of the X and Y deltas so that if the
+       * points aren't in the same place on the screen we get non-zero
+       * values to divide.
+       *
+       * After that we have vert1-vert0 in t_nopersp.x and vertnew-vert0 in t_nopersp.y.
+       */
+      brw_ADD(p,
+              brw_writemask(t_nopersp, WRITEMASK_XY),
+              brw_abs(brw_swizzle(t_nopersp, 0,2,0,0)),
+              brw_abs(brw_swizzle(t_nopersp, 1,3,0,0)));
+      brw_set_access_mode(p, BRW_ALIGN_1);
+
+      /* If the points are in the same place (vert1-vert0 == 0), just
+       * substitute a value that will ensure that we don't divide by
+       * 0.
+       */
+      brw_CMP(p, vec1(brw_null_reg()), BRW_CONDITIONAL_EQ,
+              vec1(t_nopersp),
+              brw_imm_f(1));
+      brw_IF(p, BRW_EXECUTE_1);
+      brw_MOV(p, t_nopersp, brw_imm_vf4(VF_ONE, VF_ZERO, VF_ZERO, VF_ZERO));
+      brw_ENDIF(p);
+
+      /* Now compute t_nopersp = t_nopersp.y/t_nopersp.x and broadcast it */
+      brw_math_invert(p, get_element(t_nopersp, 0), get_element(t_nopersp, 0));
+      brw_MUL(p,
+              vec1(t_nopersp),
+              vec1(t_nopersp),
+              vec1(suboffset(t_nopersp, 1)));
+      brw_set_access_mode(p, BRW_ALIGN_16);
+      brw_MOV(p, t_nopersp, brw_swizzle(t_nopersp, 0,0,0,0));
+      brw_set_access_mode(p, BRW_ALIGN_1);
+
+      release_tmp(c, tmp);
+      release_tmp(c, v0_ndc_copy);
+   }
+
+   /* Now we can iterate over each attribute
+    * (could be done in pairs?)
+    */
+   tmp = get_tmp(c);
    for (slot = 0; slot < c->vue_map.num_slots; slot++) {
       int vert_result = c->vue_map.slot_to_vert_result[slot];
       GLuint delta = brw_vue_slot_to_offset(slot);
 
+      /* HPOS is already handled */
+      if(vert_result == VERT_RESULT_HPOS)
+         continue;
+
       if (vert_result == VERT_RESULT_EDGE) {
 	 if (force_edgeflag) 
 	    brw_MOV(p, deref_4f(dest_ptr, delta), brw_imm_f(1));
@@ -174,15 +283,20 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 	  *
 	  *        New = attr0 + t*attr1 - t*attr0
 	  */
+
+         struct brw_reg t =
+            brw_get_interpolation_mode(brw, slot) == INTERP_QUALIFIER_NOPERSPECTIVE ?
+            t_nopersp : t0;
+
 	 brw_MUL(p, 
 		 vec4(brw_null_reg()),
 		 deref_4f(v1_ptr, delta),
-		 t0);
+		 t);
 
 	 brw_MAC(p, 
 		 tmp,	      
 		 negate(deref_4f(v0_ptr, delta)),
-		 t0); 
+		 t); 
 	      
 	 brw_ADD(p,
 		 deref_4f(dest_ptr, delta), 
@@ -198,11 +312,8 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
    }
 
    release_tmp(c, tmp);
-
-   /* Recreate the projected (NDC) coordinate in the new vertex
-    * header:
-    */
-   brw_clip_project_vertex(c, dest_ptr );
+   if (c->key.has_noperspective_shading)
+      release_tmp(c, t_nopersp);
 }
 
 void brw_clip_emit_vue(struct brw_clip_compile *c, 
-- 
1.7.10.rc3.1.gb306

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
  2012-06-30 18:50 [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
                   ` (4 preceding siblings ...)
  2012-06-30 18:50 ` [PATCH 5/5] intel gen4-5: Make noperspective clipping work Olivier Galibert
@ 2012-07-13  7:20 ` Olivier Galibert
  2012-07-13 21:45   ` Kenneth Graunke
  5 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-07-13  7:20 UTC (permalink / raw)
  To: intel-gfx, mesa-dev

On Sat, Jun 30, 2012 at 08:50:10PM +0200, Olivier Galibert wrote:
> This is the first part of the fixes I've done to make my gm45 work
> correctly w.r.t clipping and interpolation.  There's a fair chance
> they work for everything gen 4/5, but I have no way to be sure.

So, not even one comment, nothing?

  OG.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
  2012-07-13  7:20 ` [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
@ 2012-07-13 21:45   ` Kenneth Graunke
  2012-07-14  9:21     ` Olivier Galibert
  0 siblings, 1 reply; 24+ messages in thread
From: Kenneth Graunke @ 2012-07-13 21:45 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx

On 07/13/2012 12:20 AM, Olivier Galibert wrote:
> On Sat, Jun 30, 2012 at 08:50:10PM +0200, Olivier Galibert wrote:
>> This is the first part of the fixes I've done to make my gm45 work
>> correctly w.r.t clipping and interpolation.  There's a fair chance
>> they work for everything gen 4/5, but I have no way to be sure.
> 
> So, not even one comment, nothing?
> 
>   OG.

Sorry...been really busy, and most of us haven't actually spent much if
any time in the clipper shaders.  I'll try and review it within a week.

Despite the lack of response, I am really excited to see that you're
working on this---this is a huge step toward bringing GL 3.x back to
Gen4/5, and we're all really glad to see it happen!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
  2012-07-13 21:45   ` Kenneth Graunke
@ 2012-07-14  9:21     ` Olivier Galibert
  2012-07-17  2:33       ` Paul Berry
  0 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-07-14  9:21 UTC (permalink / raw)
  To: Kenneth Graunke; +Cc: mesa-dev, intel-gfx

On Fri, Jul 13, 2012 at 02:45:10PM -0700, Kenneth Graunke wrote:
> Sorry...been really busy, and most of us haven't actually spent much if
> any time in the clipper shaders.  I'll try and review it within a week.

Ok cool, lack of time is something I completely understand :-)


> Despite the lack of response, I am really excited to see that you're
> working on this---this is a huge step toward bringing GL 3.x back to
> Gen4/5, and we're all really glad to see it happen!

Excellent.  I was starting to wonder if gen4/5 was abandoned (by lack
of resources if anything), nice to see it isn't.

  OG.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
  2012-07-14  9:21     ` Olivier Galibert
@ 2012-07-17  2:33       ` Paul Berry
  2012-07-17 14:42         ` [Intel-gfx] " Paul Berry
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Berry @ 2012-07-17  2:33 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1074 bytes --]

On 14 July 2012 02:21, Olivier Galibert <galibert@pobox.com> wrote:

> On Fri, Jul 13, 2012 at 02:45:10PM -0700, Kenneth Graunke wrote:
> > Sorry...been really busy, and most of us haven't actually spent much if
> > any time in the clipper shaders.  I'll try and review it within a week.
>
> Ok cool, lack of time is something I completely understand :-)
>
>
> > Despite the lack of response, I am really excited to see that you're
> > working on this---this is a huge step toward bringing GL 3.x back to
> > Gen4/5, and we're all really glad to see it happen!
>
> Excellent.  I was starting to wonder if gen4/5 was abandoned (by lack
> of resources if anything), nice to see it isn't.
>

I'm starting to look at these patches too, since I'm the one who wrote the
VUE code, and I have some familiarity with the Gen4/5 clipper.  I'll try to
get some feedback to you within the next 24 hours.


>
>   OG.
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 1864 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-06-30 18:50 ` [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE Olivier Galibert
@ 2012-07-17  3:43   ` Paul Berry
  2012-07-17  8:33     ` [Intel-gfx] " Olivier Galibert
  2012-07-17  8:57     ` [Mesa-dev] " Olivier Galibert
  0 siblings, 2 replies; 24+ messages in thread
From: Paul Berry @ 2012-07-17  3:43 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 14902 bytes --]

On 30 June 2012 11:50, Olivier Galibert <galibert@pobox.com> wrote:

> There was... confusion about which register goes where.  With that
> patch urb_setup is in line with the vue setup, even when these
> annoying backcolor slots are used.  And in addition the stray mov into
> lalaland is avoided when only one of the front/back slots is used and
> the backface is looking at you.  The code instead picks whatever slot
> was written to by the vertex shader.  That makes most of the generated
> piglit tests useless to test the backface selection though.
>

It looks like this patch does three separate things:

1. modify brw_fs.cpp and brw_wm_pass2.c to properly account for the fact
that prior to Gen6, the SF stage of the pipeline doesn't remove backfacing
colors from the URB entry, so the fragment shader compiler needs to skip
over them when interpreting the contents of the thread payload.

2. rework the do_twoside_color() function so that it's easier to follow,
and doesn't emit unnecessary code when there's nothing to do.

3. ensure that if the vertex program writes to gl_BackColor but not
gl_FrontColor, then the gl_BackColor value shows up in the fragment shader,
regardless of whether the polygon is front-facing or back-facing.

Can you split this into three separate patches?  That will make it easier
to troubleshoot in case we find bugs with these patches in the future.

Also, I'm not convinced that #3 is necessary.  Is there something in the
spec that dictates this behaviour?  My reading of the spec is that if the
vertex shader writes to gl_BackColor but not glFrontColor, then the
contents of gl_Color in the fragment shader is undefined.

If we *do* decide that #3 is necessary, then I think a better way to
accomplish it is to handle it in the GLSL vertex shader front-end, by
replacing gl_BackColor with gl_FrontColor in cases where gl_FrontColor is
not written to.  That way our special case code to handle this situation
would be in just one place, rather than in three places (both fragment
shader back-ends, and the SF program).  Also then the fix would apply to
all hardware, not just Intel Gen4-5.

Finally, I couldn't figure out what you meant by "the stray mov into
lalaland".  Can you elaborate on which piece of code used to generate that
stray mov, and why it doesn't anymore?  Thanks.

I have a few additional comments below.


>
> Signed-off-by: Olivier Galibert <galibert@pobox.com>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp     |   18 +++++-
>  src/mesa/drivers/dri/i965/brw_sf.c       |    3 +-
>  src/mesa/drivers/dri/i965/brw_sf_emit.c  |   93
> +++++++++++++++++-------------
>  src/mesa/drivers/dri/i965/brw_wm_pass2.c |   19 +++++-
>  4 files changed, 89 insertions(+), 44 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 6cef08a..710f2ff 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -721,8 +721,24 @@ fs_visitor::calculate_urb_setup()
>          if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) {
>             int fp_index =
> _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
>
> +            /* Special case: two-sided vertex option, vertex program
> +             * only writes to the back color.  Map it to the
> +             * associated front color location.
> +             */
> +            if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 &&
> +                ctx->VertexProgram._TwoSideEnabled &&
> +                urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1)
> +               fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
> +
> +           /* The back color slot is skipped when the front color is
> +            * also written to.  In addition, some slots can be
> +            * written in the vertex shader and not read in the
> +            * fragment shader.  So the register number must always be
> +            * incremented, mapped or not.
> +            */
>             if (fp_index >= 0)
> -              urb_setup[fp_index] = urb_next++;
> +              urb_setup[fp_index] = urb_next;
> +           urb_next++;
>          }
>        }
>
> diff --git a/src/mesa/drivers/dri/i965/brw_sf.c
> b/src/mesa/drivers/dri/i965/brw_sf.c
> index 23a874a..7867ab5 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf.c
> +++ b/src/mesa/drivers/dri/i965/brw_sf.c
> @@ -192,7 +192,8 @@ brw_upload_sf_prog(struct brw_context *brw)
>
>     /* _NEW_LIGHT */
>     key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
> -   key.do_twoside_color = (ctx->Light.Enabled &&
> ctx->Light.Model.TwoSide);
> +   key.do_twoside_color = (ctx->Light.Enabled &&
> ctx->Light.Model.TwoSide) ||
> +     ctx->VertexProgram._TwoSideEnabled;
>

You're right that this needs fixing, but I think it should just be

key.do_twoside_color = ctx->VertexProgram._TwoSideEnabled;

My reasoning is: _TwoSideEnabled is derived state.  It is set by
src/mesa/main/state.c's update_twoside() function to either (a)
ctx->VertexProgram.TwoSideEnabled, if there is a vertex program, or (b)
ctx->Light.Enabled && ctx->Light.Model.TwoSide, if there isn't.  Since it
already accounts correctly for both ways the user can enable two-sided
lighting, there's no reason for us to consult ctx->Light.Model.TwoSide in
the driver.

Note: brw_clip.c has some references to Light.Model.TwoSide as well.  I
suspect they are also wrong, but I'm not exactly sure what the correct code
is.  I suppose we should address that in a later patch :)


>
>     /* _NEW_POLYGON */
>     if (key.do_twoside_color) {
> diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c
> b/src/mesa/drivers/dri/i965/brw_sf_emit.c
> index ff6383b..9d8aa38 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
> @@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr)
>  /***********************************************************************
>   * Twoside lighting
>   */
> -static void copy_bfc( struct brw_sf_compile *c,
> -                     struct brw_reg vert )
> -{
> -   struct brw_compile *p = &c->func;
> -   GLuint i;
> -
> -   for (i = 0; i < 2; i++) {
> -      if (have_attr(c, VERT_RESULT_COL0+i) &&
> -         have_attr(c, VERT_RESULT_BFC0+i))
> -        brw_MOV(p,
> -                get_vert_result(c, vert, VERT_RESULT_COL0+i),
> -                get_vert_result(c, vert, VERT_RESULT_BFC0+i));
> -   }
> -}
> -
> -
>  static void do_twoside_color( struct brw_sf_compile *c )
>  {
> +   GLuint i, need_0, need_1;
>     struct brw_compile *p = &c->func;
>     GLuint backface_conditional = c->key.frontface_ccw ? BRW_CONDITIONAL_G
> : BRW_CONDITIONAL_L;
>
> @@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile
> *c )
>     if (c->key.primitive == SF_UNFILLED_TRIS)
>        return;
>
> -   /* XXX: What happens if BFC isn't present?  This could only happen
> -    * for user-supplied vertex programs, as t_vp_build.c always does
> -    * the right thing.
> +   /* If the vertex shader provides both front and backface color, do
> +    * the selection.  Otherwise the generated code will pick up
> +    * whichever there is.
>      */
> -   if (!(have_attr(c, VERT_RESULT_COL0) && have_attr(c,
> VERT_RESULT_BFC0)) &&
> -       !(have_attr(c, VERT_RESULT_COL1) && have_attr(c,
> VERT_RESULT_BFC1)))
> +   need_0 = have_attr(c, VERT_RESULT_COL0) && have_attr(c,
> VERT_RESULT_BFC0);
> +   need_1 = have_attr(c, VERT_RESULT_COL1) && have_attr(c,
> VERT_RESULT_BFC1);
> +
> +   if (!need_0 && !need_1)
>        return;
>
>     /* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order
> @@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile
> *c )
>     brw_push_insn_state(p);
>     brw_CMP(p, vec4(brw_null_reg()), backface_conditional, c->det,
> brw_imm_f(0));
>     brw_IF(p, BRW_EXECUTE_4);
> -   {
> -      switch (c->nr_verts) {
> -      case 3: copy_bfc(c, c->vert[2]);
> -      case 2: copy_bfc(c, c->vert[1]);
> -      case 1: copy_bfc(c, c->vert[0]);
> -      }
> +   for (i=0; i<c->nr_verts; i++) {
> +      if (need_0)
> +        brw_MOV(p,
> +                get_vert_result(c, c->vert[i], VERT_RESULT_COL0),
> +                get_vert_result(c, c->vert[i], VERT_RESULT_BFC0));
> +      if (need_1)
> +        brw_MOV(p,
> +                get_vert_result(c, c->vert[i], VERT_RESULT_COL1),
> +                get_vert_result(c, c->vert[i], VERT_RESULT_BFC1));
>

This is much easier to follow as a loop.  Thanks for cleaning this up.


>     }
>     brw_ENDIF(p);
>     brw_pop_insn_state(p);
> @@ -139,20 +129,27 @@ static void do_twoside_color( struct brw_sf_compile
> *c )
>   */
>
>  #define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \
> -                               BITFIELD64_BIT(VERT_RESULT_COL1))
> +                                BITFIELD64_BIT(VERT_RESULT_COL1))
>
>  static void copy_colors( struct brw_sf_compile *c,
>                      struct brw_reg dst,
> -                    struct brw_reg src)
> +                     struct brw_reg src,
> +                     int allow_twoside)
>  {
>     struct brw_compile *p = &c->func;
>     GLuint i;
>
>     for (i = VERT_RESULT_COL0; i <= VERT_RESULT_COL1; i++) {
> -      if (have_attr(c,i))
> +      if (have_attr(c,i)) {
>          brw_MOV(p,
>                  get_vert_result(c, dst, i),
>                  get_vert_result(c, src, i));
> +
> +      } else if(allow_twoside && have_attr(c, i - VERT_RESULT_COL0 +
> VERT_RESULT_BFC0)) {
> +        brw_MOV(p,
> +                get_vert_result(c, dst, i - VERT_RESULT_COL0 +
> VERT_RESULT_BFC0),
> +                get_vert_result(c, src, i - VERT_RESULT_COL0 +
> VERT_RESULT_BFC0));
> +      }
>
    }
>  }
>
> @@ -167,9 +164,19 @@ static void do_flatshade_triangle( struct
> brw_sf_compile *c )
>     struct brw_compile *p = &c->func;
>     struct intel_context *intel = &p->brw->intel;
>     struct brw_reg ip = brw_ip_reg();
> -   GLuint nr = _mesa_bitcount_64(c->key.attrs & VERT_RESULT_COLOR_BITS);
>     GLuint jmpi = 1;
>
> +   GLuint nr;
> +
> +   if (c->key.do_twoside_color) {
> +      nr = ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL0) |
> BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) +
> +         ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL1) |
> BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0);
> +
> +   } else {
> +      nr = ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) +
> +         ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL1)) != 0);
> +   }
> +
>     if (!nr)
>        return;
>
> @@ -186,16 +193,16 @@ static void do_flatshade_triangle( struct
> brw_sf_compile *c )
>     brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr*2+1)));
>     brw_JMPI(p, ip, ip, c->pv);
>
> -   copy_colors(c, c->vert[1], c->vert[0]);
> -   copy_colors(c, c->vert[2], c->vert[0]);
> +   copy_colors(c, c->vert[1], c->vert[0], c->key.do_twoside_color);
> +   copy_colors(c, c->vert[2], c->vert[0], c->key.do_twoside_color);
>     brw_JMPI(p, ip, ip, brw_imm_d(jmpi*(nr*4+1)));
>
> -   copy_colors(c, c->vert[0], c->vert[1]);
> -   copy_colors(c, c->vert[2], c->vert[1]);
> +   copy_colors(c, c->vert[0], c->vert[1], c->key.do_twoside_color);
> +   copy_colors(c, c->vert[2], c->vert[1], c->key.do_twoside_color);
>     brw_JMPI(p, ip, ip, brw_imm_d(jmpi*nr*2));
>
> -   copy_colors(c, c->vert[0], c->vert[2]);
> -   copy_colors(c, c->vert[1], c->vert[2]);
> +   copy_colors(c, c->vert[0], c->vert[2], c->key.do_twoside_color);
> +   copy_colors(c, c->vert[1], c->vert[2], c->key.do_twoside_color);
>
>     brw_pop_insn_state(p);
>  }
> @@ -224,10 +231,10 @@ static void do_flatshade_line( struct brw_sf_compile
> *c )
>
>     brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr+1)));
>     brw_JMPI(p, ip, ip, c->pv);
> -   copy_colors(c, c->vert[1], c->vert[0]);
> +   copy_colors(c, c->vert[1], c->vert[0], 0);
>
>     brw_JMPI(p, ip, ip, brw_imm_ud(jmpi*nr));
> -   copy_colors(c, c->vert[0], c->vert[1]);
> +   copy_colors(c, c->vert[0], c->vert[1], 0);
>
>     brw_pop_insn_state(p);
>  }
> @@ -337,13 +344,17 @@ calculate_masks(struct brw_sf_compile *c,
>     if (c->key.do_flat_shading)
>        persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS) |
>                                      BITFIELD64_BIT(VERT_RESULT_COL0) |
> -                                    BITFIELD64_BIT(VERT_RESULT_COL1));
> +                                    BITFIELD64_BIT(VERT_RESULT_COL1) |
> +                                    BITFIELD64_BIT(VERT_RESULT_BFC0) |
> +                                    BITFIELD64_BIT(VERT_RESULT_BFC1));
>     else
>        persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS));
>
>     if (c->key.do_flat_shading)
>        linear_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_COL0) |
> -                                     BITFIELD64_BIT(VERT_RESULT_COL1));
> +                                     BITFIELD64_BIT(VERT_RESULT_COL1) |
> +                                     BITFIELD64_BIT(VERT_RESULT_BFC0) |
> +                                     BITFIELD64_BIT(VERT_RESULT_BFC1));
>     else
>        linear_mask = c->key.attrs;
>
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c
> b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
> index 27c0a94..48143f3 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
> @@ -96,9 +96,26 @@ static void init_registers( struct brw_wm_compile *c )
>          if (c->key.vp_outputs_written & BITFIELD64_BIT(j)) {
>             int fp_index = _mesa_vert_result_to_frag_attrib(j);
>
> +            /* Special case: two-sided vertex option, vertex program
> +             * only writes to the back color.  Map it to the
> +             * associated front color location.
> +             */
> +            if (j >= VERT_RESULT_BFC0 && j <= VERT_RESULT_BFC1 &&
> +                intel->ctx.VertexProgram._TwoSideEnabled &&
> +                !(c->key.vp_outputs_written & BITFIELD64_BIT(j -
> VERT_RESULT_BFC0 + VERT_RESULT_COL0)))
> +               fp_index = j - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
> +
>             nr_interp_regs++;
> +
> +           /* The back color slot is skipped when the front color is
> +            * also written to.  In addition, some slots can be
> +            * written in the vertex shader and not read in the
> +            * fragment shader.  So the register number must always be
> +            * incremented, mapped or not.
> +            */
>             if (fp_index >= 0)
> -              prealloc_reg(c, &c->payload.input_interp[fp_index], i++);
> +              prealloc_reg(c, &c->payload.input_interp[fp_index], i);
> +            i++;
>          }
>        }
>        assert(nr_interp_regs >= 1);
> --
> 1.7.10.rc3.1.gb306
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

[-- Attachment #1.2: Type: text/html, Size: 17486 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place.
  2012-06-30 18:50 ` [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place Olivier Galibert
@ 2012-07-17  4:24   ` Paul Berry
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Berry @ 2012-07-17  4:24 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 12742 bytes --]

On 30 June 2012 11:50, Olivier Galibert <galibert@pobox.com> wrote:

> The program keys are updated accordingly, but the values are not used
> yet.
>

> Signed-off-by: Olivier Galibert <galibert@pobox.com>
> ---
>  src/mesa/drivers/dri/i965/brw_clip.c    |   82
> ++++++++++++++++++++++++++++++-
>  src/mesa/drivers/dri/i965/brw_clip.h    |    1 +
>  src/mesa/drivers/dri/i965/brw_context.h |   59 ++++++++++++++++++++++
>  src/mesa/drivers/dri/i965/brw_sf.c      |    3 +-
>  src/mesa/drivers/dri/i965/brw_sf.h      |    1 +
>  src/mesa/drivers/dri/i965/brw_wm.c      |    4 ++
>  src/mesa/drivers/dri/i965/brw_wm.h      |    1 +
>  7 files changed, 149 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_clip.c
> b/src/mesa/drivers/dri/i965/brw_clip.c
> index d411208..52e8c47 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip.c
> @@ -47,6 +47,83 @@
>  #define FRONT_UNFILLED_BIT  0x1
>  #define BACK_UNFILLED_BIT   0x2
>
> +/**
> + * Lookup the interpolation mode information for every element in the
> + * vue.
> + */
> +static void
> +brw_lookup_interpolation(struct brw_context *brw)
> +{
> +   /* pprog means "previous program", i.e. the last program before the
> +    * fragment shader.  It can only be the vertex shader for now, but
> +    * it may be a geometry shader in the future.
> +    */
> +   const struct gl_program *pprog = &brw->vertex_program->Base;
> +   const struct gl_fragment_program *fprog = brw->fragment_program;
> +   struct brw_vue_map *vue_map = &brw->vs.prog_data->vue_map;
> +
> +   /* Default everything to INTERP_QUALIFIER_NONE */
> +   brw_clear_interpolation_modes(brw);
>

I'm not sure that inline functions like this one add any clarity, since
they force the reader to go look up the function to know what's going on.
I would recommend replacing this with

memset(&brw->interpolation_mode, 0, sizeof(brw->interpolation_mode));


> +
> +   /* If there is no fragment shader, interpolation won't be needed,
> +    * so defaulting to none is good.
> +    */
> +   if (!fprog)
> +      return;
> +
> +   for (int i = 0; i < vue_map->num_slots; i++) {
> +      /* First lookup the vert result, skip if there isn't one */
> +      int vert_result = vue_map->slot_to_vert_result[i];
> +      if (vert_result == BRW_VERT_RESULT_MAX)
> +         continue;
> +
> +      /* HPOS is special, it must be linear
> +       */
>

I believe this is correct, but it's counterintuitive.  Can you add an
explanation as to why?


> +      if (vert_result == VERT_RESULT_HPOS) {
> +         brw_set_interpolation_mode(brw, i,
> INTERP_QUALIFIER_NOPERSPECTIVE);
> +         continue;
> +      }
> +
> +      /* There is a 1-1 mapping of vert result to frag attrib except
> +       * for BackColor and vars
> +       */
> +      int frag_attrib = vert_result;
> +      if (vert_result >= VERT_RESULT_BFC0 && vert_result <=
> VERT_RESULT_BFC1)
> +         frag_attrib = vert_result - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
> +      else if(vert_result >= VERT_RESULT_VAR0)
> +         frag_attrib = vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0;
> +
> +      /* If the output is not used by the fragment shader, skip it. */
> +      if (!(fprog->Base.InputsRead & BITFIELD64_BIT(frag_attrib)))
> +         continue;
> +
> +      /* Lookup the interpolation mode */
> +      enum glsl_interp_qualifier interpolation_mode =
> fprog->InterpQualifier[frag_attrib];
> +
> +      /* If the mode is not specified, then the default varies.  Color
> +       * values follow the shader model, while all the rest uses
> +       * smooth.
> +       */
> +      if (interpolation_mode == INTERP_QUALIFIER_NONE) {
> +         if (frag_attrib >= FRAG_ATTRIB_COL0 && frag_attrib <=
> FRAG_ATTRIB_COL1)
> +            interpolation_mode = brw->intel.ctx.Light.ShadeModel ==
> GL_FLAT ? INTERP_QUALIFIER_FLAT : INTERP_QUALIFIER_SMOOTH;
> +         else
> +            interpolation_mode = INTERP_QUALIFIER_SMOOTH;
> +      }
> +
> +      /* Finally, if we have both a front color and a back color for
> +       * the same channel, the selection will be done before
> +       * interpolation and the back color copied over the front color
> +       * if necessary.  So interpolating the back color is
> +       * unnecessary.
> +       */
> +      if (vert_result >= VERT_RESULT_BFC0 && vert_result <=
> VERT_RESULT_BFC1)
> +         if (pprog->OutputsWritten & BITFIELD64_BIT(vert_result -
> VERT_RESULT_BFC0 + VERT_RESULT_COL0))
> +            interpolation_mode = INTERP_QUALIFIER_NONE;
> +
> +      brw_set_interpolation_mode(brw, i, interpolation_mode);
> +   }
> +}
>
>  static void compile_clip_prog( struct brw_context *brw,
>                              struct brw_clip_prog_key *key )
> @@ -141,6 +218,8 @@ brw_upload_clip_prog(struct brw_context *brw)
>
>     memset(&key, 0, sizeof(key));
>
> +   brw_lookup_interpolation(brw);
>

Can you add a comment "/* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */" above
this line?  That will help remind us that this function consults the
fragment program and the lighting state.


> +
>     /* Populate the key:
>      */
>     /* BRW_NEW_REDUCED_PRIMITIVE */
> @@ -150,6 +229,7 @@ brw_upload_clip_prog(struct brw_context *brw)
>     /* _NEW_LIGHT */
>     key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
>     key.pv_first = (ctx->Light.ProvokingVertex ==
> GL_FIRST_VERTEX_CONVENTION);
> +   brw_copy_interpolation_modes(brw, key.interpolation_mode);
>

Similar to my comment above about brw_clear_interpolation_modes(), I would
recommend replacing this with a simple call to memcpy().


>     /* _NEW_TRANSFORM (also part of VUE map)*/
>     key.nr_userclip = _mesa_bitcount_64(ctx->Transform.ClipPlanesEnabled);
>
> @@ -258,7 +338,7 @@ const struct brw_tracked_state brw_clip_prog = {
>                 _NEW_TRANSFORM |
>                 _NEW_POLYGON |
>                 _NEW_BUFFERS),
> -      .brw   = (BRW_NEW_REDUCED_PRIMITIVE),
> +      .brw   = (BRW_NEW_FRAGMENT_PROGRAM|BRW_NEW_REDUCED_PRIMITIVE),
>        .cache = CACHE_NEW_VS_PROG
>     },
>     .emit = brw_upload_clip_prog
> diff --git a/src/mesa/drivers/dri/i965/brw_clip.h
> b/src/mesa/drivers/dri/i965/brw_clip.h
> index 9185651..6f811ae 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip.h
> +++ b/src/mesa/drivers/dri/i965/brw_clip.h
> @@ -43,6 +43,7 @@
>   */
>  struct brw_clip_prog_key {
>     GLbitfield64 attrs;
> +   GLbitfield64 interpolation_mode[2]; /* copy of the main context */
>     GLuint primitive:4;
>     GLuint nr_userclip:4;
>     GLuint do_flat_shading:1;
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 69659c4e..1a417e5 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1041,6 +1041,17 @@ struct brw_context
>     uint32_t render_target_format[MESA_FORMAT_COUNT];
>     bool format_supported_as_render_target[MESA_FORMAT_COUNT];
>
> +   /* Interpolation modes, two bits per vue slot, values equal to
> +    * glsl_interp_qualifier.
> +    *
> +    * Used on gen4/5 by the clipper, sf and wm stages.  Given the
> +    * update order, the clipper is resposible to update it.
>
+    *
> +    * Ignored on gen 6+
> +    */
> +
> +   GLbitfield64 interpolation_mode[2]; /* two bits per vue slot */
> +
>

I don't think the space savings is worth the complication here.  How about
changing this (and the other new interpolation_mode arrays) to:

unsigned char interpolation_mode[BRW_VERT_RESULT_MAX];

Then we can replace brw_set_interpolation_mode() and
brw_get_interpolation_mode() by simply indexing into the array.


>     /* PrimitiveRestart */
>     struct {
>        bool in_progress;
> @@ -1262,6 +1273,54 @@ brw_program_reloc(struct brw_context *brw, uint32_t
> state_offset,
>
>  bool brw_do_cubemap_normalize(struct exec_list *instructions);
>
> +/**
> + * Pre-gen6, interpolation mode has to be resolved for code generation
> + * in clipper, sf and wm.  The resolution is done once by the clipper
> + * and stored in the interpolation_mode array and in the keys.  These
> + * functions allow access to that packed array.
> + */
> +
> +/* INTERP_QUALIFIER_NONE is required to be 0, so initialization is easy */
> +static inline
> +void brw_clear_interpolation_modes(struct brw_context *brw)
> +{
> +   brw->interpolation_mode[0] = brw->interpolation_mode[1] = 0;
> +}
> +
> +static inline
> +void brw_clear_interpolation_modes_key(GLbitfield64 *interpolation_mode)
> +{
> +   interpolation_mode[0] = interpolation_mode[1] = 0;
> +}
> +
> +/* Store an interpolation mode */
> +static inline
> +void brw_set_interpolation_mode(struct brw_context *brw, int slot, enum
> glsl_interp_qualifier mode)
> +{
> +   int idx = slot >> 5;
> +   int shift = 2*(slot & 31);
> +   brw->interpolation_mode[idx] =
> +      (brw->interpolation_mode[idx] & ~(3ULL << shift))
> +      | (((GLbitfield64)mode) << shift);
> +}
> +
> +/* Retrieve an interpolation mode */
> +static inline
> +enum glsl_interp_qualifier brw_get_interpolation_mode(struct brw_context
> *brw, int slot)
> +{
> +   int idx = slot >> 5;
> +   int shift = 2*(slot & 31);
> +   return (enum glsl_interp_qualifier)((brw->interpolation_mode[idx] >>
> shift) & 3);
> +}
> +
> +/* Copy the interpolation mode information to a key */
> +static inline
> +void brw_copy_interpolation_modes(struct brw_context *brw, GLbitfield64
> *dest)
> +{
> +   dest[0] = brw->interpolation_mode[0];
> +   dest[1] = brw->interpolation_mode[1];
> +}
> +
>

If we change the type of interpolation_mode, and use memset() and memcpy()
where they make sense, I think we can eliminate the need for all these
inline functions.


>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/src/mesa/drivers/dri/i965/brw_sf.c
> b/src/mesa/drivers/dri/i965/brw_sf.c
> index 7867ab5..0cc4fc7 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf.c
> +++ b/src/mesa/drivers/dri/i965/brw_sf.c
> @@ -194,6 +194,7 @@ brw_upload_sf_prog(struct brw_context *brw)
>     key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
>     key.do_twoside_color = (ctx->Light.Enabled &&
> ctx->Light.Model.TwoSide) ||
>       ctx->VertexProgram._TwoSideEnabled;
> +   brw_copy_interpolation_modes(brw, key.interpolation_mode);
>

This can also be replaced with a call to memcpy().  Also, we should put a
comment "/* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */" above it.


>
>     /* _NEW_POLYGON */
>     if (key.do_twoside_color) {
> @@ -216,7 +217,7 @@ const struct brw_tracked_state brw_sf_prog = {
>     .dirty = {
>        .mesa  = (_NEW_HINT | _NEW_LIGHT | _NEW_POLYGON | _NEW_POINT |
>                  _NEW_TRANSFORM | _NEW_BUFFERS),
> -      .brw   = (BRW_NEW_REDUCED_PRIMITIVE),
> +      .brw   = (BRW_NEW_FRAGMENT_PROGRAM|BRW_NEW_REDUCED_PRIMITIVE),
>        .cache = CACHE_NEW_VS_PROG
>     },
>     .emit = brw_upload_sf_prog
> diff --git a/src/mesa/drivers/dri/i965/brw_sf.h
> b/src/mesa/drivers/dri/i965/brw_sf.h
> index f908fc0..0a8135c 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf.h
> +++ b/src/mesa/drivers/dri/i965/brw_sf.h
> @@ -46,6 +46,7 @@
>
>  struct brw_sf_prog_key {
>     GLbitfield64 attrs;
> +   GLbitfield64 interpolation_mode[2]; /* copy of the main context */
>     uint8_t point_sprite_coord_replace;
>     GLuint primitive:2;
>     GLuint do_twoside_color:1;
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c
> b/src/mesa/drivers/dri/i965/brw_wm.c
> index 4a7225c..c7c1c45 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -504,6 +504,10 @@ static void brw_wm_populate_key( struct brw_context
> *brw,
>
>     /* _NEW_LIGHT */
>     key->flat_shade = (ctx->Light.ShadeModel == GL_FLAT);
> +   if (intel->gen < 6)
> +      brw_copy_interpolation_modes(brw, key->interpolation_mode);
> +   else
> +      brw_clear_interpolation_modes_key(key->interpolation_mode);
>

The "else" part isn't necessary, since we memset() the key to zero at the
top of the function.


>
>     /* _NEW_FRAG_CLAMP | _NEW_BUFFERS */
>     key->clamp_fragment_color = ctx->Color._ClampFragmentColor;
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.h
> b/src/mesa/drivers/dri/i965/brw_wm.h
> index 2cde2a0..53b83f8 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.h
> +++ b/src/mesa/drivers/dri/i965/brw_wm.h
> @@ -60,6 +60,7 @@
>  #define AA_ALWAYS    2
>
>  struct brw_wm_prog_key {
> +   GLbitfield64 interpolation_mode[2]; /* copy of the main context for
> gen4-5, 0 for gen6+ */
>     uint8_t iz_lookup;
>     GLuint stats_wm:1;
>     GLuint flat_shade:1;
> --
> 1.7.10.rc3.1.gb306
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

[-- Attachment #1.2: Type: text/html, Size: 15829 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-07-17  3:43   ` [Mesa-dev] " Paul Berry
@ 2012-07-17  8:33     ` Olivier Galibert
  2012-07-17  8:57     ` [Mesa-dev] " Olivier Galibert
  1 sibling, 0 replies; 24+ messages in thread
From: Olivier Galibert @ 2012-07-17  8:33 UTC (permalink / raw)
  To: Paul Berry; +Cc: mesa-dev, intel-gfx

On Mon, Jul 16, 2012 at 08:43:17PM -0700, Paul Berry wrote:
> Can you split this into three separate patches?  That will make it easier
> to troubleshoot in case we find bugs with these patches in the future.

I'm going to try.


> Also, I'm not convinced that #3 is necessary.  Is there something in the
> spec that dictates this behaviour?  My reading of the spec is that if the
> vertex shader writes to gl_BackColor but not glFrontColor, then the
> contents of gl_Color in the fragment shader is undefined.

Given the number of security issues/information leaks that happen due
to reads out of place, I'm always extremely wary of reads from
nowhere.  So one pretty much has a choice between forcing a specific
value (like 0) or reading from someplace else that makes sense.  In
that particular case I considered reading from the other color slot
the easy way out.


> If we *do* decide that #3 is necessary, then I think a better way to
> accomplish it is to handle it in the GLSL vertex shader front-end, by
> replacing gl_BackColor with gl_FrontColor in cases where gl_FrontColor is
> not written to.  That way our special case code to handle this situation
> would be in just one place, rather than in three places (both fragment
> shader back-ends, and the SF program).  Also then the fix would apply to
> all hardware, not just Intel Gen4-5.

You'd have to switch off two-sided lighting too, but why not.


> Finally, I couldn't figure out what you meant by "the stray mov into
> lalaland".  Can you elaborate on which piece of code used to generate that
> stray mov, and why it doesn't anymore?  Thanks.

Looking at it again, I was wrong, it was protected.

  OG.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-07-17  3:43   ` [Mesa-dev] " Paul Berry
  2012-07-17  8:33     ` [Intel-gfx] " Olivier Galibert
@ 2012-07-17  8:57     ` Olivier Galibert
  2012-07-17 14:37       ` Paul Berry
  1 sibling, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-07-17  8:57 UTC (permalink / raw)
  To: Paul Berry; +Cc: mesa-dev, intel-gfx

On Mon, Jul 16, 2012 at 08:43:17PM -0700, Paul Berry wrote:
> Also, I'm not convinced that #3 is necessary.  Is there something in the
> spec that dictates this behaviour?  My reading of the spec is that if the
> vertex shader writes to gl_BackColor but not glFrontColor, then the
> contents of gl_Color in the fragment shader is undefined.

Oh, I remember why I did that in the first place.  All the front/back
piglit tests only write the appropriate color slot and not the other
one.

The language is annoying:
  The following built-in vertex output variables are available, but deprecated. A particular one should be
  written to if any functionality in a corresponding fragment shader or fixed pipeline uses it or state derived
  from it. Otherwise, behavior is undefined.
  out vec4 gl_FrontColor;
  out vec4 gl_BackColor;
  out vec4 gl_FrontSecondaryColor;
  out vec4 gl_BackSecondaryColor;
  [...]

One could argue that you don't "use" gl_FrontColor if all your
polygons are back-facing.  Dunno.  Do you consider all of the twoside
piglit tests buggy?  We can fix *that*.

  OG.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.
  2012-06-30 18:50 ` [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf Olivier Galibert
@ 2012-07-17 12:50   ` Paul Berry
  2012-07-17 13:07     ` Paul Berry
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Berry @ 2012-07-17 12:50 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 16865 bytes --]

On 30 June 2012 11:50, Olivier Galibert <galibert@pobox.com> wrote:

> This patch also correct a couple of problems with noperspective
> interpolation.
>
> At that point all the glsl 1.1/1.3 interpolation tests that do not
> clip pass (the -none ones).
>
> The fs code does not use the pre-resolved interpolation modes in order
> not to mess with gen6+.  Sharing the resolution would require putting
> brw_wm_prog before brw_clip_prog and brw_sf_prog.  This may be a good
> thing, but it could have unexpected consequences, so it's better be
> done independently in any case.
>
> Signed-off-by: Olivier Galibert <galibert@pobox.com>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp         |    2 +-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |    5 +
>  src/mesa/drivers/dri/i965/brw_sf.c           |    9 +-
>  src/mesa/drivers/dri/i965/brw_sf.h           |    2 +-
>  src/mesa/drivers/dri/i965/brw_sf_emit.c      |  164
> +++++++++++++-------------
>  5 files changed, 95 insertions(+), 87 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 710f2ff..b142f2b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -506,7 +506,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
>                   struct brw_reg interp = interp_reg(location, k);
>                    emit_linterp(attr, fs_reg(interp), interpolation_mode,
>                                 ir->centroid);
> -                 if (intel->gen < 6) {
> +                 if (intel->gen < 6 && interpolation_mode ==
> INTERP_QUALIFIER_SMOOTH) {
>                      emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w);
>                   }
>                }
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 9bd1e67..ab83a95 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1924,6 +1924,11 @@ fs_visitor::emit_interpolation_setup_gen4()
>     emit(BRW_OPCODE_ADD,
> this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
>         this->pixel_y, fs_reg(negate(brw_vec1_grf(1, 1))));
>
> +   this->delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
> +     this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
> +   this->delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
> +     this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
> +
>

Can we add a comment explaining why this is correct?  Something like this
maybe:

"On Gen4-5, we accomplish perspective-correct interpolation by dividing the
attribute values by w in the vertex shader, interpolating the result
linearly in screen space, and then multiplying by w in the fragment
shader.  So the interpolation step is always linear in screen space,
regardless of whether the attribute is perspective or non-perspective.
Accordingly, we use the same delta_x and delta_y values for both kinds of
interpolation."


>     this->current_annotation = "compute pos.w and 1/pos.w";
>     /* Compute wpos.w.  It's always in our setup, since it's needed to
>      * interpolate the other attributes.
> diff --git a/src/mesa/drivers/dri/i965/brw_sf.c
> b/src/mesa/drivers/dri/i965/brw_sf.c
> index 0cc4fc7..85f5f51 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf.c
> +++ b/src/mesa/drivers/dri/i965/brw_sf.c
> @@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw)
>     struct brw_sf_prog_key key;
>     /* _NEW_BUFFERS */
>     bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
> +   int i;
>
>     memset(&key, 0, sizeof(key));
>
> @@ -191,7 +192,13 @@ brw_upload_sf_prog(struct brw_context *brw)
>        key.sprite_origin_lower_left = true;
>
>     /* _NEW_LIGHT */
> -   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
> +   key.has_flat_shading = 0;
> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
> +      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
> +         key.has_flat_shading = 1;
> +         break;
> +      }
> +   }
>     key.do_twoside_color = (ctx->Light.Enabled &&
> ctx->Light.Model.TwoSide) ||
>       ctx->VertexProgram._TwoSideEnabled;
>     brw_copy_interpolation_modes(brw, key.interpolation_mode);
> diff --git a/src/mesa/drivers/dri/i965/brw_sf.h
> b/src/mesa/drivers/dri/i965/brw_sf.h
> index 0a8135c..c718072 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf.h
> +++ b/src/mesa/drivers/dri/i965/brw_sf.h
> @@ -50,7 +50,7 @@ struct brw_sf_prog_key {
>     uint8_t point_sprite_coord_replace;
>     GLuint primitive:2;
>     GLuint do_twoside_color:1;
> -   GLuint do_flat_shading:1;
> +   GLuint has_flat_shading:1;
>     GLuint frontface_ccw:1;
>     GLuint do_point_sprite:1;
>     GLuint do_point_coord:1;
> diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c
> b/src/mesa/drivers/dri/i965/brw_sf_emit.c
> index 9d8aa38..387685a 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
> @@ -44,6 +44,17 @@
>
>
>  /**
> + * Determine the vue slot corresponding to the given half of the given
> + * register.  half=0 means the first half of a register, half=1 means the
> + * second half.
> + */
> +static inline int vert_reg_to_vue_slot(struct brw_sf_compile *c, GLuint
> reg,
> +                                       int half)
> +{
> +   return (reg + c->urb_entry_read_offset) * 2 + half;
> +}
> +
> +/**
>   * Determine the vert_result corresponding to the given half of the given
>   * register.  half=0 means the first half of a register, half=1 means the
>   * second half.
> @@ -51,11 +62,24 @@
>  static inline int vert_reg_to_vert_result(struct brw_sf_compile *c,
> GLuint reg,
>                                            int half)
>  {
> -   int vue_slot = (reg + c->urb_entry_read_offset) * 2 + half;
> +   int vue_slot = vert_reg_to_vue_slot(c, reg, half);
>     return c->vue_map.slot_to_vert_result[vue_slot];
>  }
>
>  /**
> + * Determine the register corresponding to the given vue slot.
> + */
> +static struct brw_reg get_vue_slot(struct brw_sf_compile *c,
> +                                   struct brw_reg vert,
> +                                   int vue_slot)
> +{
> +   GLuint off = vue_slot / 2 - c->urb_entry_read_offset;
> +   GLuint sub = vue_slot % 2;
> +
> +   return brw_vec4_grf(vert.nr + off, sub * 4);
>
+}
> +
> +/**
>   * Determine the register corresponding to the given vert_result.
>   */
>  static struct brw_reg get_vert_result(struct brw_sf_compile *c,
> @@ -64,10 +88,7 @@ static struct brw_reg get_vert_result(struct
> brw_sf_compile *c,
>  {
>     int vue_slot = c->vue_map.vert_result_to_slot[vert_result];
>     assert (vue_slot >= c->urb_entry_read_offset);
> -   GLuint off = vue_slot / 2 - c->urb_entry_read_offset;
> -   GLuint sub = vue_slot % 2;
> -
> -   return brw_vec4_grf(vert.nr + off, sub * 4);
> +   return get_vue_slot(c, vert, vue_slot);
>  }
>
>  static bool
> @@ -128,31 +149,37 @@ static void do_twoside_color( struct brw_sf_compile
> *c )
>   * Flat shading
>   */
>
> -#define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \
> -                                BITFIELD64_BIT(VERT_RESULT_COL1))
> -
> -static void copy_colors( struct brw_sf_compile *c,
> -                    struct brw_reg dst,
> -                     struct brw_reg src,
> -                     int allow_twoside)
> +static void copy_flatshaded_attributes( struct brw_sf_compile *c,
> +                                        struct brw_reg dst,
> +                                        struct brw_reg src)
>  {
>     struct brw_compile *p = &c->func;
> +   struct brw_context *brw = p->brw;
>     GLuint i;
>
> -   for (i = VERT_RESULT_COL0; i <= VERT_RESULT_COL1; i++) {
> -      if (have_attr(c,i)) {
> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
> +      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
>          brw_MOV(p,
> -                get_vert_result(c, dst, i),
> -                get_vert_result(c, src, i));
> +                get_vue_slot(c, dst, i),
> +                get_vue_slot(c, src, i));
>
> -      } else if(allow_twoside && have_attr(c, i - VERT_RESULT_COL0 +
> VERT_RESULT_BFC0)) {
> -        brw_MOV(p,
> -                get_vert_result(c, dst, i - VERT_RESULT_COL0 +
> VERT_RESULT_BFC0),
> -                get_vert_result(c, src, i - VERT_RESULT_COL0 +
> VERT_RESULT_BFC0));
>        }
>     }
>  }
>
> +static GLuint count_flatshaded_attributes(struct brw_sf_compile *c )
> +{
> +   struct brw_compile *p = &c->func;
> +   struct brw_context *brw = p->brw;
> +   GLuint count = 0;
> +   GLuint i;
> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
> +      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT)
> +         count++;
> +   }
> +   return count;
> +}
> +
>
>
>  /* Need to use a computed jump to copy flatshaded attributes as the
> @@ -168,18 +195,6 @@ static void do_flatshade_triangle( struct
> brw_sf_compile *c )
>
>     GLuint nr;
>
> -   if (c->key.do_twoside_color) {
> -      nr = ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL0) |
> BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) +
> -         ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL1) |
> BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0);
> -
> -   } else {
> -      nr = ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) +
> -         ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL1)) != 0);
> -   }
> -
> -   if (!nr)
> -      return;
> -
>     /* Already done in clip program:
>      */
>     if (c->key.primitive == SF_UNFILLED_TRIS)
> @@ -188,21 +203,23 @@ static void do_flatshade_triangle( struct
> brw_sf_compile *c )
>     if (intel->gen == 5)
>         jmpi = 2;
>
> +   nr = count_flatshaded_attributes(c);
> +
>

We used to have this optimization:

if (!nr)
   return;

I understand why you removed it: because now this code should only be
called if c->key.has_flat_shading is true.  However, for the sake of safety
(and documentation), can we add something like this:

   /* This code should only be called if c->key.has_flat_shading is true,
    * meaning there is at least one flatshaded attribute.
    */
   assert(nr != 0);

If you would prefer to put this assertion inside
count_flatshaded_attributes(), I would be happy with that too.


>     brw_push_insn_state(p);
>
>     brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr*2+1)));
>     brw_JMPI(p, ip, ip, c->pv);
>
> -   copy_colors(c, c->vert[1], c->vert[0], c->key.do_twoside_color);
> -   copy_colors(c, c->vert[2], c->vert[0], c->key.do_twoside_color);
> +   copy_flatshaded_attributes(c, c->vert[1], c->vert[0]);
> +   copy_flatshaded_attributes(c, c->vert[2], c->vert[0]);
>     brw_JMPI(p, ip, ip, brw_imm_d(jmpi*(nr*4+1)));
>
> -   copy_colors(c, c->vert[0], c->vert[1], c->key.do_twoside_color);
> -   copy_colors(c, c->vert[2], c->vert[1], c->key.do_twoside_color);
> +   copy_flatshaded_attributes(c, c->vert[0], c->vert[1]);
> +   copy_flatshaded_attributes(c, c->vert[2], c->vert[1]);
>     brw_JMPI(p, ip, ip, brw_imm_d(jmpi*nr*2));
>
> -   copy_colors(c, c->vert[0], c->vert[2], c->key.do_twoside_color);
> -   copy_colors(c, c->vert[1], c->vert[2], c->key.do_twoside_color);
> +   copy_flatshaded_attributes(c, c->vert[0], c->vert[2]);
> +   copy_flatshaded_attributes(c, c->vert[1], c->vert[2]);
>
>     brw_pop_insn_state(p);
>  }
> @@ -213,12 +230,9 @@ static void do_flatshade_line( struct brw_sf_compile
> *c )
>     struct brw_compile *p = &c->func;
>     struct intel_context *intel = &p->brw->intel;
>     struct brw_reg ip = brw_ip_reg();
> -   GLuint nr = _mesa_bitcount_64(c->key.attrs & VERT_RESULT_COLOR_BITS);
> +   GLuint nr;
>     GLuint jmpi = 1;
>
> -   if (!nr)
> -      return;
> -
>     /* Already done in clip program:
>      */
>     if (c->key.primitive == SF_UNFILLED_TRIS)
> @@ -227,14 +241,16 @@ static void do_flatshade_line( struct brw_sf_compile
> *c )
>     if (intel->gen == 5)
>         jmpi = 2;
>
> +   nr = count_flatshaded_attributes(c);
> +
>

A similar assertion should go here.


>     brw_push_insn_state(p);
>
>     brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr+1)));
>     brw_JMPI(p, ip, ip, c->pv);
> -   copy_colors(c, c->vert[1], c->vert[0], 0);
> +   copy_flatshaded_attributes(c, c->vert[1], c->vert[0]);
>
>     brw_JMPI(p, ip, ip, brw_imm_ud(jmpi*nr));
> -   copy_colors(c, c->vert[0], c->vert[1], 0);
> +   copy_flatshaded_attributes(c, c->vert[0], c->vert[1]);
>
>     brw_pop_insn_state(p);
>  }
> @@ -332,40 +348,25 @@ static void invert_det( struct brw_sf_compile *c)
>
>  static bool
>  calculate_masks(struct brw_sf_compile *c,
> -               GLuint reg,
> -               GLushort *pc,
> -               GLushort *pc_persp,
> -               GLushort *pc_linear)
> +                GLuint reg,
> +                GLushort *pc,
> +                GLushort *pc_persp,
> +                GLushort *pc_linear)
>

I like the way you've rewritten this function.  Nicely done.


>  {
> +   struct brw_compile *p = &c->func;
> +   struct brw_context *brw = p->brw;
> +   enum glsl_interp_qualifier interp;
>     bool is_last_attr = (reg == c->nr_setup_regs - 1);
> -   GLbitfield64 persp_mask;
> -   GLbitfield64 linear_mask;
> -
> -   if (c->key.do_flat_shading)
> -      persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS) |
> -                                    BITFIELD64_BIT(VERT_RESULT_COL0) |
> -                                    BITFIELD64_BIT(VERT_RESULT_COL1) |
> -                                    BITFIELD64_BIT(VERT_RESULT_BFC0) |
> -                                    BITFIELD64_BIT(VERT_RESULT_BFC1));
> -   else
> -      persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS));
> -
> -   if (c->key.do_flat_shading)
> -      linear_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_COL0) |
> -                                     BITFIELD64_BIT(VERT_RESULT_COL1) |
> -                                     BITFIELD64_BIT(VERT_RESULT_BFC0) |
> -                                     BITFIELD64_BIT(VERT_RESULT_BFC1));
> -   else
> -      linear_mask = c->key.attrs;
>
>     *pc_persp = 0;
>     *pc_linear = 0;
>     *pc = 0xf;
> -
> -   if (persp_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 0)))
> -      *pc_persp = 0xf;
>
> -   if (linear_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 0)))
> +   interp = brw_get_interpolation_mode(brw, vert_reg_to_vue_slot(c, reg,
> 0));
> +   if (interp == INTERP_QUALIFIER_SMOOTH) {
> +      *pc_linear = 0xf;
> +      *pc_persp = 0xf;
> +   } else if(interp == INTERP_QUALIFIER_NOPERSPECTIVE)
>        *pc_linear = 0xf;
>
>     /* Maybe only processs one attribute on the final round:
> @@ -373,11 +374,12 @@ calculate_masks(struct brw_sf_compile *c,
>     if (vert_reg_to_vert_result(c, reg, 1) != BRW_VERT_RESULT_MAX) {
>        *pc |= 0xf0;
>
> -      if (persp_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 1)))
> -        *pc_persp |= 0xf0;
> -
> -      if (linear_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg,
> 1)))
> -        *pc_linear |= 0xf0;
> +      interp = brw_get_interpolation_mode(brw, vert_reg_to_vue_slot(c,
> reg, 1));
> +      if (interp == INTERP_QUALIFIER_SMOOTH) {
> +         *pc_linear |= 0xf0;
> +         *pc_persp |= 0xf0;
> +      } else if(interp == INTERP_QUALIFIER_NOPERSPECTIVE)
> +         *pc_linear |= 0xf0;
>     }
>
>     return is_last_attr;
> @@ -430,7 +432,7 @@ void brw_emit_tri_setup(struct brw_sf_compile *c, bool
> allocate)
>     if (c->key.do_twoside_color)
>        do_twoside_color(c);
>
> -   if (c->key.do_flat_shading)
> +   if (c->key.has_flat_shading)
>        do_flatshade_triangle(c);
>
>
> @@ -443,7 +445,6 @@ void brw_emit_tri_setup(struct brw_sf_compile *c, bool
> allocate)
>        struct brw_reg a2 = offset(c->vert[2], i);
>        GLushort pc, pc_persp, pc_linear;
>        bool last = calculate_masks(c, i, &pc, &pc_persp, &pc_linear);
> -
>        if (pc_persp)
>        {
>          brw_set_predicate_control_flag_value(p, pc_persp);
> @@ -507,7 +508,6 @@ void brw_emit_line_setup(struct brw_sf_compile *c,
> bool allocate)
>     struct brw_compile *p = &c->func;
>     GLuint i;
>
> -
>     c->nr_verts = 2;
>
>     if (allocate)
> @@ -516,7 +516,7 @@ void brw_emit_line_setup(struct brw_sf_compile *c,
> bool allocate)
>     invert_det(c);
>     copy_z_inv_w(c);
>
> -   if (c->key.do_flat_shading)
> +   if (c->key.has_flat_shading)
>        do_flatshade_line(c);
>
>     for (i = 0; i < c->nr_setup_regs; i++)
> @@ -799,7 +799,3 @@ void brw_emit_anyprim_setup( struct brw_sf_compile *c )
>
>     brw_emit_point_setup( c, false );
>  }
> -
> -
> -
> -
> --
> 1.7.10.rc3.1.gb306
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

With the minor changes noted above, this patch is:

Reviewed-by: Paul Berry <stereotype441@gmail.com>

[-- Attachment #1.2: Type: text/html, Size: 19959 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper.
  2012-06-30 18:50 ` [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper Olivier Galibert
@ 2012-07-17 12:55   ` Paul Berry
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Berry @ 2012-07-17 12:55 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 10349 bytes --]

On 30 June 2012 11:50, Olivier Galibert <galibert@pobox.com> wrote:

> At that point, all interpolation piglit tests involving fixed clipping
> work as long as there's no noperspective.
>
> Signed-off-by: Olivier Galibert <galibert@pobox.com>
> ---
>  src/mesa/drivers/dri/i965/brw_clip.c          |   10 ++++-
>  src/mesa/drivers/dri/i965/brw_clip.h          |    6 +--
>  src/mesa/drivers/dri/i965/brw_clip_line.c     |    6 +--
>  src/mesa/drivers/dri/i965/brw_clip_tri.c      |   20 ++++-----
>  src/mesa/drivers/dri/i965/brw_clip_unfilled.c |    2 +-
>  src/mesa/drivers/dri/i965/brw_clip_util.c     |   56
> +++++++------------------
>  6 files changed, 41 insertions(+), 59 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_clip.c
> b/src/mesa/drivers/dri/i965/brw_clip.c
> index 52e8c47..952eb4a 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip.c
> @@ -215,7 +215,7 @@ brw_upload_clip_prog(struct brw_context *brw)
>     struct intel_context *intel = &brw->intel;
>     struct gl_context *ctx = &intel->ctx;
>     struct brw_clip_prog_key key;
> -
> +   int i;
>     memset(&key, 0, sizeof(key));
>
>     brw_lookup_interpolation(brw);
> @@ -227,7 +227,13 @@ brw_upload_clip_prog(struct brw_context *brw)
>     /* CACHE_NEW_VS_PROG (also part of VUE map) */
>     key.attrs = brw->vs.prog_data->outputs_written;
>     /* _NEW_LIGHT */
> -   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
> +   key.has_flat_shading = 0;
> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
> +      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
> +         key.has_flat_shading = 1;
> +         break;
> +      }
> +   }
>     key.pv_first = (ctx->Light.ProvokingVertex ==
> GL_FIRST_VERTEX_CONVENTION);
>     brw_copy_interpolation_modes(brw, key.interpolation_mode);
>     /* _NEW_TRANSFORM (also part of VUE map)*/
> diff --git a/src/mesa/drivers/dri/i965/brw_clip.h
> b/src/mesa/drivers/dri/i965/brw_clip.h
> index 6f811ae..0ea0394 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip.h
> +++ b/src/mesa/drivers/dri/i965/brw_clip.h
> @@ -46,7 +46,7 @@ struct brw_clip_prog_key {
>     GLbitfield64 interpolation_mode[2]; /* copy of the main context */
>     GLuint primitive:4;
>     GLuint nr_userclip:4;
> -   GLuint do_flat_shading:1;
> +   GLuint has_flat_shading:1;
>     GLuint pv_first:1;
>     GLuint do_unfilled:1;
>     GLuint fill_cw:2;           /* includes cull information */
> @@ -166,8 +166,8 @@ void brw_clip_kill_thread(struct brw_clip_compile *c);
>  struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c );
>  struct brw_reg brw_clip_plane0_address( struct brw_clip_compile *c );
>
> -void brw_clip_copy_colors( struct brw_clip_compile *c,
> -                          GLuint to, GLuint from );
> +void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c,
> +                                          GLuint to, GLuint from );
>
>  void brw_clip_init_clipmask( struct brw_clip_compile *c );
>
> diff --git a/src/mesa/drivers/dri/i965/brw_clip_line.c
> b/src/mesa/drivers/dri/i965/brw_clip_line.c
> index 6cf2bd2..729d8c0 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip_line.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip_line.c
> @@ -271,11 +271,11 @@ void brw_emit_line_clip( struct brw_clip_compile *c )
>     brw_clip_line_alloc_regs(c);
>     brw_clip_init_ff_sync(c);
>
> -   if (c->key.do_flat_shading) {
> +   if (c->key.has_flat_shading) {
>        if (c->key.pv_first)
> -         brw_clip_copy_colors(c, 1, 0);
> +         brw_clip_copy_flatshaded_attributes(c, 1, 0);
>        else
> -         brw_clip_copy_colors(c, 0, 1);
> +         brw_clip_copy_flatshaded_attributes(c, 0, 1);
>     }
>
>     clip_and_emit_line(c);
> diff --git a/src/mesa/drivers/dri/i965/brw_clip_tri.c
> b/src/mesa/drivers/dri/i965/brw_clip_tri.c
> index a29f8e0..71225f5 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip_tri.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip_tri.c
> @@ -187,8 +187,8 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile
> *c )
>
>     brw_IF(p, BRW_EXECUTE_1);
>     {
> -      brw_clip_copy_colors(c, 1, 0);
> -      brw_clip_copy_colors(c, 2, 0);
> +      brw_clip_copy_flatshaded_attributes(c, 1, 0);
> +      brw_clip_copy_flatshaded_attributes(c, 2, 0);
>     }
>     brw_ELSE(p);
>     {
> @@ -200,19 +200,19 @@ void brw_clip_tri_flat_shade( struct
> brw_clip_compile *c )
>                  brw_imm_ud(_3DPRIM_TRIFAN));
>          brw_IF(p, BRW_EXECUTE_1);
>          {
> -           brw_clip_copy_colors(c, 0, 1);
> -           brw_clip_copy_colors(c, 2, 1);
> +           brw_clip_copy_flatshaded_attributes(c, 0, 1);
> +           brw_clip_copy_flatshaded_attributes(c, 2, 1);
>          }
>          brw_ELSE(p);
>          {
> -           brw_clip_copy_colors(c, 1, 0);
> -           brw_clip_copy_colors(c, 2, 0);
> +           brw_clip_copy_flatshaded_attributes(c, 1, 0);
> +           brw_clip_copy_flatshaded_attributes(c, 2, 0);
>          }
>          brw_ENDIF(p);
>        }
>        else {
> -         brw_clip_copy_colors(c, 0, 2);
> -         brw_clip_copy_colors(c, 1, 2);
> +         brw_clip_copy_flatshaded_attributes(c, 0, 2);
> +         brw_clip_copy_flatshaded_attributes(c, 1, 2);
>        }
>     }
>     brw_ENDIF(p);
> @@ -606,8 +606,8 @@ void brw_emit_tri_clip( struct brw_clip_compile *c )
>      * flatshading, need to apply the flatshade here because we don't
>      * respect the PV when converting to trifan for emit:
>      */
> -   if (c->key.do_flat_shading)
> -      brw_clip_tri_flat_shade(c);
> +   if (c->key.has_flat_shading)
> +      brw_clip_tri_flat_shade(c);
>
>     if ((c->key.clip_mode == BRW_CLIPMODE_NORMAL) ||
>         (c->key.clip_mode == BRW_CLIPMODE_KERNEL_CLIP))
> diff --git a/src/mesa/drivers/dri/i965/brw_clip_unfilled.c
> b/src/mesa/drivers/dri/i965/brw_clip_unfilled.c
> index 03c7d42..96f9a84 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip_unfilled.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip_unfilled.c
> @@ -502,7 +502,7 @@ void brw_emit_unfilled_clip( struct brw_clip_compile
> *c )
>
>     /* Need to do this whether we clip or not:
>      */
> -   if (c->key.do_flat_shading)
> +   if (c->key.has_flat_shading)
>        brw_clip_tri_flat_shade(c);
>
>     brw_clip_init_clipmask(c);
> diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c
> b/src/mesa/drivers/dri/i965/brw_clip_util.c
> index bf8cc3a..7b0205a 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip_util.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
> @@ -165,7 +165,7 @@ void brw_clip_interp_vertex( struct brw_clip_compile
> *c,
>                   vert_result == VERT_RESULT_CLIP_DIST1) {
>          /* PSIZ doesn't need interpolation because it isn't used by the
>            * fragment shader.  CLIP_DIST0 and CLIP_DIST1 don't need
> -          * intepolation because on pre-GEN6, these are just placeholder
> VUE
> +          * interpolation because on pre-GEN6, these are just placeholder
> VUE
>

Heh.  Thanks for fixing my typo :)


>            * slots that don't perform any action.
>            */
>        } else if (vert_result < VERT_RESULT_MAX) {
> @@ -291,49 +291,25 @@ struct brw_reg brw_clip_plane_stride( struct
> brw_clip_compile *c )
>  }
>
>
> -/* If flatshading, distribute color from provoking vertex prior to
> +/* Distribute flatshaded attributes from provoking vertex prior to
>   * clipping.
>   */
> -void brw_clip_copy_colors( struct brw_clip_compile *c,
> -                          GLuint to, GLuint from )
> +void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c,
> +                                          GLuint to, GLuint from )
>  {
>     struct brw_compile *p = &c->func;
> -
> -   if (brw_clip_have_vert_result(c, VERT_RESULT_COL0))
> -      brw_MOV(p,
> -             byte_offset(c->reg.vertex[to],
> -                          brw_vert_result_to_offset(&c->vue_map,
> -                                                    VERT_RESULT_COL0)),
> -             byte_offset(c->reg.vertex[from],
> -                          brw_vert_result_to_offset(&c->vue_map,
> -                                                    VERT_RESULT_COL0)));
> -
> -   if (brw_clip_have_vert_result(c, VERT_RESULT_COL1))
> -      brw_MOV(p,
> -             byte_offset(c->reg.vertex[to],
> -                          brw_vert_result_to_offset(&c->vue_map,
> -                                                    VERT_RESULT_COL1)),
> -             byte_offset(c->reg.vertex[from],
> -                          brw_vert_result_to_offset(&c->vue_map,
> -                                                    VERT_RESULT_COL1)));
> -
> -   if (brw_clip_have_vert_result(c, VERT_RESULT_BFC0))
> -      brw_MOV(p,
> -             byte_offset(c->reg.vertex[to],
> -                          brw_vert_result_to_offset(&c->vue_map,
> -                                                    VERT_RESULT_BFC0)),
> -             byte_offset(c->reg.vertex[from],
> -                          brw_vert_result_to_offset(&c->vue_map,
> -                                                    VERT_RESULT_BFC0)));
> -
> -   if (brw_clip_have_vert_result(c, VERT_RESULT_BFC1))
> -      brw_MOV(p,
> -             byte_offset(c->reg.vertex[to],
> -                          brw_vert_result_to_offset(&c->vue_map,
> -                                                    VERT_RESULT_BFC1)),
> -             byte_offset(c->reg.vertex[from],
> -                          brw_vert_result_to_offset(&c->vue_map,
> -                                                    VERT_RESULT_BFC1)));
> +   struct brw_context *brw = p->brw;
> +   GLuint i;
> +
> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
> +      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
> +        brw_MOV(p,
> +                 byte_offset(c->reg.vertex[to],
> +                             brw_vue_slot_to_offset(i)),
> +                 byte_offset(c->reg.vertex[from],
> +                             brw_vue_slot_to_offset(i)));
> +      }
> +   }
>  }
>

Ah, a loop!  This is so much better.  Thanks.


>
>
> --
> 1.7.10.rc3.1.gb306
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

This patch is:

Reviewed-by: Paul Berry <stereotype441@gmail.com>

[-- Attachment #1.2: Type: text/html, Size: 12060 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.
  2012-07-17 12:50   ` [Mesa-dev] " Paul Berry
@ 2012-07-17 13:07     ` Paul Berry
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Berry @ 2012-07-17 13:07 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 17569 bytes --]

On 17 July 2012 05:50, Paul Berry <stereotype441@gmail.com> wrote:

> On 30 June 2012 11:50, Olivier Galibert <galibert@pobox.com> wrote:
>
>> This patch also correct a couple of problems with noperspective
>> interpolation.
>>
>> At that point all the glsl 1.1/1.3 interpolation tests that do not
>> clip pass (the -none ones).
>>
>> The fs code does not use the pre-resolved interpolation modes in order
>> not to mess with gen6+.  Sharing the resolution would require putting
>> brw_wm_prog before brw_clip_prog and brw_sf_prog.  This may be a good
>> thing, but it could have unexpected consequences, so it's better be
>> done independently in any case.
>>
>> Signed-off-by: Olivier Galibert <galibert@pobox.com>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp         |    2 +-
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |    5 +
>>  src/mesa/drivers/dri/i965/brw_sf.c           |    9 +-
>>  src/mesa/drivers/dri/i965/brw_sf.h           |    2 +-
>>  src/mesa/drivers/dri/i965/brw_sf_emit.c      |  164
>> +++++++++++++-------------
>>  5 files changed, 95 insertions(+), 87 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 710f2ff..b142f2b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -506,7 +506,7 @@ fs_visitor::emit_general_interpolation(ir_variable
>> *ir)
>>                   struct brw_reg interp = interp_reg(location, k);
>>                    emit_linterp(attr, fs_reg(interp), interpolation_mode,
>>                                 ir->centroid);
>> -                 if (intel->gen < 6) {
>> +                 if (intel->gen < 6 && interpolation_mode ==
>> INTERP_QUALIFIER_SMOOTH) {
>>                      emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w);
>>                   }
>>                }
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> index 9bd1e67..ab83a95 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> @@ -1924,6 +1924,11 @@ fs_visitor::emit_interpolation_setup_gen4()
>>     emit(BRW_OPCODE_ADD,
>> this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
>>         this->pixel_y, fs_reg(negate(brw_vec1_grf(1, 1))));
>>
>> +   this->delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
>> +     this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
>> +   this->delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
>> +     this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
>> +
>>
>
> Can we add a comment explaining why this is correct?  Something like this
> maybe:
>
> "On Gen4-5, we accomplish perspective-correct interpolation by dividing
> the attribute values by w in the vertex shader, interpolating the result
> linearly in screen space, and then multiplying by w in the fragment
> shader.  So the interpolation step is always linear in screen space,
> regardless of whether the attribute is perspective or non-perspective.
> Accordingly, we use the same delta_x and delta_y values for both kinds of
> interpolation."
>

Whoops, my memory was faulty.  This should say "...in the sf thread...",
not "...in the vertex shader...".


>
>
>>     this->current_annotation = "compute pos.w and 1/pos.w";
>>     /* Compute wpos.w.  It's always in our setup, since it's needed to
>>      * interpolate the other attributes.
>> diff --git a/src/mesa/drivers/dri/i965/brw_sf.c
>> b/src/mesa/drivers/dri/i965/brw_sf.c
>> index 0cc4fc7..85f5f51 100644
>> --- a/src/mesa/drivers/dri/i965/brw_sf.c
>> +++ b/src/mesa/drivers/dri/i965/brw_sf.c
>> @@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw)
>>     struct brw_sf_prog_key key;
>>     /* _NEW_BUFFERS */
>>     bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
>> +   int i;
>>
>>     memset(&key, 0, sizeof(key));
>>
>> @@ -191,7 +192,13 @@ brw_upload_sf_prog(struct brw_context *brw)
>>        key.sprite_origin_lower_left = true;
>>
>>     /* _NEW_LIGHT */
>> -   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
>> +   key.has_flat_shading = 0;
>> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
>> +      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
>> +         key.has_flat_shading = 1;
>> +         break;
>> +      }
>> +   }
>>     key.do_twoside_color = (ctx->Light.Enabled &&
>> ctx->Light.Model.TwoSide) ||
>>       ctx->VertexProgram._TwoSideEnabled;
>>     brw_copy_interpolation_modes(brw, key.interpolation_mode);
>> diff --git a/src/mesa/drivers/dri/i965/brw_sf.h
>> b/src/mesa/drivers/dri/i965/brw_sf.h
>> index 0a8135c..c718072 100644
>> --- a/src/mesa/drivers/dri/i965/brw_sf.h
>> +++ b/src/mesa/drivers/dri/i965/brw_sf.h
>> @@ -50,7 +50,7 @@ struct brw_sf_prog_key {
>>     uint8_t point_sprite_coord_replace;
>>     GLuint primitive:2;
>>     GLuint do_twoside_color:1;
>> -   GLuint do_flat_shading:1;
>> +   GLuint has_flat_shading:1;
>>     GLuint frontface_ccw:1;
>>     GLuint do_point_sprite:1;
>>     GLuint do_point_coord:1;
>> diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c
>> b/src/mesa/drivers/dri/i965/brw_sf_emit.c
>> index 9d8aa38..387685a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
>> +++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
>> @@ -44,6 +44,17 @@
>>
>>
>>  /**
>> + * Determine the vue slot corresponding to the given half of the given
>> + * register.  half=0 means the first half of a register, half=1 means the
>> + * second half.
>> + */
>> +static inline int vert_reg_to_vue_slot(struct brw_sf_compile *c, GLuint
>> reg,
>> +                                       int half)
>> +{
>> +   return (reg + c->urb_entry_read_offset) * 2 + half;
>> +}
>> +
>> +/**
>>   * Determine the vert_result corresponding to the given half of the given
>>   * register.  half=0 means the first half of a register, half=1 means the
>>   * second half.
>> @@ -51,11 +62,24 @@
>>  static inline int vert_reg_to_vert_result(struct brw_sf_compile *c,
>> GLuint reg,
>>                                            int half)
>>  {
>> -   int vue_slot = (reg + c->urb_entry_read_offset) * 2 + half;
>> +   int vue_slot = vert_reg_to_vue_slot(c, reg, half);
>>     return c->vue_map.slot_to_vert_result[vue_slot];
>>  }
>>
>>  /**
>> + * Determine the register corresponding to the given vue slot.
>> + */
>> +static struct brw_reg get_vue_slot(struct brw_sf_compile *c,
>> +                                   struct brw_reg vert,
>> +                                   int vue_slot)
>> +{
>> +   GLuint off = vue_slot / 2 - c->urb_entry_read_offset;
>> +   GLuint sub = vue_slot % 2;
>> +
>> +   return brw_vec4_grf(vert.nr + off, sub * 4);
>>
> +}
>> +
>> +/**
>>   * Determine the register corresponding to the given vert_result.
>>   */
>>  static struct brw_reg get_vert_result(struct brw_sf_compile *c,
>> @@ -64,10 +88,7 @@ static struct brw_reg get_vert_result(struct
>> brw_sf_compile *c,
>>  {
>>     int vue_slot = c->vue_map.vert_result_to_slot[vert_result];
>>     assert (vue_slot >= c->urb_entry_read_offset);
>> -   GLuint off = vue_slot / 2 - c->urb_entry_read_offset;
>> -   GLuint sub = vue_slot % 2;
>> -
>> -   return brw_vec4_grf(vert.nr + off, sub * 4);
>> +   return get_vue_slot(c, vert, vue_slot);
>>  }
>>
>>  static bool
>> @@ -128,31 +149,37 @@ static void do_twoside_color( struct brw_sf_compile
>> *c )
>>   * Flat shading
>>   */
>>
>> -#define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \
>> -                                BITFIELD64_BIT(VERT_RESULT_COL1))
>> -
>> -static void copy_colors( struct brw_sf_compile *c,
>> -                    struct brw_reg dst,
>> -                     struct brw_reg src,
>> -                     int allow_twoside)
>> +static void copy_flatshaded_attributes( struct brw_sf_compile *c,
>> +                                        struct brw_reg dst,
>> +                                        struct brw_reg src)
>>  {
>>     struct brw_compile *p = &c->func;
>> +   struct brw_context *brw = p->brw;
>>     GLuint i;
>>
>> -   for (i = VERT_RESULT_COL0; i <= VERT_RESULT_COL1; i++) {
>> -      if (have_attr(c,i)) {
>> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
>> +      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
>>          brw_MOV(p,
>> -                get_vert_result(c, dst, i),
>> -                get_vert_result(c, src, i));
>> +                get_vue_slot(c, dst, i),
>> +                get_vue_slot(c, src, i));
>>
>> -      } else if(allow_twoside && have_attr(c, i - VERT_RESULT_COL0 +
>> VERT_RESULT_BFC0)) {
>> -        brw_MOV(p,
>> -                get_vert_result(c, dst, i - VERT_RESULT_COL0 +
>> VERT_RESULT_BFC0),
>> -                get_vert_result(c, src, i - VERT_RESULT_COL0 +
>> VERT_RESULT_BFC0));
>>        }
>>     }
>>  }
>>
>> +static GLuint count_flatshaded_attributes(struct brw_sf_compile *c )
>> +{
>> +   struct brw_compile *p = &c->func;
>> +   struct brw_context *brw = p->brw;
>> +   GLuint count = 0;
>> +   GLuint i;
>> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
>> +      if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT)
>> +         count++;
>> +   }
>> +   return count;
>> +}
>> +
>>
>>
>>  /* Need to use a computed jump to copy flatshaded attributes as the
>> @@ -168,18 +195,6 @@ static void do_flatshade_triangle( struct
>> brw_sf_compile *c )
>>
>>     GLuint nr;
>>
>> -   if (c->key.do_twoside_color) {
>> -      nr = ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL0) |
>> BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) +
>> -         ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL1) |
>> BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0);
>> -
>> -   } else {
>> -      nr = ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) +
>> -         ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL1)) != 0);
>> -   }
>> -
>> -   if (!nr)
>> -      return;
>> -
>>     /* Already done in clip program:
>>      */
>>     if (c->key.primitive == SF_UNFILLED_TRIS)
>> @@ -188,21 +203,23 @@ static void do_flatshade_triangle( struct
>> brw_sf_compile *c )
>>     if (intel->gen == 5)
>>         jmpi = 2;
>>
>> +   nr = count_flatshaded_attributes(c);
>> +
>>
>
> We used to have this optimization:
>
> if (!nr)
>    return;
>
> I understand why you removed it: because now this code should only be
> called if c->key.has_flat_shading is true.  However, for the sake of safety
> (and documentation), can we add something like this:
>
>    /* This code should only be called if c->key.has_flat_shading is true,
>     * meaning there is at least one flatshaded attribute.
>     */
>    assert(nr != 0);
>
> If you would prefer to put this assertion inside
> count_flatshaded_attributes(), I would be happy with that too.
>
>
>>     brw_push_insn_state(p);
>>
>>     brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr*2+1)));
>>     brw_JMPI(p, ip, ip, c->pv);
>>
>> -   copy_colors(c, c->vert[1], c->vert[0], c->key.do_twoside_color);
>> -   copy_colors(c, c->vert[2], c->vert[0], c->key.do_twoside_color);
>> +   copy_flatshaded_attributes(c, c->vert[1], c->vert[0]);
>> +   copy_flatshaded_attributes(c, c->vert[2], c->vert[0]);
>>     brw_JMPI(p, ip, ip, brw_imm_d(jmpi*(nr*4+1)));
>>
>> -   copy_colors(c, c->vert[0], c->vert[1], c->key.do_twoside_color);
>> -   copy_colors(c, c->vert[2], c->vert[1], c->key.do_twoside_color);
>> +   copy_flatshaded_attributes(c, c->vert[0], c->vert[1]);
>> +   copy_flatshaded_attributes(c, c->vert[2], c->vert[1]);
>>     brw_JMPI(p, ip, ip, brw_imm_d(jmpi*nr*2));
>>
>> -   copy_colors(c, c->vert[0], c->vert[2], c->key.do_twoside_color);
>> -   copy_colors(c, c->vert[1], c->vert[2], c->key.do_twoside_color);
>> +   copy_flatshaded_attributes(c, c->vert[0], c->vert[2]);
>> +   copy_flatshaded_attributes(c, c->vert[1], c->vert[2]);
>>
>>     brw_pop_insn_state(p);
>>  }
>> @@ -213,12 +230,9 @@ static void do_flatshade_line( struct brw_sf_compile
>> *c )
>>     struct brw_compile *p = &c->func;
>>     struct intel_context *intel = &p->brw->intel;
>>     struct brw_reg ip = brw_ip_reg();
>> -   GLuint nr = _mesa_bitcount_64(c->key.attrs & VERT_RESULT_COLOR_BITS);
>> +   GLuint nr;
>>     GLuint jmpi = 1;
>>
>> -   if (!nr)
>> -      return;
>> -
>>     /* Already done in clip program:
>>      */
>>     if (c->key.primitive == SF_UNFILLED_TRIS)
>> @@ -227,14 +241,16 @@ static void do_flatshade_line( struct
>> brw_sf_compile *c )
>>     if (intel->gen == 5)
>>         jmpi = 2;
>>
>> +   nr = count_flatshaded_attributes(c);
>> +
>>
>
> A similar assertion should go here.
>
>
>>     brw_push_insn_state(p);
>>
>>     brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr+1)));
>>     brw_JMPI(p, ip, ip, c->pv);
>> -   copy_colors(c, c->vert[1], c->vert[0], 0);
>> +   copy_flatshaded_attributes(c, c->vert[1], c->vert[0]);
>>
>>     brw_JMPI(p, ip, ip, brw_imm_ud(jmpi*nr));
>> -   copy_colors(c, c->vert[0], c->vert[1], 0);
>> +   copy_flatshaded_attributes(c, c->vert[0], c->vert[1]);
>>
>>     brw_pop_insn_state(p);
>>  }
>> @@ -332,40 +348,25 @@ static void invert_det( struct brw_sf_compile *c)
>>
>>  static bool
>>  calculate_masks(struct brw_sf_compile *c,
>> -               GLuint reg,
>> -               GLushort *pc,
>> -               GLushort *pc_persp,
>> -               GLushort *pc_linear)
>> +                GLuint reg,
>> +                GLushort *pc,
>> +                GLushort *pc_persp,
>> +                GLushort *pc_linear)
>>
>
> I like the way you've rewritten this function.  Nicely done.
>
>
>>  {
>> +   struct brw_compile *p = &c->func;
>> +   struct brw_context *brw = p->brw;
>> +   enum glsl_interp_qualifier interp;
>>     bool is_last_attr = (reg == c->nr_setup_regs - 1);
>> -   GLbitfield64 persp_mask;
>> -   GLbitfield64 linear_mask;
>> -
>> -   if (c->key.do_flat_shading)
>> -      persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS) |
>> -                                    BITFIELD64_BIT(VERT_RESULT_COL0) |
>> -                                    BITFIELD64_BIT(VERT_RESULT_COL1) |
>> -                                    BITFIELD64_BIT(VERT_RESULT_BFC0) |
>> -                                    BITFIELD64_BIT(VERT_RESULT_BFC1));
>> -   else
>> -      persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS));
>> -
>> -   if (c->key.do_flat_shading)
>> -      linear_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_COL0) |
>> -                                     BITFIELD64_BIT(VERT_RESULT_COL1) |
>> -                                     BITFIELD64_BIT(VERT_RESULT_BFC0) |
>> -                                     BITFIELD64_BIT(VERT_RESULT_BFC1));
>> -   else
>> -      linear_mask = c->key.attrs;
>>
>>     *pc_persp = 0;
>>     *pc_linear = 0;
>>     *pc = 0xf;
>> -
>> -   if (persp_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 0)))
>> -      *pc_persp = 0xf;
>>
>> -   if (linear_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg, 0)))
>> +   interp = brw_get_interpolation_mode(brw, vert_reg_to_vue_slot(c, reg,
>> 0));
>> +   if (interp == INTERP_QUALIFIER_SMOOTH) {
>> +      *pc_linear = 0xf;
>> +      *pc_persp = 0xf;
>> +   } else if(interp == INTERP_QUALIFIER_NOPERSPECTIVE)
>>        *pc_linear = 0xf;
>>
>>     /* Maybe only processs one attribute on the final round:
>> @@ -373,11 +374,12 @@ calculate_masks(struct brw_sf_compile *c,
>>     if (vert_reg_to_vert_result(c, reg, 1) != BRW_VERT_RESULT_MAX) {
>>        *pc |= 0xf0;
>>
>> -      if (persp_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg,
>> 1)))
>> -        *pc_persp |= 0xf0;
>> -
>> -      if (linear_mask & BITFIELD64_BIT(vert_reg_to_vert_result(c, reg,
>> 1)))
>> -        *pc_linear |= 0xf0;
>> +      interp = brw_get_interpolation_mode(brw, vert_reg_to_vue_slot(c,
>> reg, 1));
>> +      if (interp == INTERP_QUALIFIER_SMOOTH) {
>> +         *pc_linear |= 0xf0;
>> +         *pc_persp |= 0xf0;
>> +      } else if(interp == INTERP_QUALIFIER_NOPERSPECTIVE)
>> +         *pc_linear |= 0xf0;
>>     }
>>
>>     return is_last_attr;
>> @@ -430,7 +432,7 @@ void brw_emit_tri_setup(struct brw_sf_compile *c,
>> bool allocate)
>>     if (c->key.do_twoside_color)
>>        do_twoside_color(c);
>>
>> -   if (c->key.do_flat_shading)
>> +   if (c->key.has_flat_shading)
>>        do_flatshade_triangle(c);
>>
>>
>> @@ -443,7 +445,6 @@ void brw_emit_tri_setup(struct brw_sf_compile *c,
>> bool allocate)
>>        struct brw_reg a2 = offset(c->vert[2], i);
>>        GLushort pc, pc_persp, pc_linear;
>>        bool last = calculate_masks(c, i, &pc, &pc_persp, &pc_linear);
>> -
>>        if (pc_persp)
>>        {
>>          brw_set_predicate_control_flag_value(p, pc_persp);
>> @@ -507,7 +508,6 @@ void brw_emit_line_setup(struct brw_sf_compile *c,
>> bool allocate)
>>     struct brw_compile *p = &c->func;
>>     GLuint i;
>>
>> -
>>     c->nr_verts = 2;
>>
>>     if (allocate)
>> @@ -516,7 +516,7 @@ void brw_emit_line_setup(struct brw_sf_compile *c,
>> bool allocate)
>>     invert_det(c);
>>     copy_z_inv_w(c);
>>
>> -   if (c->key.do_flat_shading)
>> +   if (c->key.has_flat_shading)
>>        do_flatshade_line(c);
>>
>>     for (i = 0; i < c->nr_setup_regs; i++)
>> @@ -799,7 +799,3 @@ void brw_emit_anyprim_setup( struct brw_sf_compile *c
>> )
>>
>>     brw_emit_point_setup( c, false );
>>  }
>> -
>> -
>> -
>> -
>> --
>> 1.7.10.rc3.1.gb306
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
> With the minor changes noted above, this patch is:
>
> Reviewed-by: Paul Berry <stereotype441@gmail.com>
>

[-- Attachment #1.2: Type: text/html, Size: 20848 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 5/5] intel gen4-5: Make noperspective clipping work.
  2012-06-30 18:50 ` [PATCH 5/5] intel gen4-5: Make noperspective clipping work Olivier Galibert
@ 2012-07-17 14:07   ` Paul Berry
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Berry @ 2012-07-17 14:07 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 11140 bytes --]

On 30 June 2012 11:50, Olivier Galibert <galibert@pobox.com> wrote:

> At this point all interpolation tests with fixed clipping work.
>
> Signed-off-by: Olivier Galibert <galibert@pobox.com>
> ---
>  src/mesa/drivers/dri/i965/brw_clip.c      |    9 ++
>  src/mesa/drivers/dri/i965/brw_clip.h      |    1 +
>  src/mesa/drivers/dri/i965/brw_clip_util.c |  133
> ++++++++++++++++++++++++++---
>  3 files changed, 132 insertions(+), 11 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_clip.c
> b/src/mesa/drivers/dri/i965/brw_clip.c
> index 952eb4a..6bfdf24 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip.c
> @@ -234,6 +234,15 @@ brw_upload_clip_prog(struct brw_context *brw)
>           break;
>        }
>     }
> +   key.has_noperspective_shading = 0;
> +   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
> +      if (brw_get_interpolation_mode(brw, i) ==
> INTERP_QUALIFIER_NOPERSPECTIVE &&
> +          brw->vs.prog_data->vue_map.slot_to_vert_result[i] !=
> VERT_RESULT_HPOS) {
> +         key.has_noperspective_shading = 1;
> +         break;
> +      }
> +   }
> +
>     key.pv_first = (ctx->Light.ProvokingVertex ==
> GL_FIRST_VERTEX_CONVENTION);
>     brw_copy_interpolation_modes(brw, key.interpolation_mode);
>     /* _NEW_TRANSFORM (also part of VUE map)*/
> diff --git a/src/mesa/drivers/dri/i965/brw_clip.h
> b/src/mesa/drivers/dri/i965/brw_clip.h
> index 0ea0394..2a7245a 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip.h
> +++ b/src/mesa/drivers/dri/i965/brw_clip.h
> @@ -47,6 +47,7 @@ struct brw_clip_prog_key {
>     GLuint primitive:4;
>     GLuint nr_userclip:4;
>     GLuint has_flat_shading:1;
> +   GLuint has_noperspective_shading:1;
>     GLuint pv_first:1;
>     GLuint do_unfilled:1;
>     GLuint fill_cw:2;           /* includes cull information */
> diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c
> b/src/mesa/drivers/dri/i965/brw_clip_util.c
> index 7b0205a..5bdcef8 100644
> --- a/src/mesa/drivers/dri/i965/brw_clip_util.c
> +++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
> @@ -129,6 +129,8 @@ static void brw_clip_project_vertex( struct
> brw_clip_compile *c,
>
>  /* Interpolate between two vertices and put the result into a0.0.
>   * Increment a0.0 accordingly.
> + *
> + * Beware that dest_ptr can be equal to v0_ptr.
>   */
>  void brw_clip_interp_vertex( struct brw_clip_compile *c,
>                              struct brw_indirect dest_ptr,
> @@ -138,8 +140,9 @@ void brw_clip_interp_vertex( struct brw_clip_compile
> *c,
>                              bool force_edgeflag)
>  {
>     struct brw_compile *p = &c->func;
> -   struct brw_reg tmp = get_tmp(c);
> -   GLuint slot;
> +   struct brw_context *brw = p->brw;
> +   struct brw_reg tmp, t_nopersp, v0_ndc_copy;
> +   GLuint slot, delta;
>
>     /* Just copy the vertex header:
>      */
> @@ -148,13 +151,119 @@ void brw_clip_interp_vertex( struct
> brw_clip_compile *c,
>      * back on Ironlake, so needn't change it
>      */
>     brw_copy_indirect_to_indirect(p, dest_ptr, v0_ptr, 1);
> -
> -   /* Iterate over each attribute (could be done in pairs?)
> +
> +   /*
> +    * First handle the 3D and NDC positioning, in case we need
> +    * noperspective interpolation.  Doing it early has no performance
> +    * impact in any case.
> +    */
> +
> +   /* Start by picking up the v0 NDC coordinates, because that vertex
> +    * may be shared with the destination.
> +    */
> +   if (c->key.has_noperspective_shading) {
> +      v0_ndc_copy = get_tmp(c);
> +      brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr,
> +
> brw_vert_result_to_offset(&c->vue_map,
> +
> BRW_VERT_RESULT_NDC)));
> +   }
>

Style nit-pick: this is a lot of indentation.  How about this instead:

   if (c->key.has_noperspective_shading) {
      unsigned offset =
         brw_vert_result_to_offset(&c->vue_map, BRW_VERT_RESULT_NDC);
      v0_ndc_copy = get_tmp(c);
      brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr, offset));
   }



> +
> +   /*
> +    * Compute the new 3D position
> +    */
> +
> +   delta = brw_vert_result_to_offset(&c->vue_map, VERT_RESULT_HPOS);
> +   tmp = get_tmp(c);
> +   brw_MUL(p,
> +           vec4(brw_null_reg()),
> +           deref_4f(v1_ptr, delta),
> +           t0);
> +
> +   brw_MAC(p,
> +           tmp,
> +           negate(deref_4f(v0_ptr, delta)),
> +           t0);
> +
> +   brw_ADD(p,
> +           deref_4f(dest_ptr, delta),
> +           deref_4f(v0_ptr, delta),
> +           tmp);
> +   release_tmp(c, tmp);
>

Since delta and tmp are used elsewhere in this function for other purposes,
I would feel more comfortable if we created a local scope for them by
enclosing this small chunk of code in curly braces, e.g.:

   {
      /*
       * Compute the new 3D position
       */

      GLuint delta = brw_vert_result_to_offset(&c->vue_map,
VERT_RESULT_HPOS);
      struct brw_reg tmp = get_tmp(c);
      brw_MUL(p,
              vec4(brw_null_reg()),
              deref_4f(v1_ptr, delta),
              t0);

      brw_MAC(p,
              tmp,
              negate(deref_4f(v0_ptr, delta)),
              t0);

      brw_ADD(p,
              deref_4f(dest_ptr, delta),
              deref_4f(v0_ptr, delta),
              tmp);
      release_tmp(c, tmp);
   }

Also, it would be nice to have a comment above each small group of
instructions showing what they compute as a formula.  For example, above
these three instructions we could say:

/* dest_hpos = v0_hpos * (1 - t0) + v1_hpos * t0 */


> +
> +   /* Then recreate the projected (NDC) coordinate in the new vertex
> +    * header
>      */
> +   brw_clip_project_vertex(c, dest_ptr);
> +
> +   /*
> +    * If we have noperspective attributes, we now need to compute the
> +    * screen-space t.
> +    */
> +   if (c->key.has_noperspective_shading) {
> +      delta = brw_vert_result_to_offset(&c->vue_map, BRW_VERT_RESULT_NDC);
> +      t_nopersp = get_tmp(c);
> +      tmp = get_tmp(c);
>

Along the same lines as my previous comment, I'd prefer to make these three
variables local to this scope, e.g.:

   if (c->key.has_noperspective_shading) {
      GLuint delta = brw_vert_result_to_offset(&c->vue_map,
BRW_VERT_RESULT_NDC);
      struct brw_reg t_nopersp = get_tmp(c);
      struct brw_reg tmp = get_tmp(c);



> +
> +      /* Build a register with coordinates from the second and new
> vertices */
>

In the code below, I could really use some comments to clarify the
computation you're doing.  I'll insert some suggestions inline:

      /* t_nopersp = vec4(v1_ndc.xy, dest_ndc.xy) */

> +      brw_MOV(p, t_nopersp, deref_4f(v1_ptr, delta));
> +      brw_MOV(p, tmp, deref_4f(dest_ptr, delta));
> +      brw_set_access_mode(p, BRW_ALIGN_16);
> +      brw_MOV(p,
> +              brw_writemask(t_nopersp, WRITEMASK_ZW),
> +              brw_swizzle(tmp, 0,1,0,1));
> +
> +      /* Subtract the coordinates of the first vertex */
>
      /* t_nopersp -= v0_ndc_copy.xyxy
       * Therefore t_nopersp = vec4(v1_ndc.xy - v0_ndc.xy,
       *                            dest_ndc.xy - v0_ndc.xy)
       */

> +      brw_ADD(p, t_nopersp, t_nopersp, negate(brw_swizzle(v0_ndc_copy,
> 0,1,0,1)));
> +
> +      /* Add the absolute value of the X and Y deltas so that if the
> +       * points aren't in the same place on the screen we get non-zero
> +       * values to divide.
> +       *
> +       * After that we have vert1-vert0 in t_nopersp.x and vertnew-vert0
> in t_nopersp.y.
> +       */
>
      /* t_nopersp.xy = |t_nopersp.xz| + |t_nopersp.yw|
       * Therefore:
       * t_nopersp = vec4(|v1_ndc.x - v0_ndc.x| + |v1_ndc.y - v0_ndc.y|,
       *                  |dest_ndc.x - v0_ndc.x| + |dest_ndc.y - v0_ndc.y|,
       *                  <don't care>, <don't care>)
       */

> +      brw_ADD(p,
> +              brw_writemask(t_nopersp, WRITEMASK_XY),
> +              brw_abs(brw_swizzle(t_nopersp, 0,2,0,0)),
> +              brw_abs(brw_swizzle(t_nopersp, 1,3,0,0)));
> +      brw_set_access_mode(p, BRW_ALIGN_1);
> +
> +      /* If the points are in the same place (vert1-vert0 == 0), just
> +       * substitute a value that will ensure that we don't divide by
> +       * 0.
> +       */
> +      brw_CMP(p, vec1(brw_null_reg()), BRW_CONDITIONAL_EQ,
> +              vec1(t_nopersp),
> +              brw_imm_f(1));
>

Shouldn't this be brw_imm_f(0)?


> +      brw_IF(p, BRW_EXECUTE_1);
> +      brw_MOV(p, t_nopersp, brw_imm_vf4(VF_ONE, VF_ZERO, VF_ZERO,
> VF_ZERO));
> +      brw_ENDIF(p);
> +
> +      /* Now compute t_nopersp = t_nopersp.y/t_nopersp.x and broadcast it
> */
> +      brw_math_invert(p, get_element(t_nopersp, 0),
> get_element(t_nopersp, 0));
> +      brw_MUL(p,
> +              vec1(t_nopersp),
> +              vec1(t_nopersp),
> +              vec1(suboffset(t_nopersp, 1)));
> +      brw_set_access_mode(p, BRW_ALIGN_16);
> +      brw_MOV(p, t_nopersp, brw_swizzle(t_nopersp, 0,0,0,0));
> +      brw_set_access_mode(p, BRW_ALIGN_1);
>

That was a very clever computation.  Well done.


> +
> +      release_tmp(c, tmp);
> +      release_tmp(c, v0_ndc_copy);
> +   }
> +
> +   /* Now we can iterate over each attribute
> +    * (could be done in pairs?)
> +    */
> +   tmp = get_tmp(c);
>     for (slot = 0; slot < c->vue_map.num_slots; slot++) {
>        int vert_result = c->vue_map.slot_to_vert_result[slot];
>        GLuint delta = brw_vue_slot_to_offset(slot);
>
> +      /* HPOS is already handled */
> +      if(vert_result == VERT_RESULT_HPOS)
> +         continue;
> +
>        if (vert_result == VERT_RESULT_EDGE) {
>          if (force_edgeflag)
>             brw_MOV(p, deref_4f(dest_ptr, delta), brw_imm_f(1));
> @@ -174,15 +283,20 @@ void brw_clip_interp_vertex( struct brw_clip_compile
> *c,
>           *
>           *        New = attr0 + t*attr1 - t*attr0
>           */
> +
> +         struct brw_reg t =
> +            brw_get_interpolation_mode(brw, slot) ==
> INTERP_QUALIFIER_NOPERSPECTIVE ?
> +            t_nopersp : t0;
> +
>          brw_MUL(p,
>                  vec4(brw_null_reg()),
>                  deref_4f(v1_ptr, delta),
> -                t0);
> +                t);
>
>          brw_MAC(p,
>                  tmp,
>                  negate(deref_4f(v0_ptr, delta)),
> -                t0);
> +                t);
>
>          brw_ADD(p,
>                  deref_4f(dest_ptr, delta),
> @@ -198,11 +312,8 @@ void brw_clip_interp_vertex( struct brw_clip_compile
> *c,
>     }
>
>     release_tmp(c, tmp);
> -
> -   /* Recreate the projected (NDC) coordinate in the new vertex
> -    * header:
> -    */
> -   brw_clip_project_vertex(c, dest_ptr );
> +   if (c->key.has_noperspective_shading)
> +      release_tmp(c, t_nopersp);
>  }
>
>  void brw_clip_emit_vue(struct brw_clip_compile *c,
> --
> 1.7.10.rc3.1.gb306
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

With the exception of my question about brw_imm_f(0), all of my comments on
this patch are stylistic suggestions.  So assuming we get the brw_imm_f(0)
thing figured out, this patch is:

Reviewed-by: Paul Berry <stereotype441@gmail.com>

[-- Attachment #1.2: Type: text/html, Size: 13964 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-07-17  8:57     ` [Mesa-dev] " Olivier Galibert
@ 2012-07-17 14:37       ` Paul Berry
  2012-07-29 17:00         ` [Mesa-dev] " Olivier Galibert
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Berry @ 2012-07-17 14:37 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3102 bytes --]

On 17 July 2012 01:57, Olivier Galibert <galibert@pobox.com> wrote:

> On Mon, Jul 16, 2012 at 08:43:17PM -0700, Paul Berry wrote:
> > Also, I'm not convinced that #3 is necessary.  Is there something in the
> > spec that dictates this behaviour?  My reading of the spec is that if the
> > vertex shader writes to gl_BackColor but not glFrontColor, then the
> > contents of gl_Color in the fragment shader is undefined.
>
> Oh, I remember why I did that in the first place.  All the front/back
> piglit tests only write the appropriate color slot and not the other
> one.
>
> The language is annoying:
>   The following built-in vertex output variables are available, but
> deprecated. A particular one should be
>   written to if any functionality in a corresponding fragment shader or
> fixed pipeline uses it or state derived
>   from it. Otherwise, behavior is undefined.
>   out vec4 gl_FrontColor;
>   out vec4 gl_BackColor;
>   out vec4 gl_FrontSecondaryColor;
>   out vec4 gl_BackSecondaryColor;
>   [...]
>
> One could argue that you don't "use" gl_FrontColor if all your
> polygons are back-facing.  Dunno.  Do you consider all of the twoside
> piglit tests buggy?  We can fix *that*.
>
>   OG.
>
>
You have a good point about the piglit tests.  It's weird that they don't
set gl_FrontColor, but I can't bring myself to consider it a bug.  Sigh, I
was really hoping we could consider it undefined to write to gl_BackColor
and not gl_FrontColor, but I guess it's reasonable for someone to expect
that if they only write to gl_BackColor, and they only render backfacing
polygons, they should get a sensible value in the fragment shader.

If possible, I would still like to think of a way to address this situation
that (a) doesn't require modifying both fragment shader back-ends and the
SF program, and (b) helps all Mesa drivers, not just Intel Gen4-5.
Especially because I suspect we may have bugs in Gen6-7 related to this
situation.  Would you be happy with one of the following two alternatives?

1. In the GLSL front-end, if we detect that a vertex shader writes to
gl_BackColor but not gl_FrontColor, then automatically insert
"gl_FrontColor = 0;" into the shader.  This will guarantee that whenever
gl_BackColor is written, gl_FrontColor is too.

2. In the function brw_compute_vue_map(), assign a VUE slot for
VERT_RESULT_COL0 whenever *either* VERT_RESULT_COL0 or VERT_RESULT_BFC0 is
used.  This will guarantee that we always have a VUE slot available for
front color, so we don't have to be as tricky in the FS and SF code.

Note that alternative 2 leaves the contents of gl_FrontColor undefined if
the vertex shader doesn't write to it.  Although I appreciate what you're
saying about security leaks due to reads out of place, I think that in this
case the security risk is negligible, since the garbage value that winds up
in gl_FrontColor will just be a value out of some random register in the
vertex shader, not a random value out of memory and not a value belonging
to some other process.

This morning I'll try to ask some other Intel folks for their opinion on
the subject.

[-- Attachment #1.2: Type: text/html, Size: 3641 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
  2012-07-17  2:33       ` Paul Berry
@ 2012-07-17 14:42         ` Paul Berry
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Berry @ 2012-07-17 14:42 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1179 bytes --]

On 16 July 2012 19:33, Paul Berry <stereotype441@gmail.com> wrote:

> On 14 July 2012 02:21, Olivier Galibert <galibert@pobox.com> wrote:
>
>> On Fri, Jul 13, 2012 at 02:45:10PM -0700, Kenneth Graunke wrote:
>> > Sorry...been really busy, and most of us haven't actually spent much if
>> > any time in the clipper shaders.  I'll try and review it within a week.
>>
>> Ok cool, lack of time is something I completely understand :-)
>>
>>
>> > Despite the lack of response, I am really excited to see that you're
>> > working on this---this is a huge step toward bringing GL 3.x back to
>> > Gen4/5, and we're all really glad to see it happen!
>>
>> Excellent.  I was starting to wonder if gen4/5 was abandoned (by lack
>> of resources if anything), nice to see it isn't.
>>
>
> I'm starting to look at these patches too, since I'm the one who wrote the
> VUE code, and I have some familiarity with the Gen4/5 clipper.  I'll try to
> get some feedback to you within the next 24 hours.
>

By the way I want to commend you for digging into a really complex corner
of the code and making what look like very solid improvements to it.  I'm
excited to see what else you have planned :)

[-- Attachment #1.2: Type: text/html, Size: 1810 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-07-17 14:37       ` Paul Berry
@ 2012-07-29 17:00         ` Olivier Galibert
  2012-07-30 17:30           ` [Intel-gfx] " Eric Anholt
  0 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-07-29 17:00 UTC (permalink / raw)
  To: Paul Berry; +Cc: mesa-dev, intel-gfx

On Tue, Jul 17, 2012 at 07:37:43AM -0700, Paul Berry wrote:
> If possible, I would still like to think of a way to address this situation
> that (a) doesn't require modifying both fragment shader back-ends and the
> SF program, and (b) helps all Mesa drivers, not just Intel Gen4-5.
> Especially because I suspect we may have bugs in Gen6-7 related to this
> situation. 

You don't :-) It's correctly handled in
gen6_sf_state.c::get_attr_override with similar semantics too.

> Would you be happy with one of the following two alternatives?
> 
> 1. In the GLSL front-end, if we detect that a vertex shader writes to
> gl_BackColor but not gl_FrontColor, then automatically insert
> "gl_FrontColor = 0;" into the shader.  This will guarantee that whenever
> gl_BackColor is written, gl_FrontColor is too.
> 
> 2. In the function brw_compute_vue_map(), assign a VUE slot for
> VERT_RESULT_COL0 whenever *either* VERT_RESULT_COL0 or VERT_RESULT_BFC0 is
> used.  This will guarantee that we always have a VUE slot available for
> front color, so we don't have to be as tricky in the FS and SF code.

With both methods the SF code is not really simplified.  Doing the mov
without testing would require writing to/reserving a slot for
gl_BackColor if gl_FrontColor is written to, which wouldn't be
acceptable.  And to write to/reserve a slot for the two of them if
gl_Color is read in any case.  Probably unacceptable.  So the need_*
stuff is going to stay in any case :/

So the only simplification would be in the fs/wm and I'm somewhat
afraid of having a vue slot that's not in outputs_written of the
previous stage.  They seem to be expected equivalent.

> This morning I'll try to ask some other Intel folks for their opinion on
> the subject.

Did they have an opinion?

Best,

  OG.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-07-29 17:00         ` [Mesa-dev] " Olivier Galibert
@ 2012-07-30 17:30           ` Eric Anholt
  2012-07-30 19:55             ` [Mesa-dev] " Olivier Galibert
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Anholt @ 2012-07-30 17:30 UTC (permalink / raw)
  To: Olivier Galibert, Paul Berry; +Cc: mesa-dev, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1729 bytes --]

Olivier Galibert <galibert@pobox.com> writes:

> On Tue, Jul 17, 2012 at 07:37:43AM -0700, Paul Berry wrote:
>> If possible, I would still like to think of a way to address this situation
>> that (a) doesn't require modifying both fragment shader back-ends and the
>> SF program, and (b) helps all Mesa drivers, not just Intel Gen4-5.
>> Especially because I suspect we may have bugs in Gen6-7 related to this
>> situation. 
>
> You don't :-) It's correctly handled in
> gen6_sf_state.c::get_attr_override with similar semantics too.
>
>> Would you be happy with one of the following two alternatives?
>> 
>> 1. In the GLSL front-end, if we detect that a vertex shader writes to
>> gl_BackColor but not gl_FrontColor, then automatically insert
>> "gl_FrontColor = 0;" into the shader.  This will guarantee that whenever
>> gl_BackColor is written, gl_FrontColor is too.
>> 
>> 2. In the function brw_compute_vue_map(), assign a VUE slot for
>> VERT_RESULT_COL0 whenever *either* VERT_RESULT_COL0 or VERT_RESULT_BFC0 is
>> used.  This will guarantee that we always have a VUE slot available for
>> front color, so we don't have to be as tricky in the FS and SF code.
>
> With both methods the SF code is not really simplified.  Doing the mov
> without testing would require writing to/reserving a slot for
> gl_BackColor if gl_FrontColor is written to, which wouldn't be
> acceptable.  And to write to/reserve a slot for the two of them if
> gl_Color is read in any case.  Probably unacceptable.  So the need_*
> stuff is going to stay in any case :/

I'm perfectly fine with the VUE containing slots for both when the app
has gone out of its way to ask for deprecated two-sided color
rendering.

[-- Attachment #1.2: Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-07-30 17:30           ` [Intel-gfx] " Eric Anholt
@ 2012-07-30 19:55             ` Olivier Galibert
  2012-07-31 15:28               ` [Intel-gfx] " Eric Anholt
  0 siblings, 1 reply; 24+ messages in thread
From: Olivier Galibert @ 2012-07-30 19:55 UTC (permalink / raw)
  To: Eric Anholt; +Cc: intel-gfx, mesa-dev

On Mon, Jul 30, 2012 at 10:30:57AM -0700, Eric Anholt wrote:
> I'm perfectly fine with the VUE containing slots for both when the app
> has gone out of its way to ask for deprecated two-sided color
> rendering.

Are you also ok with recompiler the shaders when that enable is
switched?

  OG.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
  2012-07-30 19:55             ` [Mesa-dev] " Olivier Galibert
@ 2012-07-31 15:28               ` Eric Anholt
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Anholt @ 2012-07-31 15:28 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: intel-gfx, mesa-dev


[-- Attachment #1.1: Type: text/plain, Size: 513 bytes --]

Olivier Galibert <galibert@pobox.com> writes:

> On Mon, Jul 30, 2012 at 10:30:57AM -0700, Eric Anholt wrote:
>> I'm perfectly fine with the VUE containing slots for both when the app
>> has gone out of its way to ask for deprecated two-sided color
>> rendering.
>
> Are you also ok with recompiler the shaders when that enable is
> switched?

Yes, you have to include it in the program key and recompile.  But
people will consistently use the same values for things that land in
program key contents, generally.

[-- Attachment #1.2: Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2012-07-31 15:28 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-30 18:50 [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
2012-06-30 18:50 ` [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE Olivier Galibert
2012-07-17  3:43   ` [Mesa-dev] " Paul Berry
2012-07-17  8:33     ` [Intel-gfx] " Olivier Galibert
2012-07-17  8:57     ` [Mesa-dev] " Olivier Galibert
2012-07-17 14:37       ` Paul Berry
2012-07-29 17:00         ` [Mesa-dev] " Olivier Galibert
2012-07-30 17:30           ` [Intel-gfx] " Eric Anholt
2012-07-30 19:55             ` [Mesa-dev] " Olivier Galibert
2012-07-31 15:28               ` [Intel-gfx] " Eric Anholt
2012-06-30 18:50 ` [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place Olivier Galibert
2012-07-17  4:24   ` Paul Berry
2012-06-30 18:50 ` [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf Olivier Galibert
2012-07-17 12:50   ` [Mesa-dev] " Paul Berry
2012-07-17 13:07     ` Paul Berry
2012-06-30 18:50 ` [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper Olivier Galibert
2012-07-17 12:55   ` [Mesa-dev] " Paul Berry
2012-06-30 18:50 ` [PATCH 5/5] intel gen4-5: Make noperspective clipping work Olivier Galibert
2012-07-17 14:07   ` [Mesa-dev] " Paul Berry
2012-07-13  7:20 ` [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes Olivier Galibert
2012-07-13 21:45   ` Kenneth Graunke
2012-07-14  9:21     ` Olivier Galibert
2012-07-17  2:33       ` Paul Berry
2012-07-17 14:42         ` [Intel-gfx] " Paul Berry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.