All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop
@ 2017-04-03 15:58 Karol Herbst
       [not found] ` <20170403155822.1810-1-karolherbst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-04-03 15:58 ` [PATCH v2 3/3] nv50/ir: run some passes multiple times Karol Herbst
  0 siblings, 2 replies; 6+ messages in thread
From: Karol Herbst @ 2017-04-03 15:58 UTC (permalink / raw)
  To: mesa-dev-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Slowly we are getting to the point, that we miss enough optimization
opportunities as the result of our own passes.

For this we need to fix AlgebraicOpt to be able to handle mods on sources
without creating new issues.

The last patch enables looping opts.

v2: update commit author

Karol Herbst (3):
  nv50/ir: fix AlgebraicOpt for slcts with mods
  nv50/ir: handle logops with NOT in AlgebraicOpt
  nv50/ir: run some passes multiple times

 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 29 +++++++++++++++-------
 1 file changed, 20 insertions(+), 9 deletions(-)

-- 
2.12.2

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods
       [not found] ` <20170403155822.1810-1-karolherbst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-04-03 15:58   ` Karol Herbst
       [not found]     ` <20170403155822.1810-2-karolherbst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-04-03 15:58   ` [PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt Karol Herbst
  1 sibling, 1 reply; 6+ messages in thread
From: Karol Herbst @ 2017-04-03 15:58 UTC (permalink / raw)
  To: mesa-dev-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 4c92a1efb5..bd60a84998 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1797,10 +1797,10 @@ AlgebraicOpt::handleSLCT(Instruction *slct)
       if (slct->getSrc(2)->asImm()->compare(slct->asCmp()->setCond, 0.0f))
          slct->setSrc(0, slct->getSrc(1));
    } else
-   if (slct->getSrc(0) != slct->getSrc(1)) {
+   if (slct->getSrc(0) != slct->getSrc(1) || slct->src(0).mod != slct->src(1).mod)
       return;
-   }
-   slct->op = OP_MOV;
+   slct->op = slct->src(0).mod.getOp();
+   slct->src(0).mod = slct->src(0).mod ^ Modifier(slct->op);
    slct->setSrc(1, NULL);
    slct->setSrc(2, NULL);
 }
-- 
2.12.2

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt
       [not found] ` <20170403155822.1810-1-karolherbst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-04-03 15:58   ` [PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods Karol Herbst
@ 2017-04-03 15:58   ` Karol Herbst
  2017-04-09 17:36     ` Ilia Mirkin
  1 sibling, 1 reply; 6+ messages in thread
From: Karol Herbst @ 2017-04-03 15:58 UTC (permalink / raw)
  To: mesa-dev-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index bd60a84998..0de84fe9fc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1856,6 +1856,12 @@ AlgebraicOpt::handleLOGOP(Instruction *logop)
 
       set0 = cloneForward(func, set0);
       set1 = cloneShallow(func, set1);
+
+      if (logop->src(0).mod == Modifier(NV50_IR_MOD_NOT))
+         set0->asCmp()->setCond = inverseCondCode(set0->asCmp()->setCond);
+      if (logop->src(1).mod == Modifier(NV50_IR_MOD_NOT))
+         set1->asCmp()->setCond = inverseCondCode(set1->asCmp()->setCond);
+
       logop->bb->insertAfter(logop, set1);
       logop->bb->insertAfter(logop, set0);
 
-- 
2.12.2

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/3] nv50/ir: run some passes multiple times
  2017-04-03 15:58 [PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop Karol Herbst
       [not found] ` <20170403155822.1810-1-karolherbst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-04-03 15:58 ` Karol Herbst
  1 sibling, 0 replies; 6+ messages in thread
From: Karol Herbst @ 2017-04-03 15:58 UTC (permalink / raw)
  To: mesa-dev; +Cc: nouveau

With the shader cache, compilation time matters less.

As a side effect we can write more optimizations to produce better optimized
code.

total instructions in shared programs : 3931743 -> 3917512 (-0.36%)
total gprs used in shared programs    : 481460 -> 481680 (0.05%)
total local used in shared programs   : 27481 -> 26761 (-2.62%)
total bytes used in shared programs   : 36032672 -> 35902648 (-0.36%)

                local        gpr       inst      bytes
    helped          48         133        3843        3843
      hurt           1         295          75          75

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp        | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 0de84fe9fc..505de08573 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -3729,12 +3729,17 @@ Program::optimizeSSA(int level)
    RUN_PASS(1, CopyPropagation, run);
    RUN_PASS(1, MergeSplits, run);
    RUN_PASS(2, GlobalCSE, run);
-   RUN_PASS(1, LocalCSE, run);
-   RUN_PASS(2, AlgebraicOpt, run);
-   RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
-   RUN_PASS(1, ConstantFolding, foldAll);
-   RUN_PASS(2, LateAlgebraicOpt, run);
-   RUN_PASS(1, Split64BitOpPreRA, run);
+   for (int i = 0; i < 2; ++i) {
+      RUN_PASS(1, LocalCSE, run);
+      RUN_PASS(2, AlgebraicOpt, run);
+      RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
+      RUN_PASS(1, ConstantFolding, foldAll);
+      RUN_PASS(2, LateAlgebraicOpt, run);
+      // only once
+      if (i == 0)
+         RUN_PASS(1, Split64BitOpPreRA, run);
+      RUN_PASS(1, DeadCodeElim, buryAll);
+   }
    RUN_PASS(1, LoadPropagation, run);
    RUN_PASS(1, IndirectPropagation, run);
    RUN_PASS(2, MemoryOpt, run);
-- 
2.12.2

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods
       [not found]     ` <20170403155822.1810-2-karolherbst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-04-09 17:34       ` Ilia Mirkin
  0 siblings, 0 replies; 6+ messages in thread
From: Ilia Mirkin @ 2017-04-09 17:34 UTC (permalink / raw)
  To: Karol Herbst
  Cc: mesa-dev-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Mon, Apr 3, 2017 at 11:58 AM, Karol Herbst <karolherbst@gmail.com> wrote:
> Signed-off-by: Karol Herbst <karolherbst@gmail.com>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 4c92a1efb5..bd60a84998 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -1797,10 +1797,10 @@ AlgebraicOpt::handleSLCT(Instruction *slct)
>        if (slct->getSrc(2)->asImm()->compare(slct->asCmp()->setCond, 0.0f))
>           slct->setSrc(0, slct->getSrc(1));
>     } else
> -   if (slct->getSrc(0) != slct->getSrc(1)) {
> +   if (slct->getSrc(0) != slct->getSrc(1) || slct->src(0).mod != slct->src(1).mod)

SLCT can't have mods on src0/src1. Only on src2. I'd be just as happy
to assert that they're both == 0 here. You can also add a helper to
ValueRef to see if it's == to another ValueRef, which compares both
the Value ptr as well as any modifiers, indirects, etc. But it again
doesn't ultimately need to be used here.

>        return;
> -   }
> -   slct->op = OP_MOV;
> +   slct->op = slct->src(0).mod.getOp();
> +   slct->src(0).mod = slct->src(0).mod ^ Modifier(slct->op);
>     slct->setSrc(1, NULL);
>     slct->setSrc(2, NULL);
>  }
> --
> 2.12.2
>
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt
  2017-04-03 15:58   ` [PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt Karol Herbst
@ 2017-04-09 17:36     ` Ilia Mirkin
  0 siblings, 0 replies; 6+ messages in thread
From: Ilia Mirkin @ 2017-04-09 17:36 UTC (permalink / raw)
  To: Karol Herbst; +Cc: mesa-dev, nouveau

On Mon, Apr 3, 2017 at 11:58 AM, Karol Herbst <karolherbst@gmail.com> wrote:
> Signed-off-by: Karol Herbst <karolherbst@gmail.com>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index bd60a84998..0de84fe9fc 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -1856,6 +1856,12 @@ AlgebraicOpt::handleLOGOP(Instruction *logop)
>
>        set0 = cloneForward(func, set0);
>        set1 = cloneShallow(func, set1);
> +
> +      if (logop->src(0).mod == Modifier(NV50_IR_MOD_NOT))
> +         set0->asCmp()->setCond = inverseCondCode(set0->asCmp()->setCond);
> +      if (logop->src(1).mod == Modifier(NV50_IR_MOD_NOT))
> +         set1->asCmp()->setCond = inverseCondCode(set1->asCmp()->setCond);

set0/set1 may have been swapped further up, so you need to keep track of that.

Also, I don't think this will work if one of the sets is a SET_AND --
the condcode applies to the set bit, not to the AND bit. I think you'd
also have to flip AND <-> OR and flip the neg.

  -ilia
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-04-09 17:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-03 15:58 [PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop Karol Herbst
     [not found] ` <20170403155822.1810-1-karolherbst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-04-03 15:58   ` [PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods Karol Herbst
     [not found]     ` <20170403155822.1810-2-karolherbst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-04-09 17:34       ` Ilia Mirkin
2017-04-03 15:58   ` [PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt Karol Herbst
2017-04-09 17:36     ` Ilia Mirkin
2017-04-03 15:58 ` [PATCH v2 3/3] nv50/ir: run some passes multiple times Karol Herbst

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.