From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ilia Mirkin Subject: Re: [PATCH v2] nv50/ir: optimize shl(a, 0) to a Date: Sun, 30 Apr 2017 10:09:04 -0400 Message-ID: References: <20170429220934.3355-1-karolherbst@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0836588652==" Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: mesa-dev-bounces@lists.freedesktop.org Sender: "mesa-dev" To: Karol Herbst Cc: nouveau@lists.freedesktop.org, mesa-dev@lists.freedesktop.org List-Id: nouveau.vger.kernel.org --===============0836588652== Content-Type: multipart/alternative; boundary=001a113e4784731109054e62d94b --001a113e4784731109054e62d94b Content-Type: text/plain; charset=UTF-8 On Apr 30, 2017 8:14 AM, "Karol Herbst" wrote: 2017-04-30 2:28 GMT+02:00 Ilia Mirkin : > Maybe in a separate change. I'd want to double check on all gens. I think > the thing I suggested is sufficient. > well, if I just fixup the op, I kind of have to fix the mod as well. And if I use getOp, it could also return a OP_CVT, so I have to do the check. I don't see how I can only use getOp, but not fixing the mod? Perhaps this all doesn't work the way I remember. Will try to look during the week. > On Apr 29, 2017 8:09 PM, "Karol Herbst" wrote: > > 2017-04-30 0:28 GMT+02:00 Ilia Mirkin : >> On Sat, Apr 29, 2017 at 6:09 PM, Karol Herbst >> wrote: >>> helps two alien isolation shaders >>> >>> shader-db: >>> total instructions in shared programs : 4251497 -> 4251494 (-0.00%) >>> total gprs used in shared programs : 513962 -> 513962 (0.00%) >>> total local used in shared programs : 29797 -> 29797 (0.00%) >>> total bytes used in shared programs : 38960264 -> 38960232 (-0.00%) >>> >>> local gpr inst bytes >>> helped 0 0 2 2 >>> hurt 0 0 0 0 >>> >>> v2: handle potential mods on src0 >>> >>> Signed-off-by: Karol Herbst >>> Reviewed-by: Samuel Pitoiset >>> Reviewed-by: Ilia Mirkin >>> --- >>> src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> index 015def0391..82da0d3e48 100644 >>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> @@ -1284,6 +1284,13 @@ ConstantFolding::opnd(Instruction *i, >>> ImmediateValue &imm0, int s) >>> >>> case OP_SHL: >>> { >>> + if (s == 1 && imm0.isInteger(0)) { >>> + i->op = i->src(0).mod.getOp(); >>> + if (i->op != OP_CVT) >>> + i->src(0).mod = 0; >> >> Is this necessary? Presumably if the op != 0, then op == OP_CVT... >> > > yeah, no idea. I just thought I do it right when I actually depend on > the getOp magic. But we can't emit any mods to begin with, so maybe we > should just drop the mod handling and be done with it? > >>> + i->setSrc(1, NULL); >>> + break; >>> + } >>> if (s != 1 || i->src(0).mod != Modifier(0)) >>> break; >>> // try to concatenate shifts >>> -- >>> 2.12.2 >>> > > --001a113e4784731109054e62d94b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Apr 30, 2017 8:14 AM, "Karol Herbst" <karolherbst@gmail.com> wrote:
2017-0= 4-30 2:28 GMT+02:00 Ilia Mirkin <imirkin@alum.mit.edu>:
> Maybe in a separate change. I'd want to double check on all gens. = I think
> the thing I suggested is sufficient.
>

well, if I just fixup the op, I kind of have to fix the mod as well.<= br> And if I use getOp, it could also return a OP_CVT, so I have to do the
check.

I don't see how I can only use getOp, but not fixing the mod?

Perh= aps this all doesn't work the way I remember. Will try to look during t= he week.


> On Apr 29, 2017 8:09 PM, "Karol Herbst" <karolherbst@gmail.com> wrote:
>
> 2017-04-30 0:28 GMT+02:00 Ilia Mirkin <imirkin@alum.mit.edu>:
>> On Sat, Apr 29, 2017 at 6:09 PM, Karol Herbst <karolherbst@gmail.com>
>> wrote:
>>> helps two alien isolation shaders
>>>
>>> shader-db:
>>> total instructions in shared programs : 4251497 -> 4251494 = (-0.00%)
>>> total gprs used in shared programs=C2=A0 =C2=A0 : 513962 ->= 513962 (0.00%)
>>> total local used in shared programs=C2=A0 =C2=A0: 29797 -> = 29797 (0.00%)
>>> total bytes used in shared programs=C2=A0 =C2=A0: 38960264 -&g= t; 38960232 (-0.00%)
>>>
>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0l= ocal=C2=A0 =C2=A0 =C2=A0 =C2=A0 gpr=C2=A0 =C2=A0 =C2=A0 =C2=A0inst=C2=A0 = =C2=A0 =C2=A0 bytes
>>>=C2=A0 =C2=A0 =C2=A0helped=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A02=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02
>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0hurt=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
>>>
>>> v2: handle potential mods on src0
>>>
>>> Signed-off-by: Karol Herbst <karolherbst@gmail.com>
>>> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
>>> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
>>> ---
>>>=C2=A0 src/gallium/drivers/nouveau/codegen/nv50_ir_peephol= e.cpp | 7 +++++++
>>>=C2=A0 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_= peephole.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cp= p
>>> index 015def0391..82da0d3e48 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephol= e.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephol= e.cpp
>>> @@ -1284,6 +1284,13 @@ ConstantFolding::opnd(Instruction = *i,
>>> ImmediateValue &imm0, int s)
>>>
>>>=C2=A0 =C2=A0 =C2=A0case OP_SHL:
>>>=C2=A0 =C2=A0 =C2=A0{
>>> +=C2=A0 =C2=A0 =C2=A0 if (s =3D=3D 1 && imm0.isInteger= (0)) {
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0i->op =3D i->src(0).m= od.getOp();
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (i->op !=3D OP_CVT) >>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i->src(0).mod = =3D 0;
>>
>> Is this necessary? Presumably if the op !=3D 0, then op =3D=3D OP_= CVT...
>>
>
> yeah, no idea. I just thought I do it right when I actually depend on<= br> > the getOp magic. But we can't emit any mods to begin with, so mayb= e we
> should just drop the mod handling and be done with it?
>
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0i->setSrc(1, NULL);
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0break;
>>> +=C2=A0 =C2=A0 =C2=A0 }
>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (s !=3D 1 || i->src(0).mod != =3D Modifier(0))
>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0break;
>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 // try to concatenate shifts
>>> --
>>> 2.12.2
>>>
>
>

--001a113e4784731109054e62d94b-- --===============0836588652== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbWVzYS1kZXYg bWFpbGluZyBsaXN0Cm1lc2EtZGV2QGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3Rz LmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL21lc2EtZGV2Cg== --===============0836588652==--