All of lore.kernel.org
 help / color / mirror / Atom feed
* Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles
@ 2015-05-18 20:48 Ilia Mirkin
       [not found] ` <CAKb7UvixGi2pd8sp-qcOM5-Fuj7U+1Vn7ih4VGS7O06iAFgYeA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Ilia Mirkin @ 2015-05-18 20:48 UTC (permalink / raw)
  To: gpu-public-documentation; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hello,

I've been debugging a few different tessellation shader issues with
nouveau, but let's start small. I see this issue on my GK208 with high
frequency, and I *think* I've seen it once or twice on my GF108, but
it's exceedingly rare, if it does happen. I don't have a GK10x to test
on, unfortunately, but I assume it'll have the same issue as the
GK208.

The issue is this -- a bunch of triangles that should come out of the
tessellator end up black. I also see a GPC0/TPC1/MP trap:
MEM_OUT_OF_BOUNDS error produced by nouveau -- this is output in
response to a interrupt and MP trap generated by the hardware, read
out with nv_rd32(priv, TPC_UNIT(gpc, tpc, 0x648)); (see
gf100_gr_trap_mp). I assume some of the tessellation evaluation
invocations get killed, but I have no proof of this.

I also see this: TRAP ch 5 [0x003facf000 shader_runner[19044]]

I would imagine that's some floating point number ending up in the
register instead of an address, but the fp32 value of it
(1.35107421875) does not seem familiar.

Even when all the triangles show up, I still see the error on the
GK208, so I'm not sure if they're the same issue or not.

Now, here's the fun part -- this is completely non-deterministic.
Sometimes everything shows up on the GK208, other times I see holes,
in varying locations. I'm fairly sure that the actual shader code is
correct... so I'm doing something funny wrong. (And yeah, tons of
missed optimization opportunities in this code, but let's not dwell on
that.)

This is the piglit test:

http://cgit.freedesktop.org/piglit/tree/tests/spec/arb_tessellation_shader/execution/quads.shader_test

It should be noted that other piglit tests don't exhibit this error,
however they also tend to be simpler. One key difference is that they
don't change the patch size in TCS. I'm including a link to a text
file with the tessellation control and evaluation shaders (decoded
with nvdisasm which you're hopefully more familiar with), along with
the shader headers that we generate.

FTR, this is how I feed the raw shader opcode bytes into nvdisasm:

perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b SM35 tt

(for some reason it doesn't want to read from a pipe or even a fd).

http://people.freedesktop.org/~imirkin/tess_shaders_quads.txt

My suspicion is that we're doing something wrong with the sched codes.
We have an elaborate calculator, but... perhaps not elaborate enough?
You can see it here:

http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n2574

The reason I think it's an error in sched codes is due to the TRAP
memory location that I see -- could well be some "stale" value in the
register and the value from S2R or VILD doesn't make it in there in
time before the ALD reads it.

If you should like to try this yourself, you can use
https://github.com/imirkin/mesa/commits/gl4-integration-2 . This
branch is good enough to run Unigine Heaven, but still has a lot of
known shortcomings. (Both at the core and the nouveau levels.)

Any advice or suggestions for debugging this would be greatly
appreciated. And let me know if you'd like me to generate additional
info on this. For example I can supply a full command trace that can
be piped to demmt, if that's helpful.

Thanks in advance,

  -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles
       [not found] ` <CAKb7UvixGi2pd8sp-qcOM5-Fuj7U+1Vn7ih4VGS7O06iAFgYeA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-05-22 21:10   ` Ilia Mirkin
       [not found]     ` <CAKb7UvgW=X8zo8DJD-xkKB=+TihRyPhxN7mPPnORybU0oyEQiQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Ilia Mirkin @ 2015-05-22 21:10 UTC (permalink / raw)
  To: gpu-public-documentation; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Mon, May 18, 2015 at 4:48 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> Hello,
>
> I've been debugging a few different tessellation shader issues with
> nouveau, but let's start small. I see this issue on my GK208 with high
> frequency, and I *think* I've seen it once or twice on my GF108, but
> it's exceedingly rare, if it does happen. I don't have a GK10x to test
> on, unfortunately, but I assume it'll have the same issue as the
> GK208.
>
> The issue is this -- a bunch of triangles that should come out of the
> tessellator end up black. I also see a GPC0/TPC1/MP trap:
> MEM_OUT_OF_BOUNDS error produced by nouveau -- this is output in
> response to a interrupt and MP trap generated by the hardware, read
> out with nv_rd32(priv, TPC_UNIT(gpc, tpc, 0x648)); (see
> gf100_gr_trap_mp). I assume some of the tessellation evaluation
> invocations get killed, but I have no proof of this.
>
> I also see this: TRAP ch 5 [0x003facf000 shader_runner[19044]]
>
> I would imagine that's some floating point number ending up in the
> register instead of an address, but the fp32 value of it
> (1.35107421875) does not seem familiar.

Ben pointed out that the 0x3facf000 is a channel address, not a value
from the shader. Oops. So that theory completely doesn't hold water.
Perhaps some buffer isn't big enough? This ends up using 9 output
vertices per patch, with 2 vec4's each. I've tried playing with the
per-warp stack size to no avail, but I didn't *entirely* know what I
was doing either though.

>
> Even when all the triangles show up, I still see the error on the
> GK208, so I'm not sure if they're the same issue or not.
>
> Now, here's the fun part -- this is completely non-deterministic.
> Sometimes everything shows up on the GK208, other times I see holes,
> in varying locations. I'm fairly sure that the actual shader code is
> correct... so I'm doing something funny wrong. (And yeah, tons of
> missed optimization opportunities in this code, but let's not dwell on
> that.)
>
> This is the piglit test:
>
> http://cgit.freedesktop.org/piglit/tree/tests/spec/arb_tessellation_shader/execution/quads.shader_test
>
> It should be noted that other piglit tests don't exhibit this error,
> however they also tend to be simpler. One key difference is that they
> don't change the patch size in TCS. I'm including a link to a text
> file with the tessellation control and evaluation shaders (decoded
> with nvdisasm which you're hopefully more familiar with), along with
> the shader headers that we generate.
>
> FTR, this is how I feed the raw shader opcode bytes into nvdisasm:
>
> perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b SM35 tt
>
> (for some reason it doesn't want to read from a pipe or even a fd).
>
> http://people.freedesktop.org/~imirkin/tess_shaders_quads.txt
>
> My suspicion is that we're doing something wrong with the sched codes.
> We have an elaborate calculator, but... perhaps not elaborate enough?
> You can see it here:
>
> http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n2574
>
> The reason I think it's an error in sched codes is due to the TRAP
> memory location that I see -- could well be some "stale" value in the
> register and the value from S2R or VILD doesn't make it in there in
> time before the ALD reads it.
>
> If you should like to try this yourself, you can use
> https://github.com/imirkin/mesa/commits/gl4-integration-2 . This
> branch is good enough to run Unigine Heaven, but still has a lot of
> known shortcomings. (Both at the core and the nouveau levels.)
>
> Any advice or suggestions for debugging this would be greatly
> appreciated. And let me know if you'd like me to generate additional
> info on this. For example I can supply a full command trace that can
> be piped to demmt, if that's helpful.
>
> Thanks in advance,
>
>   -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles
       [not found]     ` <CAKb7UvgW=X8zo8DJD-xkKB=+TihRyPhxN7mPPnORybU0oyEQiQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-05-26 23:34       ` Ilia Mirkin
       [not found]         ` <CAKb7Uvi1Hpve=BZPEUXYsHqb6119qyzqe7omGGDiLHNTYTT0SA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Ilia Mirkin @ 2015-05-26 23:34 UTC (permalink / raw)
  To: gpu-public-documentation; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

One additional observation that I just made is that on GK208, the blob
apparently doesn't use the result of S2R Rx, SR_INVOCATION_ID
wholesale in TCS. It either passes it through a I2I.S32.S32 Rx, |Rx|
(i.e. absolute value), or even more paradoxically, shl 2; shr 2; which
removes the top *2* bits, rather than just the top 1. However I see no
such behaviour on GF108.

I'm going to test out tomorrow whether this is the cause of my GK208 woes.

On Fri, May 22, 2015 at 5:10 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> On Mon, May 18, 2015 at 4:48 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>> Hello,
>>
>> I've been debugging a few different tessellation shader issues with
>> nouveau, but let's start small. I see this issue on my GK208 with high
>> frequency, and I *think* I've seen it once or twice on my GF108, but
>> it's exceedingly rare, if it does happen. I don't have a GK10x to test
>> on, unfortunately, but I assume it'll have the same issue as the
>> GK208.
>>
>> The issue is this -- a bunch of triangles that should come out of the
>> tessellator end up black. I also see a GPC0/TPC1/MP trap:
>> MEM_OUT_OF_BOUNDS error produced by nouveau -- this is output in
>> response to a interrupt and MP trap generated by the hardware, read
>> out with nv_rd32(priv, TPC_UNIT(gpc, tpc, 0x648)); (see
>> gf100_gr_trap_mp). I assume some of the tessellation evaluation
>> invocations get killed, but I have no proof of this.
>>
>> I also see this: TRAP ch 5 [0x003facf000 shader_runner[19044]]
>>
>> I would imagine that's some floating point number ending up in the
>> register instead of an address, but the fp32 value of it
>> (1.35107421875) does not seem familiar.
>
> Ben pointed out that the 0x3facf000 is a channel address, not a value
> from the shader. Oops. So that theory completely doesn't hold water.
> Perhaps some buffer isn't big enough? This ends up using 9 output
> vertices per patch, with 2 vec4's each. I've tried playing with the
> per-warp stack size to no avail, but I didn't *entirely* know what I
> was doing either though.
>
>>
>> Even when all the triangles show up, I still see the error on the
>> GK208, so I'm not sure if they're the same issue or not.
>>
>> Now, here's the fun part -- this is completely non-deterministic.
>> Sometimes everything shows up on the GK208, other times I see holes,
>> in varying locations. I'm fairly sure that the actual shader code is
>> correct... so I'm doing something funny wrong. (And yeah, tons of
>> missed optimization opportunities in this code, but let's not dwell on
>> that.)
>>
>> This is the piglit test:
>>
>> http://cgit.freedesktop.org/piglit/tree/tests/spec/arb_tessellation_shader/execution/quads.shader_test
>>
>> It should be noted that other piglit tests don't exhibit this error,
>> however they also tend to be simpler. One key difference is that they
>> don't change the patch size in TCS. I'm including a link to a text
>> file with the tessellation control and evaluation shaders (decoded
>> with nvdisasm which you're hopefully more familiar with), along with
>> the shader headers that we generate.
>>
>> FTR, this is how I feed the raw shader opcode bytes into nvdisasm:
>>
>> perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b SM35 tt
>>
>> (for some reason it doesn't want to read from a pipe or even a fd).
>>
>> http://people.freedesktop.org/~imirkin/tess_shaders_quads.txt
>>
>> My suspicion is that we're doing something wrong with the sched codes.
>> We have an elaborate calculator, but... perhaps not elaborate enough?
>> You can see it here:
>>
>> http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n2574
>>
>> The reason I think it's an error in sched codes is due to the TRAP
>> memory location that I see -- could well be some "stale" value in the
>> register and the value from S2R or VILD doesn't make it in there in
>> time before the ALD reads it.
>>
>> If you should like to try this yourself, you can use
>> https://github.com/imirkin/mesa/commits/gl4-integration-2 . This
>> branch is good enough to run Unigine Heaven, but still has a lot of
>> known shortcomings. (Both at the core and the nouveau levels.)
>>
>> Any advice or suggestions for debugging this would be greatly
>> appreciated. And let me know if you'd like me to generate additional
>> info on this. For example I can supply a full command trace that can
>> be piped to demmt, if that's helpful.
>>
>> Thanks in advance,
>>
>>   -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles
       [not found]         ` <CAKb7Uvi1Hpve=BZPEUXYsHqb6119qyzqe7omGGDiLHNTYTT0SA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-23  6:36           ` Ilia Mirkin
       [not found]             ` <CAKb7Uvi3P7ftXL_ghrPoBvqivHu2GOjTj=ZCLK_zDyPLdme0aw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Ilia Mirkin @ 2015-07-23  6:36 UTC (permalink / raw)
  To: gpu-public-documentation; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

I think I figured out what was going on. Will re-check on the GK208,
but on a GF108 the random blue splotches in Unigine Heaven are gone
now. Turns out that with an instruction like

        /*00d0*/                   ALD.128 R0, a[0x70], R0;
   /* 0x7ecc0000381ffc02 */

The hardware will internally split it up into roughly

ALD R0, a[0x70], R0
ALD R1, a[0x74], R0
ALD R2, a[0x78], R0
ALD R3, a[0x7c], R0

Of course the first one of those overwrites R0, which makes the
subsequent loads be full of fail. Adding a hazard in our RA for the
indirect argument resolves the issue.

  -ilia


On Tue, May 26, 2015 at 7:34 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> One additional observation that I just made is that on GK208, the blob
> apparently doesn't use the result of S2R Rx, SR_INVOCATION_ID
> wholesale in TCS. It either passes it through a I2I.S32.S32 Rx, |Rx|
> (i.e. absolute value), or even more paradoxically, shl 2; shr 2; which
> removes the top *2* bits, rather than just the top 1. However I see no
> such behaviour on GF108.
>
> I'm going to test out tomorrow whether this is the cause of my GK208 woes.
>
> On Fri, May 22, 2015 at 5:10 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>> On Mon, May 18, 2015 at 4:48 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>>> Hello,
>>>
>>> I've been debugging a few different tessellation shader issues with
>>> nouveau, but let's start small. I see this issue on my GK208 with high
>>> frequency, and I *think* I've seen it once or twice on my GF108, but
>>> it's exceedingly rare, if it does happen. I don't have a GK10x to test
>>> on, unfortunately, but I assume it'll have the same issue as the
>>> GK208.
>>>
>>> The issue is this -- a bunch of triangles that should come out of the
>>> tessellator end up black. I also see a GPC0/TPC1/MP trap:
>>> MEM_OUT_OF_BOUNDS error produced by nouveau -- this is output in
>>> response to a interrupt and MP trap generated by the hardware, read
>>> out with nv_rd32(priv, TPC_UNIT(gpc, tpc, 0x648)); (see
>>> gf100_gr_trap_mp). I assume some of the tessellation evaluation
>>> invocations get killed, but I have no proof of this.
>>>
>>> I also see this: TRAP ch 5 [0x003facf000 shader_runner[19044]]
>>>
>>> I would imagine that's some floating point number ending up in the
>>> register instead of an address, but the fp32 value of it
>>> (1.35107421875) does not seem familiar.
>>
>> Ben pointed out that the 0x3facf000 is a channel address, not a value
>> from the shader. Oops. So that theory completely doesn't hold water.
>> Perhaps some buffer isn't big enough? This ends up using 9 output
>> vertices per patch, with 2 vec4's each. I've tried playing with the
>> per-warp stack size to no avail, but I didn't *entirely* know what I
>> was doing either though.
>>
>>>
>>> Even when all the triangles show up, I still see the error on the
>>> GK208, so I'm not sure if they're the same issue or not.
>>>
>>> Now, here's the fun part -- this is completely non-deterministic.
>>> Sometimes everything shows up on the GK208, other times I see holes,
>>> in varying locations. I'm fairly sure that the actual shader code is
>>> correct... so I'm doing something funny wrong. (And yeah, tons of
>>> missed optimization opportunities in this code, but let's not dwell on
>>> that.)
>>>
>>> This is the piglit test:
>>>
>>> http://cgit.freedesktop.org/piglit/tree/tests/spec/arb_tessellation_shader/execution/quads.shader_test
>>>
>>> It should be noted that other piglit tests don't exhibit this error,
>>> however they also tend to be simpler. One key difference is that they
>>> don't change the patch size in TCS. I'm including a link to a text
>>> file with the tessellation control and evaluation shaders (decoded
>>> with nvdisasm which you're hopefully more familiar with), along with
>>> the shader headers that we generate.
>>>
>>> FTR, this is how I feed the raw shader opcode bytes into nvdisasm:
>>>
>>> perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b SM35 tt
>>>
>>> (for some reason it doesn't want to read from a pipe or even a fd).
>>>
>>> http://people.freedesktop.org/~imirkin/tess_shaders_quads.txt
>>>
>>> My suspicion is that we're doing something wrong with the sched codes.
>>> We have an elaborate calculator, but... perhaps not elaborate enough?
>>> You can see it here:
>>>
>>> http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n2574
>>>
>>> The reason I think it's an error in sched codes is due to the TRAP
>>> memory location that I see -- could well be some "stale" value in the
>>> register and the value from S2R or VILD doesn't make it in there in
>>> time before the ALD reads it.
>>>
>>> If you should like to try this yourself, you can use
>>> https://github.com/imirkin/mesa/commits/gl4-integration-2 . This
>>> branch is good enough to run Unigine Heaven, but still has a lot of
>>> known shortcomings. (Both at the core and the nouveau levels.)
>>>
>>> Any advice or suggestions for debugging this would be greatly
>>> appreciated. And let me know if you'd like me to generate additional
>>> info on this. For example I can supply a full command trace that can
>>> be piped to demmt, if that's helpful.
>>>
>>> Thanks in advance,
>>>
>>>   -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles
       [not found]             ` <CAKb7Uvi3P7ftXL_ghrPoBvqivHu2GOjTj=ZCLK_zDyPLdme0aw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-24 16:34               ` Ilia Mirkin
  0 siblings, 0 replies; 5+ messages in thread
From: Ilia Mirkin @ 2015-07-24 16:34 UTC (permalink / raw)
  To: gpu-public-documentation; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Indeed, this fixed the original issue on the GK208. Additionally it
seems like starting with GK104 the mechanism for indirect offsets for
ALD/AST changed and a AL2P instruction must now be used to determine
the "indirect" or "physical" offset. Once nouveau was adjusted to do
this, all MEM_OUT_OF_BOUNDS errors with tessellation shaders are gone.

On Thu, Jul 23, 2015 at 2:36 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> I think I figured out what was going on. Will re-check on the GK208,
> but on a GF108 the random blue splotches in Unigine Heaven are gone
> now. Turns out that with an instruction like
>
>         /*00d0*/                   ALD.128 R0, a[0x70], R0;
>    /* 0x7ecc0000381ffc02 */
>
> The hardware will internally split it up into roughly
>
> ALD R0, a[0x70], R0
> ALD R1, a[0x74], R0
> ALD R2, a[0x78], R0
> ALD R3, a[0x7c], R0
>
> Of course the first one of those overwrites R0, which makes the
> subsequent loads be full of fail. Adding a hazard in our RA for the
> indirect argument resolves the issue.
>
>   -ilia
>
>
> On Tue, May 26, 2015 at 7:34 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>> One additional observation that I just made is that on GK208, the blob
>> apparently doesn't use the result of S2R Rx, SR_INVOCATION_ID
>> wholesale in TCS. It either passes it through a I2I.S32.S32 Rx, |Rx|
>> (i.e. absolute value), or even more paradoxically, shl 2; shr 2; which
>> removes the top *2* bits, rather than just the top 1. However I see no
>> such behaviour on GF108.
>>
>> I'm going to test out tomorrow whether this is the cause of my GK208 woes.
>>
>> On Fri, May 22, 2015 at 5:10 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>>> On Mon, May 18, 2015 at 4:48 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>>>> Hello,
>>>>
>>>> I've been debugging a few different tessellation shader issues with
>>>> nouveau, but let's start small. I see this issue on my GK208 with high
>>>> frequency, and I *think* I've seen it once or twice on my GF108, but
>>>> it's exceedingly rare, if it does happen. I don't have a GK10x to test
>>>> on, unfortunately, but I assume it'll have the same issue as the
>>>> GK208.
>>>>
>>>> The issue is this -- a bunch of triangles that should come out of the
>>>> tessellator end up black. I also see a GPC0/TPC1/MP trap:
>>>> MEM_OUT_OF_BOUNDS error produced by nouveau -- this is output in
>>>> response to a interrupt and MP trap generated by the hardware, read
>>>> out with nv_rd32(priv, TPC_UNIT(gpc, tpc, 0x648)); (see
>>>> gf100_gr_trap_mp). I assume some of the tessellation evaluation
>>>> invocations get killed, but I have no proof of this.
>>>>
>>>> I also see this: TRAP ch 5 [0x003facf000 shader_runner[19044]]
>>>>
>>>> I would imagine that's some floating point number ending up in the
>>>> register instead of an address, but the fp32 value of it
>>>> (1.35107421875) does not seem familiar.
>>>
>>> Ben pointed out that the 0x3facf000 is a channel address, not a value
>>> from the shader. Oops. So that theory completely doesn't hold water.
>>> Perhaps some buffer isn't big enough? This ends up using 9 output
>>> vertices per patch, with 2 vec4's each. I've tried playing with the
>>> per-warp stack size to no avail, but I didn't *entirely* know what I
>>> was doing either though.
>>>
>>>>
>>>> Even when all the triangles show up, I still see the error on the
>>>> GK208, so I'm not sure if they're the same issue or not.
>>>>
>>>> Now, here's the fun part -- this is completely non-deterministic.
>>>> Sometimes everything shows up on the GK208, other times I see holes,
>>>> in varying locations. I'm fairly sure that the actual shader code is
>>>> correct... so I'm doing something funny wrong. (And yeah, tons of
>>>> missed optimization opportunities in this code, but let's not dwell on
>>>> that.)
>>>>
>>>> This is the piglit test:
>>>>
>>>> http://cgit.freedesktop.org/piglit/tree/tests/spec/arb_tessellation_shader/execution/quads.shader_test
>>>>
>>>> It should be noted that other piglit tests don't exhibit this error,
>>>> however they also tend to be simpler. One key difference is that they
>>>> don't change the patch size in TCS. I'm including a link to a text
>>>> file with the tessellation control and evaluation shaders (decoded
>>>> with nvdisasm which you're hopefully more familiar with), along with
>>>> the shader headers that we generate.
>>>>
>>>> FTR, this is how I feed the raw shader opcode bytes into nvdisasm:
>>>>
>>>> perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b SM35 tt
>>>>
>>>> (for some reason it doesn't want to read from a pipe or even a fd).
>>>>
>>>> http://people.freedesktop.org/~imirkin/tess_shaders_quads.txt
>>>>
>>>> My suspicion is that we're doing something wrong with the sched codes.
>>>> We have an elaborate calculator, but... perhaps not elaborate enough?
>>>> You can see it here:
>>>>
>>>> http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n2574
>>>>
>>>> The reason I think it's an error in sched codes is due to the TRAP
>>>> memory location that I see -- could well be some "stale" value in the
>>>> register and the value from S2R or VILD doesn't make it in there in
>>>> time before the ALD reads it.
>>>>
>>>> If you should like to try this yourself, you can use
>>>> https://github.com/imirkin/mesa/commits/gl4-integration-2 . This
>>>> branch is good enough to run Unigine Heaven, but still has a lot of
>>>> known shortcomings. (Both at the core and the nouveau levels.)
>>>>
>>>> Any advice or suggestions for debugging this would be greatly
>>>> appreciated. And let me know if you'd like me to generate additional
>>>> info on this. For example I can supply a full command trace that can
>>>> be piped to demmt, if that's helpful.
>>>>
>>>> Thanks in advance,
>>>>
>>>>   -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-07-24 16:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-18 20:48 Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles Ilia Mirkin
     [not found] ` <CAKb7UvixGi2pd8sp-qcOM5-Fuj7U+1Vn7ih4VGS7O06iAFgYeA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-22 21:10   ` Ilia Mirkin
     [not found]     ` <CAKb7UvgW=X8zo8DJD-xkKB=+TihRyPhxN7mPPnORybU0oyEQiQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-26 23:34       ` Ilia Mirkin
     [not found]         ` <CAKb7Uvi1Hpve=BZPEUXYsHqb6119qyzqe7omGGDiLHNTYTT0SA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-23  6:36           ` Ilia Mirkin
     [not found]             ` <CAKb7Uvi3P7ftXL_ghrPoBvqivHu2GOjTj=ZCLK_zDyPLdme0aw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-24 16:34               ` Ilia Mirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.