Am 28.05.20 um 12:06
schrieb Michel Dänzer:
> On 2020-05-28 11:11 a.m., Christian König wrote:
>> Well we still need implicit sync [...]
> Yeah, this isn't about "we don't want implicit sync",
it's about "amdgpu
> doesn't ensure later jobs fully see the effects of
previous implicitly
> synced jobs", requiring userspace to do pessimistic
flushing.
Yes, exactly that.
For the background: We also do this flushing for explicit
syncs. And
when this was implemented 2-3 years ago we first did the
flushing for
implicit sync as well.
That was immediately reverted and then implemented
differently because
it caused severe performance problems in some use cases.
I'm not sure of the root cause of this performance problems.
My
assumption was always that we then insert to many pipeline
syncs, but
Marek doesn't seem to think it could be that.
On the one hand I'm rather keen to remove the extra handling
and just
always use the explicit handling for everything because it
simplifies
the kernel code quite a bit. On the other hand I don't want
to run into
this performance problem again.
Additional to that what the kernel does is a "full" pipeline
sync, e.g.
we busy wait for the full hardware pipeline to drain. That
might be
overkill if you just want to do some flushing so that the
next shader
sees the stuff written, but I'm not an expert on that.
Do we busy-wait on the CPU or in WAIT_REG_MEM?
WAIT_REG_MEM is what UMDs do and should be faster.