On Thu, May 28, 2020 at 10:40 AM Christian König <christian.koenig@amd.com>
wrote:

> Am 28.05.20 um 12:06 schrieb Michel Dänzer:
> > On 2020-05-28 11:11 a.m., Christian König wrote:
> >> Well we still need implicit sync [...]
> > Yeah, this isn't about "we don't want implicit sync", it's about "amdgpu
> > doesn't ensure later jobs fully see the effects of previous implicitly
> > synced jobs", requiring userspace to do pessimistic flushing.
>
> Yes, exactly that.
>
> For the background: We also do this flushing for explicit syncs. And
> when this was implemented 2-3 years ago we first did the flushing for
> implicit sync as well.
>
> That was immediately reverted and then implemented differently because
> it caused severe performance problems in some use cases.
>
> I'm not sure of the root cause of this performance problems. My
> assumption was always that we then insert to many pipeline syncs, but
> Marek doesn't seem to think it could be that.
>
> On the one hand I'm rather keen to remove the extra handling and just
> always use the explicit handling for everything because it simplifies
> the kernel code quite a bit. On the other hand I don't want to run into
> this performance problem again.
>
> Additional to that what the kernel does is a "full" pipeline sync, e.g.
> we busy wait for the full hardware pipeline to drain. That might be
> overkill if you just want to do some flushing so that the next shader
> sees the stuff written, but I'm not an expert on that.
>

Do we busy-wait on the CPU or in WAIT_REG_MEM?

WAIT_REG_MEM is what UMDs do and should be faster.

Marek