Created attachment 118824 =
[details] [review]
test kernel patch

(In reply to Michel D=C3=A4nzer from comment #29)
> That is interesting, though; the radeonsi driver=
 seems to think there should
> be something mapped at the faulting address. This indicates that eithe=
r the
> kernel driver fails to handle the mapping properly, or maybe there's a
> problem with communicating the buffer mapping information from userspa=
ce to
> the kernel driver.

Judging by the symptoms it feels like some caching/buffering problem somewh=
ere.=20

If I understand the code right, most of things are mapped write-combine, wh=
ich
means the CPU is allowed to write data it any order it likes. Looking at
amdgpu/radeon code, there is surprising lack of barriers, basically it's ju=
st
amdgpu_ring_commit()/radeon_ring_commit() and that's it. But mb() doesn't
guarantee that the writes will arrive in program order, it just ensures that
all the writes are finished after that mb() statement.

So the question is, is it ok for the hardware if in something like
amdgpu_ib_schedule() the writes to the ring arrive before the writes to IB?=
 I
do admit I don't understand how the hardware works, like what triggers the
hardware to start processing the ring contents, perhaps the write to the la=
st
word in the ring? If so you clearly need a wmb() before the write which
triggers the hardware so that everything is ready before the GPU kicks in.

Attached is a debug kernel patch to test if my guess is correct. It's way
overkill and will trash performance, but it should show if this is a problem
related to CPU caching/buffering. I don't have the hardware to test this
myself.