From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 91278] Tonga GPU lock/reset fail with Unigine Valley Date: Sun, 11 Oct 2015 16:14:11 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1713792135==" Return-path: Received: from culpepper.freedesktop.org (unknown [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id 0FF8C6E1F5 for ; Sun, 11 Oct 2015 09:14:12 -0700 (PDT) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1713792135== Content-Type: multipart/alternative; boundary="1444580051.dbcbFb1a1.4150"; charset="UTF-8" --1444580051.dbcbFb1a1.4150 Date: Sun, 11 Oct 2015 16:14:11 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable https://bugs.freedesktop.org/show_bug.cgi?id=3D91278 Grazvydas Ignotas changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |notasas@gmail.com --- Comment #34 from Grazvydas Ignotas --- Created attachment 118824 --> https://bugs.freedesktop.org/attachment.cgi?id=3D118824&action=3Dedit test kernel patch (In reply to Michel D=C3=A4nzer from comment #29) > That is interesting, though; the radeonsi driver seems to think there sho= uld > be something mapped at the faulting address. This indicates that either t= he > kernel driver fails to handle the mapping properly, or maybe there's a > problem with communicating the buffer mapping information from userspace = to > the kernel driver. Judging by the symptoms it feels like some caching/buffering problem somewh= ere.=20 If I understand the code right, most of things are mapped write-combine, wh= ich means the CPU is allowed to write data it any order it likes. Looking at amdgpu/radeon code, there is surprising lack of barriers, basically it's ju= st amdgpu_ring_commit()/radeon_ring_commit() and that's it. But mb() doesn't guarantee that the writes will arrive in program order, it just ensures that all the writes are finished after that mb() statement. So the question is, is it ok for the hardware if in something like amdgpu_ib_schedule() the writes to the ring arrive before the writes to IB?= I do admit I don't understand how the hardware works, like what triggers the hardware to start processing the ring contents, perhaps the write to the la= st word in the ring? If so you clearly need a wmb() before the write which triggers the hardware so that everything is ready before the GPU kicks in. Attached is a debug kernel patch to test if my guess is correct. It's way overkill and will trash performance, but it should show if this is a problem related to CPU caching/buffering. I don't have the hardware to test this myself. --=20 You are receiving this mail because: You are the assignee for the bug. --1444580051.dbcbFb1a1.4150 Date: Sun, 11 Oct 2015 16:14:11 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Grazvydas Ignotas changed bug 91278<= /a>
What Removed Added
CC   notasas@gmail.com

Commen= t # 34 on bug 91278<= /a> from Grazvydas Ignotas
Created attachment 118824 =
[details] [review]
test kernel patch

(In reply to Michel D=C3=A4nzer from comment #29)
> That is interesting, though; the radeonsi driver=
 seems to think there should
> be something mapped at the faulting address. This indicates that eithe=
r the
> kernel driver fails to handle the mapping properly, or maybe there's a
> problem with communicating the buffer mapping information from userspa=
ce to
> the kernel driver.

Judging by the symptoms it feels like some caching/buffering problem somewh=
ere.=20

If I understand the code right, most of things are mapped write-combine, wh=
ich
means the CPU is allowed to write data it any order it likes. Looking at
amdgpu/radeon code, there is surprising lack of barriers, basically it's ju=
st
amdgpu_ring_commit()/radeon_ring_commit() and that's it. But mb() doesn't
guarantee that the writes will arrive in program order, it just ensures that
all the writes are finished after that mb() statement.

So the question is, is it ok for the hardware if in something like
amdgpu_ib_schedule() the writes to the ring arrive before the writes to IB?=
 I
do admit I don't understand how the hardware works, like what triggers the
hardware to start processing the ring contents, perhaps the write to the la=
st
word in the ring? If so you clearly need a wmb() before the write which
triggers the hardware so that everything is ready before the GPU kicks in.

Attached is a debug kernel patch to test if my guess is correct. It's way
overkill and will trash performance, but it should show if this is a problem
related to CPU caching/buffering. I don't have the hardware to test this
myself.


You are receiving this mail because: =20=20=20=20=20=20
  • You are the assignee for the bug.
--1444580051.dbcbFb1a1.4150-- --===============1713792135== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0 cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWwK --===============1713792135==--