All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
@ 2018-10-08  8:26 bugzilla-daemon
  2018-10-08  8:27 ` bugzilla-daemon
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-08  8:26 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2885 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

            Bug ID: 108272
           Summary: opencl-mesa: Anything using OpenCL segfaults, XFX
                    Radeon RX 580
           Product: Mesa
           Version: 18.2
          Hardware: x86-64 (AMD64)
                OS: All
            Status: NEW
          Severity: critical
          Priority: medium
         Component: Drivers/Gallium/radeonsi
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: jamespharvey20@gmail.com
        QA Contact: dri-devel@lists.freedesktop.org

Created attachment 141931
  --> https://bugs.freedesktop.org/attachment.cgi?id=141931&action=edit
clinfo seems happy

Up to date Arch Linux, including: linux 4.18.12.arch1-1, xf86-video-amdgpu
18.1.0-1, mesa 18.2.2-1, opencl-mesa 18.2.2-1, xorg-server 1.20.1-1, and
plasma-desktop 5.13.5-1.

(Recently installed system that STARTED with: linux 4.18.9.arch1-1, mesa
18.2.1-1, and opencl-mesa 18.2.1-1.)

I have not tried AMDGPU-PRO.  (Not really interested anyway, and the Arch AUR
package for it is 17.40, which requires downgrading to linux 4.9 and Xorg
1.18.)

$ lspci -k | grep VGA
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev e7)
        Subsystem: XFX Pine Group Inc. Ellesmere [Radeon RX
470/480/570/570X/580/580X]
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu

It's the: XFX AMD Radeon RX 580 GTS Black Edition 8GB GDDR5 PCI Express 3.0

clinfo seems happy, see full output attached or here: http://termbin.com/iiow

See glxinfo output attached or here: http://termbin.com/lgqd

Anything using OpenCL immediately segfaults.

Everything that segfaults runs just fine if I uninstall opencl-mesa, and either
use opencl-amd (Arch linux package, which extracts OpenCL library from
AMDGPU-PRO **18.30** but allows it to run with the open source AMDGPU driver.) 
Everthing also runs fine if I instead use intel-opencl-runtime to just run off
the CPUs.

-----

luxmark 3.1 crashes in pipe_radeonsi.so, called by libMesaOpenCL.so.1, called
by libOpenCL.so.1.

A gdb backtrace with opencl-mesa 18.2.2 that I compiled in debug mode with
symbols is attached or here: http://termbin.com/to7v

A gdb backtrace with opencl-mesa 18.2.2 (Arch binary, so without buildtype
specified and no symbols) is here: http://termbin.com/m1yx

Setting environment variable MESA_DEBUG causes no additional output.

-----

Granted, IndigoBenchmark_x64_v4.0.64 crashes by calling
/usr/lib/libOpenCL.so::clGetPlatformIDs() which is ocl-icd, but without
opencl-mesa and instead with opencl-amd, it runs through it just fine.

A gdb backtrace is attached or here: http://termbin.com/junz6

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 4571 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
@ 2018-10-08  8:27 ` bugzilla-daemon
  2018-10-08  8:28 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-08  8:27 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 299 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #1 from jamespharvey20@gmail.com ---
Created attachment 141932
  --> https://bugs.freedesktop.org/attachment.cgi?id=141932&action=edit
glxinfo output

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1220 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
  2018-10-08  8:27 ` bugzilla-daemon
@ 2018-10-08  8:28 ` bugzilla-daemon
  2018-10-08  8:28 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-08  8:28 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 347 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #2 from jamespharvey20@gmail.com ---
Created attachment 141933
  --> https://bugs.freedesktop.org/attachment.cgi?id=141933&action=edit
gdb backtrace of luxmark with opencl-mesa 18.2.2 in debug mode

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1364 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
  2018-10-08  8:27 ` bugzilla-daemon
  2018-10-08  8:28 ` bugzilla-daemon
@ 2018-10-08  8:28 ` bugzilla-daemon
  2018-10-08  8:29 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-08  8:28 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 370 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #3 from jamespharvey20@gmail.com ---
Created attachment 141934
  --> https://bugs.freedesktop.org/attachment.cgi?id=141934&action=edit
gdb backtrace of luxmark with opencl-mesa 18.2.2 Arch binary (not debug, no
symbols)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1431 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (2 preceding siblings ...)
  2018-10-08  8:28 ` bugzilla-daemon
@ 2018-10-08  8:29 ` bugzilla-daemon
  2018-10-08 21:08 ` [Bug 108272] [polaris10] " bugzilla-daemon
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-08  8:29 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 317 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #4 from jamespharvey20@gmail.com ---
Created attachment 141935
  --> https://bugs.freedesktop.org/attachment.cgi?id=141935&action=edit
gdb backtrace of IndigoBenchmark

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1274 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (3 preceding siblings ...)
  2018-10-08  8:29 ` bugzilla-daemon
@ 2018-10-08 21:08 ` bugzilla-daemon
  2018-10-08 21:10 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-08 21:08 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 794 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

Jan Vesely <jan.vesely@rutgers.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |99510
            Summary|opencl-mesa: Anything using |[polaris10] opencl-mesa:
                   |OpenCL segfaults, XFX       |Anything using OpenCL
                   |Radeon RX 580               |segfaults, XFX Radeon RX
                   |                            |580


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=99510
[Bug 99510] cl_khr_fp64 is reported as supported, but is not, on CAYMAN
-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1910 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (4 preceding siblings ...)
  2018-10-08 21:08 ` [Bug 108272] [polaris10] " bugzilla-daemon
@ 2018-10-08 21:10 ` bugzilla-daemon
  2018-10-08 21:25 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-08 21:10 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 638 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

Jan Vesely <jan.vesely@rutgers.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|99510                       |99553


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=99510
[Bug 99510] cl_khr_fp64 is reported as supported, but is not, on CAYMAN
https://bugs.freedesktop.org/show_bug.cgi?id=99553
[Bug 99553] Tracker bug for runnning OpenCL applications on Clover
-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1910 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (5 preceding siblings ...)
  2018-10-08 21:10 ` bugzilla-daemon
@ 2018-10-08 21:25 ` bugzilla-daemon
  2018-10-09  2:54 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-08 21:25 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 564 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

Jan Vesely <jan.vesely@rutgers.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |99553
         Depends on|99553                       |


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=99553
[Bug 99553] Tracker bug for runnning OpenCL applications on Clover
-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1789 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (6 preceding siblings ...)
  2018-10-08 21:25 ` bugzilla-daemon
@ 2018-10-09  2:54 ` bugzilla-daemon
  2018-10-09  8:32 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-09  2:54 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 873 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #5 from Jan Vesely <jan.vesely@rutgers.edu> ---
These look like two separate problems. The luxmark failure is known. Luxmark
requires more than 22 global buffers currently supported by radeonsi. without
asserts (src/gallium/drivers/radeonsi/si_compute.c:298) it accesses the global
buffer array out of bounds.
just bumping MAX_GLOBAL_BUFFERS to 32 allows luxmark to run, albeit still with
many incorrect pixels -- libclc rounding conversions are incorrect.



The second problem is harder to assess. since platform evaluation works OK with
clinfo. the failure seems to be in llvm initialization code. Is IndigoBenchmark
linking to LLVM (directly or via OpenGL)? if yes, is it linked to the same
version as clover?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1709 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (7 preceding siblings ...)
  2018-10-09  2:54 ` bugzilla-daemon
@ 2018-10-09  8:32 ` bugzilla-daemon
  2018-10-09  8:34 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-09  8:32 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2544 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #6 from jamespharvey20@gmail.com ---
Understood about Luxmark.

Doesn't look like it links against LLVM directly, but it links against libGL,
libGLX, and libGLdispatch.

Arch is on LLVM 7.0.0-1, and I wouldn't be surprised if that is newer than
IndigoBenchmark had on whatever distribution they compiled on.  I'm reaching
out to them to see if I can get their attention to come here, or answer what
they linked against.

IndigoBenchmark does work with opencl-amd, but I understand maybe that doesn't
link against or at least use the llvm initialization code in the same way.

I made a post on their support forum (indigorenderer.com/forum) which is
pending moderator approval.  Hopefully I can get them to share more
information.

$ ldd ./indigo_benchmark
linux-vdso.so.1 (0x00007fffb2f65000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f761adda000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f761adb9000)
libz.so.1 => /usr/lib/libz.so.1 (0x00007f761aba2000)
libpng12.so.0 => /usr/lib/libpng12.so.0 (0x00007f761a979000)
libQt5Gui.so.5 =>
/home/jamespharvey20/Downloads/IndigoBenchmark_x64_v4.0.64/./libQt5Gui.so.5
(0x00007f761a404000)
libQt5Core.so.5 =>
/home/jamespharvey20/Downloads/IndigoBenchmark_x64_v4.0.64/./libQt5Core.so.5
(0x00007f7619e5d000)
libQt5Widgets.so.5 =>
/home/jamespharvey20/Downloads/IndigoBenchmark_x64_v4.0.64/./libQt5Widgets.so.5
(0x00007f76197fd000)
libQt5Network.so.5 =>
/home/jamespharvey20/Downloads/IndigoBenchmark_x64_v4.0.64/./libQt5Network.so.5
(0x00007f76196f1000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f7619562000)
libm.so.6 => /usr/lib/libm.so.6 (0x00007f76193dd000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f76193c3000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007f76191ff000)
/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2
(0x00007f761ae0c000)
libGL.so.1 => /usr/lib/libGL.so.1 (0x00007f761916a000)
librt.so.1 => /usr/lib/librt.so.1 (0x00007f7619160000)
libGLX.so.0 => /usr/lib/libGLX.so.0 (0x00007f761912d000)
libX11.so.6 => /usr/lib/libX11.so.6 (0x00007f7618fef000)
libXext.so.6 => /usr/lib/libXext.so.6 (0x00007f7618ddd000)
libGLdispatch.so.0 => /usr/lib/libGLdispatch.so.0 (0x00007f7618d1f000)
libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00007f7618cf5000)
libXau.so.6 => /usr/lib/libXau.so.6 (0x00007f7618af1000)
libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00007f76188eb000)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3435 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (8 preceding siblings ...)
  2018-10-09  8:32 ` bugzilla-daemon
@ 2018-10-09  8:34 ` bugzilla-daemon
  2018-10-09 17:42 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-09  8:34 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 388 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #7 from jamespharvey20@gmail.com ---
Regarding Luxmark, do you know with MAX_GLOBAL_BUFFERS set to 32, is it merely
that many pixels will be shown wrong?  Are the benchmark results valid, or with
the wrong pixels, are the results garbage?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1213 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (9 preceding siblings ...)
  2018-10-09  8:34 ` bugzilla-daemon
@ 2018-10-09 17:42 ` bugzilla-daemon
  2018-10-19 18:00 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-09 17:42 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 572 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #8 from Jan Vesely <jan.vesely@rutgers.edu> ---
(In reply to jamespharvey20 from comment #7)
> Regarding Luxmark, do you know with MAX_GLOBAL_BUFFERS set to 32, is it
> merely that many pixels will be shown wrong?  Are the benchmark results
> valid, or with the wrong pixels, are the results garbage?

The image looks slightly garbled to me, results say that ~30% are incorrect on
my raven gpu. I only checked luxball.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1485 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (10 preceding siblings ...)
  2018-10-09 17:42 ` bugzilla-daemon
@ 2018-10-19 18:00 ` bugzilla-daemon
  2018-10-31 18:59 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-19 18:00 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 866 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #9 from Jan Vesely <jan.vesely@rutgers.edu> ---
hm, I thought I sent this out yesterday...

the luxball issue should be fixed by 06bf56725db1827dfcb86b1d0bcd71d195fda1d2
("radeonsi: Bump number of allowed global buffers to 32")

the indigo benchmark might be just manifestation of earlier memory corruption.
it has been the case before that OpenCL apps don't allocate large enough buffer
for device name [0,1]. clover uses rather lengthy device names (~80 chars in
your case).
you can try modifying the name string in
src/gallium/drivers/radeonsi/si_get.c:964 to see if helps hide the issue.


[0] https://github.com/JPaulMora/Pyrit/pull/572/files
[1] https://github.com/Theano/libgpuarray/pull/531/files

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1843 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (11 preceding siblings ...)
  2018-10-19 18:00 ` bugzilla-daemon
@ 2018-10-31 18:59 ` bugzilla-daemon
  2018-11-01  4:33 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-10-31 18:59 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 302 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #10 from Juan A. Suarez <jasuarez@igalia.com> ---
Mesa 18.2.4 has been released. Could you check if that version fixes this bug?
If so, please, close it.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1140 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (12 preceding siblings ...)
  2018-10-31 18:59 ` bugzilla-daemon
@ 2018-11-01  4:33 ` bugzilla-daemon
  2018-12-17 17:46 ` bugzilla-daemon
  2019-09-25 18:10 ` bugzilla-daemon
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-11-01  4:33 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 4448 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #11 from jamespharvey20@gmail.com ---
When I originally filed this, I assumed it was 1 bug since I tried 2 things
with OpenCL, and both failed with opencl-mesa but worked with opencl-amd.

Jan Vesely was correct that there were two separate problems.

I'm hoping Jan Vesely can give guidance on whether to leave this bug open for
any of the reasons below, or if I should close it and potentially open up 1-2
new bugs.

The original luxmark bug (segfault) is solved, but that exposes 2 new
opencl-mesa bugs when running luxmark.

The original IndigoBenchmark bug (segfault) isn't solved, but as explained
below, I understand if we have to consider that unsolvable for now.

I don't think this affects any of these bugs, but I'll mention a few weeks ago,
I switched back to my Asus Radeon R9 390.  The same behaviors discussed in this
entire bug report occur.  (i.e. 18.2.3 and before crash luxmark.)  If someone
really wants me to do so, I can switch back to the RX 580 to test 18.2.4, but
I'm betting since it works properly with the R9 390 that the problem is fixed.

ORIGINAL LUXMARK BUG #1
-----------------------------------------

Using mesa 18.2.4, the luxmark segfault is solved.

NEW - LUXMARK BUG #2
------------------------------------

Jan Vesely's comment on 2018-10-09 mentions: "bumping MAX_GLOBAL_BUFFERS to 32
allows luxmark to run, albeit still with many incorrect pixels -- libclc
rounding conversions are incorrect."

That's what I'm seeing out of 18.2.4.  Using LuxBall HDR (Simple Benchmark):

MESA 18.2.4: 40626 (Image validation OK (65739 different pixels, 10.27%)

AMDGPU-PRO: 15739 (Image validation OK (5736 different pixels, 0.90%)

There's no typos there.  opencl-mesa scores almost unbelievably higher than
opencl-amd, but the different pixels percentage increases by a factor of 11.4.

As Jan's other comment on 2018-10-09 mentions, the image looks garbled and the
results are incorrect.

Not sure if this bug should be left open for this issue, or if I should create
a new bug.  (Or, if there is a bug already open for it.)  Or, if mesa will say
it's purely libclc's problem, and to go to them about it.

NEW - LUXMARK BUG #3
------------------------------------

Although luxmark can now benchmark, when doing so, all input becomes unusably
awful.  It reminds me of when Windows has too many things open, suddenly
decided it can't cope, and you're waiting to see if it's going to recover or
crash.  Keystrokes take too long to be printed, and the mouse becomes slow and
jumpy.  Top shows cpu and memory usage are fine, which was my first thought. 
BTW, running xf86-video-amdgpu 18.1.0, and when I upgraded mesa, it was both
mesa and opencl-mesa.

In comparison, if I use opencl-amd, input is not affected.  I wouldn't even
know the GPU is being slammed.

Using the program radeontop, I can see when using mesa, "Graphics pipe",
"Texture Addresser", and "Shader Interpolator" are between 95-100%, usually
98-100%.

When using opencl-amd, radeontop shows the same.  (Granted, Vertex Grouper +
Tesselator / Shader Export/Scan Converter/Depth Block/Color Block bounce
between 5-20% vs on opencl-mesa, they bounce between 1-5%.)

INDIGO BUG
------------------

I edited 18.2.4's si_get.c to be very short:

    snprintf(sscreen->renderer_string, sizeof(sscreen->renderer_string),
       "%s",
       chip_name);

And compiled/installed it, but it didn't affect the crash.

IndigoBenchmark said they're statically linking with LLVM 3.4, which is quite
old.  But, it runs fine with opencl-amd, and only crashes on opencl-mesa.  I
just posted a followup "where do we go from here"-ish comment there which has
to be moderator approved so isn't showing yet. 
 https://www.indigorenderer.com/forum/viewtopic.php?f=37&t=14986

Part of me thinks it needs to be given up on, being a closed-source precompiled
binary statically linked against LLVM 3.4.

Part of me thinks since it only crashes with opencl-mesa, and runs perfectly
fine with opencl-amd, there's probably (but not definitely) a bug in
opencl-mesa.

But, I understand since they don't seem to be paying this any attention, we may
have to give up on the Indigo Bug as being unable to be realistically
investigated further.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 5914 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (13 preceding siblings ...)
  2018-11-01  4:33 ` bugzilla-daemon
@ 2018-12-17 17:46 ` bugzilla-daemon
  2019-09-25 18:10 ` bugzilla-daemon
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2018-12-17 17:46 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 5652 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

--- Comment #12 from Jan Vesely <jan.vesely@rutgers.edu> ---
Hi,

sorry for the delay. somehow I missed the notifications.
(In reply to jamespharvey20 from comment #11)
> When I originally filed this, I assumed it was 1 bug since I tried 2 things
> with OpenCL, and both failed with opencl-mesa but worked with opencl-amd.
> 
> Jan Vesely was correct that there were two separate problems.
> 
> I'm hoping Jan Vesely can give guidance on whether to leave this bug open
> for any of the reasons below, or if I should close it and potentially open
> up 1-2 new bugs.
> 
> The original luxmark bug (segfault) is solved, but that exposes 2 new
> opencl-mesa bugs when running luxmark.
> 
> The original IndigoBenchmark bug (segfault) isn't solved, but as explained
> below, I understand if we have to consider that unsolvable for now.
> 
> I don't think this affects any of these bugs, but I'll mention a few weeks
> ago, I switched back to my Asus Radeon R9 390.  The same behaviors discussed
> in this entire bug report occur.  (i.e. 18.2.3 and before crash luxmark.) 
> If someone really wants me to do so, I can switch back to the RX 580 to test
> 18.2.4, but I'm betting since it works properly with the R9 390 that the
> problem is fixed.
> 
> ORIGINAL LUXMARK BUG #1
> -----------------------------------------
> 
> Using mesa 18.2.4, the luxmark segfault is solved.

As this was the first bug. I'd close this one and open new bugs for both indigo
and incorrect rendering in luxmark.

> 
> NEW - LUXMARK BUG #2
> ------------------------------------
> 
> Jan Vesely's comment on 2018-10-09 mentions: "bumping MAX_GLOBAL_BUFFERS to
> 32 allows luxmark to run, albeit still with many incorrect pixels -- libclc
> rounding conversions are incorrect."
> 
> That's what I'm seeing out of 18.2.4.  Using LuxBall HDR (Simple Benchmark):
> 
> MESA 18.2.4: 40626 (Image validation OK (65739 different pixels, 10.27%)
> 
> AMDGPU-PRO: 15739 (Image validation OK (5736 different pixels, 0.90%)
> 
> There's no typos there.  opencl-mesa scores almost unbelievably higher than
> opencl-amd, but the different pixels percentage increases by a factor of
> 11.4.
> 
> As Jan's other comment on 2018-10-09 mentions, the image looks garbled and
> the results are incorrect.
> 
> Not sure if this bug should be left open for this issue, or if I should
> create a new bug.  (Or, if there is a bug already open for it.)  Or, if mesa
> will say it's purely libclc's problem, and to go to them about it.

I'd say this is probably a purely libclc problem, but feel free to open the bug
against clover on freedesktop. 10% is rather good I usually saw ~30% wrong
pixels on my machines.

> 
> NEW - LUXMARK BUG #3
> ------------------------------------
> 
> Although luxmark can now benchmark, when doing so, all input becomes
> unusably awful.  It reminds me of when Windows has too many things open,
> suddenly decided it can't cope, and you're waiting to see if it's going to
> recover or crash.  Keystrokes take too long to be printed, and the mouse
> becomes slow and jumpy.  Top shows cpu and memory usage are fine, which was
> my first thought.  BTW, running xf86-video-amdgpu 18.1.0, and when I
> upgraded mesa, it was both mesa and opencl-mesa.
> 
> In comparison, if I use opencl-amd, input is not affected.  I wouldn't even
> know the GPU is being slammed.
> 
> Using the program radeontop, I can see when using mesa, "Graphics pipe",
> "Texture Addresser", and "Shader Interpolator" are between 95-100%, usually
> 98-100%.
> 
> When using opencl-amd, radeontop shows the same.  (Granted, Vertex Grouper +
> Tesselator / Shader Export/Scan Converter/Depth Block/Color Block bounce
> between 5-20% vs on opencl-mesa, they bounce between 1-5%.)

This sounds like GPU priority/scheduling problem. I haven't looked into whether
it can be solved via opening lower priority pipe for compute, or we need to
enable advanced features like CWSR. Please open a separate bug. Hogging a large
portion of the GPU might explain some of that high score.

> 
> INDIGO BUG
> ------------------
> 
> I edited 18.2.4's si_get.c to be very short:
> 
>     snprintf(sscreen->renderer_string, sizeof(sscreen->renderer_string),
>        "%s",
>        chip_name);
> 
> And compiled/installed it, but it didn't affect the crash.
> 
> IndigoBenchmark said they're statically linking with LLVM 3.4, which is
> quite old.  But, it runs fine with opencl-amd, and only crashes on
> opencl-mesa.  I just posted a followup "where do we go from here"-ish
> comment there which has to be moderator approved so isn't showing yet. 
>  https://www.indigorenderer.com/forum/viewtopic.php?f=37&t=14986
> 
> Part of me thinks it needs to be given up on, being a closed-source
> precompiled binary statically linked against LLVM 3.4.
> 
> Part of me thinks since it only crashes with opencl-mesa, and runs perfectly
> fine with opencl-amd, there's probably (but not definitely) a bug in
> opencl-mesa.
> 
> But, I understand since they don't seem to be paying this any attention, we
> may have to give up on the Indigo Bug as being unable to be realistically
> investigated further.

Can you check if indigo exports any LLVM symbols? It might be that we end up
using those instead of the new ones from libLLVM.*
If that's the case one solution would be to link mesa/clover with static LLVM.
Enabling symbol versioning for LLVM should work as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 7582 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug 108272] [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580
  2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
                   ` (14 preceding siblings ...)
  2018-12-17 17:46 ` bugzilla-daemon
@ 2019-09-25 18:10 ` bugzilla-daemon
  15 siblings, 0 replies; 17+ messages in thread
From: bugzilla-daemon @ 2019-09-25 18:10 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 843 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108272

GitLab Migration User <gitlab-migration@fdo.invalid> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |MOVED

--- Comment #13 from GitLab Migration User <gitlab-migration@fdo.invalid> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1333.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2525 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-09-25 18:10 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-08  8:26 [Bug 108272] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580 bugzilla-daemon
2018-10-08  8:27 ` bugzilla-daemon
2018-10-08  8:28 ` bugzilla-daemon
2018-10-08  8:28 ` bugzilla-daemon
2018-10-08  8:29 ` bugzilla-daemon
2018-10-08 21:08 ` [Bug 108272] [polaris10] " bugzilla-daemon
2018-10-08 21:10 ` bugzilla-daemon
2018-10-08 21:25 ` bugzilla-daemon
2018-10-09  2:54 ` bugzilla-daemon
2018-10-09  8:32 ` bugzilla-daemon
2018-10-09  8:34 ` bugzilla-daemon
2018-10-09 17:42 ` bugzilla-daemon
2018-10-19 18:00 ` bugzilla-daemon
2018-10-31 18:59 ` bugzilla-daemon
2018-11-01  4:33 ` bugzilla-daemon
2018-12-17 17:46 ` bugzilla-daemon
2019-09-25 18:10 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.