Comment # 11 on bug 108272 from
When I originally filed this, I assumed it was 1 bug since I tried 2 things
with OpenCL, and both failed with opencl-mesa but worked with opencl-amd.

Jan Vesely was correct that there were two separate problems.

I'm hoping Jan Vesely can give guidance on whether to leave this bug open for
any of the reasons below, or if I should close it and potentially open up 1-2
new bugs.

The original luxmark bug (segfault) is solved, but that exposes 2 new
opencl-mesa bugs when running luxmark.

The original IndigoBenchmark bug (segfault) isn't solved, but as explained
below, I understand if we have to consider that unsolvable for now.

I don't think this affects any of these bugs, but I'll mention a few weeks ago,
I switched back to my Asus Radeon R9 390.  The same behaviors discussed in this
entire bug report occur.  (i.e. 18.2.3 and before crash luxmark.)  If someone
really wants me to do so, I can switch back to the RX 580 to test 18.2.4, but
I'm betting since it works properly with the R9 390 that the problem is fixed.

ORIGINAL LUXMARK BUG #1
-----------------------------------------

Using mesa 18.2.4, the luxmark segfault is solved.

NEW - LUXMARK BUG #2
------------------------------------

Jan Vesely's comment on 2018-10-09 mentions: "bumping MAX_GLOBAL_BUFFERS to 32
allows luxmark to run, albeit still with many incorrect pixels -- libclc
rounding conversions are incorrect."

That's what I'm seeing out of 18.2.4.  Using LuxBall HDR (Simple Benchmark):

MESA 18.2.4: 40626 (Image validation OK (65739 different pixels, 10.27%)

AMDGPU-PRO: 15739 (Image validation OK (5736 different pixels, 0.90%)

There's no typos there.  opencl-mesa scores almost unbelievably higher than
opencl-amd, but the different pixels percentage increases by a factor of 11.4.

As Jan's other comment on 2018-10-09 mentions, the image looks garbled and the
results are incorrect.

Not sure if this bug should be left open for this issue, or if I should create
a new bug.  (Or, if there is a bug already open for it.)  Or, if mesa will say
it's purely libclc's problem, and to go to them about it.

NEW - LUXMARK BUG #3
------------------------------------

Although luxmark can now benchmark, when doing so, all input becomes unusably
awful.  It reminds me of when Windows has too many things open, suddenly
decided it can't cope, and you're waiting to see if it's going to recover or
crash.  Keystrokes take too long to be printed, and the mouse becomes slow and
jumpy.  Top shows cpu and memory usage are fine, which was my first thought. 
BTW, running xf86-video-amdgpu 18.1.0, and when I upgraded mesa, it was both
mesa and opencl-mesa.

In comparison, if I use opencl-amd, input is not affected.  I wouldn't even
know the GPU is being slammed.

Using the program radeontop, I can see when using mesa, "Graphics pipe",
"Texture Addresser", and "Shader Interpolator" are between 95-100%, usually
98-100%.

When using opencl-amd, radeontop shows the same.  (Granted, Vertex Grouper +
Tesselator / Shader Export/Scan Converter/Depth Block/Color Block bounce
between 5-20% vs on opencl-mesa, they bounce between 1-5%.)

INDIGO BUG
------------------

I edited 18.2.4's si_get.c to be very short:

    snprintf(sscreen->renderer_string, sizeof(sscreen->renderer_string),
       "%s",
       chip_name);

And compiled/installed it, but it didn't affect the crash.

IndigoBenchmark said they're statically linking with LLVM 3.4, which is quite
old.  But, it runs fine with opencl-amd, and only crashes on opencl-mesa.  I
just posted a followup "where do we go from here"-ish comment there which has
to be moderator approved so isn't showing yet. 
 https://www.indigorenderer.com/forum/viewtopic.php?f=37&t=14986

Part of me thinks it needs to be given up on, being a closed-source precompiled
binary statically linked against LLVM 3.4.

Part of me thinks since it only crashes with opencl-mesa, and runs perfectly
fine with opencl-amd, there's probably (but not definitely) a bug in
opencl-mesa.

But, I understand since they don't seem to be paying this any attention, we may
have to give up on the Indigo Bug as being unable to be realistically
investigated further.


You are receiving this mail because: