All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
@ 2018-11-01 15:59 bugzilla-daemon
  2018-11-01 17:55 ` bugzilla-daemon
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-01 15:59 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 5096 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

            Bug ID: 108625
           Summary: AMDGPU - Can't even get Xorg to start - Kernel driver
                    hangs  with ring buffer timeout on ARM64
           Product: DRI
           Version: unspecified
          Hardware: ARM
                OS: Linux (All)
            Status: NEW
          Severity: blocker
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: raster@rasterman.com

So we're going to have fun with this one...

Start Xorg. It hangs in screen setup:

  #0  ioctl () at ../sysdeps/unix/sysv/linux/aarch64/ioctl.S:25
  #1  0x0000ffffbb149334 in drmIoctl () from /lib/aarch64-linux-gnu/libdrm.so.2
  #2  0x0000ffffba5166b4 in amdgpu_cs_query_fence_status () from
/lib/aarch64-linux-gnu/libdrm_amdgpu.so.1
  #3  0x0000ffffb9ef37f8 in ?? () from
/usr/lib/aarch64-linux-gnu/dri/radeonsi_dri.so
  #4  0x0000ffffb9dd148c in ?? () from
/usr/lib/aarch64-linux-gnu/dri/radeonsi_dri.so
  #5  0x0000ffffb993d448 in ?? () from
/usr/lib/aarch64-linux-gnu/dri/radeonsi_dri.so
  #6  0x0000ffffb993d4ac in ?? () from
/usr/lib/aarch64-linux-gnu/dri/radeonsi_dri.so
  #7  0x0000ffffba54425c in ?? () from
/usr/lib/xorg/modules/drivers/amdgpu_drv.so
  #8  0x0000ffffba537ca8 in ?? () from
/usr/lib/xorg/modules/drivers/amdgpu_drv.so
  #9  0x0000aaaae7133348 in MapWindow ()
  #10 0x0000aaaae710c820 in ?? ()
  #11 0x0000ffffbad52720 in __libc_start_main (main=0x0, argc=0, argv=0x0,
init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=<optimized out>) at ../csu/libc-start.c:310

And that ioctl hangs because of:

  [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled
seq=10, last emitted seq=11
  [drm] GPU recovery disabled.

The amdgpu kernel driver reports:

  [drm] amdgpu kernel modesetting enabled.
  amdgpu 0000:89:00.0: enabling device (0100 -> 0102)
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_mc.bin
  amdgpu 0000:89:00.0: BAR 2: releasing [mem 0x14010000000-0x140101fffff 64bit
pref]
  amdgpu 0000:89:00.0: BAR 0: releasing [mem 0x14000000000-0x1400fffffff 64bit
pref]
  amdgpu 0000:89:00.0: BAR 0: assigned [mem 0x14000000000-0x140ffffffff 64bit
pref]
  amdgpu 0000:89:00.0: BAR 2: assigned [mem 0x14100000000-0x141001fffff 64bit
pref]
  amdgpu 0000:89:00.0: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF
(4096M used)
  amdgpu 0000:89:00.0: GTT: 256M 0x0000000000000000 - 0x000000000FFFFFFF
  [drm] amdgpu: 4096M of VRAM memory ready
  [drm] amdgpu: 4096M of GTT memory ready.
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_pfp_2.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_me_2.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_ce_2.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_rlc.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_mec_2.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_mec2_2.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_sdma.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_sdma1.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_uvd.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_vce.bin
  amdgpu 0000:89:00.0: firmware: direct-loading firmware
amdgpu/polaris11_k_smc.bin
  [drm] Initialized amdgpu 3.26.0 20150101 for 0000:89:00.0 on minor 1
  amdgpu 0000:89:00.0: vgaarb: changed VGA decodes:
olddecodes=io+mem,decodes=none:owns=none

So here is where the fun begins. Kernel is:

  Linux noisy 4.18.0-2-arm64 #1 SMP Debian 4.18.10-2 (2018-10-07) aarch64
GNU/Linux

It's Debian unstable on a Cavium Thunder-X2 64bit ARM system (2 CPUs with 32
cores each, 256 cores total with 4 way SMT enabled) with a bunch of PCIE slots.
There is an Nvidia card that works.... to a decent degree and an on-board PCIE
dumb framebuffer display device (ASPEED), but I'd rather a more open stack etc.
- I've fiddled with xorg configs to get it to ignore other devices other than
the AMD one like with:

  Section "ServerFlags"
         Option "AutoAddGPU" "false"
  EndSection

  Section "Device"
         Identifier "amdgpu"
         Driver "amdgpu"
         BusID "PCI:137:0:0"
         Option "DRI" "2"
         Option "TearFree" "on"
  EndSection

I've even put the AMD card in the same slot as the Nvidia one with the same
results, so it's not a slot specific issue it seems. So where should I start
poking to see where this very early stage ring gfx timeout is originating from
specifically... I'm willing to start the fun of compiling kernels etc. to dig
through this. So how can I help solve this and make AMD cards portable and
usable? :)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 6553 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
@ 2018-11-01 17:55 ` bugzilla-daemon
  2018-11-02 12:14 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-01 17:55 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 258 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #1 from Alex Deucher <alexdeucher@gmail.com> ---
Please attach your full dmesg output and xorg log if using X.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1131 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
  2018-11-01 17:55 ` bugzilla-daemon
@ 2018-11-02 12:14 ` bugzilla-daemon
  2018-11-02 12:15 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-02 12:14 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 311 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #2 from Carsten Haitzler <raster@rasterman.com> ---
Created attachment 142337
  --> https://bugs.freedesktop.org/attachment.cgi?id=142337&action=edit
log - dmesg

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1301 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
  2018-11-01 17:55 ` bugzilla-daemon
  2018-11-02 12:14 ` bugzilla-daemon
@ 2018-11-02 12:15 ` bugzilla-daemon
  2018-11-02 12:15 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-02 12:15 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 310 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #3 from Carsten Haitzler <raster@rasterman.com> ---
Created attachment 142338
  --> https://bugs.freedesktop.org/attachment.cgi?id=142338&action=edit
log - xorg

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1298 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (2 preceding siblings ...)
  2018-11-02 12:15 ` bugzilla-daemon
@ 2018-11-02 12:15 ` bugzilla-daemon
  2018-11-02 12:16 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-02 12:15 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 328 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #4 from Carsten Haitzler <raster@rasterman.com> ---
Created attachment 142339
  --> https://bugs.freedesktop.org/attachment.cgi?id=142339&action=edit
log - xorg - gdb attach + bt

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1352 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (3 preceding siblings ...)
  2018-11-02 12:15 ` bugzilla-daemon
@ 2018-11-02 12:16 ` bugzilla-daemon
  2018-11-02 18:41 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-02 12:16 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 250 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #5 from Carsten Haitzler <raster@rasterman.com> ---
Attached them (too big to put inline as comments).

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1126 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (4 preceding siblings ...)
  2018-11-02 12:16 ` bugzilla-daemon
@ 2018-11-02 18:41 ` bugzilla-daemon
  2018-11-02 18:41 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-02 18:41 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 345 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #6 from Alex Deucher <alexdeucher@gmail.com> ---
It looks like something submitted by mesa caused a GPU hang.  You might try
starting a bare X server and trying some simple OGL apps to start with.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1218 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (5 preceding siblings ...)
  2018-11-02 18:41 ` bugzilla-daemon
@ 2018-11-02 18:41 ` bugzilla-daemon
  2018-11-04 13:17 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-02 18:41 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 237 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #7 from Alex Deucher <alexdeucher@gmail.com> ---
Or try a newer or older version of mesa.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1110 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (6 preceding siblings ...)
  2018-11-02 18:41 ` bugzilla-daemon
@ 2018-11-04 13:17 ` bugzilla-daemon
  2018-11-04 16:15 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-04 13:17 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 482 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #8 from Carsten Haitzler <raster@rasterman.com> ---
Actually no ogl client has even started. this is just the xserver being started
by slim (login manager) and that doesn't use OGL. it's really basic xlib stuff.
so it is basically a raw xserver... perhaps its the glamor accel stuff...
but... no OGL clients. :) never got that far.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1358 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (7 preceding siblings ...)
  2018-11-04 13:17 ` bugzilla-daemon
@ 2018-11-04 16:15 ` bugzilla-daemon
  2018-11-05  9:08 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-04 16:15 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 584 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #9 from Alex Deucher <alexdeucher@gmail.com> ---
(In reply to Carsten Haitzler from comment #8)
> Actually no ogl client has even started. this is just the xserver being
> started by slim (login manager) and that doesn't use OGL. it's really basic
> xlib stuff. so it is basically a raw xserver... perhaps its the glamor accel
> stuff... but... no OGL clients. :) never got that far.

Yeah, it would be GL via glamor in that case.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1537 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (8 preceding siblings ...)
  2018-11-04 16:15 ` bugzilla-daemon
@ 2018-11-05  9:08 ` bugzilla-daemon
  2018-11-05 15:20 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-05  9:08 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1001 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #10 from Carsten Haitzler <raster@rasterman.com> ---
so wouldn't that make it a necessity then if its even glamor needing it? i
guess i can turn off glamor accel but realistically gl is a necessity so the
problem needs to be addressed sooner or later.

the ring gfx timeout smells to me of "not a mesa bug" in that an ioctl going to
the drm driver never returns qhen doing a simple query. it hangs, thus
something lower down that is having a bad day, if something as simple as
querying a fence causes a hang... :)

what is this ring gfx thing exactly (seems to be some command queue) and why
would it be timing out? all the way back at seq 10/11 ... like right at the
start of its use? it's almost like some interrupt or in memory semaphore thing
mapped from the card is messing up? i'm looking for something to look into more
specifically.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1888 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (9 preceding siblings ...)
  2018-11-05  9:08 ` bugzilla-daemon
@ 2018-11-05 15:20 ` bugzilla-daemon
  2018-11-05 15:32 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-05 15:20 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 476 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #11 from Alex Deucher <alexdeucher@gmail.com> ---
Does this patch help?
https://patchwork.freedesktop.org/patch/259364/

Does ARM support write combining?  The driver uses it pretty extensively.  You
might try disabling GTT_USWC (uncached write combined) support in the kernel
driver and just falling back to cached memory.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1412 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (10 preceding siblings ...)
  2018-11-05 15:20 ` bugzilla-daemon
@ 2018-11-05 15:32 ` bugzilla-daemon
  2018-11-09 20:32 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-05 15:32 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2105 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #12 from Alex Deucher <alexdeucher@gmail.com> ---
(In reply to Carsten Haitzler from comment #10)
> so wouldn't that make it a necessity then if its even glamor needing it? i
> guess i can turn off glamor accel but realistically gl is a necessity so the
> problem needs to be addressed sooner or later.
> 

If you were starting a bare x server, you usually don't hit the glamor paths
too extensively compared to a full desktop environment.

> the ring gfx timeout smells to me of "not a mesa bug" in that an ioctl going
> to the drm driver never returns qhen doing a simple query. it hangs, thus
> something lower down that is having a bad day, if something as simple as
> querying a fence causes a hang... :)
> 
> what is this ring gfx thing exactly (seems to be some command queue) and why
> would it be timing out? all the way back at seq 10/11 ... like right at the
> start of its use? it's almost like some interrupt or in memory semaphore
> thing mapped from the card is messing up? i'm looking for something to look
> into more specifically.

Each engine on the GPU (gfx, compute, video decode, encode, dma, etc.) has a
ring buffer used to feed it.  The work sent to the engines is managed by a sw
scheduler in the kernel. The kernel driver tests the rings as part of the
driver init sequence.  The driver won't come up if the ring tests fail so they
are working at least until you start X.  Presumably X submits (via glamor) some
work to the GPU which causes the GPU to hang.  The fence never signals because
the GPU never finished processing the job due to the hang.

Another simplier test would be to boot up to a console (no X) and then try
running some of the libdrm amdgpu tests.  They are really simple (copying data
and round and verifying it using different engines, allocating freeing memory,
etc.).
https://cgit.freedesktop.org/mesa/drm/tree/tests/amdgpu
See if some of the simple copy or write tests work.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 3198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (11 preceding siblings ...)
  2018-11-05 15:32 ` bugzilla-daemon
@ 2018-11-09 20:32 ` bugzilla-daemon
  2018-11-09 20:33 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-09 20:32 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 269 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #13 from Alex Deucher <alexdeucher@gmail.com> ---
Does returning false in drm_arch_can_wc_memory() for ARM fix the issue?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1143 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (12 preceding siblings ...)
  2018-11-09 20:32 ` bugzilla-daemon
@ 2018-11-09 20:33 ` bugzilla-daemon
  2018-11-19 13:00 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-09 20:33 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 372 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #14 from Alex Deucher <alexdeucher@gmail.com> ---
(In reply to Alex Deucher from comment #13)
> Does returning false in drm_arch_can_wc_memory() for ARM fix the issue?

This has enabled a working driver for others on ARM.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1318 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (13 preceding siblings ...)
  2018-11-09 20:33 ` bugzilla-daemon
@ 2018-11-19 13:00 ` bugzilla-daemon
  2018-11-19 13:39 ` bugzilla-daemon
  2019-03-21 20:55 ` bugzilla-daemon
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-19 13:00 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 890 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #15 from Carsten Haitzler <raster@rasterman.com> ---
And lo and behold:

--- ./include/drm/drm_cache.h~  2018-08-12 21:41:04.000000000 +0100
+++ ./include/drm/drm_cache.h   2018-11-16 11:06:16.976842816 +0000
@@ -48,7 +48,7 @@
 #elif defined(CONFIG_MIPS) && defined(CONFIG_CPU_LOONGSON3)
        return false;
 #else
-       return true;
+       return false;
 #endif
 }

Makes it work. Of course this isn't a brilliant patch, but indeed there is
something up with the way write combined memory is handled on ARM here. but
disabling WC for all ARM DRM devices might be too much of a sledgehammer... I'm
going to look into a less sledge-hammer solution that might make this work more
universally. I'll get back to you on that.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1791 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (14 preceding siblings ...)
  2018-11-19 13:00 ` bugzilla-daemon
@ 2018-11-19 13:39 ` bugzilla-daemon
  2019-03-21 20:55 ` bugzilla-daemon
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2018-11-19 13:39 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 621 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #16 from Christian König <ckoenig.leichtzumerken@gmail.com> ---
(In reply to Carsten Haitzler from comment #15)
> Makes it work. Of course this isn't a brilliant patch, but indeed there is
> something up with the way write combined memory is handled on ARM here.

Well disabling WC is also a good way of reducing the performance in general.

E.g. what could be is that because you disabled WC the performance is reduced
and because of that the timing is changed....

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1585 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs  with ring buffer timeout on ARM64
  2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
                   ` (15 preceding siblings ...)
  2018-11-19 13:39 ` bugzilla-daemon
@ 2019-03-21 20:55 ` bugzilla-daemon
  16 siblings, 0 replies; 18+ messages in thread
From: bugzilla-daemon @ 2019-03-21 20:55 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 652 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108625

Alex Deucher <alexdeucher@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #17 from Alex Deucher <alexdeucher@gmail.com> ---
Fixed with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/drm/drm_cache.h?id=e02f5c1bb2283cfcee68f2f0feddcc06150f13aa

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2445 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-03-21 20:55 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-01 15:59 [Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64 bugzilla-daemon
2018-11-01 17:55 ` bugzilla-daemon
2018-11-02 12:14 ` bugzilla-daemon
2018-11-02 12:15 ` bugzilla-daemon
2018-11-02 12:15 ` bugzilla-daemon
2018-11-02 12:16 ` bugzilla-daemon
2018-11-02 18:41 ` bugzilla-daemon
2018-11-02 18:41 ` bugzilla-daemon
2018-11-04 13:17 ` bugzilla-daemon
2018-11-04 16:15 ` bugzilla-daemon
2018-11-05  9:08 ` bugzilla-daemon
2018-11-05 15:20 ` bugzilla-daemon
2018-11-05 15:32 ` bugzilla-daemon
2018-11-09 20:32 ` bugzilla-daemon
2018-11-09 20:33 ` bugzilla-daemon
2018-11-19 13:00 ` bugzilla-daemon
2018-11-19 13:39 ` bugzilla-daemon
2019-03-21 20:55 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.