2017-05-09 9:50 GMT+02:00 Vincent Vanackere : > Some additional data: > - putting LIBGL_ALWAYS_SOFTWARE=1 in /etc/environment makes indeed the > system work (for my current usage, the slowness is acceptable in exchange > of stabillity) > Unfortunately I just got a freeze (using wayland with LIBGL_ALWAYS_SOFTWARE=1): [179221.647861] nouveau 0000:01:00.0: Xwayland[27856]: nv50cal_space: -16 [179245.768920] traps: gnome-shell[3175] trap int3 ip:7f14cd988de1 sp:7ffe10e66110 error:0 in libglib-2.0.so.0.5200.0[7f14cd939000+111000] [179256.854109] [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179267.094392] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179277.334749] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] flip_done timed out [179279.385856] nouveau 0000:01:00.0: DRM: base-1: timeout [179289.623162] [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179299.863479] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179310.103838] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] flip_done timed out [179319.064210] INFO: task kworker/u8:1:30061 blocked for more than 120 seconds. [179319.064211] Not tainted 4.11.0-999-generic #201705062201 [179319.064211] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [179319.064212] kworker/u8:1 D 0 30061 2 0x00000000 [179319.064238] Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau] [179319.064239] Call Trace: [179319.064242] __schedule+0x3c3/0x840 [179319.064261] ? nouveau_display_scanoutpos+0xe9/0x180 [nouveau] [179319.064262] schedule+0x36/0x80 [179319.064264] schedule_timeout+0x23e/0x310 [179319.064265] ? __slab_free+0xa9/0x300 [179319.064283] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] [179319.064300] ? nv84_fence_read+0x2e/0x30 [nouveau] [179319.064301] dma_fence_default_wait+0x1af/0x250 [179319.064302] ? dma_fence_default_wait+0x1af/0x250 [179319.064304] ? dma_fence_free+0x20/0x20 [179319.064305] dma_fence_wait_timeout+0x39/0xe0 [179319.064310] drm_atomic_helper_wait_for_fences+0x4c/0xf0 [drm_kms_helper] [179319.064326] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] [179319.064342] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] [179319.064343] process_one_work+0x1e9/0x410 [179319.064344] worker_thread+0x4b/0x410 [179319.064345] kthread+0x109/0x140 [179319.064346] ? process_one_work+0x410/0x410 [179319.064347] ? kthread_create_on_node+0x70/0x70 [179319.064348] ret_from_fork+0x2c/0x40 [179320.344194] [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179330.584461] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179340.824777] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] flip_done timed out My current kernel version is 4.11.0-999.201705062201 from http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/2017-05-07/ To Ben Skeggs: is there anything I could do to help fix this ? If there is no hope of stability improvements I will have to switch to another graphic card so please let me know ! Best regards, Vincent - I still get lock-up using mesa from git (17.2~git1705081930.25d2 from > this repository https://launchpad.net/~oibaf/+archive/ubuntu/graphics- > drivers ) > > I have another question (probably Ben Skeggs could also give an advice ?): > I see there are a lot more mesa variables that can be set ( > https://www.mesa3d.org/envvars.html). Are there some other variables that > I could set in order to either partially enable hardware acceleration or > (better) to get a diagnostic of what the driver is doing that is causing > the graphic card to hang ? > > Thanks for your help ! > > Vincent > > 2017-05-08 13:50 GMT+02:00 Vincent Vanackere > : > >> On 07/05/2017 23:50, Ilia Mirkin wrote: >> > You have two issues: >> > >> > (a) nouveau's GL driver messed something up, causing a read fault error >> > (b) nouveau's kernel driver tried to recover. It failed. >> > >> > Solution to #1: None, really. You can try updating mesa, and hope it >> > helps. Not sure what version you're on. >> >> Here's my packages version: >> >> ii libegl1-mesa:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the EGL API -- runtime >> ii libegl1-mesa-dev:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the EGL API -- development files >> ii libgl1-mesa-dev:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the OpenGL API -- GLX development files >> ii libgl1-mesa-dri:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the OpenGL API -- DRI modules >> ii libgl1-mesa-glx:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the OpenGL API -- GLX runtime >> ii libglapi-mesa:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the GL API -- shared library >> ii libgles2-mesa:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the OpenGL|ES 2.x API -- runtime >> ii libglu1-mesa:amd64 9.0.0-2.1build1 amd64 >> Mesa OpenGL utility library (GLU) >> ii libglu1-mesa-dev:amd64 9.0.0-2.1build1 amd64 >> Mesa OpenGL utility library -- development files >> ii libwayland-egl1-mesa:amd64 17.0.3-1ubuntu1 amd64 >> implementation of the Wayland EGL platform -- runtime >> ii mesa-common-dev:amd64 17.0.3-1ubuntu1 amd64 >> Developer documentation for Mesa >> ii mesa-utils 8.3.0-4 amd64 >> Miscellaneous Mesa GL utilities >> ii mesa-vdpau-drivers:amd64 17.0.3-1ubuntu1 amd64 >> Mesa VDPAU video acceleration drivers >> >> >> I'll try compiling a newer version from git to see if it helps... >> >> > Solution to #2: Ben Skeggs will hopefully have something clever to >> > say. The recovery logic was recently beefed up considerably, so the >> > fact that you even got that far is already a good start. >> > >> > If you're looking for a stable experience with Xorg, I recommend using >> > xf86-video-nouveau -- it's been extensively battle-tested, and is >> > quite simple logic; I also recommend against anything that uses GL on >> > an ongoing basis (which, sadly, everyone thinks is the coolest thing >> > to do these days). If you're looking for a stable experience with a >> > GL-based Wayland compositor, you'll have to wait until either the >> > nouveau GL driver is perfect or nouveau kernel module can properly >> > recover from any screwups the GL driver makes. >> >> I'm not expecting the GL driver to be perfect ;-) >> However it would be nice if the kernel module could recover at least a >> bit better from bad commands from the GL driver (indeed I've had some hard >> lockups too where I could not even connect from ssh). >> >> > You can also remove nouveau_dri.so entirely, which is a big hammer >> > against these types of issues (removes all GL-based acceleration), or >> > you can run certain key pieces of software with >> > LIBGL_ALWAYS_SOFTWARE=1, which will force a CPU-based GL >> > implementation. >> >> Thanks for the hint, I'll try this workaround too ! >> >> Please let me know if I can do anything to improve the drivers's >> stablility (like dumping the cards's register or enabling some traces ?). >> Alternatively if you know of a fanless graphic card model that would be >> able to drive 2 monitors at 2560x1440 with proper linux support, I'm >> interested ;-) >> >> Regards >> >> > Cheers, >> > >> > -ilia >> > >> > >> > 2017-05-07 16:03 GMT-04:00 Vincent Vanackere < >> vincent.vanackere-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>: >> >> Hi, >> >> >> >> I own an Asus GT730-SL-2GD3-BRK, trying to drive two monitors at >> 2560x1440 >> >> resolution. Using gnome-shell with either Xorg or wayland I get screen >> >> freezes very frequently. Those freezes usually require a reboot to get >> >> working graphics (below a sample trace that I got yesterday). >> >> I am running Ubuntu 17.04 with the latest kernels avalable, I also >> tested >> >> various more recent kernels including the latest drm tree at >> >> https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next but the >> problem >> >> always occurs. >> >> When a freeze occurs, the computer is still reachable through ssh but >> the >> >> only action I found so far to get graphics back is to restart the >> computer. >> >> I am willing to run diagnostics programs or test any patch if it >> would >> >> help. I'm also not excluding the possibility that I may have some >> faulty >> >> hardware so any hardwae-health-test advice would be welcome... >> >> >> >> Regards, >> >> >> >> Vincent Vanackère >> >> >> >> [ 1.199135] nouveau 0000:01:00.0: NVIDIA GK208B (b06070b1) >> >> [ 1.319930] nouveau 0000:01:00.0: bios: version 80.28.92.00.10 >> >> [ 1.322095] nouveau 0000:01:00.0: fb: 2048 MiB DDR3 >> >> [ 2.620362] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB >> >> [ 2.620362] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB >> >> [ 2.620364] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 >> >> [ 2.620378] nouveau 0000:01:00.0: DRM: DCB version 4.0 >> >> [ 2.620379] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 >> 00020030 >> >> [ 2.620380] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 >> 00020010 >> >> [ 2.620380] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022f10 >> 00000000 >> >> [ 2.620381] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031 >> >> [ 2.620381] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161 >> >> [ 2.620382] nouveau 0000:01:00.0: DRM: DCB conn 02: 00000200 >> >> [ 2.666199] nouveau 0000:01:00.0: hwmon_device_register() is >> deprecated. >> >> Please convert the driver to use hwmon_device_register_with_info(). >> >> [ 2.717519] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer >> copies >> >> [ 2.992994] nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: >> 0x60000, >> >> bo ffff8cd1499f8000 >> >> [ 3.025200] fbcon: nouveaufb (fb0) is primary device >> >> [ 3.253561] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device >> >> [ 3.268163] [drm] Initialized nouveau 1.3.1 20120801 for >> 0000:01:00.0 on >> >> minor 0 >> >> [ 2150.225651] nouveau 0000:01:00.0: fifo: read fault at 0006710000 >> engine >> >> 00 [GR] client 02 [GPC0/PE_0] reason 02 [PTE] on channel 31 [007e8cb000 >> >> Xwayland[3019]] >> >> [ 2150.225662] nouveau 0000:01:00.0: fifo: channel 31: killed >> >> [ 2150.225663] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for >> recovery >> >> [ 2150.225666] nouveau 0000:01:00.0: fifo: engine 0: scheduled for >> recovery >> >> [ 2150.225669] nouveau 0000:01:00.0: Xwayland[3019]: channel 31 killed! >> >> [ 2296.863975] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2296.863990] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> >> [ 2296.864032] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2296.864047] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2296.864118] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2296.864138] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> >> [ 2296.864153] ? nv84_fence_read+0x2e/0x30 [nouveau] >> >> [ 2296.864175] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2296.864189] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2417.699641] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2417.699656] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> >> [ 2417.699688] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2417.699705] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2417.699785] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2417.699808] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> >> [ 2417.699825] ? nv84_fence_read+0x2e/0x30 [nouveau] >> >> [ 2417.699851] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2417.699867] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2538.535424] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2538.535439] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> >> [ 2538.535469] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2538.535485] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2538.535555] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2538.535576] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> >> [ 2538.535591] ? nv84_fence_read+0x2e/0x30 [nouveau] >> >> [ 2538.535614] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2538.535628] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> >> >> >> >> _______________________________________________ >> >> Nouveau mailing list >> >> Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >> >> https://lists.freedesktop.org/mailman/listinfo/nouveau >> >> >> >> >