From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jamie Heilman Subject: nouveau regression post v5.8, still present in v5.10 Date: Fri, 25 Dec 2020 07:34:08 +0000 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline To: Ben Skeggs , nouveau@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org List-Id: nouveau.vger.kernel.org Something between v5.8 and v5.9 has resulted in periodically losing video. Unfortunately, I can't reliably reproduce it, it seems to happen every once in a long while---I can go weeks without an occurance, but it always seems to happen after my workstation has been idle long enough to screen blank and put the monitor to sleep. I'm using a single display (Dell 2405FPW) connected via DVI, running X (Xorg 1.20.x from Debian sid). I don't really do anything fancy, xterms, a browser or two, play the occasional video, but like I said, I can't reliably reproduce this. I've had it happen about 11 times since August. lspci -vv output is: 01:00.0 VGA compatible controller: NVIDIA Corporation G86 [Quadro NVS 290] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation G86 [Quadro NVS 290] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 Kernel driver in use: nouveau The last time this happened, this is what got logged: nouveau 0000:01:00.0: disp: ERROR 5 [INVALID_STATE] 06 [] chid 1 mthd 0080 data 00000001 nouveau 0000:01:00.0: disp: Base 1: nouveau 0000:01:00.0: disp: 0084: 00000000 nouveau 0000:01:00.0: disp: 0088: 00000000 nouveau 0000:01:00.0: disp: 008c: 00000000 nouveau 0000:01:00.0: disp: 0090: 00000000 nouveau 0000:01:00.0: disp: 0094: 00000000 nouveau 0000:01:00.0: disp: 00a0: 00000060 -> 00000070 nouveau 0000:01:00.0: disp: 00a4: 00000000 -> f0000000 nouveau 0000:01:00.0: disp: 00c0: 00000000 nouveau 0000:01:00.0: disp: 00c4: 00000000 nouveau 0000:01:00.0: disp: 00c8: 00000000 nouveau 0000:01:00.0: disp: 00cc: 00000000 nouveau 0000:01:00.0: disp: 00e0: 40000000 nouveau 0000:01:00.0: disp: 00e4: 00000000 nouveau 0000:01:00.0: disp: 00e8: 00000000 nouveau 0000:01:00.0: disp: 00ec: 00000000 nouveau 0000:01:00.0: disp: 00fc: 00000000 nouveau 0000:01:00.0: disp: 0100: fffe0000 nouveau 0000:01:00.0: disp: 0104: 00000000 nouveau 0000:01:00.0: disp: 0110: 00000000 nouveau 0000:01:00.0: disp: 0114: 00000000 nouveau 0000:01:00.0: disp: Base 1 - Image 0: nouveau 0000:01:00.0: disp: 0800: 00009500 nouveau 0000:01:00.0: disp: 0804: 00000000 nouveau 0000:01:00.0: disp: 0808: 04b00780 nouveau 0000:01:00.0: disp: 080c: 00007804 nouveau 0000:01:00.0: disp: 0810: 0000cf00 nouveau 0000:01:00.0: disp: Base 1 - Image 1: nouveau 0000:01:00.0: disp: 0c00: 00009500 nouveau 0000:01:00.0: disp: 0c04: 00000000 nouveau 0000:01:00.0: disp: 0c08: 04b00780 nouveau 0000:01:00.0: disp: 0c0c: 00007804 nouveau 0000:01:00.0: disp: 0c10: 0000cf00 nouveau 0000:01:00.0: disp: ERROR 5 [INVALID_STATE] 06 [] chid 1 mthd 0080 data 00000001 nouveau 0000:01:00.0: disp: Base 1: nouveau 0000:01:00.0: disp: 0084: 00000000 nouveau 0000:01:00.0: disp: 0088: 00000000 nouveau 0000:01:00.0: disp: 008c: 00000000 nouveau 0000:01:00.0: disp: 0090: 00000000 nouveau 0000:01:00.0: disp: 0094: 00000000 nouveau 0000:01:00.0: disp: 00a0: 00000060 -> 00000070 nouveau 0000:01:00.0: disp: 00a4: 00000000 -> f0000000 nouveau 0000:01:00.0: disp: 00c0: 00000000 nouveau 0000:01:00.0: disp: 00c4: 00000000 nouveau 0000:01:00.0: disp: 00c8: 00000000 nouveau 0000:01:00.0: disp: 00cc: 00000000 nouveau 0000:01:00.0: disp: 00e0: 40000000 nouveau 0000:01:00.0: disp: 00e4: 00000000 nouveau 0000:01:00.0: disp: 00e8: 00000000 nouveau 0000:01:00.0: disp: 00ec: 00000000 nouveau 0000:01:00.0: disp: 00fc: 00000000 nouveau 0000:01:00.0: disp: 0100: fffe0000 nouveau 0000:01:00.0: disp: 0104: 00000000 nouveau 0000:01:00.0: disp: 0110: 00000000 nouveau 0000:01:00.0: disp: 0114: 00000000 nouveau 0000:01:00.0: disp: Base 1 - Image 0: nouveau 0000:01:00.0: disp: 0800: 00009500 nouveau 0000:01:00.0: disp: 0804: 00000000 nouveau 0000:01:00.0: disp: 0808: 04b00780 nouveau 0000:01:00.0: disp: 080c: 00007804 nouveau 0000:01:00.0: disp: 0810: 0000cf00 nouveau 0000:01:00.0: disp: Base 1 - Image 1: nouveau 0000:01:00.0: disp: 0c00: 00009500 nouveau 0000:01:00.0: disp: 0c04: 00000000 nouveau 0000:01:00.0: disp: 0c08: 04b00780 nouveau 0000:01:00.0: disp: 0c0c: 00007804 nouveau 0000:01:00.0: disp: 0c10: 0000cf00 nouveau 0000:01:00.0: DRM: core notifier timeout nouveau 0000:01:00.0: DRM: base-0: timeout I've got logs of all of this, if they help I can collect them. The timeout message are consistent the error messages a little less so. If there's more debugging I can do when this happens, I'd love to know what it is. kernel config: http://audible.transient.net/~jamie/k/nouveau.config-5.10.0 dmesg at boot: http://audible.transient.net/~jamie/k/nouveau.dmesg -- Jamie Heilman http://audible.transient.net/~jamie/