From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ@public.gmane.org
Subject: [Bug 100567] Nouveau system freeze fifo: SCHED_ERROR 0a
[CTXSW_TIMEOUT]
Date: Thu, 05 Sep 2019 14:48:03 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0294196047=="
Return-path:
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: nouveau-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Sender: "Nouveau"
To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
List-Id: nouveau.vger.kernel.org
--===============0294196047==
Content-Type: multipart/alternative; boundary="15676948840.F9AE.12436"
Content-Transfer-Encoding: 7bit
--15676948840.F9AE.12436
Date: Thu, 5 Sep 2019 14:48:04 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D100567
John Walicki changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |walicki-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org
--- Comment #38 from John Walicki ---
The nouveau driver on my ThinkPad P50 (running RHEL 7.7 with a
5.2.11-1.el7.elrepo.x86_64 kernel) just hung up with this same error.
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: fifo: SCHED_ERRO=
R 0a
[CTXSW_TIMEOUT]
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: fifo: runlist 0:
scheduled for recovery
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: fifo: channel 2:
killed
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: fifo: engine 0:
scheduled for recovery
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: X[5775]: channel=
2
killed!
What is possibly interesting is that in /var/log/Xorg.0.log
there was a mouse event that matches exactly with the timing of the nouveau
errors.
[150241.393] AUDIT: Thu Sep 5 09:04:19 2019: 5775: client 43 disconnected
[150241.396] AUDIT: Thu Sep 5 09:04:19 2019: 5775: client 44 disconnected
[150270.473] (II) event8 - Logitech USB Receiver: SYN_DROPPED event - some
input events have been lost.
The [bracket] is a timestamp which indicates the time since the system last
booted, in seconds. So 150270 is 29 seconds after the 09:04:19 timestamped
line.=20
09:04:19 + 29 seconds is 09:04:48
The nouveau driver hung at that time (see /var/log/messages timestamp above)
Not certain if that was cause or effect of the video driver hang.
I was able to ssh into my system to reboot.
$ lspci -vv -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M100=
0M]
(rev a2) (prog-if 00 [VGA controller])
Subsystem: Lenovo Device 2230
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=3Dfast >TAbort- SERR-
Kernel driver in use: nouveau
Kernel modules: nouveau
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15676948840.F9AE.12436
Date: Thu, 5 Sep 2019 14:48:04 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
John Walicki
changed
bug 10056=
7
What |
Removed |
Added |
CC |
|
walicki@us.ibm.com
|
Comme=
nt # 38
on bug 10056=
7
from John Walicki
The nouveau driver on my ThinkPad P50 (running RHEL 7.7 with a
5.2.11-1.el7.elrepo.x86_64 kernel) just hung up with this same error.
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: fifo: SCHED_ERRO=
R 0a
[CTXSW_TIMEOUT]
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: fifo: runlist 0:
scheduled for recovery
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: fifo: channel 2:
killed
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: fifo: engine 0:
scheduled for recovery
Sep 5 09:04:48 jaw-p50rhel7 kernel: nouveau 0000:01:00.0: X[5775]: channel=
2
killed!
What is possibly interesting is that in /var/log/Xorg.0.log
there was a mouse event that matches exactly with the timing of the nouveau
errors.
[150241.393] AUDIT: Thu Sep 5 09:04:19 2019: 5775: client 43 disconnected
[150241.396] AUDIT: Thu Sep 5 09:04:19 2019: 5775: client 44 disconnected
[150270.473] (II) event8 - Logitech USB Receiver: SYN_DROPPED event - some
input events have been lost.
The [bracket] is a timestamp which indicates the time since the system last
booted, in seconds. So 150270 is 29 seconds after the 09:04:19 timestamped
line.=20
09:04:19 + 29 seconds is 09:04:48
The nouveau driver hung at that time (see /var/log/messages timestamp above)
Not certain if that was cause or effect of the video driver hang.
I was able to ssh into my system to reboot.
$ lspci -vv -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M100=
0M]
(rev a2) (prog-if 00 [VGA controller])
Subsystem: Lenovo Device 2230
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=3Dfast >TAbort-=
<TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 129
Region 0: Memory at b2000000 (32-bit, non-prefetchable) [size=3D16M]
Region 1: Memory at a0000000 (64-bit, prefetchable) [size=3D256M]
Region 3: Memory at b0000000 (64-bit, prefetchable) [size=3D32M]
Region 5: I/O ports at 4000 [size=3D128]
Expansion ROM at 000c0000 [disabled] [size=3D128K]
Capabilities: <access denied>
Kernel driver in use: nouveau
Kernel modules: nouveau
You are receiving this mail because:
- You are the assignee for the bug.
=
--15676948840.F9AE.12436--
--===============0294196047==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KTm91dmVhdSBt
YWlsaW5nIGxpc3QKTm91dmVhdUBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5m
cmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9ub3V2ZWF1
--===============0294196047==--