From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2081AC433F5 for ; Mon, 6 Sep 2021 09:35:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0BE1060E76 for ; Mon, 6 Sep 2021 09:35:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241409AbhIFJgH (ORCPT ); Mon, 6 Sep 2021 05:36:07 -0400 Received: from x127130.tudelft.net ([131.180.127.130]:34940 "EHLO djo.tudelft.nl" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S241112AbhIFJgF (ORCPT ); Mon, 6 Sep 2021 05:36:05 -0400 Received: by djo.tudelft.nl (Postfix, from userid 2001) id 0935C1C42C4; Mon, 6 Sep 2021 11:36:11 +0200 (CEST) Date: Mon, 6 Sep 2021 11:36:11 +0200 From: wim To: Greg KH Cc: stable@vger.kernel.org, linux-kernel@vger.kernel.org, wim Subject: Re: kernel-4.9.270 crash Message-ID: <20210906093611.GA20123@djo.tudelft.nl> Reply-To: wim@djo.tudelft.nl References: <20210904235231.GA31607@djo.tudelft.nl> <20210905190045.GA10991@djo.tudelft.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.2 (2019-01-07) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 06, 2021 at 06:59:22AM +0200, Greg KH wrote: > On Sun, Sep 05, 2021 at 09:00:45PM +0200, wim wrote: > > On Sun, Sep 05, 2021 at 01:52:31AM +0200, wim wrote: > > > > > > Hello Greg, > > > > > > from kernel-4.9.270 up until now (4.9.282) I experience kernel crashes upon > > > loading a GPU module. > > > It happens on two out of at least six different machines. > > > I can't believe that I'm the only one where that happens, but since the bug > > > is still there twelve versions later, I need to report this. > > > ... > > Do you have any kernel log messages when these crashes happen? On the AMD machine: Aug 1 20:51:24 djo kernel: [drm] Initialized Aug 1 20:51:24 djo kernel: checking generic (a0000 10000) vs hw (e0000000 8000000) Aug 1 20:51:24 djo kernel: checking generic (a0000 10000) vs hw (ea000000 1000000) Aug 1 20:51:24 djo kernel: fb: switching to nouveaufb from VGA16 VGA Aug 1 20:51:24 djo kernel: divide error: 0000 [#1] SMP Aug 1 20:51:24 djo kernel: Modules linked in: nouveau(+) video drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm agpgart i2c_algo_bit tun lirc_serial(C) lirc_dev arc4 binfmt_misc snd_pcm_oss snd_mixer_oss fbcon bitblit softcursor font tileblit ath9k_htc ath9k_common ath9k_hw ath mac80211 cfg80211 uvcvideo rfkill firmware_class snd_usb_audio sr9700 videobuf2_vmalloc videobuf2_memops snd_usbmidi_lib videobuf2_v4l2 dm9601 videobuf2_core usbnet snd_rawmidi mii usb_storage snd_hda_codec_generic kvm snd_hda_intel irqbypass snd_hda_codec gpio_ich ppdev snd_hwdep pcspkr snd_hda_core snd_pcm uhci_hcd ohci_pci snd_timer ohci_hcd lpc_ich ehci_pci snd ehci_hcd wmi mfd_core usbcore soundcore parport_pc floppy usb_common parport acpi_cpufreq button processor Aug 1 20:51:24 djo kernel: CPU: 0 PID: 2791 Comm: modprobe Tainted: G C 4.9.277 #1 Aug 1 20:51:24 djo kernel: Hardware name: Hewlett-Packard HP xw4300 Workstation/0A00h, BIOS 786D3 v01.08 03/10/2006 Aug 1 20:51:24 djo kernel: task: f6317080 task.stack: f4058000 Aug 1 20:51:24 djo kernel: EIP: 0060:[] EFLAGS: 00010206 CPU: 0 Aug 1 20:51:24 djo kernel: EAX: 00000190 EBX: ffffffea ECX: 00000019 EDX: 00000000 Aug 1 20:51:24 djo kernel: ESI: f52db800 EDI: 00000050 EBP: c02f7838 ESP: f4059c10 Aug 1 20:51:24 djo kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Aug 1 20:51:24 djo kernel: CR0: 80050033 CR2: 080a1a54 CR3: 35234000 CR4: 00000690 Aug 1 20:51:24 djo kernel: Stack: Aug 1 20:51:24 djo kernel: 00000050 f52db800 00000019 c0340732 00000000 000000a0 000000a0 00000fa0 Aug 1 20:51:24 djo kernel: f62f4000 0000001e 00000000 00000000 f5a63800 00000000 00000000 00000000 Aug 1 20:51:24 djo kernel: 00000000 00000000 f6024000 00000000 f52db800 00000001 00000000 00000000 Aug 1 20:51:24 djo kernel: Call Trace: Aug 1 20:51:24 djo kernel: [] ? 0xc0340732 Aug 1 20:51:24 djo kernel: [] ? 0xc0340988 Aug 1 20:51:24 djo kernel: [] ? 0xc02f734a Aug 1 20:51:24 djo kernel: [] ? 0xc033f780 Aug 1 20:51:24 djo kernel: [] ? 0xc0340b32 Aug 1 20:51:24 djo kernel: [] ? 0xc0340d20 Aug 1 20:51:24 djo kernel: [] ? 0xf8bc4ef7 Aug 1 20:51:24 djo kernel: [] ? 0xc0163715 Aug 1 20:51:24 djo kernel: [] ? 0xf8bc4c82 Aug 1 20:51:24 djo kernel: [] ? 0xc014aac4 Aug 1 20:51:24 djo kernel: [] ? 0xc014ad8a Aug 1 20:51:24 djo kernel: [] ? 0xc014ada6 Aug 1 20:51:24 djo kernel: [] ? 0xc02f9aa4 Aug 1 20:51:24 djo kernel: [] ? 0xc0168c32 Aug 1 20:51:24 djo kernel: [] ? 0xc02fa294 Aug 1 20:51:24 djo kernel: [] ? 0xc02fa47e Aug 1 20:51:24 djo kernel: [] ? 0xc02fa4f5 Aug 1 20:51:24 djo kernel: [] ? 0xf90a5c94 Aug 1 20:51:24 djo kernel: [] ? 0xf90a5b88 Aug 1 20:51:24 djo kernel: [] ? 0xc02e82de Aug 1 20:51:24 djo kernel: [] ? 0xc03545f8 Aug 1 20:51:24 djo kernel: [] ? 0xc035475d Aug 1 20:51:24 djo kernel: [] ? 0xc03533a9 Aug 1 20:51:24 djo kernel: [] ? 0xc035424a Aug 1 20:51:24 djo kernel: [] ? 0xc0354705 Aug 1 20:51:24 djo kernel: [] ? 0xc0353f3d Aug 1 20:51:24 djo kernel: [] ? 0xc0354e44 Aug 1 20:51:24 djo kernel: [] ? 0xf9124000 Aug 1 20:51:24 djo kernel: [] ? 0xc01003df Aug 1 20:51:24 djo kernel: [] ? 0xc01dbb22 Aug 1 20:51:24 djo kernel: [] ? 0xc04ba42d Aug 1 20:51:24 djo kernel: [] ? 0xc04ba45c Aug 1 20:51:24 djo kernel: [] ? 0xc01889d5 Aug 1 20:51:24 djo kernel: [] ? 0xc01e45e4 Aug 1 20:51:24 djo kernel: [] ? 0xc0188c2b Aug 1 20:51:24 djo kernel: [] ? 0xc0101211 Aug 1 20:51:24 djo kernel: [] ? 0xc04c0579 Aug 1 20:51:24 djo kernel: Code: 63 c0 eb 53 f6 04 24 01 bb ea ff ff ff 75 4a 0f b6 05 07 c5 6c c0 3b 04 24 72 3e 0f b6 05 0e c5 6c c0 31 d2 0f af 05 08 cc 63 c0 b6 ec 00 00 00 39 c8 72 24 8b 86 24 02 00 00 31 db 3b 30 75 Aug 1 20:51:24 djo kernel: EIP: [] Aug 1 20:51:24 djo kernel: SS:ESP 0068:f4059c10 Aug 1 20:51:24 djo kernel: ---[ end trace 307fdb439b21cfc0 ]--- On the Intel machine: Sep 5 00:20:26 asusUX410U kernel: Adding 2097148k swap on /dev/sda2. Priority:-1 extents:1 across:2097148k FS Sep 5 00:20:38 asusUX410U kernel: [drm] Memory usable by graphics device = 4096M Sep 5 00:20:38 asusUX410U kernel: fb: switching to inteldrmfb from VGA16 VGA Sep 5 00:20:38 asusUX410U kernel: divide error: 0000 [#1] SMP Sep 5 00:20:38 asusUX410U kernel: Modules linked in: i915(+) intel_gtt cmac uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core arc4 iwlmvm mac80211 nouveau drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm agpgart btusb btrtl btbcm btintel bluetooth hid_multitouch iwlwifi i2c_designware_platform mxm_wmi i2c_designware_core cfg80211 x86_pkg_temp_thermal intel_powerclamp pcspkr nvidiafb i2c_algo_bit fb_ddc rfkill firmware_class thermal i2c_hid xhci_pci xhci_hcd usbcore battery int3403_thermal wmi video ac int3400_thermal acpi_thermal_rel acpi_pad asus_wireless intel_lpss_pci intel_lpss button processor_thermal_device i2c_i801 intel_soc_dts_iosf i2c_smbus intel_pch_thermal usb_common mfd_core int340x_thermal_zone binfmt_misc snd_hda_codec_generic snd_pcm_oss snd_mixer_oss snd_hda_intel Sep 5 00:20:38 asusUX410U kernel: snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore fbcon bitblit softcursor font tileblit Sep 5 00:20:38 asusUX410U kernel: CPU: 2 PID: 2601 Comm: modprobe Not tainted 4.9.282 #1 Sep 5 00:20:38 asusUX410U kernel: Hardware name: ASUSTeK COMPUTER INC. UX410UQK/UX410UQK, BIOS UX410UQK.301 12/12/2016 Sep 5 00:20:38 asusUX410U kernel: task: ffff880264ac8000 task.stack: ffffc90003ee0000 Sep 5 00:20:38 asusUX410U kernel: RIP: 0010:[] [] 0xffffffff8044b341 Sep 5 00:20:38 asusUX410U kernel: RSP: 0018:ffffc90003ee38e8 EFLAGS: 00010246 Sep 5 00:20:38 asusUX410U kernel: RAX: 0000000000000190 RBX: 00000000000000a0 RCX: 0000000000000000 Sep 5 00:20:38 asusUX410U kernel: RDX: 0000000000000000 RSI: 0000000000000050 RDI: ffff880256b9b800 Sep 5 00:20:38 asusUX410U kernel: RBP: 0000000000000019 R08: 0000000000000019 R09: 00000000000000a0 Sep 5 00:20:38 asusUX410U kernel: R10: 000000000000001e R11: 0000000000000001 R12: 00000000ffffffea Sep 5 00:20:38 asusUX410U kernel: R13: ffff880256b9b800 R14: 0000000000000fa0 R15: 0000000000000000 Sep 5 00:20:38 asusUX410U kernel: FS: 00007fb959a4cc00(0000) GS:ffff88026ed00000(0000) knlGS:0000000000000000 Sep 5 00:20:38 asusUX410U kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 5 00:20:38 asusUX410U kernel: CR2: 000056515c106000 CR3: 0000000259500000 CR4: 0000000000360670 Sep 5 00:20:38 asusUX410U kernel: Stack: Sep 5 00:20:38 asusUX410U kernel: 0000000000000050 ffffffff804a8d05 ffff880259667000 0000000000000000 Sep 5 00:20:38 asusUX410U kernel: ffff88020000001e 000000a000000fa0 00000000000000a0 00000000000000a0 Sep 5 00:20:38 asusUX410U kernel: 0190000000500019 ffff880256b9b800 ffff880256b9b800 0000000000000000 Sep 5 00:20:38 asusUX410U kernel: Call Trace: Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804a8d05 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff8044adee Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804a79dd Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804a9160 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804a9395 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffffa000c549 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff80257d40 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff80258077 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff8044d551 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff8044e213 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff8044e457 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff8044e4de Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffffa05cb585 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff80439f8a Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804c0d1e Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804c0eda Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804c0e72 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804bf59b Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804c04c9 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffffa06ab000 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff804c1738 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff80200341 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff802962fc Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff80297a24 Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff80297e0e Sep 5 00:20:38 asusUX410U last message buffered 1 times Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff802014fd Sep 5 00:20:38 asusUX410U kernel: [] ? 0xffffffff80645a3e Sep 5 00:20:38 asusUX410U kernel: Code: 65 00 eb 57 41 bc ea ff ff ff 40 f6 c6 01 75 4e 0f b6 05 da 22 75 00 39 f0 72 43 0f b6 05 d6 22 75 00 0f af 05 e9 6d 65 00 31 d2 b7 7c 01 00 00 44 39 c0 72 28 48 8b 87 00 03 00 00 45 31 e4 Sep 5 00:20:38 asusUX410U kernel: RSP Sep 5 00:20:38 asusUX410U kernel: ---[ end trace a46f8400460cdde1 ]--- > Can you use 'git bisect' to track down the offending commit? If I would know how to do that > And why are you stuck on 4.9.y for these machines? Why not use 5.10 or > newer? Because in 4.10 they dropped lirc-serial and I need that. The new ir-serial is no replacement. (The last working version of LIRC is 0.9.6. After that they destroyed transmitter support.) (I believe irda support got dropped too, which I need for my old nokia.) Wim.