From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030285AbdAIM7e (ORCPT ); Mon, 9 Jan 2017 07:59:34 -0500 Received: from mga11.intel.com ([192.55.52.93]:36269 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030225AbdAIM7c (ORCPT ); Mon, 9 Jan 2017 07:59:32 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,339,1477983600"; d="scan'208";a="1109982967" Subject: Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code To: Daniel Vetter , Jani Nikula , David Airlie , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org References: <7fd16549-1349-a9e5-ceff-9aa6f748caae@intel.com> <20170109101516.y3acaev5ujbjugwl@phenom.ffwll.local> From: Dave Hansen Message-ID: <7766844a-bbd6-8742-42a8-08e3ac7c4edc@intel.com> Date: Mon, 9 Jan 2017 04:59:30 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20170109101516.y3acaev5ujbjugwl@phenom.ffwll.local> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/09/2017 02:15 AM, Daniel Vetter wrote: ... > Can you pls do some printk tracing to make sure that without your patch > we're indeed releasing the same connector twice from this loop? I suspect > you're just ever-so-slightly shifting the timing and things blow up > somewhre else. But no idea where :( Your suspicious appear correct. I still get an oops even with my patch applied. I did add some printk's but they're weird. I'm seeing a 'kdev' that looks to be -2, although it should never get set to that unless device_create_with_groups() and th error checking failed somehow. I'll take out the locking portion of the patch and add some more printk's and see if anything else shows up. Here's a copy of the printk's and associated oops, with my locking modification patch applied, if this helps at all: > [ 7927.049763] drm_dp_destroy_port() kfree(ffff88040c759000) > [ 7927.050039] drm_dp_destroy_connector_work() port: ffff88040c759800 connector: ffff88040c75a000 > [ 7927.050061] drm_sysfs_connector_remove() connector: ffff88040c75a000 kdev: fffffffffffffffe > [ 7927.050106] BUG: unable to handle kernel NULL pointer dereference at 000000000000009e > [ 7927.050242] IP: device_del+0x19/0x330 > [ 7927.050300] PGD 0 > [ 7927.050302] > [ 7927.050364] Oops: 0000 [#1] SMP > [ 7927.050414] Modules linked in: netconsole ctr ccm ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc iptable_filter ip_tables ebtable_nat ebtables x_tables cmac rfcomm bnep dm_crypt arc4 iwlmvm mac80211 snd_hda_codec_hdmi iwlwifi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_seq_midi snd_seq_midi_event snd_hda_codec cfg80211 snd_hwdep snd_rawmidi intel_rapl snd_hda_core iosf_mbi snd_pcm snd_seq btusb x86_pkg_temp_thermal hid_logitech_hidpp btrtl btbcm btintel coretemp shpchp snd_timer ghash_clmulni_intel bluetooth joydev thinkpad_acpi snd_seq_device nvram wmi snd soundcore mac_hid aesni_intel aes_x86_64 crypto_simd cryptd glue_helper > [ 7927.051492] kvm_intel kvm irqbypass hid_generic hid_logitech_dj usbhid hid [last unloaded: netconsole] > [ 7927.051642] CPU: 2 PID: 123 Comm: kworker/2:2 Tainted: G W 4.10.0-rc2-dirty #53 > [ 7927.051763] Hardware name: LENOVO 20F5S7V800/20F5S7V800, BIOS R02ET50W (1.23 ) 09/20/2016 > [ 7927.051887] Workqueue: events drm_dp_destroy_connector_work > [ 7927.051971] task: ffff88040c21bc00 task.stack: ffffc90002484000 > [ 7927.052065] RIP: 0010:device_del+0x19/0x330 > [ 7927.052129] RSP: 0018:ffffc90002487d58 EFLAGS: 00010282 > [ 7927.052209] RAX: 0000000000000000 RBX: fffffffffffffffe RCX: ffff88040c72d1f8 > [ 7927.052310] RDX: ffffffff81cb69b2 RSI: 0000000000000001 RDI: fffffffffffffffe > [ 7927.052412] RBP: ffffc90002487d90 R08: 0000000000000000 R09: 0000000000000392 > [ 7927.052517] R10: 00000003e05de802 R11: 0000000000000000 R12: fffffffffffffffe > [ 7927.052677] R13: ffff88040c516c18 R14: 0000000000000000 R15: ffff88040c516bd8 > [ 7927.052781] FS: 0000000000000000(0000) GS:ffff880421500000(0000) knlGS:0000000000000000 > [ 7927.052897] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 7927.052981] CR2: 000000000000009e CR3: 0000000001e0b000 CR4: 00000000003406e0 > [ 7927.053082] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 7927.053186] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 7927.053288] Call Trace: > [ 7927.053338] ? printk+0x4d/0x4f > [ 7927.053396] device_unregister+0x12/0x30 > [ 7927.053464] drm_sysfs_connector_remove+0x57/0x70 > [ 7927.053538] drm_connector_unregister.part.8+0x27/0x40 > [ 7927.053616] drm_connector_unregister+0x14/0x20 > [ 7927.053690] intel_dp_destroy_mst_connector+0x1a/0x80 > [ 7927.053771] drm_dp_destroy_connector_work+0xce/0x150 > [ 7927.053850] process_one_work+0x14b/0x430 > [ 7927.053914] worker_thread+0x12b/0x4a0 > [ 7927.053982] kthread+0x10c/0x140 > [ 7927.054037] ? process_one_work+0x430/0x430 > [ 7927.054104] ? kthread_create_on_node+0x40/0x40 > [ 7927.054179] ret_from_fork+0x27/0x40 > [ 7927.054237] Code: 00 00 00 00 00 00 00 5b 41 5c 41 5d 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 10 <48> 8b 87 a0 00 00 00 4c 8b 2f 48 85 c0 74 1b 48 8b b8 90 00 00 > [ 7927.054462] RIP: device_del+0x19/0x330 RSP: ffffc90002487d58 > [ 7927.054486] CR2: 000000000000009e > [ 7927.065278] ---[ end trace 1a2a5ca2cf6c3a4f ]---