From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1035124AbdAIRXF (ORCPT ); Mon, 9 Jan 2017 12:23:05 -0500 Received: from mga09.intel.com ([134.134.136.24]:51470 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1034362AbdAIRXA (ORCPT ); Mon, 9 Jan 2017 12:23:00 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,340,1477983600"; d="scan'208";a="47160190" Subject: Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code To: Daniel Vetter References: <7fd16549-1349-a9e5-ceff-9aa6f748caae@intel.com> <20170109101516.y3acaev5ujbjugwl@phenom.ffwll.local> <16a1e734-667c-5d9a-c418-555b1f13e446@intel.com> Cc: Daniel Vetter , Jani Nikula , David Airlie , intel-gfx , dri-devel , Linux Kernel Mailing List From: Dave Hansen Message-ID: <4893e899-eb31-0f73-b2dc-81d13e26cf76@intel.com> Date: Mon, 9 Jan 2017 09:22:52 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/09/2017 08:59 AM, Daniel Vetter wrote: > On Mon, Jan 9, 2017 at 5:50 PM, Dave Hansen wrote: >> On 01/09/2017 08:41 AM, Daniel Vetter wrote: >>> On Mon, Jan 9, 2017 at 2:40 PM, Dave Hansen wrote: >>>> Well, now I found where the -2 comes from. >>>> intel_dp_register_mst_connector() calls drm_connector_register(), which >>>> fails to add the kobject (warning below). But, it does zero error >>>> checking on the drm_connector_register() call and leaves the >>>> partially-constructed connector in place. >>>> >>>> The next time some poor, hapless code goes and tries to do anything with >>>> that kdev, they oops. I'm perplexed by this, though. The >>>> drm_dp_mst_topology_cbs->register_connector just returns void. It seems >>>> a bit goofy that it can't even _return_ failure. >>>> >>>> Is there some stable code to go back to here? Or, is there something >>>> about my configuration that's unique? I really wonder why nobody else >>>> is running into this. >>>> >>>> There's probably some other race going on here. This warning doesn't >>>> happen on every boot. >>> This smells more like the root-cause: Something goes wrong on boot >>> that prevents connectors from properly registering, then we fall over >>> later on. And the register callback is intentionally void, assuming >>> that any prep work has been done earlier and that therefore the >>> register step can't fail. Can you pls check whether the oops later on >>> only happens together with this warning at boot, or whether they're >>> not correlated? >> >> Looking through my logs, I can't find any instance of the oops without >> the warning at boot. So I do think the later oops is entirely caused by >> the issue warned about in early boot. > > Hm, I guess then we'd need to fix that boot-up warning. Can you try to > figure out why it's unhappy? On a hunch it could be that we call > drm_connector_register from the mst probe worker before the main > driver load thread has reached the drm_dev_register call. A few printk > to decide whether that's the case (plus a few boot-up tests to gather > the statistics, sorry about that) would be real great. > > If that's inconclusive I'm again a bit low on ideas ... I'll do that shortly. But, for now I can confirm that the failure is precipitated by the !parent check in sysfs_create_dir_ns(). I also can't reproduce this if I build i915 as a module. It only happens when built in. > Jan 9 09:07:34 ray kernel: [ 1.400547] sysfs_create_dir_ns()::53 error: -2 > Jan 9 09:07:34 ray kernel: [ 1.400554] create_dir()::75 error: -2 > Jan 9 09:07:34 ray kernel: [ 1.400558] ------------[ cut here ]------------ > Jan 9 09:07:34 ray kernel: [ 1.400565] WARNING: CPU: 1 PID: 90 at lib/kobject.c:249 kobject_add_internal+0x273/0x320 > Jan 9 09:07:34 ray kernel: [ 1.400569] kobject_add_internal failed for card0-DP-3 (error: -2 parent: card0) > Jan 9 09:07:34 ray kernel: [ 1.400572] Modules linked in: > Jan 9 09:07:34 ray kernel: [ 1.400577] CPU: 1 PID: 90 Comm: kworker/1:2 Not tainted 4.10.0-rc3-dirty #61 > Jan 9 09:07:34 ray kernel: [ 1.400579] Hardware name: LENOVO 20F5S7V800/20F5S7V800, BIOS R02ET50W (1.23 ) 09/20/2016 > Jan 9 09:07:34 ray kernel: [ 1.400585] Workqueue: events_long drm_dp_mst_link_probe_work > Jan 9 09:07:34 ray kernel: [ 1.400588] Call Trace: > Jan 9 09:07:34 ray kernel: [ 1.400593] dump_stack+0x67/0x99 > Jan 9 09:07:34 ray kernel: [ 1.400598] __warn+0xd1/0xf0 > Jan 9 09:07:34 ray kernel: [ 1.400601] warn_slowpath_fmt+0x4f/0x60 > Jan 9 09:07:34 ray kernel: [ 1.400604] kobject_add_internal+0x273/0x320 > Jan 9 09:07:34 ray kernel: [ 1.400607] kobject_add+0x65/0xb0 > Jan 9 09:07:34 ray kernel: [ 1.400611] ? klist_init+0x31/0x40 > Jan 9 09:07:34 ray kernel: [ 1.400615] device_add+0x102/0x5d0 > Jan 9 09:07:34 ray kernel: [ 1.400619] ? kfree_const+0x22/0x30 > Jan 9 09:07:34 ray kernel: [ 1.400623] device_create_groups_vargs+0xd8/0x100 > Jan 9 09:07:34 ray kernel: [ 1.400626] device_create_with_groups+0x36/0x40 > Jan 9 09:07:34 ray kernel: [ 1.400631] ? drm_fb_helper_add_one_connector+0x57/0xd0 > Jan 9 09:07:34 ray kernel: [ 1.400636] ? kmem_cache_alloc_trace+0x1d2/0x1f0 > Jan 9 09:07:34 ray kernel: [ 1.400641] drm_sysfs_connector_add+0x60/0xe0 > Jan 9 09:07:34 ray kernel: [ 1.400645] drm_connector_register+0x21/0xc0 > Jan 9 09:07:34 ray kernel: [ 1.400649] intel_dp_register_mst_connector+0x41/0x50 > Jan 9 09:07:34 ray kernel: [ 1.400653] drm_dp_add_port+0x350/0x450 > Jan 9 09:07:34 ray kernel: [ 1.400657] ? rcu_early_boot_tests+0x1/0x10 > Jan 9 09:07:34 ray kernel: [ 1.400660] ? schedule_timeout+0x1cd/0x390 > Jan 9 09:07:34 ray kernel: [ 1.400664] ? __might_sleep+0x4a/0x90 > Jan 9 09:07:34 ray kernel: [ 1.400667] ? mutex_lock+0x25/0x50 > Jan 9 09:07:34 ray kernel: [ 1.400670] ? drm_dp_mst_wait_tx_reply+0x118/0x1e0 > Jan 9 09:07:34 ray kernel: [ 1.400673] ? prepare_to_wait_event+0x120/0x120 > Jan 9 09:07:34 ray kernel: [ 1.400675] drm_sysfs_connector_add() connector: ffff88040c778000 kdev: ffff88040ef15000 > Jan 9 09:07:34 ray kernel: [ 1.400681] ? drm_dp_check_mstb_guid+0x3d/0x120 > Jan 9 09:07:34 ray kernel: [ 1.400684] drm_dp_send_link_address+0x185/0x1f0 > Jan 9 09:07:34 ray kernel: [ 1.400688] drm_dp_check_and_send_link_address+0xad/0xc0 > Jan 9 09:07:34 ray kernel: [ 1.400691] drm_dp_mst_link_probe_work+0x57/0xa0 > Jan 9 09:07:34 ray kernel: [ 1.400694] process_one_work+0x14b/0x430 > Jan 9 09:07:34 ray kernel: [ 1.400697] worker_thread+0x12b/0x4a0 > Jan 9 09:07:34 ray kernel: [ 1.400700] kthread+0x10c/0x140 > Jan 9 09:07:34 ray kernel: [ 1.400703] ? process_one_work+0x430/0x430 > Jan 9 09:07:34 ray kernel: [ 1.400706] ? kthread_create_on_node+0x40/0x40 > Jan 9 09:07:34 ray kernel: [ 1.400709] ret_from_fork+0x27/0x40 > Jan 9 09:07:34 ray kernel: [ 1.400714] ---[ end trace 0009c9dc7b253d9c ]---