From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757096AbcIPCYS (ORCPT ); Thu, 15 Sep 2016 22:24:18 -0400 Received: from bh-25.webhostbox.net ([208.91.199.152]:58452 "EHLO bh-25.webhostbox.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755084AbcIPCYK (ORCPT ); Thu, 15 Sep 2016 22:24:10 -0400 Subject: Re: Crashing 'kzm' target in next-20160913 due to 'gpio: mxc: shift gpio_mxc_init() to subsys_initcall level' To: =?UTF-8?Q?Uwe_Kleine-K=c3=b6nig?= References: <9befb559-499a-dd70-cd44-60d5fce2e5d6@roeck-us.net> <99993063-6692-0870-12fa-64cf13d507f6@mentor.com> <20160915142326.lktxsck3idykemds@pengutronix.de> <14a0ca18-f568-3365-800a-7dca858f113b@roeck-us.net> <20160915144653.u4zqohlioeb5pqyx@pengutronix.de> Cc: Vladimir Zapolskiy , Shawn Guo , "linux-kernel@vger.kernel.org" , Greg Kroah-Hartman , Linus Walleij From: Guenter Roeck Message-ID: <2bbc441c-4ab5-d0c1-61b3-da20acde3cfb@roeck-us.net> Date: Thu, 15 Sep 2016 19:24:01 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160915144653.u4zqohlioeb5pqyx@pengutronix.de> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Authenticated_sender: linux@roeck-us.net X-OutGoing-Spam-Status: No, score=-1.0 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - bh-25.webhostbox.net X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - roeck-us.net X-Get-Message-Sender-Via: bh-25.webhostbox.net: authenticated_id: linux@roeck-us.net X-Authenticated-Sender: bh-25.webhostbox.net: linux@roeck-us.net X-Source: X-Source-Args: X-Source-Dir: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/15/2016 07:46 AM, Uwe Kleine-König wrote: > On Thu, Sep 15, 2016 at 07:35:16AM -0700, Guenter Roeck wrote: >> On 09/15/2016 07:23 AM, Uwe Kleine-König wrote: >>> On Thu, Sep 15, 2016 at 04:35:04PM +0300, Vladimir Zapolskiy wrote: >>>> Hi Guenter, >>>> >>>> On 09/14/2016 06:20 AM, Guenter Roeck wrote: >>>>> Hi Vladimir, >>>>> >>>>> your commit e188cbf7564f ("gpio: mxc: shift gpio_mxc_init() to subsys_initcall level") >>>>> in -next causes the following crash when running the 'kzm' target (and most likely >>>>> the real thing) with qemu. >>>>> >>>>> [ 1.211426] Unable to handle kernel NULL pointer dereference at virtual address 0000000c >>>>> [ 1.211600] pgd = c0004000 >>>>> [ 1.211680] [0000000c] *pgd=00000000 >>>>> [ 1.212067] Internal error: Oops: 5 [#1] SMP ARM >>>>> [ 1.212245] Modules linked in: >>>>> [ 1.212542] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-next-20160913 #1 >>>>> [ 1.212671] Hardware name: Kyoto Microcomputer Co., Ltd. KZM-ARM11-01 >>>>> [ 1.212825] task: c6848000 task.stack: c683e000 >>>>> [ 1.213231] PC is at platform_get_irq+0xc0/0xe8 >>>>> >>>>> See http://kerneltests.org/builders/qemu-arm-next/builds/525/steps/qemubuildcommand/logs/stdio >>>>> for a complete log. >>>>> >>>>> Problem is quite subtle. The change causes the gpio driver to be installed later. >>>>> As a result, kzm_init_smsc9118() fails to initialize the gpio pins correctly. >>>>> gpio_request() in that function returns -EPROBE_DEFER, which is ignored, >>>>> gpio_to_irq() then returns -22 which is unconditionally assigned as interrupt number. >>>>> platform_get_irq(), as called from the smsc driver, gets this negative interrupt >>>>> number, and passes it unconditionally to irq_get_irq_data(), which returns NULL. >>>>> The NULL pointer is then passed to irqd_set_trigger_type() which, not entirely >>>>> surprisingly, crashes. >>>>> >>>>> So, in other words, lots of bugs here. Nevertheless, I would suggest to keep using >>>>> postcore_initcall(), at least until it is sure that all gpio clients handle -EPROBE_DEFER >>>>> correctly. >>>> >>>> I'm inviting Shawn and Uwe to the discussion. >>>> >>>> The proper fix in this particular case should be like this one: >>>> >>>> diff --git a/arch/arm/mach-imx/mach-kzm_arm11_01.c b/arch/arm/mach-imx/mach-kzm_arm11_01.c >>>> index 31df4361996f..8288acfe7221 100644 >>>> --- a/arch/arm/mach-imx/mach-kzm_arm11_01.c >>>> +++ b/arch/arm/mach-imx/mach-kzm_arm11_01.c >>>> @@ -245,13 +245,17 @@ static void __init kzm_board_init(void) >>>> mxc_iomux_setup_multiple_pins(kzm_pins, >>>> ARRAY_SIZE(kzm_pins), "kzm"); >>>> - kzm_init_ext_uart(); >>>> - kzm_init_smsc9118(); >>>> kzm_init_imx_uart(); >>>> pr_info("Clock input source is 26MHz\n"); >>>> } >>>> +static void __init kzm_late_init(void) >>>> +{ >>>> + kzm_init_ext_uart(); >>>> + kzm_init_smsc9118(); >>>> +} >>>> + >>>> /* >>>> * This structure defines static mappings for the kzm-arm11-01 board. >>>> */ >>>> @@ -291,5 +295,6 @@ MACHINE_START(KZM_ARM11_01, "Kyoto Microcomputer Co., Ltd. KZM-ARM11-01") >>>> .init_irq = mx31_init_irq, >>>> .init_time = kzm_timer_init, >>>> .init_machine = kzm_board_init, >>>> + .init_late = kzm_late_init, >>>> .restart = mxc_restart, >>>> MACHINE_END >>> >>> That + checking the return code of gpio_request and the other calls. >>> Or better, convert the machine to dt. >>> >>>> But I agree that there might be more legacy boards (i.MX31 only IMHO), >>>> which may attempt to manipulate GPIO lines before subsys_initcall() >>>> level. >>> >>> I wouldn't revert anything for legacy boards. That's the chance to say >>> in the near future: They stopped working in September 2016, obviously >>> nobody cares, let's rip them. :-) >>> >> New kernel development philosophy ? Regressions are acceptable as long as >> they affect a board older than X years ? What is your cut-off date for accepting >> regressions like that ? > > I would soften your statement about regressions to: Regressions like > these are an incentive to pre-dt platforms to modernize. Of course they > should ideally be prevented, and fixed when they happen, but I wouldn't > start reverting gpio-changes because of a regression on a machine that > is hardly used with modern kernel (ok, that's an assumption, but it > didn't receive enough love to be converted to dt), that fails to even > check return values of gpio_request and that can be easily fixed. > All 890 of them ? Really ? Guenter