All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rajendra Nayak <rnayak@ti.com>
To: Tony Lindgren <tony@atomide.com>
Cc: "Bedia, Vaibhav" <vaibhav.bedia@ti.com>,
	"linux-omap@vger.kernel.org" <linux-omap@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Mark Jackson <mpfj-list@newflow.co.uk>,
	Sourav Poddar <sourav.poddar@ti.com>,
	Paul Walmsley <paul@pwsan.com>
Subject: Re: Boot hang regression 3.10.0-rc4 -> 3.10.0
Date: Mon, 8 Jul 2013 18:50:01 +0530	[thread overview]
Message-ID: <51DABC81.3080409@ti.com> (raw)
In-Reply-To: <20130708131033.GA5523@atomide.com>

On Monday 08 July 2013 06:40 PM, Tony Lindgren wrote:
> * Rajendra Nayak <rnayak@ti.com> [130708 05:48]:
>> On Monday 08 July 2013 04:55 PM, Tony Lindgren wrote:
>>> * Bedia, Vaibhav <vaibhav.bedia@ti.com> [130705 06:37]:
>>>> On Fri, Jul 05, 2013 at 18:50:10, Bedia, Vaibhav wrote:
>>>>> Hi Tony,
>>>>>
>>>>> On Fri, Jul 05, 2013 at 17:29:59, Tony Lindgren wrote:
>>>>>> * Bedia, Vaibhav <vaibhav.bedia@ti.com> [130705 01:17]:
>>>>>>>
>>>>>>> I just checked the behavior on my AM335x-EVM. Current mainline boots fine
>>>>>>> provided I don't use earlyprintk.  The offending patch [1] in this case is the one
>>>>>>> that tries to get rid of omap_serial_early_init() for DT boot. This change inadvertently
>>>>>>> also results in the console UART getting reset and idled during bootup and that's where
>>>>>>> the boot stops for you. I think if you skip earlyprintk from the bootargs you should see
>>>>>>> the system booting fine. 
>>>>>>>
>>>>>>> I guess we need to retain the NO_IDLE and NO_RESET aspect for the console UART in
>>>>>>> omap_serial_early_init() to get earlyprintk working again.
>>>>>>
>>>>>> Hmm nothing should get idled while earlyprintk is running, and then when the
>>>>>> serial driver kicks in it should not idle anything by default. And for DT based
>>>>>> booting we should not have mach-omap2/serial.c initialize anything.
>>>>>>
>>>>>
>>>>> If I add in the HWMOD flags without any reverts I get to the point where the serial driver
>>>>> comes up but the boot eventually stops [1]. Without the flags the boot stops much earlier [2]
>>>>> just like Mark reported.
>>>>>
>>>>
>>>> Err.. the log with HWMOD flags added is [2] and without flags is [1]. Sorry for the confusion.
>>>
>>> It sounds like something needs to be fixed for am33xx as omap3 and omap4
>>> won't hang with earlyprintk. Almost certainly mach-omap2/serial.c should not
>>> be needed at all for am33xx, and the bug is somewhere else.
>>
>> Tony, I spent some time on this today and there seem to 2 issues.
>>
>> Issue 1: Causing boot to stop much earlier as reported (this is during hmwod setup)
>>
>> The commit 'e97f03cb36e9ec8a2ccaa3e4bee5297fe48156fd' 
>> "ARM: OMAP2+: Fix serial init for device tree based booting" stubbed out omap_serial_early_init()
>> for DT case thinking its doing the port inits. But that does not seem to be true, the port inits happen
>> as part of omap_serial_init_port(). What omap_serial_early_init() was doing instead was adding the
>> HWMOD_INIT_NO_IDLE and HWMOD_INIT_NO_RESET flags which would tell hmwod not to reset and then idle the
>> console UART. With this not happening now for the DT case, it causes an issue.
>>
>> The issue was seen on am33xx and not on some other platforms because some platforms still have these
>> statically defined in the hwmod data files. I could see these set for uart3 in case of omap4 and omap5.
>> So I feel the above commit should be reverted and these static flags should be removed from the data
>> files.
> 
> Oh OK. That's starting to make a bit more sense then. 
>   
>>>>>> I wonder if this is because the timeouts get now initialized to 0 instead
>>>>>> of -1 for the serial driver?
>>>>>>
>>>>>
>>>>> You meant initialized to -1, right? There's an additional check for timeout being 0. Unless i
>>>>> am missing something DT-boot will start off with timeout set to 0 and then get forced to -1.
>>>
>>> OK
>>
>> Issue 2: Causing boot to stop when serial driver is initialized. (After Issue 1 is fixed)
>>
>> I could narrow this down to the change done to return -EINVAL instead of 0 in serial_omap_get_context_loss_count()
>> as part of commit 'a630fbfbb1beeffc5bbe542a7986bf2068874633' "serial: omap: Fix device tree based PM runtime"
>>
>> What this change in turn seems to do is cause a serial_omap_restore_context() to get called as part of
>> serial_omap_runtime_resume() which was not the case when serial_omap_get_context_loss_count() returned 0
>>
>> from serial_omap_runtime_resume():
>> -----
>>         int loss_cnt = serial_omap_get_context_loss_count(up);
>>
>>         if (loss_cnt < 0) {
>>                 dev_dbg(dev, "serial_omap_get_context_loss_count failed : %d\n",
>>                         loss_cnt);
>>                 serial_omap_restore_context(up);
>>         } else if (up->context_loss_cnt != loss_cnt) {
>>                 serial_omap_restore_context(up);
>>         }
>> -----
>>
>> I am still working on why a serial_omap_restore_context() could have caused console to die. I will work with
>> Sourav on this and post the fixes for both issue 1 and issue2 once its clear on whats really causing issue 2.
> 
> That's because we don't have the omap specific pdata callbacks for
> context loss any longer. We may be able to detect when the context
> was really lost in the serial driver, and only then call the
> serial_omap_restore_context().

Right, but calling serial_omap_restore_context() even when the context is not lost, should not
ideally cause an issue.

>  
>> Let me know if the fix I listed for Issue 1: makes sense.
> 
> Yes makes sense as a fix, but IMHO we should not need any workarounds
> like that. Is the hwmod code idling the the uarts early? If so, then
> it should only do that in a late_initcall if no drivers are registered.

hwmod as part of its setup (early) enables/resets and idles all modules.
These flags are used to tell hwmod to avoid a reset and idle and leave the
module enabled (in this case console uart)

regards
Rajendra

> 
> Regards,
> 
> Tony
> 


WARNING: multiple messages have this Message-ID (diff)
From: rnayak@ti.com (Rajendra Nayak)
To: linux-arm-kernel@lists.infradead.org
Subject: Boot hang regression 3.10.0-rc4 -> 3.10.0
Date: Mon, 8 Jul 2013 18:50:01 +0530	[thread overview]
Message-ID: <51DABC81.3080409@ti.com> (raw)
In-Reply-To: <20130708131033.GA5523@atomide.com>

On Monday 08 July 2013 06:40 PM, Tony Lindgren wrote:
> * Rajendra Nayak <rnayak@ti.com> [130708 05:48]:
>> On Monday 08 July 2013 04:55 PM, Tony Lindgren wrote:
>>> * Bedia, Vaibhav <vaibhav.bedia@ti.com> [130705 06:37]:
>>>> On Fri, Jul 05, 2013 at 18:50:10, Bedia, Vaibhav wrote:
>>>>> Hi Tony,
>>>>>
>>>>> On Fri, Jul 05, 2013 at 17:29:59, Tony Lindgren wrote:
>>>>>> * Bedia, Vaibhav <vaibhav.bedia@ti.com> [130705 01:17]:
>>>>>>>
>>>>>>> I just checked the behavior on my AM335x-EVM. Current mainline boots fine
>>>>>>> provided I don't use earlyprintk.  The offending patch [1] in this case is the one
>>>>>>> that tries to get rid of omap_serial_early_init() for DT boot. This change inadvertently
>>>>>>> also results in the console UART getting reset and idled during bootup and that's where
>>>>>>> the boot stops for you. I think if you skip earlyprintk from the bootargs you should see
>>>>>>> the system booting fine. 
>>>>>>>
>>>>>>> I guess we need to retain the NO_IDLE and NO_RESET aspect for the console UART in
>>>>>>> omap_serial_early_init() to get earlyprintk working again.
>>>>>>
>>>>>> Hmm nothing should get idled while earlyprintk is running, and then when the
>>>>>> serial driver kicks in it should not idle anything by default. And for DT based
>>>>>> booting we should not have mach-omap2/serial.c initialize anything.
>>>>>>
>>>>>
>>>>> If I add in the HWMOD flags without any reverts I get to the point where the serial driver
>>>>> comes up but the boot eventually stops [1]. Without the flags the boot stops much earlier [2]
>>>>> just like Mark reported.
>>>>>
>>>>
>>>> Err.. the log with HWMOD flags added is [2] and without flags is [1]. Sorry for the confusion.
>>>
>>> It sounds like something needs to be fixed for am33xx as omap3 and omap4
>>> won't hang with earlyprintk. Almost certainly mach-omap2/serial.c should not
>>> be needed at all for am33xx, and the bug is somewhere else.
>>
>> Tony, I spent some time on this today and there seem to 2 issues.
>>
>> Issue 1: Causing boot to stop much earlier as reported (this is during hmwod setup)
>>
>> The commit 'e97f03cb36e9ec8a2ccaa3e4bee5297fe48156fd' 
>> "ARM: OMAP2+: Fix serial init for device tree based booting" stubbed out omap_serial_early_init()
>> for DT case thinking its doing the port inits. But that does not seem to be true, the port inits happen
>> as part of omap_serial_init_port(). What omap_serial_early_init() was doing instead was adding the
>> HWMOD_INIT_NO_IDLE and HWMOD_INIT_NO_RESET flags which would tell hmwod not to reset and then idle the
>> console UART. With this not happening now for the DT case, it causes an issue.
>>
>> The issue was seen on am33xx and not on some other platforms because some platforms still have these
>> statically defined in the hwmod data files. I could see these set for uart3 in case of omap4 and omap5.
>> So I feel the above commit should be reverted and these static flags should be removed from the data
>> files.
> 
> Oh OK. That's starting to make a bit more sense then. 
>   
>>>>>> I wonder if this is because the timeouts get now initialized to 0 instead
>>>>>> of -1 for the serial driver?
>>>>>>
>>>>>
>>>>> You meant initialized to -1, right? There's an additional check for timeout being 0. Unless i
>>>>> am missing something DT-boot will start off with timeout set to 0 and then get forced to -1.
>>>
>>> OK
>>
>> Issue 2: Causing boot to stop when serial driver is initialized. (After Issue 1 is fixed)
>>
>> I could narrow this down to the change done to return -EINVAL instead of 0 in serial_omap_get_context_loss_count()
>> as part of commit 'a630fbfbb1beeffc5bbe542a7986bf2068874633' "serial: omap: Fix device tree based PM runtime"
>>
>> What this change in turn seems to do is cause a serial_omap_restore_context() to get called as part of
>> serial_omap_runtime_resume() which was not the case when serial_omap_get_context_loss_count() returned 0
>>
>> from serial_omap_runtime_resume():
>> -----
>>         int loss_cnt = serial_omap_get_context_loss_count(up);
>>
>>         if (loss_cnt < 0) {
>>                 dev_dbg(dev, "serial_omap_get_context_loss_count failed : %d\n",
>>                         loss_cnt);
>>                 serial_omap_restore_context(up);
>>         } else if (up->context_loss_cnt != loss_cnt) {
>>                 serial_omap_restore_context(up);
>>         }
>> -----
>>
>> I am still working on why a serial_omap_restore_context() could have caused console to die. I will work with
>> Sourav on this and post the fixes for both issue 1 and issue2 once its clear on whats really causing issue 2.
> 
> That's because we don't have the omap specific pdata callbacks for
> context loss any longer. We may be able to detect when the context
> was really lost in the serial driver, and only then call the
> serial_omap_restore_context().

Right, but calling serial_omap_restore_context() even when the context is not lost, should not
ideally cause an issue.

>  
>> Let me know if the fix I listed for Issue 1: makes sense.
> 
> Yes makes sense as a fix, but IMHO we should not need any workarounds
> like that. Is the hwmod code idling the the uarts early? If so, then
> it should only do that in a late_initcall if no drivers are registered.

hwmod as part of its setup (early) enables/resets and idles all modules.
These flags are used to tell hwmod to avoid a reset and idle and leave the
module enabled (in this case console uart)

regards
Rajendra

> 
> Regards,
> 
> Tony
> 

  reply	other threads:[~2013-07-08 13:20 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-04 13:25 Boot hang regression 3.10.0-rc4 -> 3.10.0 Mark Jackson
2013-07-04 13:25 ` Mark Jackson
2013-07-04 15:14 ` Mark Jackson
2013-07-04 15:14   ` Mark Jackson
2013-07-04 16:00   ` Mark Jackson
2013-07-04 16:00     ` Mark Jackson
2013-07-05  8:11     ` Bedia, Vaibhav
2013-07-05  8:11       ` Bedia, Vaibhav
2013-07-05 11:59       ` Tony Lindgren
2013-07-05 11:59         ` Tony Lindgren
2013-07-05 13:20         ` Bedia, Vaibhav
2013-07-05 13:20           ` Bedia, Vaibhav
2013-07-05 13:31           ` Bedia, Vaibhav
2013-07-05 13:31             ` Bedia, Vaibhav
2013-07-08 11:25             ` Tony Lindgren
2013-07-08 11:25               ` Tony Lindgren
2013-07-08 12:16               ` Tony Lindgren
2013-07-08 12:16                 ` Tony Lindgren
2013-07-08 12:41               ` Rajendra Nayak
2013-07-08 12:41                 ` Rajendra Nayak
2013-07-08 13:10                 ` Tony Lindgren
2013-07-08 13:10                   ` Tony Lindgren
2013-07-08 13:20                   ` Rajendra Nayak [this message]
2013-07-08 13:20                     ` Rajendra Nayak
2013-07-08 13:25                     ` Rajendra Nayak
2013-07-08 13:25                       ` Rajendra Nayak
2013-07-08 13:35                     ` Felipe Balbi
2013-07-08 13:35                       ` Felipe Balbi
2013-07-09  5:33                       ` Rajendra Nayak
2013-07-09  5:33                         ` Rajendra Nayak
2013-07-09  6:42                         ` Felipe Balbi
2013-07-09  6:42                           ` Felipe Balbi
2013-07-09  7:19                           ` Rajendra Nayak
2013-07-09  7:19                             ` Rajendra Nayak
2013-07-09  7:40                             ` Felipe Balbi
2013-07-09  7:40                               ` Felipe Balbi
2013-07-09 18:59                           ` Grygorii Strashko
2013-07-09 18:59                             ` Grygorii Strashko
2013-07-09 19:41                             ` Felipe Balbi
2013-07-09 19:41                               ` Felipe Balbi
2013-07-10 12:16                               ` Grygorii Strashko
2013-07-10 12:16                                 ` Grygorii Strashko
2013-07-10 12:25                                 ` Felipe Balbi
2013-07-10 12:25                                   ` Felipe Balbi
2013-07-10  8:22                       ` Kevin Hilman
2013-07-10  8:22                         ` Kevin Hilman
2013-07-10 12:10                         ` Tony Lindgren
2013-07-10 12:10                           ` Tony Lindgren
2013-07-10 12:27                           ` Tony Lindgren
2013-07-10 12:27                             ` Tony Lindgren
2013-07-10 14:26                         ` Tony Lindgren
2013-07-10 14:26                           ` Tony Lindgren
2013-07-10 16:07                           ` Felipe Balbi
2013-07-10 16:07                             ` Felipe Balbi
2013-07-10 16:11                             ` Felipe Balbi
2013-07-10 16:11                               ` Felipe Balbi
2013-07-11  6:32                               ` Tony Lindgren
2013-07-11  6:32                                 ` Tony Lindgren
2013-07-11  9:59                                 ` Grygorii Strashko
2013-07-11  9:59                                   ` Grygorii Strashko
2013-07-12  0:40                                   ` Suman Anna
2013-07-12  0:40                                     ` Suman Anna
2013-07-15  6:44                                     ` Rajendra Nayak
2013-07-15  6:44                                       ` Rajendra Nayak
2013-07-15 10:01                                       ` Rajendra Nayak
2013-07-15 10:01                                         ` Rajendra Nayak
2013-07-15 19:23                                         ` Suman Anna
2013-07-15 19:23                                           ` Suman Anna
2013-07-16  6:30                                           ` Rajendra Nayak
2013-07-16  6:30                                             ` Rajendra Nayak
2013-07-11  9:17                             ` Rajendra Nayak
2013-07-11  9:17                               ` Rajendra Nayak
2013-07-11  9:26                               ` Felipe Balbi
2013-07-11  9:26                                 ` Felipe Balbi
2013-07-11 10:16                                 ` [PATCH] arm: omap2plus: unidle devices which are about to probe Felipe Balbi
2013-07-11 10:16                                   ` Felipe Balbi
2013-07-12 11:58                                   ` Grygorii Strashko
2013-07-12 11:58                                     ` Grygorii Strashko
2013-07-12 12:10                                     ` Felipe Balbi
2013-07-12 12:10                                       ` Felipe Balbi
2013-07-12 12:27                                       ` Rajendra Nayak
2013-07-12 12:27                                         ` Rajendra Nayak
2013-07-13 22:21                                   ` Kevin Hilman
2013-07-13 22:21                                     ` Kevin Hilman
2013-07-11  9:59                               ` Boot hang regression 3.10.0-rc4 -> 3.10.0 Grygorii Strashko
2013-07-11  9:59                                 ` Grygorii Strashko
2013-07-16 10:27                               ` Grygorii Strashko
2013-07-16 10:27                                 ` Grygorii Strashko
2013-07-17  7:10                                 ` Rajendra Nayak
2013-07-17  7:10                                   ` Rajendra Nayak
2013-07-11  6:18                           ` Rajendra Nayak
2013-07-11  6:18                             ` Rajendra Nayak
2013-07-11  6:24                             ` Tony Lindgren
2013-07-11  6:24                               ` Tony Lindgren
2013-07-11  9:11                               ` Rajendra Nayak
2013-07-11  9:11                                 ` Rajendra Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51DABC81.3080409@ti.com \
    --to=rnayak@ti.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=mpfj-list@newflow.co.uk \
    --cc=paul@pwsan.com \
    --cc=sourav.poddar@ti.com \
    --cc=tony@atomide.com \
    --cc=vaibhav.bedia@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.