All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Enable arm_global_timer for Zynq brakes boot
       [not found]           ` <20130722201348.GI453@xsjandreislx>
@ 2013-07-22 22:41               ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-22 22:41 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: John Stultz, Thomas Gleixner, Daniel Lezcano, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

Hi,

adding LKML and LAKML (which I forgot on the original email, sorry)

On Mon, Jul 22, 2013 at 01:13:48PM -0700, Sören Brinkmann wrote:
> On Mon, Jul 22, 2013 at 12:54:26PM -0700, Stephen Boyd wrote:
> > On 07/22/13 11:25, Sören Brinkmann wrote:
> > > On Wed, Jul 17, 2013 at 06:22:35PM -0700, Stephen Boyd wrote:
> > >> On 07/17/13 17:59, Sören Brinkmann wrote:
> > >>> Hi Stephen,
> > >>>
> > >>> On Wed, Jul 17, 2013 at 04:50:34PM -0700, Stephen Boyd wrote:
> > >>>> On 07/17/13 14:04, Sören Brinkmann wrote:
> > >>>>> Hi all,
> > >>>>>
> > >>>>> I'm trying to enable the arm_global_timer on Zynq platforms with the
> > >>>>> attached patch. Unfortunately that patch breaks booting up. It hangs
> > >>>>> when handing over to init/early userspace (see attached boot.log).
> > >>>>>
> > >>>>> The funny thing is, if I remove either the global timer or the
> > >>>>> arm,cortex-a9-twd-timer node from my dts, it works. So, it looks like
> > >>>>> the two timer (drivers) interfere somehow. Does anybody have an idea of
> > >>>>> what is going on and probably even how to resolve it?
> > >>>>>
> > >>>>> The patch is based on commit c0d15cc in Linus' tree.
> > >>>> If you boot with one CPU does it hang? It looks like secondary CPUs
> > >>>> aren't getting interrupts but I'm not sure why. Maybe you can try this
> > >>>> patch and/or put some prints in the timer interrupt handler to see if
> > >>>> interrupts are firing on secondary CPUs.
> > >>> Your proposed patch does not seem to make a difference, but adding
> > >>> 'maxcpus=1' to the kernel command line makes the system boot.
> > >> Hmm I guess that doesn't really confirm much because TWD doesn't
> > >> register itself on maxcpus=1 systems, so it's the same as removing the
> > >> node from DT. Probably easier to put a printk in the interrupt handler
> > >> and confirm that you're receiving interrupts on secondary CPUs.
> > > Turns out it does work when I disable Zynq's cpuidle driver. I think I
> > > can blame that driver.
> > >
> > 
> > Hmm.. Perhaps the arm_global_timer driver also needs FEAT_C3_STOP added
> > to it. Do you know if that timer is reset during low power modes?
> 
> Our cpudidle driver is not powering off anything, AFAIK. I think it just
> executes 'wfi' on the CPU. I don't know how the timer core handles it,
> but I'd expect the CPU should find the timer just a it was left before
> entering idle (well, the timer continues to run I assume, but other than
> that).
> I'll do some debugging and see if I can find out what exactly causes the
> hang.

So, what I found:
The Zynq cpuidle driver provides two idle states, which are both
basically just ARM wfi states. But the second one set's these flags:
	.flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TIMER_STOP,

I don't know what these flags cause in detail. But the
CPUIDLE_FLAG_TIMER_STOP seemed suspicious, since wfi does not have any
effect on the timer. So, I removed that one and things are working.

I also tried the other approach: Leaving cpuidle as is and adding the
C3STOP flag to the global timer. That solves it too.

Does anybody know what the correct solution is?
In case the C3STOP flag is considered to be corret for the timer, I
could prepare a patch for that and bundle it with the one to enable the
timer for Zynq?

@Michal: What do we do about the cpuidle driver? If you asked me, we
should rip out the second state completely. We have it as disabled
placeholder in our vendor tree and it seems to break things having it
enabled in mainline.


	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-22 22:41               ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-22 22:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

adding LKML and LAKML (which I forgot on the original email, sorry)

On Mon, Jul 22, 2013 at 01:13:48PM -0700, S?ren Brinkmann wrote:
> On Mon, Jul 22, 2013 at 12:54:26PM -0700, Stephen Boyd wrote:
> > On 07/22/13 11:25, S?ren Brinkmann wrote:
> > > On Wed, Jul 17, 2013 at 06:22:35PM -0700, Stephen Boyd wrote:
> > >> On 07/17/13 17:59, S?ren Brinkmann wrote:
> > >>> Hi Stephen,
> > >>>
> > >>> On Wed, Jul 17, 2013 at 04:50:34PM -0700, Stephen Boyd wrote:
> > >>>> On 07/17/13 14:04, S?ren Brinkmann wrote:
> > >>>>> Hi all,
> > >>>>>
> > >>>>> I'm trying to enable the arm_global_timer on Zynq platforms with the
> > >>>>> attached patch. Unfortunately that patch breaks booting up. It hangs
> > >>>>> when handing over to init/early userspace (see attached boot.log).
> > >>>>>
> > >>>>> The funny thing is, if I remove either the global timer or the
> > >>>>> arm,cortex-a9-twd-timer node from my dts, it works. So, it looks like
> > >>>>> the two timer (drivers) interfere somehow. Does anybody have an idea of
> > >>>>> what is going on and probably even how to resolve it?
> > >>>>>
> > >>>>> The patch is based on commit c0d15cc in Linus' tree.
> > >>>> If you boot with one CPU does it hang? It looks like secondary CPUs
> > >>>> aren't getting interrupts but I'm not sure why. Maybe you can try this
> > >>>> patch and/or put some prints in the timer interrupt handler to see if
> > >>>> interrupts are firing on secondary CPUs.
> > >>> Your proposed patch does not seem to make a difference, but adding
> > >>> 'maxcpus=1' to the kernel command line makes the system boot.
> > >> Hmm I guess that doesn't really confirm much because TWD doesn't
> > >> register itself on maxcpus=1 systems, so it's the same as removing the
> > >> node from DT. Probably easier to put a printk in the interrupt handler
> > >> and confirm that you're receiving interrupts on secondary CPUs.
> > > Turns out it does work when I disable Zynq's cpuidle driver. I think I
> > > can blame that driver.
> > >
> > 
> > Hmm.. Perhaps the arm_global_timer driver also needs FEAT_C3_STOP added
> > to it. Do you know if that timer is reset during low power modes?
> 
> Our cpudidle driver is not powering off anything, AFAIK. I think it just
> executes 'wfi' on the CPU. I don't know how the timer core handles it,
> but I'd expect the CPU should find the timer just a it was left before
> entering idle (well, the timer continues to run I assume, but other than
> that).
> I'll do some debugging and see if I can find out what exactly causes the
> hang.

So, what I found:
The Zynq cpuidle driver provides two idle states, which are both
basically just ARM wfi states. But the second one set's these flags:
	.flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TIMER_STOP,

I don't know what these flags cause in detail. But the
CPUIDLE_FLAG_TIMER_STOP seemed suspicious, since wfi does not have any
effect on the timer. So, I removed that one and things are working.

I also tried the other approach: Leaving cpuidle as is and adding the
C3STOP flag to the global timer. That solves it too.

Does anybody know what the correct solution is?
In case the C3STOP flag is considered to be corret for the timer, I
could prepare a patch for that and bundle it with the one to enable the
timer for Zynq?

@Michal: What do we do about the cpuidle driver? If you asked me, we
should rip out the second state completely. We have it as disabled
placeholder in our vendor tree and it seems to break things having it
enabled in mainline.


	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-22 22:41               ` Sören Brinkmann
@ 2013-07-29 12:51                 ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-29 12:51 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 07/23/2013 12:41 AM, Sören Brinkmann wrote:
> Hi,
> 
> adding LKML and LAKML (which I forgot on the original email, sorry)
> 
> On Mon, Jul 22, 2013 at 01:13:48PM -0700, Sören Brinkmann wrote:
>> On Mon, Jul 22, 2013 at 12:54:26PM -0700, Stephen Boyd wrote:
>>> On 07/22/13 11:25, Sören Brinkmann wrote:
>>>> On Wed, Jul 17, 2013 at 06:22:35PM -0700, Stephen Boyd wrote:
>>>>> On 07/17/13 17:59, Sören Brinkmann wrote:
>>>>>> Hi Stephen,
>>>>>>
>>>>>> On Wed, Jul 17, 2013 at 04:50:34PM -0700, Stephen Boyd wrote:
>>>>>>> On 07/17/13 14:04, Sören Brinkmann wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I'm trying to enable the arm_global_timer on Zynq platforms with the
>>>>>>>> attached patch. Unfortunately that patch breaks booting up. It hangs
>>>>>>>> when handing over to init/early userspace (see attached boot.log).
>>>>>>>>
>>>>>>>> The funny thing is, if I remove either the global timer or the
>>>>>>>> arm,cortex-a9-twd-timer node from my dts, it works. So, it looks like
>>>>>>>> the two timer (drivers) interfere somehow. Does anybody have an idea of
>>>>>>>> what is going on and probably even how to resolve it?
>>>>>>>>
>>>>>>>> The patch is based on commit c0d15cc in Linus' tree.
>>>>>>> If you boot with one CPU does it hang? It looks like secondary CPUs
>>>>>>> aren't getting interrupts but I'm not sure why. Maybe you can try this
>>>>>>> patch and/or put some prints in the timer interrupt handler to see if
>>>>>>> interrupts are firing on secondary CPUs.
>>>>>> Your proposed patch does not seem to make a difference, but adding
>>>>>> 'maxcpus=1' to the kernel command line makes the system boot.
>>>>> Hmm I guess that doesn't really confirm much because TWD doesn't
>>>>> register itself on maxcpus=1 systems, so it's the same as removing the
>>>>> node from DT. Probably easier to put a printk in the interrupt handler
>>>>> and confirm that you're receiving interrupts on secondary CPUs.
>>>> Turns out it does work when I disable Zynq's cpuidle driver. I think I
>>>> can blame that driver.
>>>>
>>>
>>> Hmm.. Perhaps the arm_global_timer driver also needs FEAT_C3_STOP added
>>> to it. Do you know if that timer is reset during low power modes?
>>
>> Our cpudidle driver is not powering off anything, AFAIK. I think it just
>> executes 'wfi' on the CPU. I don't know how the timer core handles it,
>> but I'd expect the CPU should find the timer just a it was left before
>> entering idle (well, the timer continues to run I assume, but other than
>> that).
>> I'll do some debugging and see if I can find out what exactly causes the
>> hang.
> 
> So, what I found:
> The Zynq cpuidle driver provides two idle states, which are both
> basically just ARM wfi states. But the second one set's these flags:
> 	.flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TIMER_STOP,
> 
> I don't know what these flags cause in detail. But the
> CPUIDLE_FLAG_TIMER_STOP seemed suspicious, since wfi does not have any
> effect on the timer. So, I removed that one and things are working.
> 
> I also tried the other approach: Leaving cpuidle as is and adding the
> C3STOP flag to the global timer. That solves it too.
> 
> Does anybody know what the correct solution is?
> In case the C3STOP flag is considered to be corret for the timer, I
> could prepare a patch for that and bundle it with the one to enable the
> timer for Zynq?

Hi Soren,

the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
timer will be stopped when entering to the idle state. In this case, the
cpuidle framework will call clockevents_notify(ENTER) and switches to a
broadcast timer and will call clockevents_notify(EXIT) when exiting the
idle state, switching the local timer back in use.

The C3STOP flag has a similar semantic than the CPUIDLE_FLAG_TIMER_STOP,
that is the timer can be shutdown with a specific idle state. This flag
is used by the tick broadcast code.

If the C3STOP flag is not set for a local timer, the
CPUIDLE_FLAG_TIMER_STOP does not make sense because it will be ignored
by the tick-broadcast code.

If the local timer could be shutdown at idle time, you *must* specify
this flag.

If the idle state shutdowns the cpu with its local timer, you *must*
specify the CPUIDLE_FLAG_TIMER_STOP flag for this specific state.

At the first glance, the idle state #2 is aimed to do DDR self refresh
and to switch to WFI, so no power gating, then no local timer down. The
CPUIDLE_FLAG_TIMER_STOP shouldn't be used here.

IIUC, the global timer does not belong to the CPU and the cluster power
domains, so it can't be shutdown: the C3STOP shouldn't be used.

I hope that helps.

  -- Daniel


> @Michal: What do we do about the cpuidle driver? If you asked me, we
> should rip out the second state completely. We have it as disabled
> placeholder in our vendor tree and it seems to break things having it
> enabled in mainline.
> 
> 
> 	Sören
> 
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-29 12:51                 ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-29 12:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/23/2013 12:41 AM, S?ren Brinkmann wrote:
> Hi,
> 
> adding LKML and LAKML (which I forgot on the original email, sorry)
> 
> On Mon, Jul 22, 2013 at 01:13:48PM -0700, S?ren Brinkmann wrote:
>> On Mon, Jul 22, 2013 at 12:54:26PM -0700, Stephen Boyd wrote:
>>> On 07/22/13 11:25, S?ren Brinkmann wrote:
>>>> On Wed, Jul 17, 2013 at 06:22:35PM -0700, Stephen Boyd wrote:
>>>>> On 07/17/13 17:59, S?ren Brinkmann wrote:
>>>>>> Hi Stephen,
>>>>>>
>>>>>> On Wed, Jul 17, 2013 at 04:50:34PM -0700, Stephen Boyd wrote:
>>>>>>> On 07/17/13 14:04, S?ren Brinkmann wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I'm trying to enable the arm_global_timer on Zynq platforms with the
>>>>>>>> attached patch. Unfortunately that patch breaks booting up. It hangs
>>>>>>>> when handing over to init/early userspace (see attached boot.log).
>>>>>>>>
>>>>>>>> The funny thing is, if I remove either the global timer or the
>>>>>>>> arm,cortex-a9-twd-timer node from my dts, it works. So, it looks like
>>>>>>>> the two timer (drivers) interfere somehow. Does anybody have an idea of
>>>>>>>> what is going on and probably even how to resolve it?
>>>>>>>>
>>>>>>>> The patch is based on commit c0d15cc in Linus' tree.
>>>>>>> If you boot with one CPU does it hang? It looks like secondary CPUs
>>>>>>> aren't getting interrupts but I'm not sure why. Maybe you can try this
>>>>>>> patch and/or put some prints in the timer interrupt handler to see if
>>>>>>> interrupts are firing on secondary CPUs.
>>>>>> Your proposed patch does not seem to make a difference, but adding
>>>>>> 'maxcpus=1' to the kernel command line makes the system boot.
>>>>> Hmm I guess that doesn't really confirm much because TWD doesn't
>>>>> register itself on maxcpus=1 systems, so it's the same as removing the
>>>>> node from DT. Probably easier to put a printk in the interrupt handler
>>>>> and confirm that you're receiving interrupts on secondary CPUs.
>>>> Turns out it does work when I disable Zynq's cpuidle driver. I think I
>>>> can blame that driver.
>>>>
>>>
>>> Hmm.. Perhaps the arm_global_timer driver also needs FEAT_C3_STOP added
>>> to it. Do you know if that timer is reset during low power modes?
>>
>> Our cpudidle driver is not powering off anything, AFAIK. I think it just
>> executes 'wfi' on the CPU. I don't know how the timer core handles it,
>> but I'd expect the CPU should find the timer just a it was left before
>> entering idle (well, the timer continues to run I assume, but other than
>> that).
>> I'll do some debugging and see if I can find out what exactly causes the
>> hang.
> 
> So, what I found:
> The Zynq cpuidle driver provides two idle states, which are both
> basically just ARM wfi states. But the second one set's these flags:
> 	.flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TIMER_STOP,
> 
> I don't know what these flags cause in detail. But the
> CPUIDLE_FLAG_TIMER_STOP seemed suspicious, since wfi does not have any
> effect on the timer. So, I removed that one and things are working.
> 
> I also tried the other approach: Leaving cpuidle as is and adding the
> C3STOP flag to the global timer. That solves it too.
> 
> Does anybody know what the correct solution is?
> In case the C3STOP flag is considered to be corret for the timer, I
> could prepare a patch for that and bundle it with the one to enable the
> timer for Zynq?

Hi Soren,

the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
timer will be stopped when entering to the idle state. In this case, the
cpuidle framework will call clockevents_notify(ENTER) and switches to a
broadcast timer and will call clockevents_notify(EXIT) when exiting the
idle state, switching the local timer back in use.

The C3STOP flag has a similar semantic than the CPUIDLE_FLAG_TIMER_STOP,
that is the timer can be shutdown with a specific idle state. This flag
is used by the tick broadcast code.

If the C3STOP flag is not set for a local timer, the
CPUIDLE_FLAG_TIMER_STOP does not make sense because it will be ignored
by the tick-broadcast code.

If the local timer could be shutdown at idle time, you *must* specify
this flag.

If the idle state shutdowns the cpu with its local timer, you *must*
specify the CPUIDLE_FLAG_TIMER_STOP flag for this specific state.

At the first glance, the idle state #2 is aimed to do DDR self refresh
and to switch to WFI, so no power gating, then no local timer down. The
CPUIDLE_FLAG_TIMER_STOP shouldn't be used here.

IIUC, the global timer does not belong to the CPU and the cluster power
domains, so it can't be shutdown: the C3STOP shouldn't be used.

I hope that helps.

  -- Daniel


> @Michal: What do we do about the cpuidle driver? If you asked me, we
> should rip out the second state completely. We have it as disabled
> placeholder in our vendor tree and it seems to break things having it
> enabled in mainline.
> 
> 
> 	S?ren
> 
> 


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-29 12:51                 ` Daniel Lezcano
@ 2013-07-29 15:58                   ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-29 15:58 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

Hi Daniel,

On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> On 07/23/2013 12:41 AM, Sören Brinkmann wrote:
> > Hi,
> > 
> > adding LKML and LAKML (which I forgot on the original email, sorry)
> > 
> > On Mon, Jul 22, 2013 at 01:13:48PM -0700, Sören Brinkmann wrote:
> >> On Mon, Jul 22, 2013 at 12:54:26PM -0700, Stephen Boyd wrote:
> >>> On 07/22/13 11:25, Sören Brinkmann wrote:
> >>>> On Wed, Jul 17, 2013 at 06:22:35PM -0700, Stephen Boyd wrote:
> >>>>> On 07/17/13 17:59, Sören Brinkmann wrote:
> >>>>>> Hi Stephen,
> >>>>>>
> >>>>>> On Wed, Jul 17, 2013 at 04:50:34PM -0700, Stephen Boyd wrote:
> >>>>>>> On 07/17/13 14:04, Sören Brinkmann wrote:
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I'm trying to enable the arm_global_timer on Zynq platforms with the
> >>>>>>>> attached patch. Unfortunately that patch breaks booting up. It hangs
> >>>>>>>> when handing over to init/early userspace (see attached boot.log).
> >>>>>>>>
> >>>>>>>> The funny thing is, if I remove either the global timer or the
> >>>>>>>> arm,cortex-a9-twd-timer node from my dts, it works. So, it looks like
> >>>>>>>> the two timer (drivers) interfere somehow. Does anybody have an idea of
> >>>>>>>> what is going on and probably even how to resolve it?
> >>>>>>>>
> >>>>>>>> The patch is based on commit c0d15cc in Linus' tree.
> >>>>>>> If you boot with one CPU does it hang? It looks like secondary CPUs
> >>>>>>> aren't getting interrupts but I'm not sure why. Maybe you can try this
> >>>>>>> patch and/or put some prints in the timer interrupt handler to see if
> >>>>>>> interrupts are firing on secondary CPUs.
> >>>>>> Your proposed patch does not seem to make a difference, but adding
> >>>>>> 'maxcpus=1' to the kernel command line makes the system boot.
> >>>>> Hmm I guess that doesn't really confirm much because TWD doesn't
> >>>>> register itself on maxcpus=1 systems, so it's the same as removing the
> >>>>> node from DT. Probably easier to put a printk in the interrupt handler
> >>>>> and confirm that you're receiving interrupts on secondary CPUs.
> >>>> Turns out it does work when I disable Zynq's cpuidle driver. I think I
> >>>> can blame that driver.
> >>>>
> >>>
> >>> Hmm.. Perhaps the arm_global_timer driver also needs FEAT_C3_STOP added
> >>> to it. Do you know if that timer is reset during low power modes?
> >>
> >> Our cpudidle driver is not powering off anything, AFAIK. I think it just
> >> executes 'wfi' on the CPU. I don't know how the timer core handles it,
> >> but I'd expect the CPU should find the timer just a it was left before
> >> entering idle (well, the timer continues to run I assume, but other than
> >> that).
> >> I'll do some debugging and see if I can find out what exactly causes the
> >> hang.
> > 
> > So, what I found:
> > The Zynq cpuidle driver provides two idle states, which are both
> > basically just ARM wfi states. But the second one set's these flags:
> > 	.flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TIMER_STOP,
> > 
> > I don't know what these flags cause in detail. But the
> > CPUIDLE_FLAG_TIMER_STOP seemed suspicious, since wfi does not have any
> > effect on the timer. So, I removed that one and things are working.
> > 
> > I also tried the other approach: Leaving cpuidle as is and adding the
> > C3STOP flag to the global timer. That solves it too.
> > 
> > Does anybody know what the correct solution is?
> > In case the C3STOP flag is considered to be corret for the timer, I
> > could prepare a patch for that and bundle it with the one to enable the
> > timer for Zynq?
> 
> Hi Soren,
> 
> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> timer will be stopped when entering to the idle state. In this case, the
> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> idle state, switching the local timer back in use.
> 
> The C3STOP flag has a similar semantic than the CPUIDLE_FLAG_TIMER_STOP,
> that is the timer can be shutdown with a specific idle state. This flag
> is used by the tick broadcast code.
> 
> If the C3STOP flag is not set for a local timer, the
> CPUIDLE_FLAG_TIMER_STOP does not make sense because it will be ignored
> by the tick-broadcast code.
> 
> If the local timer could be shutdown at idle time, you *must* specify
> this flag.
> 
> If the idle state shutdowns the cpu with its local timer, you *must*
> specify the CPUIDLE_FLAG_TIMER_STOP flag for this specific state.
> 
> At the first glance, the idle state #2 is aimed to do DDR self refresh
> and to switch to WFI, so no power gating, then no local timer down. The
> CPUIDLE_FLAG_TIMER_STOP shouldn't be used here.
> 
> IIUC, the global timer does not belong to the CPU and the cluster power
> domains, so it can't be shutdown: the C3STOP shouldn't be used.
> 
> I hope that helps.

Thanks for the explanation. I have to discuss with Michal what to do about the
cpuidle driver and then we should be good to use the global timer.

	Thanks,
	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-29 15:58                   ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-29 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> On 07/23/2013 12:41 AM, S?ren Brinkmann wrote:
> > Hi,
> > 
> > adding LKML and LAKML (which I forgot on the original email, sorry)
> > 
> > On Mon, Jul 22, 2013 at 01:13:48PM -0700, S?ren Brinkmann wrote:
> >> On Mon, Jul 22, 2013 at 12:54:26PM -0700, Stephen Boyd wrote:
> >>> On 07/22/13 11:25, S?ren Brinkmann wrote:
> >>>> On Wed, Jul 17, 2013 at 06:22:35PM -0700, Stephen Boyd wrote:
> >>>>> On 07/17/13 17:59, S?ren Brinkmann wrote:
> >>>>>> Hi Stephen,
> >>>>>>
> >>>>>> On Wed, Jul 17, 2013 at 04:50:34PM -0700, Stephen Boyd wrote:
> >>>>>>> On 07/17/13 14:04, S?ren Brinkmann wrote:
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I'm trying to enable the arm_global_timer on Zynq platforms with the
> >>>>>>>> attached patch. Unfortunately that patch breaks booting up. It hangs
> >>>>>>>> when handing over to init/early userspace (see attached boot.log).
> >>>>>>>>
> >>>>>>>> The funny thing is, if I remove either the global timer or the
> >>>>>>>> arm,cortex-a9-twd-timer node from my dts, it works. So, it looks like
> >>>>>>>> the two timer (drivers) interfere somehow. Does anybody have an idea of
> >>>>>>>> what is going on and probably even how to resolve it?
> >>>>>>>>
> >>>>>>>> The patch is based on commit c0d15cc in Linus' tree.
> >>>>>>> If you boot with one CPU does it hang? It looks like secondary CPUs
> >>>>>>> aren't getting interrupts but I'm not sure why. Maybe you can try this
> >>>>>>> patch and/or put some prints in the timer interrupt handler to see if
> >>>>>>> interrupts are firing on secondary CPUs.
> >>>>>> Your proposed patch does not seem to make a difference, but adding
> >>>>>> 'maxcpus=1' to the kernel command line makes the system boot.
> >>>>> Hmm I guess that doesn't really confirm much because TWD doesn't
> >>>>> register itself on maxcpus=1 systems, so it's the same as removing the
> >>>>> node from DT. Probably easier to put a printk in the interrupt handler
> >>>>> and confirm that you're receiving interrupts on secondary CPUs.
> >>>> Turns out it does work when I disable Zynq's cpuidle driver. I think I
> >>>> can blame that driver.
> >>>>
> >>>
> >>> Hmm.. Perhaps the arm_global_timer driver also needs FEAT_C3_STOP added
> >>> to it. Do you know if that timer is reset during low power modes?
> >>
> >> Our cpudidle driver is not powering off anything, AFAIK. I think it just
> >> executes 'wfi' on the CPU. I don't know how the timer core handles it,
> >> but I'd expect the CPU should find the timer just a it was left before
> >> entering idle (well, the timer continues to run I assume, but other than
> >> that).
> >> I'll do some debugging and see if I can find out what exactly causes the
> >> hang.
> > 
> > So, what I found:
> > The Zynq cpuidle driver provides two idle states, which are both
> > basically just ARM wfi states. But the second one set's these flags:
> > 	.flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TIMER_STOP,
> > 
> > I don't know what these flags cause in detail. But the
> > CPUIDLE_FLAG_TIMER_STOP seemed suspicious, since wfi does not have any
> > effect on the timer. So, I removed that one and things are working.
> > 
> > I also tried the other approach: Leaving cpuidle as is and adding the
> > C3STOP flag to the global timer. That solves it too.
> > 
> > Does anybody know what the correct solution is?
> > In case the C3STOP flag is considered to be corret for the timer, I
> > could prepare a patch for that and bundle it with the one to enable the
> > timer for Zynq?
> 
> Hi Soren,
> 
> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> timer will be stopped when entering to the idle state. In this case, the
> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> idle state, switching the local timer back in use.
> 
> The C3STOP flag has a similar semantic than the CPUIDLE_FLAG_TIMER_STOP,
> that is the timer can be shutdown with a specific idle state. This flag
> is used by the tick broadcast code.
> 
> If the C3STOP flag is not set for a local timer, the
> CPUIDLE_FLAG_TIMER_STOP does not make sense because it will be ignored
> by the tick-broadcast code.
> 
> If the local timer could be shutdown at idle time, you *must* specify
> this flag.
> 
> If the idle state shutdowns the cpu with its local timer, you *must*
> specify the CPUIDLE_FLAG_TIMER_STOP flag for this specific state.
> 
> At the first glance, the idle state #2 is aimed to do DDR self refresh
> and to switch to WFI, so no power gating, then no local timer down. The
> CPUIDLE_FLAG_TIMER_STOP shouldn't be used here.
> 
> IIUC, the global timer does not belong to the CPU and the cluster power
> domains, so it can't be shutdown: the C3STOP shouldn't be used.
> 
> I hope that helps.

Thanks for the explanation. I have to discuss with Michal what to do about the
cpuidle driver and then we should be good to use the global timer.

	Thanks,
	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-29 12:51                 ` Daniel Lezcano
@ 2013-07-30  0:03                   ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-30  0:03 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

Hi Daniel,

On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
(snip)
> 
> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> timer will be stopped when entering to the idle state. In this case, the
> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> idle state, switching the local timer back in use.

I've been thinking about this, trying to understand how this makes my
boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
would make the timer core switch to a broadcast device even though it
wouldn't be necessary. But shouldn't it still work? It sounds like we do
something useless, but nothing wrong in a sense that it should result in
breakage. I guess I'm missing something obvious. This timer system will
always remain a mystery to me.

Actually this more or less leads to the question: What is this
'broadcast timer'. I guess that is some clockevent device which is
common to all cores? (that would be the cadence_ttc for Zynq). Is the
hang pointing to some issue with that driver?

	Thanks,
	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-30  0:03                   ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-30  0:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
(snip)
> 
> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> timer will be stopped when entering to the idle state. In this case, the
> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> idle state, switching the local timer back in use.

I've been thinking about this, trying to understand how this makes my
boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
would make the timer core switch to a broadcast device even though it
wouldn't be necessary. But shouldn't it still work? It sounds like we do
something useless, but nothing wrong in a sense that it should result in
breakage. I guess I'm missing something obvious. This timer system will
always remain a mystery to me.

Actually this more or less leads to the question: What is this
'broadcast timer'. I guess that is some clockevent device which is
common to all cores? (that would be the cadence_ttc for Zynq). Is the
hang pointing to some issue with that driver?

	Thanks,
	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-30  0:03                   ` Sören Brinkmann
@ 2013-07-30  8:47                     ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-30  8:47 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> Hi Daniel,
> 
> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> (snip)
>>
>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>> timer will be stopped when entering to the idle state. In this case, the
>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>> idle state, switching the local timer back in use.
> 
> I've been thinking about this, trying to understand how this makes my
> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> would make the timer core switch to a broadcast device even though it
> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> something useless, but nothing wrong in a sense that it should result in
> breakage. I guess I'm missing something obvious. This timer system will
> always remain a mystery to me.
> 
> Actually this more or less leads to the question: What is this
> 'broadcast timer'. I guess that is some clockevent device which is
> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> hang pointing to some issue with that driver?

If you look at the /proc/timer_list, which timer is used for broadcasting ?

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-30  8:47                     ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-30  8:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> Hi Daniel,
> 
> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> (snip)
>>
>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>> timer will be stopped when entering to the idle state. In this case, the
>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>> idle state, switching the local timer back in use.
> 
> I've been thinking about this, trying to understand how this makes my
> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> would make the timer core switch to a broadcast device even though it
> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> something useless, but nothing wrong in a sense that it should result in
> breakage. I guess I'm missing something obvious. This timer system will
> always remain a mystery to me.
> 
> Actually this more or less leads to the question: What is this
> 'broadcast timer'. I guess that is some clockevent device which is
> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> hang pointing to some issue with that driver?

If you look at the /proc/timer_list, which timer is used for broadcasting ?

-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-30  8:47                     ` Daniel Lezcano
@ 2013-07-30 22:14                       ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-30 22:14 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 2010 bytes --]

On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > (snip)
> >>
> >> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >> timer will be stopped when entering to the idle state. In this case, the
> >> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >> idle state, switching the local timer back in use.
> > 
> > I've been thinking about this, trying to understand how this makes my
> > boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > would make the timer core switch to a broadcast device even though it
> > wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > something useless, but nothing wrong in a sense that it should result in
> > breakage. I guess I'm missing something obvious. This timer system will
> > always remain a mystery to me.
> > 
> > Actually this more or less leads to the question: What is this
> > 'broadcast timer'. I guess that is some clockevent device which is
> > common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > hang pointing to some issue with that driver?
> 
> If you look at the /proc/timer_list, which timer is used for broadcasting ?

In case of a vanilla kernel and with my patches for enabling the global
timer (I removed the wrongly set flag from the C2 state + adding the DT
fragment to use the GT), this is what I see (full output from timer_list
attached):
	Tick Device: mode:     1
	Broadcast device
	Clock Event Device: xttcps_clockevent

And the local timer seems to be the arm twd timer (also in both cases).

I don't think I can gather this information for the actual broken case,
but AFAIK, there shouldn't be any other timer be capable of this for
Zynq.

	Sören


[-- Attachment #2: timer_list --]
[-- Type: text/plain, Size: 4104 bytes --]

Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 3
now at 76779317814 nsecs

cpu: 0
 clock 0:
  .base:       c14a0c90
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c14a2b50>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
 # expires at 76779328789-76779328789 nsecs [in 10975 to 10975 nsecs]
 #1: <c14a0dd8>, tick_sched_timer, S:01, hrtimer_start, swapper/0/0
 # expires at 76800000000-76800000000 nsecs [in 20682186 to 20682186 nsecs]
 #2: <ed956e40>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 93249172026-93249172026 nsecs [in 16469854212 to 16469854212 nsecs]
 #3: <ed956c00>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 900000055806-900000055806 nsecs [in 823220737992 to 823220737992 nsecs]
 clock 1:
  .base:       c14a0cc8
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     975789465870609368 nsecs
active timers:
 clock 2:
  .base:       c14a0d00
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
  .expires_next   : 76779492678 nsecs
  .hres_active    : 1
  .nr_events      : 2449
  .nr_retries     : 66
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 76780000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4294944973
  .idle_calls     : 3114
  .idle_sleeps    : 981
  .idle_entrytime : 76779354678 nsecs
  .idle_waketime  : 76779336246 nsecs
  .idle_exittime  : 76769862116 nsecs
  .idle_sleeptime : 70735903990 nsecs
  .iowait_sleeptime: 18432 nsecs
  .last_jiffies   : 4294944973
  .next_jiffies   : 4294944976
  .idle_expires   : 76800000000 nsecs
jiffies: 4294944973

cpu: 1
 clock 0:
  .base:       c14a9c90
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c14a9dd8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
 # expires at 76780000000-76780000000 nsecs [in 682186 to 682186 nsecs]
 clock 1:
  .base:       c14a9cc8
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     975789465870609368 nsecs
active timers:
 clock 2:
  .base:       c14a9d00
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
  .expires_next   : 76780000000 nsecs
  .hres_active    : 1
  .nr_events      : 2257
  .nr_retries     : 61
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 76700000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294944965
  .idle_calls     : 2998
  .idle_sleeps    : 444
  .idle_entrytime : 76771944950 nsecs
  .idle_waketime  : 76768498136 nsecs
  .idle_exittime  : 76768498136 nsecs
  .idle_sleeptime : 72124544054 nsecs
  .iowait_sleeptime: 4276262 nsecs
  .last_jiffies   : 4294944973
  .next_jiffies   : 4294944974
  .idle_expires   : 78880000000 nsecs
jiffies: 4294944973


Tick Device: mode:     1
Broadcast device
Clock Event Device: xttcps_clockevent
 max_delta_ns:   1207932479
 min_delta_ns:   18432
 mult:           233015
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: xttcps_set_next_event
 set_mode:       xttcps_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0
tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000


Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     76800000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     76780000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-30 22:14                       ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-30 22:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > (snip)
> >>
> >> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >> timer will be stopped when entering to the idle state. In this case, the
> >> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >> idle state, switching the local timer back in use.
> > 
> > I've been thinking about this, trying to understand how this makes my
> > boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > would make the timer core switch to a broadcast device even though it
> > wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > something useless, but nothing wrong in a sense that it should result in
> > breakage. I guess I'm missing something obvious. This timer system will
> > always remain a mystery to me.
> > 
> > Actually this more or less leads to the question: What is this
> > 'broadcast timer'. I guess that is some clockevent device which is
> > common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > hang pointing to some issue with that driver?
> 
> If you look at the /proc/timer_list, which timer is used for broadcasting ?

In case of a vanilla kernel and with my patches for enabling the global
timer (I removed the wrongly set flag from the C2 state + adding the DT
fragment to use the GT), this is what I see (full output from timer_list
attached):
	Tick Device: mode:     1
	Broadcast device
	Clock Event Device: xttcps_clockevent

And the local timer seems to be the arm twd timer (also in both cases).

I don't think I can gather this information for the actual broken case,
but AFAIK, there shouldn't be any other timer be capable of this for
Zynq.

	S?ren

-------------- next part --------------
Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 3
now at 76779317814 nsecs

cpu: 0
 clock 0:
  .base:       c14a0c90
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c14a2b50>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
 # expires at 76779328789-76779328789 nsecs [in 10975 to 10975 nsecs]
 #1: <c14a0dd8>, tick_sched_timer, S:01, hrtimer_start, swapper/0/0
 # expires at 76800000000-76800000000 nsecs [in 20682186 to 20682186 nsecs]
 #2: <ed956e40>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 93249172026-93249172026 nsecs [in 16469854212 to 16469854212 nsecs]
 #3: <ed956c00>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 900000055806-900000055806 nsecs [in 823220737992 to 823220737992 nsecs]
 clock 1:
  .base:       c14a0cc8
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     975789465870609368 nsecs
active timers:
 clock 2:
  .base:       c14a0d00
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
  .expires_next   : 76779492678 nsecs
  .hres_active    : 1
  .nr_events      : 2449
  .nr_retries     : 66
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 76780000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4294944973
  .idle_calls     : 3114
  .idle_sleeps    : 981
  .idle_entrytime : 76779354678 nsecs
  .idle_waketime  : 76779336246 nsecs
  .idle_exittime  : 76769862116 nsecs
  .idle_sleeptime : 70735903990 nsecs
  .iowait_sleeptime: 18432 nsecs
  .last_jiffies   : 4294944973
  .next_jiffies   : 4294944976
  .idle_expires   : 76800000000 nsecs
jiffies: 4294944973

cpu: 1
 clock 0:
  .base:       c14a9c90
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c14a9dd8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
 # expires at 76780000000-76780000000 nsecs [in 682186 to 682186 nsecs]
 clock 1:
  .base:       c14a9cc8
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     975789465870609368 nsecs
active timers:
 clock 2:
  .base:       c14a9d00
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
  .expires_next   : 76780000000 nsecs
  .hres_active    : 1
  .nr_events      : 2257
  .nr_retries     : 61
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 76700000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294944965
  .idle_calls     : 2998
  .idle_sleeps    : 444
  .idle_entrytime : 76771944950 nsecs
  .idle_waketime  : 76768498136 nsecs
  .idle_exittime  : 76768498136 nsecs
  .idle_sleeptime : 72124544054 nsecs
  .iowait_sleeptime: 4276262 nsecs
  .last_jiffies   : 4294944973
  .next_jiffies   : 4294944974
  .idle_expires   : 78880000000 nsecs
jiffies: 4294944973


Tick Device: mode:     1
Broadcast device
Clock Event Device: xttcps_clockevent
 max_delta_ns:   1207932479
 min_delta_ns:   18432
 mult:           233015
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: xttcps_set_next_event
 set_mode:       xttcps_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0
tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000


Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     76800000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     76780000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-30 22:14                       ` Sören Brinkmann
@ 2013-07-30 22:23                         ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-30 22:23 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

Forget this. It's trash. I had to migrate some stuff around due to some
quota issues and missed to update a few scripts. Sorry, I'll rerun this
test.

	Sören

On Tue, Jul 30, 2013 at 03:14:43PM -0700, Sören Brinkmann wrote:
> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> > On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> > > Hi Daniel,
> > > 
> > > On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > > (snip)
> > >>
> > >> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> > >> timer will be stopped when entering to the idle state. In this case, the
> > >> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> > >> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> > >> idle state, switching the local timer back in use.
> > > 
> > > I've been thinking about this, trying to understand how this makes my
> > > boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > > would make the timer core switch to a broadcast device even though it
> > > wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > > something useless, but nothing wrong in a sense that it should result in
> > > breakage. I guess I'm missing something obvious. This timer system will
> > > always remain a mystery to me.
> > > 
> > > Actually this more or less leads to the question: What is this
> > > 'broadcast timer'. I guess that is some clockevent device which is
> > > common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > > hang pointing to some issue with that driver?
> > 
> > If you look at the /proc/timer_list, which timer is used for broadcasting ?
> 
> In case of a vanilla kernel and with my patches for enabling the global
> timer (I removed the wrongly set flag from the C2 state + adding the DT
> fragment to use the GT), this is what I see (full output from timer_list
> attached):
> 	Tick Device: mode:     1
> 	Broadcast device
> 	Clock Event Device: xttcps_clockevent
> 
> And the local timer seems to be the arm twd timer (also in both cases).
> 
> I don't think I can gather this information for the actual broken case,
> but AFAIK, there shouldn't be any other timer be capable of this for
> Zynq.
> 
> 	Sören
> 

> Timer List Version: v0.7
> HRTIMER_MAX_CLOCK_BASES: 3
> now at 76779317814 nsecs
> 
> cpu: 0
>  clock 0:
>   .base:       c14a0c90
>   .index:      0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset:     0 nsecs
> active timers:
>  #0: <c14a2b50>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
>  # expires at 76779328789-76779328789 nsecs [in 10975 to 10975 nsecs]
>  #1: <c14a0dd8>, tick_sched_timer, S:01, hrtimer_start, swapper/0/0
>  # expires at 76800000000-76800000000 nsecs [in 20682186 to 20682186 nsecs]
>  #2: <ed956e40>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
>  # expires at 93249172026-93249172026 nsecs [in 16469854212 to 16469854212 nsecs]
>  #3: <ed956c00>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
>  # expires at 900000055806-900000055806 nsecs [in 823220737992 to 823220737992 nsecs]
>  clock 1:
>   .base:       c14a0cc8
>   .index:      1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset:     975789465870609368 nsecs
> active timers:
>  clock 2:
>   .base:       c14a0d00
>   .index:      2
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_boottime
>   .offset:     0 nsecs
> active timers:
>   .expires_next   : 76779492678 nsecs
>   .hres_active    : 1
>   .nr_events      : 2449
>   .nr_retries     : 66
>   .nr_hangs       : 0
>   .max_hang_time  : 0 nsecs
>   .nohz_mode      : 2
>   .last_tick      : 76780000000 nsecs
>   .tick_stopped   : 1
>   .idle_jiffies   : 4294944973
>   .idle_calls     : 3114
>   .idle_sleeps    : 981
>   .idle_entrytime : 76779354678 nsecs
>   .idle_waketime  : 76779336246 nsecs
>   .idle_exittime  : 76769862116 nsecs
>   .idle_sleeptime : 70735903990 nsecs
>   .iowait_sleeptime: 18432 nsecs
>   .last_jiffies   : 4294944973
>   .next_jiffies   : 4294944976
>   .idle_expires   : 76800000000 nsecs
> jiffies: 4294944973
> 
> cpu: 1
>  clock 0:
>   .base:       c14a9c90
>   .index:      0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset:     0 nsecs
> active timers:
>  #0: <c14a9dd8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
>  # expires at 76780000000-76780000000 nsecs [in 682186 to 682186 nsecs]
>  clock 1:
>   .base:       c14a9cc8
>   .index:      1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset:     975789465870609368 nsecs
> active timers:
>  clock 2:
>   .base:       c14a9d00
>   .index:      2
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_boottime
>   .offset:     0 nsecs
> active timers:
>   .expires_next   : 76780000000 nsecs
>   .hres_active    : 1
>   .nr_events      : 2257
>   .nr_retries     : 61
>   .nr_hangs       : 0
>   .max_hang_time  : 0 nsecs
>   .nohz_mode      : 2
>   .last_tick      : 76700000000 nsecs
>   .tick_stopped   : 0
>   .idle_jiffies   : 4294944965
>   .idle_calls     : 2998
>   .idle_sleeps    : 444
>   .idle_entrytime : 76771944950 nsecs
>   .idle_waketime  : 76768498136 nsecs
>   .idle_exittime  : 76768498136 nsecs
>   .idle_sleeptime : 72124544054 nsecs
>   .iowait_sleeptime: 4276262 nsecs
>   .last_jiffies   : 4294944973
>   .next_jiffies   : 4294944974
>   .idle_expires   : 78880000000 nsecs
> jiffies: 4294944973
> 
> 
> Tick Device: mode:     1
> Broadcast device
> Clock Event Device: xttcps_clockevent
>  max_delta_ns:   1207932479
>  min_delta_ns:   18432
>  mult:           233015
>  shift:          32
>  mode:           1
>  next_event:     9223372036854775807 nsecs
>  set_next_event: xttcps_set_next_event
>  set_mode:       xttcps_set_mode
>  event_handler:  tick_handle_oneshot_broadcast
>  retries:        0
> tick_broadcast_mask: 00000000
> tick_broadcast_oneshot_mask: 00000000
> 
> 
> Tick Device: mode:     1
> Per CPU device: 0
> Clock Event Device: local_timer
>  max_delta_ns:   12884902005
>  min_delta_ns:   1000
>  mult:           715827876
>  shift:          31
>  mode:           3
>  next_event:     76800000000 nsecs
>  set_next_event: twd_set_next_event
>  set_mode:       twd_set_mode
>  event_handler:  hrtimer_interrupt
>  retries:        0
> 
> Tick Device: mode:     1
> Per CPU device: 1
> Clock Event Device: local_timer
>  max_delta_ns:   12884902005
>  min_delta_ns:   1000
>  mult:           715827876
>  shift:          31
>  mode:           3
>  next_event:     76780000000 nsecs
>  set_next_event: twd_set_next_event
>  set_mode:       twd_set_mode
>  event_handler:  hrtimer_interrupt
>  retries:        0
> 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-30 22:23                         ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-30 22:23 UTC (permalink / raw)
  To: linux-arm-kernel

Forget this. It's trash. I had to migrate some stuff around due to some
quota issues and missed to update a few scripts. Sorry, I'll rerun this
test.

	S?ren

On Tue, Jul 30, 2013 at 03:14:43PM -0700, S?ren Brinkmann wrote:
> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> > On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> > > Hi Daniel,
> > > 
> > > On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > > (snip)
> > >>
> > >> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> > >> timer will be stopped when entering to the idle state. In this case, the
> > >> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> > >> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> > >> idle state, switching the local timer back in use.
> > > 
> > > I've been thinking about this, trying to understand how this makes my
> > > boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > > would make the timer core switch to a broadcast device even though it
> > > wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > > something useless, but nothing wrong in a sense that it should result in
> > > breakage. I guess I'm missing something obvious. This timer system will
> > > always remain a mystery to me.
> > > 
> > > Actually this more or less leads to the question: What is this
> > > 'broadcast timer'. I guess that is some clockevent device which is
> > > common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > > hang pointing to some issue with that driver?
> > 
> > If you look at the /proc/timer_list, which timer is used for broadcasting ?
> 
> In case of a vanilla kernel and with my patches for enabling the global
> timer (I removed the wrongly set flag from the C2 state + adding the DT
> fragment to use the GT), this is what I see (full output from timer_list
> attached):
> 	Tick Device: mode:     1
> 	Broadcast device
> 	Clock Event Device: xttcps_clockevent
> 
> And the local timer seems to be the arm twd timer (also in both cases).
> 
> I don't think I can gather this information for the actual broken case,
> but AFAIK, there shouldn't be any other timer be capable of this for
> Zynq.
> 
> 	S?ren
> 

> Timer List Version: v0.7
> HRTIMER_MAX_CLOCK_BASES: 3
> now at 76779317814 nsecs
> 
> cpu: 0
>  clock 0:
>   .base:       c14a0c90
>   .index:      0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset:     0 nsecs
> active timers:
>  #0: <c14a2b50>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
>  # expires at 76779328789-76779328789 nsecs [in 10975 to 10975 nsecs]
>  #1: <c14a0dd8>, tick_sched_timer, S:01, hrtimer_start, swapper/0/0
>  # expires at 76800000000-76800000000 nsecs [in 20682186 to 20682186 nsecs]
>  #2: <ed956e40>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
>  # expires at 93249172026-93249172026 nsecs [in 16469854212 to 16469854212 nsecs]
>  #3: <ed956c00>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
>  # expires at 900000055806-900000055806 nsecs [in 823220737992 to 823220737992 nsecs]
>  clock 1:
>   .base:       c14a0cc8
>   .index:      1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset:     975789465870609368 nsecs
> active timers:
>  clock 2:
>   .base:       c14a0d00
>   .index:      2
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_boottime
>   .offset:     0 nsecs
> active timers:
>   .expires_next   : 76779492678 nsecs
>   .hres_active    : 1
>   .nr_events      : 2449
>   .nr_retries     : 66
>   .nr_hangs       : 0
>   .max_hang_time  : 0 nsecs
>   .nohz_mode      : 2
>   .last_tick      : 76780000000 nsecs
>   .tick_stopped   : 1
>   .idle_jiffies   : 4294944973
>   .idle_calls     : 3114
>   .idle_sleeps    : 981
>   .idle_entrytime : 76779354678 nsecs
>   .idle_waketime  : 76779336246 nsecs
>   .idle_exittime  : 76769862116 nsecs
>   .idle_sleeptime : 70735903990 nsecs
>   .iowait_sleeptime: 18432 nsecs
>   .last_jiffies   : 4294944973
>   .next_jiffies   : 4294944976
>   .idle_expires   : 76800000000 nsecs
> jiffies: 4294944973
> 
> cpu: 1
>  clock 0:
>   .base:       c14a9c90
>   .index:      0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset:     0 nsecs
> active timers:
>  #0: <c14a9dd8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
>  # expires at 76780000000-76780000000 nsecs [in 682186 to 682186 nsecs]
>  clock 1:
>   .base:       c14a9cc8
>   .index:      1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset:     975789465870609368 nsecs
> active timers:
>  clock 2:
>   .base:       c14a9d00
>   .index:      2
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_boottime
>   .offset:     0 nsecs
> active timers:
>   .expires_next   : 76780000000 nsecs
>   .hres_active    : 1
>   .nr_events      : 2257
>   .nr_retries     : 61
>   .nr_hangs       : 0
>   .max_hang_time  : 0 nsecs
>   .nohz_mode      : 2
>   .last_tick      : 76700000000 nsecs
>   .tick_stopped   : 0
>   .idle_jiffies   : 4294944965
>   .idle_calls     : 2998
>   .idle_sleeps    : 444
>   .idle_entrytime : 76771944950 nsecs
>   .idle_waketime  : 76768498136 nsecs
>   .idle_exittime  : 76768498136 nsecs
>   .idle_sleeptime : 72124544054 nsecs
>   .iowait_sleeptime: 4276262 nsecs
>   .last_jiffies   : 4294944973
>   .next_jiffies   : 4294944974
>   .idle_expires   : 78880000000 nsecs
> jiffies: 4294944973
> 
> 
> Tick Device: mode:     1
> Broadcast device
> Clock Event Device: xttcps_clockevent
>  max_delta_ns:   1207932479
>  min_delta_ns:   18432
>  mult:           233015
>  shift:          32
>  mode:           1
>  next_event:     9223372036854775807 nsecs
>  set_next_event: xttcps_set_next_event
>  set_mode:       xttcps_set_mode
>  event_handler:  tick_handle_oneshot_broadcast
>  retries:        0
> tick_broadcast_mask: 00000000
> tick_broadcast_oneshot_mask: 00000000
> 
> 
> Tick Device: mode:     1
> Per CPU device: 0
> Clock Event Device: local_timer
>  max_delta_ns:   12884902005
>  min_delta_ns:   1000
>  mult:           715827876
>  shift:          31
>  mode:           3
>  next_event:     76800000000 nsecs
>  set_next_event: twd_set_next_event
>  set_mode:       twd_set_mode
>  event_handler:  hrtimer_interrupt
>  retries:        0
> 
> Tick Device: mode:     1
> Per CPU device: 1
> Clock Event Device: local_timer
>  max_delta_ns:   12884902005
>  min_delta_ns:   1000
>  mult:           715827876
>  shift:          31
>  mode:           3
>  next_event:     76780000000 nsecs
>  set_next_event: twd_set_next_event
>  set_mode:       twd_set_mode
>  event_handler:  hrtimer_interrupt
>  retries:        0
> 

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-30  8:47                     ` Daniel Lezcano
@ 2013-07-30 22:34                       ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-30 22:34 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 2274 bytes --]

On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > (snip)
> >>
> >> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >> timer will be stopped when entering to the idle state. In this case, the
> >> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >> idle state, switching the local timer back in use.
> > 
> > I've been thinking about this, trying to understand how this makes my
> > boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > would make the timer core switch to a broadcast device even though it
> > wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > something useless, but nothing wrong in a sense that it should result in
> > breakage. I guess I'm missing something obvious. This timer system will
> > always remain a mystery to me.
> > 
> > Actually this more or less leads to the question: What is this
> > 'broadcast timer'. I guess that is some clockevent device which is
> > common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > hang pointing to some issue with that driver?
> 
> If you look at the /proc/timer_list, which timer is used for broadcasting ?

So, the correct run results (full output attached).

The vanilla kernel uses the twd timers as local timers and the TTC as
broadcast device:
	Tick Device: mode:     1                                                         
	Broadcast device  
	Clock Event Device: ttc_clockevent

When I remove the offending CPUIDLE flag and add the DT fragment to
enable the global timer, the twd timers are still used as local timers
and the broadcast device is the global timer:
	Tick Device: mode:     1                                                         
	Broadcast device                                                                 
	Clock Event Device: arm_global_timer

Again, since boot hangs in the actually broken case, I don't see way to
obtain this information for that case.

	Thanks,
	Sören


[-- Attachment #2: timer_list_vanilla --]
[-- Type: text/plain, Size: 4230 bytes --]

# cat /proc/timer_list 
Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 4
now at 55591467929 nsecs

cpu: 0
 clock 0:
  .base:       c148e718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c1490670>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
 # expires at 55591646361-55591646361 nsecs [in 178432 to 178432 nsecs]
 clock 1:
  .base:       c148e750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c148e788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c148e7c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 55591705658 nsecs
  .hres_active    : 1
  .nr_events      : 917
  .nr_retries     : 34
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 55590000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4294942854
  .idle_calls     : 1977
  .idle_sleeps    : 250
  .idle_entrytime : 55591578522 nsecs
  .idle_waketime  : 55591578522 nsecs
  .idle_exittime  : 55582307145 nsecs
  .idle_sleeptime : 52623799592 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294942855
  .next_jiffies   : 4294949450
  .idle_expires   : 121540000000 nsecs
jiffies: 4294942855

cpu: 1
 clock 0:
  .base:       c1497718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c1497ac8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
 # expires at 55600000000-55600000000 nsecs [in 8532071 to 8532071 nsecs]
 #1: <ed851c80>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 102214419661-102214419661 nsecs [in 46622951732 to 46622951732 nsecs]
 #2: <ed92fc80>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 900000055748-900000055748 nsecs [in 844408587819 to 844408587819 nsecs]
 clock 1:
  .base:       c1497750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c1497788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c14977c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 55600000000 nsecs
  .hres_active    : 1
  .nr_events      : 921
  .nr_retries     : 20
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 55590000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294942854
  .idle_calls     : 1482
  .idle_sleeps    : 234
  .idle_entrytime : 55584131929 nsecs
  .idle_waketime  : 55584058200 nsecs
  .idle_exittime  : 55584131929 nsecs
  .idle_sleeptime : 53182424229 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294942854
  .next_jiffies   : 4294949632
  .idle_expires   : 56520692335 nsecs
jiffies: 4294942855

Tick Device: mode:     1
Broadcast device
Clock Event Device: ttc_clockevent
 max_delta_ns:   1207932479
 min_delta_ns:   18432
 mult:           233015
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: ttc_set_next_event
 set_mode:       ttc_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        5

tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     55592004140 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     55600000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

#


[-- Attachment #3: timer_list_gt --]
[-- Type: text/plain, Size: 4276 bytes --]

# cat /proc/timer_list 
Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 4
now at 246603578154 nsecs

cpu: 0
 clock 0:
  .base:       c148e718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c1490670>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
 # expires at 246603698006-246603698006 nsecs [in 120539 to 120539 nsecs]
 #1: <c148eac8>, tick_sched_timer, S:01, hrtimer_start, swapper/0/0
 # expires at 252080000000-252080000000 nsecs [in 5476422533 to 5476422533 nsecs]
 #2: <ed9faa80>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 900000053569-900000053569 nsecs [in 653396476102 to 653396476102 nsecs]
 clock 1:
  .base:       c148e750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c148e788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c148e7c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 246603871523 nsecs
  .hres_active    : 1
  .nr_events      : 935
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 246600000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4294961955
  .idle_calls     : 1974
  .idle_sleeps    : 259
  .idle_entrytime : 246603709947 nsecs
  .idle_waketime  : 246603700671 nsecs
  .idle_exittime  : 246594690344 nsecs
  .idle_sleeptime : 243604071460 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294961956
  .next_jiffies   : 4294962504
  .idle_expires   : 252080000000 nsecs
jiffies: 4294961956

cpu: 1
 clock 0:
  .base:       c1497718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c1497ac8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
 # expires at 246610000000-246610000000 nsecs [in 6422533 to 6422533 nsecs]
 clock 1:
  .base:       c1497750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c1497788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c14977c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 246610000000 nsecs
  .hres_active    : 1
  .nr_events      : 882
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 245680000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294961863
  .idle_calls     : 1405
  .idle_sleeps    : 272
  .idle_entrytime : 246596515784 nsecs
  .idle_waketime  : 244320007598 nsecs
  .idle_exittime  : 246593639546 nsecs
  .idle_sleeptime : 243941063708 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294961955
  .next_jiffies   : 4294961956
  .idle_expires   : 361550000000 nsecs
jiffies: 4294961956

Tick Device: mode:     1
Broadcast device
Clock Event Device: arm_global_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: gt_clockevent_set_next_event
 set_mode:       gt_clockevent_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0

tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     246604040852 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     246610000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

## 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-30 22:34                       ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-30 22:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > (snip)
> >>
> >> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >> timer will be stopped when entering to the idle state. In this case, the
> >> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >> idle state, switching the local timer back in use.
> > 
> > I've been thinking about this, trying to understand how this makes my
> > boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > would make the timer core switch to a broadcast device even though it
> > wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > something useless, but nothing wrong in a sense that it should result in
> > breakage. I guess I'm missing something obvious. This timer system will
> > always remain a mystery to me.
> > 
> > Actually this more or less leads to the question: What is this
> > 'broadcast timer'. I guess that is some clockevent device which is
> > common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > hang pointing to some issue with that driver?
> 
> If you look at the /proc/timer_list, which timer is used for broadcasting ?

So, the correct run results (full output attached).

The vanilla kernel uses the twd timers as local timers and the TTC as
broadcast device:
	Tick Device: mode:     1                                                         
	Broadcast device  
	Clock Event Device: ttc_clockevent

When I remove the offending CPUIDLE flag and add the DT fragment to
enable the global timer, the twd timers are still used as local timers
and the broadcast device is the global timer:
	Tick Device: mode:     1                                                         
	Broadcast device                                                                 
	Clock Event Device: arm_global_timer

Again, since boot hangs in the actually broken case, I don't see way to
obtain this information for that case.

	Thanks,
	S?ren

-------------- next part --------------
# cat /proc/timer_list 
Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 4
now at 55591467929 nsecs

cpu: 0
 clock 0:
  .base:       c148e718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c1490670>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
 # expires at 55591646361-55591646361 nsecs [in 178432 to 178432 nsecs]
 clock 1:
  .base:       c148e750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c148e788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c148e7c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 55591705658 nsecs
  .hres_active    : 1
  .nr_events      : 917
  .nr_retries     : 34
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 55590000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4294942854
  .idle_calls     : 1977
  .idle_sleeps    : 250
  .idle_entrytime : 55591578522 nsecs
  .idle_waketime  : 55591578522 nsecs
  .idle_exittime  : 55582307145 nsecs
  .idle_sleeptime : 52623799592 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294942855
  .next_jiffies   : 4294949450
  .idle_expires   : 121540000000 nsecs
jiffies: 4294942855

cpu: 1
 clock 0:
  .base:       c1497718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c1497ac8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
 # expires at 55600000000-55600000000 nsecs [in 8532071 to 8532071 nsecs]
 #1: <ed851c80>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 102214419661-102214419661 nsecs [in 46622951732 to 46622951732 nsecs]
 #2: <ed92fc80>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 900000055748-900000055748 nsecs [in 844408587819 to 844408587819 nsecs]
 clock 1:
  .base:       c1497750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c1497788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c14977c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 55600000000 nsecs
  .hres_active    : 1
  .nr_events      : 921
  .nr_retries     : 20
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 55590000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294942854
  .idle_calls     : 1482
  .idle_sleeps    : 234
  .idle_entrytime : 55584131929 nsecs
  .idle_waketime  : 55584058200 nsecs
  .idle_exittime  : 55584131929 nsecs
  .idle_sleeptime : 53182424229 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294942854
  .next_jiffies   : 4294949632
  .idle_expires   : 56520692335 nsecs
jiffies: 4294942855

Tick Device: mode:     1
Broadcast device
Clock Event Device: ttc_clockevent
 max_delta_ns:   1207932479
 min_delta_ns:   18432
 mult:           233015
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: ttc_set_next_event
 set_mode:       ttc_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        5

tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     55592004140 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     55600000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

#

-------------- next part --------------
# cat /proc/timer_list 
Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 4
now at 246603578154 nsecs

cpu: 0
 clock 0:
  .base:       c148e718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c1490670>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
 # expires at 246603698006-246603698006 nsecs [in 120539 to 120539 nsecs]
 #1: <c148eac8>, tick_sched_timer, S:01, hrtimer_start, swapper/0/0
 # expires at 252080000000-252080000000 nsecs [in 5476422533 to 5476422533 nsecs]
 #2: <ed9faa80>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 900000053569-900000053569 nsecs [in 653396476102 to 653396476102 nsecs]
 clock 1:
  .base:       c148e750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c148e788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c148e7c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 246603871523 nsecs
  .hres_active    : 1
  .nr_events      : 935
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 246600000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4294961955
  .idle_calls     : 1974
  .idle_sleeps    : 259
  .idle_entrytime : 246603709947 nsecs
  .idle_waketime  : 246603700671 nsecs
  .idle_exittime  : 246594690344 nsecs
  .idle_sleeptime : 243604071460 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294961956
  .next_jiffies   : 4294962504
  .idle_expires   : 252080000000 nsecs
jiffies: 4294961956

cpu: 1
 clock 0:
  .base:       c1497718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c1497ac8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
 # expires at 246610000000-246610000000 nsecs [in 6422533 to 6422533 nsecs]
 clock 1:
  .base:       c1497750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c1497788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c14977c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 246610000000 nsecs
  .hres_active    : 1
  .nr_events      : 882
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 245680000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294961863
  .idle_calls     : 1405
  .idle_sleeps    : 272
  .idle_entrytime : 246596515784 nsecs
  .idle_waketime  : 244320007598 nsecs
  .idle_exittime  : 246593639546 nsecs
  .idle_sleeptime : 243941063708 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294961955
  .next_jiffies   : 4294961956
  .idle_expires   : 361550000000 nsecs
jiffies: 4294961956

Tick Device: mode:     1
Broadcast device
Clock Event Device: arm_global_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: gt_clockevent_set_next_event
 set_mode:       gt_clockevent_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0

tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     246604040852 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     246610000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

## 

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-30 22:34                       ` Sören Brinkmann
@ 2013-07-31  7:27                         ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31  7:27 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>> Hi Daniel,
>>>
>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>> (snip)
>>>>
>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>> timer will be stopped when entering to the idle state. In this case, the
>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>> idle state, switching the local timer back in use.
>>>
>>> I've been thinking about this, trying to understand how this makes my
>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>> would make the timer core switch to a broadcast device even though it
>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>> something useless, but nothing wrong in a sense that it should result in
>>> breakage. I guess I'm missing something obvious. This timer system will
>>> always remain a mystery to me.
>>>
>>> Actually this more or less leads to the question: What is this
>>> 'broadcast timer'. I guess that is some clockevent device which is
>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>> hang pointing to some issue with that driver?
>>
>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> 
> So, the correct run results (full output attached).
> 
> The vanilla kernel uses the twd timers as local timers and the TTC as
> broadcast device:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device  
> 	Clock Event Device: ttc_clockevent
> 
> When I remove the offending CPUIDLE flag and add the DT fragment to
> enable the global timer, the twd timers are still used as local timers
> and the broadcast device is the global timer:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device                                                                 
> 	Clock Event Device: arm_global_timer
> 
> Again, since boot hangs in the actually broken case, I don't see way to
> obtain this information for that case.

Hmm, interesting. Can you give the ouput of /proc/interrupts also with
the global timer ?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31  7:27                         ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31  7:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>> Hi Daniel,
>>>
>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>> (snip)
>>>>
>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>> timer will be stopped when entering to the idle state. In this case, the
>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>> idle state, switching the local timer back in use.
>>>
>>> I've been thinking about this, trying to understand how this makes my
>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>> would make the timer core switch to a broadcast device even though it
>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>> something useless, but nothing wrong in a sense that it should result in
>>> breakage. I guess I'm missing something obvious. This timer system will
>>> always remain a mystery to me.
>>>
>>> Actually this more or less leads to the question: What is this
>>> 'broadcast timer'. I guess that is some clockevent device which is
>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>> hang pointing to some issue with that driver?
>>
>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> 
> So, the correct run results (full output attached).
> 
> The vanilla kernel uses the twd timers as local timers and the TTC as
> broadcast device:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device  
> 	Clock Event Device: ttc_clockevent
> 
> When I remove the offending CPUIDLE flag and add the DT fragment to
> enable the global timer, the twd timers are still used as local timers
> and the broadcast device is the global timer:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device                                                                 
> 	Clock Event Device: arm_global_timer
> 
> Again, since boot hangs in the actually broken case, I don't see way to
> obtain this information for that case.

Hmm, interesting. Can you give the ouput of /proc/interrupts also with
the global timer ?


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-30 22:34                       ` Sören Brinkmann
@ 2013-07-31  9:34                         ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31  9:34 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>> Hi Daniel,
>>>
>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>> (snip)
>>>>
>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>> timer will be stopped when entering to the idle state. In this case, the
>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>> idle state, switching the local timer back in use.
>>>
>>> I've been thinking about this, trying to understand how this makes my
>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>> would make the timer core switch to a broadcast device even though it
>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>> something useless, but nothing wrong in a sense that it should result in
>>> breakage. I guess I'm missing something obvious. This timer system will
>>> always remain a mystery to me.
>>>
>>> Actually this more or less leads to the question: What is this
>>> 'broadcast timer'. I guess that is some clockevent device which is
>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>> hang pointing to some issue with that driver?
>>
>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> 
> So, the correct run results (full output attached).
> 
> The vanilla kernel uses the twd timers as local timers and the TTC as
> broadcast device:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device  
> 	Clock Event Device: ttc_clockevent
> 
> When I remove the offending CPUIDLE flag and add the DT fragment to
> enable the global timer, the twd timers are still used as local timers
> and the broadcast device is the global timer:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device                                                                 
> 	Clock Event Device: arm_global_timer
> 
> Again, since boot hangs in the actually broken case, I don't see way to
> obtain this information for that case.

Is it possible the global timer driver simply does not work ? So when
the cpuidle driver switches to it, the system stays stuck with no interrupt.

Removing the CPUIDLE_FLAG_TIMER_STOP prevents to use the broadcast timer
(aka arm global timer), so the problem does not appear.

And when the C3STOP feature flag is added to the global timer, this one
can't be a broadcast timer, so another clock is selected for that (I
guess cadence_ttc). So again the problem does not appear.

I am more and more convinced the problem is not coming from the cpuidle
driver. The cpuidle flag has just spotted a problem somewhere else and I
suspect the arm_global_timer is not working for zynq.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31  9:34                         ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31  9:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>> Hi Daniel,
>>>
>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>> (snip)
>>>>
>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>> timer will be stopped when entering to the idle state. In this case, the
>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>> idle state, switching the local timer back in use.
>>>
>>> I've been thinking about this, trying to understand how this makes my
>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>> would make the timer core switch to a broadcast device even though it
>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>> something useless, but nothing wrong in a sense that it should result in
>>> breakage. I guess I'm missing something obvious. This timer system will
>>> always remain a mystery to me.
>>>
>>> Actually this more or less leads to the question: What is this
>>> 'broadcast timer'. I guess that is some clockevent device which is
>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>> hang pointing to some issue with that driver?
>>
>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> 
> So, the correct run results (full output attached).
> 
> The vanilla kernel uses the twd timers as local timers and the TTC as
> broadcast device:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device  
> 	Clock Event Device: ttc_clockevent
> 
> When I remove the offending CPUIDLE flag and add the DT fragment to
> enable the global timer, the twd timers are still used as local timers
> and the broadcast device is the global timer:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device                                                                 
> 	Clock Event Device: arm_global_timer
> 
> Again, since boot hangs in the actually broken case, I don't see way to
> obtain this information for that case.

Is it possible the global timer driver simply does not work ? So when
the cpuidle driver switches to it, the system stays stuck with no interrupt.

Removing the CPUIDLE_FLAG_TIMER_STOP prevents to use the broadcast timer
(aka arm global timer), so the problem does not appear.

And when the C3STOP feature flag is added to the global timer, this one
can't be a broadcast timer, so another clock is selected for that (I
guess cadence_ttc). So again the problem does not appear.

I am more and more convinced the problem is not coming from the cpuidle
driver. The cpuidle flag has just spotted a problem somewhere else and I
suspect the arm_global_timer is not working for zynq.


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-31  7:27                         ` Daniel Lezcano
@ 2013-07-31 16:26                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 16:26 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On Wed, Jul 31, 2013 at 09:27:25AM +0200, Daniel Lezcano wrote:
> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> > On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>> Hi Daniel,
> >>>
> >>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>> (snip)
> >>>>
> >>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>> timer will be stopped when entering to the idle state. In this case, the
> >>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>> idle state, switching the local timer back in use.
> >>>
> >>> I've been thinking about this, trying to understand how this makes my
> >>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>> would make the timer core switch to a broadcast device even though it
> >>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>> something useless, but nothing wrong in a sense that it should result in
> >>> breakage. I guess I'm missing something obvious. This timer system will
> >>> always remain a mystery to me.
> >>>
> >>> Actually this more or less leads to the question: What is this
> >>> 'broadcast timer'. I guess that is some clockevent device which is
> >>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>> hang pointing to some issue with that driver?
> >>
> >> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > 
> > So, the correct run results (full output attached).
> > 
> > The vanilla kernel uses the twd timers as local timers and the TTC as
> > broadcast device:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device  
> > 	Clock Event Device: ttc_clockevent
> > 
> > When I remove the offending CPUIDLE flag and add the DT fragment to
> > enable the global timer, the twd timers are still used as local timers
> > and the broadcast device is the global timer:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device                                                                 
> > 	Clock Event Device: arm_global_timer
> > 
> > Again, since boot hangs in the actually broken case, I don't see way to
> > obtain this information for that case.
> 
> Hmm, interesting. Can you give the ouput of /proc/interrupts also with
> the global timer ?
Sure:

	# cat /proc/interrupts 
	           CPU0       CPU1       
	 27:         14          1       GIC  27  gt
	 29:        841        843       GIC  29  twd
	 43:          0          0       GIC  43  ttc_clockevent
	 82:        563          0       GIC  82  xuartps
	IPI0:          0          0  CPU wakeup interrupts
	IPI1:          0          0  Timer broadcast interrupts
	IPI2:       1266       1330  Rescheduling interrupts
	IPI3:          0          0  Function call interrupts
	IPI4:         34         59  Single function call interrupts
	IPI5:          0          0  CPU stop interrupts
	Err:          0

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31 16:26                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 16:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 31, 2013 at 09:27:25AM +0200, Daniel Lezcano wrote:
> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> > On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>> Hi Daniel,
> >>>
> >>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>> (snip)
> >>>>
> >>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>> timer will be stopped when entering to the idle state. In this case, the
> >>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>> idle state, switching the local timer back in use.
> >>>
> >>> I've been thinking about this, trying to understand how this makes my
> >>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>> would make the timer core switch to a broadcast device even though it
> >>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>> something useless, but nothing wrong in a sense that it should result in
> >>> breakage. I guess I'm missing something obvious. This timer system will
> >>> always remain a mystery to me.
> >>>
> >>> Actually this more or less leads to the question: What is this
> >>> 'broadcast timer'. I guess that is some clockevent device which is
> >>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>> hang pointing to some issue with that driver?
> >>
> >> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > 
> > So, the correct run results (full output attached).
> > 
> > The vanilla kernel uses the twd timers as local timers and the TTC as
> > broadcast device:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device  
> > 	Clock Event Device: ttc_clockevent
> > 
> > When I remove the offending CPUIDLE flag and add the DT fragment to
> > enable the global timer, the twd timers are still used as local timers
> > and the broadcast device is the global timer:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device                                                                 
> > 	Clock Event Device: arm_global_timer
> > 
> > Again, since boot hangs in the actually broken case, I don't see way to
> > obtain this information for that case.
> 
> Hmm, interesting. Can you give the ouput of /proc/interrupts also with
> the global timer ?
Sure:

	# cat /proc/interrupts 
	           CPU0       CPU1       
	 27:         14          1       GIC  27  gt
	 29:        841        843       GIC  29  twd
	 43:          0          0       GIC  43  ttc_clockevent
	 82:        563          0       GIC  82  xuartps
	IPI0:          0          0  CPU wakeup interrupts
	IPI1:          0          0  Timer broadcast interrupts
	IPI2:       1266       1330  Rescheduling interrupts
	IPI3:          0          0  Function call interrupts
	IPI4:         34         59  Single function call interrupts
	IPI5:          0          0  CPU stop interrupts
	Err:          0

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-31  9:34                         ` Daniel Lezcano
@ 2013-07-31 16:43                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 16:43 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On Wed, Jul 31, 2013 at 11:34:48AM +0200, Daniel Lezcano wrote:
> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> > On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>> Hi Daniel,
> >>>
> >>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>> (snip)
> >>>>
> >>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>> timer will be stopped when entering to the idle state. In this case, the
> >>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>> idle state, switching the local timer back in use.
> >>>
> >>> I've been thinking about this, trying to understand how this makes my
> >>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>> would make the timer core switch to a broadcast device even though it
> >>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>> something useless, but nothing wrong in a sense that it should result in
> >>> breakage. I guess I'm missing something obvious. This timer system will
> >>> always remain a mystery to me.
> >>>
> >>> Actually this more or less leads to the question: What is this
> >>> 'broadcast timer'. I guess that is some clockevent device which is
> >>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>> hang pointing to some issue with that driver?
> >>
> >> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > 
> > So, the correct run results (full output attached).
> > 
> > The vanilla kernel uses the twd timers as local timers and the TTC as
> > broadcast device:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device  
> > 	Clock Event Device: ttc_clockevent
> > 
> > When I remove the offending CPUIDLE flag and add the DT fragment to
> > enable the global timer, the twd timers are still used as local timers
> > and the broadcast device is the global timer:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device                                                                 
> > 	Clock Event Device: arm_global_timer
> > 
> > Again, since boot hangs in the actually broken case, I don't see way to
> > obtain this information for that case.
> 
> Is it possible the global timer driver simply does not work ? So when
> the cpuidle driver switches to it, the system stays stuck with no interrupt.
> 
> Removing the CPUIDLE_FLAG_TIMER_STOP prevents to use the broadcast timer
> (aka arm global timer), so the problem does not appear.
> 
> And when the C3STOP feature flag is added to the global timer, this one
> can't be a broadcast timer, so another clock is selected for that (I
> guess cadence_ttc). So again the problem does not appear.
> 
> I am more and more convinced the problem is not coming from the cpuidle
> driver. The cpuidle flag has just spotted a problem somewhere else and I
> suspect the arm_global_timer is not working for zynq.

I made the following experiment:
I removed all TTC and twd nodes from my dts, leaving the gt as only
timer source and the system boots. Interestingly, in this case no
broadcast device is available.

/proc/timer_list (shortended):
	Tick Device: mode:     1
	Broadcast device
	Clock Event Device: <NULL>
	tick_broadcast_mask: 00000000
	tick_broadcast_oneshot_mask: 00000000
	
	Tick Device: mode:     1
	Per CPU device: 0
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     25654916518 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0
	
	Tick Device: mode:     1
	Per CPU device: 1
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     25660000000 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0

/proc/interrupts:
	# cat /proc/interrupts 
	           CPU0       CPU1       
	 27:       1483       1623       GIC  27  gt
	 82:        507          0       GIC  82  xuartps
	IPI0:          0          0  CPU wakeup interrupts
	IPI1:          0          0  Timer broadcast interrupts
	IPI2:       1211       1238  Rescheduling interrupts
	IPI3:          0          0  Function call interrupts
	IPI4:         42         51  Single function call interrupts
	IPI5:          0          0  CPU stop interrupts
	Err:          0


	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31 16:43                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 31, 2013 at 11:34:48AM +0200, Daniel Lezcano wrote:
> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> > On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>> Hi Daniel,
> >>>
> >>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>> (snip)
> >>>>
> >>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>> timer will be stopped when entering to the idle state. In this case, the
> >>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>> idle state, switching the local timer back in use.
> >>>
> >>> I've been thinking about this, trying to understand how this makes my
> >>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>> would make the timer core switch to a broadcast device even though it
> >>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>> something useless, but nothing wrong in a sense that it should result in
> >>> breakage. I guess I'm missing something obvious. This timer system will
> >>> always remain a mystery to me.
> >>>
> >>> Actually this more or less leads to the question: What is this
> >>> 'broadcast timer'. I guess that is some clockevent device which is
> >>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>> hang pointing to some issue with that driver?
> >>
> >> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > 
> > So, the correct run results (full output attached).
> > 
> > The vanilla kernel uses the twd timers as local timers and the TTC as
> > broadcast device:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device  
> > 	Clock Event Device: ttc_clockevent
> > 
> > When I remove the offending CPUIDLE flag and add the DT fragment to
> > enable the global timer, the twd timers are still used as local timers
> > and the broadcast device is the global timer:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device                                                                 
> > 	Clock Event Device: arm_global_timer
> > 
> > Again, since boot hangs in the actually broken case, I don't see way to
> > obtain this information for that case.
> 
> Is it possible the global timer driver simply does not work ? So when
> the cpuidle driver switches to it, the system stays stuck with no interrupt.
> 
> Removing the CPUIDLE_FLAG_TIMER_STOP prevents to use the broadcast timer
> (aka arm global timer), so the problem does not appear.
> 
> And when the C3STOP feature flag is added to the global timer, this one
> can't be a broadcast timer, so another clock is selected for that (I
> guess cadence_ttc). So again the problem does not appear.
> 
> I am more and more convinced the problem is not coming from the cpuidle
> driver. The cpuidle flag has just spotted a problem somewhere else and I
> suspect the arm_global_timer is not working for zynq.

I made the following experiment:
I removed all TTC and twd nodes from my dts, leaving the gt as only
timer source and the system boots. Interestingly, in this case no
broadcast device is available.

/proc/timer_list (shortended):
	Tick Device: mode:     1
	Broadcast device
	Clock Event Device: <NULL>
	tick_broadcast_mask: 00000000
	tick_broadcast_oneshot_mask: 00000000
	
	Tick Device: mode:     1
	Per CPU device: 0
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     25654916518 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0
	
	Tick Device: mode:     1
	Per CPU device: 1
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     25660000000 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0

/proc/interrupts:
	# cat /proc/interrupts 
	           CPU0       CPU1       
	 27:       1483       1623       GIC  27  gt
	 82:        507          0       GIC  82  xuartps
	IPI0:          0          0  CPU wakeup interrupts
	IPI1:          0          0  Timer broadcast interrupts
	IPI2:       1211       1238  Rescheduling interrupts
	IPI3:          0          0  Function call interrupts
	IPI4:         42         51  Single function call interrupts
	IPI5:          0          0  CPU stop interrupts
	Err:          0


	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-30 22:34                       ` Sören Brinkmann
@ 2013-07-31 20:49                         ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31 20:49 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>> Hi Daniel,
>>>
>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>> (snip)
>>>>
>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>> timer will be stopped when entering to the idle state. In this case, the
>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>> idle state, switching the local timer back in use.
>>>
>>> I've been thinking about this, trying to understand how this makes my
>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>> would make the timer core switch to a broadcast device even though it
>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>> something useless, but nothing wrong in a sense that it should result in
>>> breakage. I guess I'm missing something obvious. This timer system will
>>> always remain a mystery to me.
>>>
>>> Actually this more or less leads to the question: What is this
>>> 'broadcast timer'. I guess that is some clockevent device which is
>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>> hang pointing to some issue with that driver?
>>
>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> 
> So, the correct run results (full output attached).
> 
> The vanilla kernel uses the twd timers as local timers and the TTC as
> broadcast device:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device  
> 	Clock Event Device: ttc_clockevent
> 
> When I remove the offending CPUIDLE flag and add the DT fragment to
> enable the global timer, the twd timers are still used as local timers
> and the broadcast device is the global timer:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device                                                                 
> 	Clock Event Device: arm_global_timer
> 
> Again, since boot hangs in the actually broken case, I don't see way to
> obtain this information for that case.

Can't you use the maxcpus=1 option to ensure the system to boot up ?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31 20:49                         ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>> Hi Daniel,
>>>
>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>> (snip)
>>>>
>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>> timer will be stopped when entering to the idle state. In this case, the
>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>> idle state, switching the local timer back in use.
>>>
>>> I've been thinking about this, trying to understand how this makes my
>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>> would make the timer core switch to a broadcast device even though it
>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>> something useless, but nothing wrong in a sense that it should result in
>>> breakage. I guess I'm missing something obvious. This timer system will
>>> always remain a mystery to me.
>>>
>>> Actually this more or less leads to the question: What is this
>>> 'broadcast timer'. I guess that is some clockevent device which is
>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>> hang pointing to some issue with that driver?
>>
>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> 
> So, the correct run results (full output attached).
> 
> The vanilla kernel uses the twd timers as local timers and the TTC as
> broadcast device:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device  
> 	Clock Event Device: ttc_clockevent
> 
> When I remove the offending CPUIDLE flag and add the DT fragment to
> enable the global timer, the twd timers are still used as local timers
> and the broadcast device is the global timer:
> 	Tick Device: mode:     1                                                         
> 	Broadcast device                                                                 
> 	Clock Event Device: arm_global_timer
> 
> Again, since boot hangs in the actually broken case, I don't see way to
> obtain this information for that case.

Can't you use the maxcpus=1 option to ensure the system to boot up ?


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-31 20:49                         ` Daniel Lezcano
@ 2013-07-31 20:58                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 20:58 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> > On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>> Hi Daniel,
> >>>
> >>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>> (snip)
> >>>>
> >>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>> timer will be stopped when entering to the idle state. In this case, the
> >>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>> idle state, switching the local timer back in use.
> >>>
> >>> I've been thinking about this, trying to understand how this makes my
> >>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>> would make the timer core switch to a broadcast device even though it
> >>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>> something useless, but nothing wrong in a sense that it should result in
> >>> breakage. I guess I'm missing something obvious. This timer system will
> >>> always remain a mystery to me.
> >>>
> >>> Actually this more or less leads to the question: What is this
> >>> 'broadcast timer'. I guess that is some clockevent device which is
> >>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>> hang pointing to some issue with that driver?
> >>
> >> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > 
> > So, the correct run results (full output attached).
> > 
> > The vanilla kernel uses the twd timers as local timers and the TTC as
> > broadcast device:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device  
> > 	Clock Event Device: ttc_clockevent
> > 
> > When I remove the offending CPUIDLE flag and add the DT fragment to
> > enable the global timer, the twd timers are still used as local timers
> > and the broadcast device is the global timer:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device                                                                 
> > 	Clock Event Device: arm_global_timer
> > 
> > Again, since boot hangs in the actually broken case, I don't see way to
> > obtain this information for that case.
> 
> Can't you use the maxcpus=1 option to ensure the system to boot up ?

Right, that works. I forgot about that option after you mentioned, that
it is most likely not that useful.

Anyway, this are those sysfs files with an unmodified cpuidle driver and
the gt enabled and having maxcpus=1 set.

/proc/timer_list:
	Tick Device: mode:     1
	Broadcast device
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     108080000000 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  tick_handle_oneshot_broadcast
	 retries:        0
	
	tick_broadcast_mask: 00000001
	tick_broadcast_oneshot_mask: 00000000
	
	Tick Device: mode:     1
	Per CPU device: 0
	Clock Event Device: local_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     106900000000 nsecs
	 set_next_event: twd_set_next_event
	 set_mode:       twd_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0

# cat /proc/interrupts 
	           CPU0       
	 27:        252       GIC  27  gt
	 29:        626       GIC  29  twd
	 43:          0       GIC  43  ttc_clockevent
	 82:        410       GIC  82  xuartps
	IPI0:          0  CPU wakeup interrupts
	IPI1:          0  Timer broadcast interrupts
	IPI2:          0  Rescheduling interrupts
	IPI3:          0  Function call interrupts
	IPI4:          0  Single function call interrupts
	IPI5:          0  CPU stop interrupts
	Err:          0


	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31 20:58                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 20:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> > On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>> Hi Daniel,
> >>>
> >>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>> (snip)
> >>>>
> >>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>> timer will be stopped when entering to the idle state. In this case, the
> >>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>> idle state, switching the local timer back in use.
> >>>
> >>> I've been thinking about this, trying to understand how this makes my
> >>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>> would make the timer core switch to a broadcast device even though it
> >>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>> something useless, but nothing wrong in a sense that it should result in
> >>> breakage. I guess I'm missing something obvious. This timer system will
> >>> always remain a mystery to me.
> >>>
> >>> Actually this more or less leads to the question: What is this
> >>> 'broadcast timer'. I guess that is some clockevent device which is
> >>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>> hang pointing to some issue with that driver?
> >>
> >> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > 
> > So, the correct run results (full output attached).
> > 
> > The vanilla kernel uses the twd timers as local timers and the TTC as
> > broadcast device:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device  
> > 	Clock Event Device: ttc_clockevent
> > 
> > When I remove the offending CPUIDLE flag and add the DT fragment to
> > enable the global timer, the twd timers are still used as local timers
> > and the broadcast device is the global timer:
> > 	Tick Device: mode:     1                                                         
> > 	Broadcast device                                                                 
> > 	Clock Event Device: arm_global_timer
> > 
> > Again, since boot hangs in the actually broken case, I don't see way to
> > obtain this information for that case.
> 
> Can't you use the maxcpus=1 option to ensure the system to boot up ?

Right, that works. I forgot about that option after you mentioned, that
it is most likely not that useful.

Anyway, this are those sysfs files with an unmodified cpuidle driver and
the gt enabled and having maxcpus=1 set.

/proc/timer_list:
	Tick Device: mode:     1
	Broadcast device
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     108080000000 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  tick_handle_oneshot_broadcast
	 retries:        0
	
	tick_broadcast_mask: 00000001
	tick_broadcast_oneshot_mask: 00000000
	
	Tick Device: mode:     1
	Per CPU device: 0
	Clock Event Device: local_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     106900000000 nsecs
	 set_next_event: twd_set_next_event
	 set_mode:       twd_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0

# cat /proc/interrupts 
	           CPU0       
	 27:        252       GIC  27  gt
	 29:        626       GIC  29  twd
	 43:          0       GIC  43  ttc_clockevent
	 82:        410       GIC  82  xuartps
	IPI0:          0  CPU wakeup interrupts
	IPI1:          0  Timer broadcast interrupts
	IPI2:          0  Rescheduling interrupts
	IPI3:          0  Function call interrupts
	IPI4:          0  Single function call interrupts
	IPI5:          0  CPU stop interrupts
	Err:          0


	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-31 20:58                           ` Sören Brinkmann
@ 2013-07-31 21:08                             ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31 21:08 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>> Hi Daniel,
>>>>>
>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>> (snip)
>>>>>>
>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>> idle state, switching the local timer back in use.
>>>>>
>>>>> I've been thinking about this, trying to understand how this makes my
>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>> would make the timer core switch to a broadcast device even though it
>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>> always remain a mystery to me.
>>>>>
>>>>> Actually this more or less leads to the question: What is this
>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>> hang pointing to some issue with that driver?
>>>>
>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>
>>> So, the correct run results (full output attached).
>>>
>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>> broadcast device:
>>> 	Tick Device: mode:     1                                                         
>>> 	Broadcast device  
>>> 	Clock Event Device: ttc_clockevent
>>>
>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>> enable the global timer, the twd timers are still used as local timers
>>> and the broadcast device is the global timer:
>>> 	Tick Device: mode:     1                                                         
>>> 	Broadcast device                                                                 
>>> 	Clock Event Device: arm_global_timer
>>>
>>> Again, since boot hangs in the actually broken case, I don't see way to
>>> obtain this information for that case.
>>
>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> 
> Right, that works. I forgot about that option after you mentioned, that
> it is most likely not that useful.
> 
> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> the gt enabled and having maxcpus=1 set.
> 
> /proc/timer_list:
> 	Tick Device: mode:     1
> 	Broadcast device
> 	Clock Event Device: arm_global_timer
> 	 max_delta_ns:   12884902005
> 	 min_delta_ns:   1000
> 	 mult:           715827876
> 	 shift:          31
> 	 mode:           3

Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)

The previous timer_list output you gave me when removing the offending
cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).

Is it possible you try to get this output again right after onlining the
cpu1 in order to check if the broadcast device switches to SHUTDOWN ?

> 	 next_event:     108080000000 nsecs
> 	 set_next_event: gt_clockevent_set_next_event
> 	 set_mode:       gt_clockevent_set_mode
> 	 event_handler:  tick_handle_oneshot_broadcast
> 	 retries:        0
> 	
> 	tick_broadcast_mask: 00000001
> 	tick_broadcast_oneshot_mask: 00000000
> 	
> 	Tick Device: mode:     1
> 	Per CPU device: 0
> 	Clock Event Device: local_timer
> 	 max_delta_ns:   12884902005
> 	 min_delta_ns:   1000
> 	 mult:           715827876
> 	 shift:          31
> 	 mode:           3
> 	 next_event:     106900000000 nsecs
> 	 set_next_event: twd_set_next_event
> 	 set_mode:       twd_set_mode
> 	 event_handler:  hrtimer_interrupt
> 	 retries:        0
> 
> # cat /proc/interrupts 
> 	           CPU0       
> 	 27:        252       GIC  27  gt
> 	 29:        626       GIC  29  twd
> 	 43:          0       GIC  43  ttc_clockevent
> 	 82:        410       GIC  82  xuartps
> 	IPI0:          0  CPU wakeup interrupts
> 	IPI1:          0  Timer broadcast interrupts
> 	IPI2:          0  Rescheduling interrupts
> 	IPI3:          0  Function call interrupts
> 	IPI4:          0  Single function call interrupts
> 	IPI5:          0  CPU stop interrupts
> 	Err:          0
> 
> 
> 	Sören
> 
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31 21:08                             ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31 21:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>> Hi Daniel,
>>>>>
>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>> (snip)
>>>>>>
>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>> idle state, switching the local timer back in use.
>>>>>
>>>>> I've been thinking about this, trying to understand how this makes my
>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>> would make the timer core switch to a broadcast device even though it
>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>> always remain a mystery to me.
>>>>>
>>>>> Actually this more or less leads to the question: What is this
>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>> hang pointing to some issue with that driver?
>>>>
>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>
>>> So, the correct run results (full output attached).
>>>
>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>> broadcast device:
>>> 	Tick Device: mode:     1                                                         
>>> 	Broadcast device  
>>> 	Clock Event Device: ttc_clockevent
>>>
>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>> enable the global timer, the twd timers are still used as local timers
>>> and the broadcast device is the global timer:
>>> 	Tick Device: mode:     1                                                         
>>> 	Broadcast device                                                                 
>>> 	Clock Event Device: arm_global_timer
>>>
>>> Again, since boot hangs in the actually broken case, I don't see way to
>>> obtain this information for that case.
>>
>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> 
> Right, that works. I forgot about that option after you mentioned, that
> it is most likely not that useful.
> 
> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> the gt enabled and having maxcpus=1 set.
> 
> /proc/timer_list:
> 	Tick Device: mode:     1
> 	Broadcast device
> 	Clock Event Device: arm_global_timer
> 	 max_delta_ns:   12884902005
> 	 min_delta_ns:   1000
> 	 mult:           715827876
> 	 shift:          31
> 	 mode:           3

Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)

The previous timer_list output you gave me when removing the offending
cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).

Is it possible you try to get this output again right after onlining the
cpu1 in order to check if the broadcast device switches to SHUTDOWN ?

> 	 next_event:     108080000000 nsecs
> 	 set_next_event: gt_clockevent_set_next_event
> 	 set_mode:       gt_clockevent_set_mode
> 	 event_handler:  tick_handle_oneshot_broadcast
> 	 retries:        0
> 	
> 	tick_broadcast_mask: 00000001
> 	tick_broadcast_oneshot_mask: 00000000
> 	
> 	Tick Device: mode:     1
> 	Per CPU device: 0
> 	Clock Event Device: local_timer
> 	 max_delta_ns:   12884902005
> 	 min_delta_ns:   1000
> 	 mult:           715827876
> 	 shift:          31
> 	 mode:           3
> 	 next_event:     106900000000 nsecs
> 	 set_next_event: twd_set_next_event
> 	 set_mode:       twd_set_mode
> 	 event_handler:  hrtimer_interrupt
> 	 retries:        0
> 
> # cat /proc/interrupts 
> 	           CPU0       
> 	 27:        252       GIC  27  gt
> 	 29:        626       GIC  29  twd
> 	 43:          0       GIC  43  ttc_clockevent
> 	 82:        410       GIC  82  xuartps
> 	IPI0:          0  CPU wakeup interrupts
> 	IPI1:          0  Timer broadcast interrupts
> 	IPI2:          0  Rescheduling interrupts
> 	IPI3:          0  Function call interrupts
> 	IPI4:          0  Single function call interrupts
> 	IPI5:          0  CPU stop interrupts
> 	Err:          0
> 
> 
> 	S?ren
> 
> 


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-31 21:08                             ` Daniel Lezcano
@ 2013-07-31 22:18                               ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 22:18 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> > On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> >>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>>>> Hi Daniel,
> >>>>>
> >>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>> (snip)
> >>>>>>
> >>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>> idle state, switching the local timer back in use.
> >>>>>
> >>>>> I've been thinking about this, trying to understand how this makes my
> >>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>> would make the timer core switch to a broadcast device even though it
> >>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>> always remain a mystery to me.
> >>>>>
> >>>>> Actually this more or less leads to the question: What is this
> >>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>> hang pointing to some issue with that driver?
> >>>>
> >>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>
> >>> So, the correct run results (full output attached).
> >>>
> >>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>> broadcast device:
> >>> 	Tick Device: mode:     1                                                         
> >>> 	Broadcast device  
> >>> 	Clock Event Device: ttc_clockevent
> >>>
> >>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>> enable the global timer, the twd timers are still used as local timers
> >>> and the broadcast device is the global timer:
> >>> 	Tick Device: mode:     1                                                         
> >>> 	Broadcast device                                                                 
> >>> 	Clock Event Device: arm_global_timer
> >>>
> >>> Again, since boot hangs in the actually broken case, I don't see way to
> >>> obtain this information for that case.
> >>
> >> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> > 
> > Right, that works. I forgot about that option after you mentioned, that
> > it is most likely not that useful.
> > 
> > Anyway, this are those sysfs files with an unmodified cpuidle driver and
> > the gt enabled and having maxcpus=1 set.
> > 
> > /proc/timer_list:
> > 	Tick Device: mode:     1
> > 	Broadcast device
> > 	Clock Event Device: arm_global_timer
> > 	 max_delta_ns:   12884902005
> > 	 min_delta_ns:   1000
> > 	 mult:           715827876
> > 	 shift:          31
> > 	 mode:           3
> 
> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> 
> The previous timer_list output you gave me when removing the offending
> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> 
> Is it possible you try to get this output again right after onlining the
> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?

How do I do that? I tried to online CPU1 after booting with maxcpus=1
and that didn't end well:
	# echo 1 > online && cat /proc/timer_list 
	[ 4689.992658] CPU1: Booted secondary processor
	[ 4690.986295] CPU1: failed to come online
	sh: write error: Input/output error
	# [ 4691.045945] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
	[ 4691.045986] 
	[ 4691.052972] ===============================
	[ 4691.057349] [ INFO: suspicious RCU usage. ]
	[ 4691.061413] 3.11.0-rc3-00001-gc14f576-dirty #139 Not tainted
	[ 4691.067026] -------------------------------
	[ 4691.071129] kernel/sched/fair.c:5477 suspicious rcu_dereference_check() usage!
	[ 4691.078292] 
	[ 4691.078292] other info that might help us debug this:
	[ 4691.078292] 
	[ 4691.086209] 
	[ 4691.086209] RCU used illegally from offline CPU!
	[ 4691.086209] rcu_scheduler_active = 1, debug_locks = 0
	[ 4691.097216] 1 lock held by swapper/1/0:
	[ 4691.100968]  #0:  (rcu_read_lock){.+.+..}, at: [<c00679b4>] set_cpu_sd_state_idle+0x0/0x1e4
	[ 4691.109250] 
	[ 4691.109250] stack backtrace:
	[ 4691.113531] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #139
	[ 4691.121755] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
	[ 4691.130263] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd50>] (dump_stack+0x80/0xc4)
	[ 4691.138264] [<c045bd50>] (dump_stack+0x80/0xc4) from [<c007ad78>] (lockdep_rcu_suspicious+0xdc/0x118)
	[ 4691.147371] [<c007ad78>] (lockdep_rcu_suspicious+0xdc/0x118) from [<c0067ac0>] (set_cpu_sd_state_idle+0x10c/0x1e4)
	[ 4691.157605] [<c0067ac0>] (set_cpu_sd_state_idle+0x10c/0x1e4) from [<c0078238>] (tick_nohz_idle_enter+0x48/0x80)
	[ 4691.167583] [<c0078238>] (tick_nohz_idle_enter+0x48/0x80) from [<c006dc5c>] (cpu_startup_entry+0x28/0x388)
	[ 4691.177127] [<c006dc5c>] (cpu_startup_entry+0x28/0x388) from [<c0014acc>] (secondary_start_kernel+0x12c/0x144)
	[ 4691.187013] [<c0014acc>] (secondary_start_kernel+0x12c/0x144) from [<000081ec>] (0x81ec)


	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31 22:18                               ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 22:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> > On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> >>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>>>> Hi Daniel,
> >>>>>
> >>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>> (snip)
> >>>>>>
> >>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>> idle state, switching the local timer back in use.
> >>>>>
> >>>>> I've been thinking about this, trying to understand how this makes my
> >>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>> would make the timer core switch to a broadcast device even though it
> >>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>> always remain a mystery to me.
> >>>>>
> >>>>> Actually this more or less leads to the question: What is this
> >>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>> hang pointing to some issue with that driver?
> >>>>
> >>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>
> >>> So, the correct run results (full output attached).
> >>>
> >>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>> broadcast device:
> >>> 	Tick Device: mode:     1                                                         
> >>> 	Broadcast device  
> >>> 	Clock Event Device: ttc_clockevent
> >>>
> >>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>> enable the global timer, the twd timers are still used as local timers
> >>> and the broadcast device is the global timer:
> >>> 	Tick Device: mode:     1                                                         
> >>> 	Broadcast device                                                                 
> >>> 	Clock Event Device: arm_global_timer
> >>>
> >>> Again, since boot hangs in the actually broken case, I don't see way to
> >>> obtain this information for that case.
> >>
> >> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> > 
> > Right, that works. I forgot about that option after you mentioned, that
> > it is most likely not that useful.
> > 
> > Anyway, this are those sysfs files with an unmodified cpuidle driver and
> > the gt enabled and having maxcpus=1 set.
> > 
> > /proc/timer_list:
> > 	Tick Device: mode:     1
> > 	Broadcast device
> > 	Clock Event Device: arm_global_timer
> > 	 max_delta_ns:   12884902005
> > 	 min_delta_ns:   1000
> > 	 mult:           715827876
> > 	 shift:          31
> > 	 mode:           3
> 
> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> 
> The previous timer_list output you gave me when removing the offending
> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> 
> Is it possible you try to get this output again right after onlining the
> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?

How do I do that? I tried to online CPU1 after booting with maxcpus=1
and that didn't end well:
	# echo 1 > online && cat /proc/timer_list 
	[ 4689.992658] CPU1: Booted secondary processor
	[ 4690.986295] CPU1: failed to come online
	sh: write error: Input/output error
	# [ 4691.045945] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
	[ 4691.045986] 
	[ 4691.052972] ===============================
	[ 4691.057349] [ INFO: suspicious RCU usage. ]
	[ 4691.061413] 3.11.0-rc3-00001-gc14f576-dirty #139 Not tainted
	[ 4691.067026] -------------------------------
	[ 4691.071129] kernel/sched/fair.c:5477 suspicious rcu_dereference_check() usage!
	[ 4691.078292] 
	[ 4691.078292] other info that might help us debug this:
	[ 4691.078292] 
	[ 4691.086209] 
	[ 4691.086209] RCU used illegally from offline CPU!
	[ 4691.086209] rcu_scheduler_active = 1, debug_locks = 0
	[ 4691.097216] 1 lock held by swapper/1/0:
	[ 4691.100968]  #0:  (rcu_read_lock){.+.+..}, at: [<c00679b4>] set_cpu_sd_state_idle+0x0/0x1e4
	[ 4691.109250] 
	[ 4691.109250] stack backtrace:
	[ 4691.113531] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #139
	[ 4691.121755] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
	[ 4691.130263] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd50>] (dump_stack+0x80/0xc4)
	[ 4691.138264] [<c045bd50>] (dump_stack+0x80/0xc4) from [<c007ad78>] (lockdep_rcu_suspicious+0xdc/0x118)
	[ 4691.147371] [<c007ad78>] (lockdep_rcu_suspicious+0xdc/0x118) from [<c0067ac0>] (set_cpu_sd_state_idle+0x10c/0x1e4)
	[ 4691.157605] [<c0067ac0>] (set_cpu_sd_state_idle+0x10c/0x1e4) from [<c0078238>] (tick_nohz_idle_enter+0x48/0x80)
	[ 4691.167583] [<c0078238>] (tick_nohz_idle_enter+0x48/0x80) from [<c006dc5c>] (cpu_startup_entry+0x28/0x388)
	[ 4691.177127] [<c006dc5c>] (cpu_startup_entry+0x28/0x388) from [<c0014acc>] (secondary_start_kernel+0x12c/0x144)
	[ 4691.187013] [<c0014acc>] (secondary_start_kernel+0x12c/0x144) from [<000081ec>] (0x81ec)


	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-31 22:18                               ` Sören Brinkmann
@ 2013-07-31 23:01                                 ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31 23:01 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>> Hi Daniel,
>>>>>>>
>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>> (snip)
>>>>>>>>
>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>
>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>> always remain a mystery to me.
>>>>>>>
>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>> hang pointing to some issue with that driver?
>>>>>>
>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>
>>>>> So, the correct run results (full output attached).
>>>>>
>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>> broadcast device:
>>>>> 	Tick Device: mode:     1                                                         
>>>>> 	Broadcast device  
>>>>> 	Clock Event Device: ttc_clockevent
>>>>>
>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>> enable the global timer, the twd timers are still used as local timers
>>>>> and the broadcast device is the global timer:
>>>>> 	Tick Device: mode:     1                                                         
>>>>> 	Broadcast device                                                                 
>>>>> 	Clock Event Device: arm_global_timer
>>>>>
>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>> obtain this information for that case.
>>>>
>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>
>>> Right, that works. I forgot about that option after you mentioned, that
>>> it is most likely not that useful.
>>>
>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>> the gt enabled and having maxcpus=1 set.
>>>
>>> /proc/timer_list:
>>> 	Tick Device: mode:     1
>>> 	Broadcast device
>>> 	Clock Event Device: arm_global_timer
>>> 	 max_delta_ns:   12884902005
>>> 	 min_delta_ns:   1000
>>> 	 mult:           715827876
>>> 	 shift:          31
>>> 	 mode:           3
>>
>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>
>> The previous timer_list output you gave me when removing the offending
>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>
>> Is it possible you try to get this output again right after onlining the
>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> 
> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> and that didn't end well:
> 	# echo 1 > online && cat /proc/timer_list 

Hmm, I was hoping to have a small delay before the kernel hangs but
apparently this is not the case... :(

I suspect the global timer is shutdown at one moment but I don't
understand why and when.

Can you add a stack trace in the "clockevents_shutdown" function with
the clockevent device name ? Perhaps, we may see at boot time an
interesting trace when it hangs.



> 	[ 4689.992658] CPU1: Booted secondary processor
> 	[ 4690.986295] CPU1: failed to come online
> 	sh: write error: Input/output error
> 	# [ 4691.045945] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
> 	[ 4691.045986] 
> 	[ 4691.052972] ===============================
> 	[ 4691.057349] [ INFO: suspicious RCU usage. ]
> 	[ 4691.061413] 3.11.0-rc3-00001-gc14f576-dirty #139 Not tainted
> 	[ 4691.067026] -------------------------------
> 	[ 4691.071129] kernel/sched/fair.c:5477 suspicious rcu_dereference_check() usage!
> 	[ 4691.078292] 
> 	[ 4691.078292] other info that might help us debug this:
> 	[ 4691.078292] 
> 	[ 4691.086209] 
> 	[ 4691.086209] RCU used illegally from offline CPU!
> 	[ 4691.086209] rcu_scheduler_active = 1, debug_locks = 0
> 	[ 4691.097216] 1 lock held by swapper/1/0:
> 	[ 4691.100968]  #0:  (rcu_read_lock){.+.+..}, at: [<c00679b4>] set_cpu_sd_state_idle+0x0/0x1e4
> 	[ 4691.109250] 
> 	[ 4691.109250] stack backtrace:
> 	[ 4691.113531] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #139
> 	[ 4691.121755] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
> 	[ 4691.130263] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd50>] (dump_stack+0x80/0xc4)
> 	[ 4691.138264] [<c045bd50>] (dump_stack+0x80/0xc4) from [<c007ad78>] (lockdep_rcu_suspicious+0xdc/0x118)
> 	[ 4691.147371] [<c007ad78>] (lockdep_rcu_suspicious+0xdc/0x118) from [<c0067ac0>] (set_cpu_sd_state_idle+0x10c/0x1e4)
> 	[ 4691.157605] [<c0067ac0>] (set_cpu_sd_state_idle+0x10c/0x1e4) from [<c0078238>] (tick_nohz_idle_enter+0x48/0x80)
> 	[ 4691.167583] [<c0078238>] (tick_nohz_idle_enter+0x48/0x80) from [<c006dc5c>] (cpu_startup_entry+0x28/0x388)
> 	[ 4691.177127] [<c006dc5c>] (cpu_startup_entry+0x28/0x388) from [<c0014acc>] (secondary_start_kernel+0x12c/0x144)
> 	[ 4691.187013] [<c0014acc>] (secondary_start_kernel+0x12c/0x144) from [<000081ec>] (0x81ec)
> 
> 
> 	Sören
> 
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31 23:01                                 ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-07-31 23:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>> Hi Daniel,
>>>>>>>
>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>> (snip)
>>>>>>>>
>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>
>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>> always remain a mystery to me.
>>>>>>>
>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>> hang pointing to some issue with that driver?
>>>>>>
>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>
>>>>> So, the correct run results (full output attached).
>>>>>
>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>> broadcast device:
>>>>> 	Tick Device: mode:     1                                                         
>>>>> 	Broadcast device  
>>>>> 	Clock Event Device: ttc_clockevent
>>>>>
>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>> enable the global timer, the twd timers are still used as local timers
>>>>> and the broadcast device is the global timer:
>>>>> 	Tick Device: mode:     1                                                         
>>>>> 	Broadcast device                                                                 
>>>>> 	Clock Event Device: arm_global_timer
>>>>>
>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>> obtain this information for that case.
>>>>
>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>
>>> Right, that works. I forgot about that option after you mentioned, that
>>> it is most likely not that useful.
>>>
>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>> the gt enabled and having maxcpus=1 set.
>>>
>>> /proc/timer_list:
>>> 	Tick Device: mode:     1
>>> 	Broadcast device
>>> 	Clock Event Device: arm_global_timer
>>> 	 max_delta_ns:   12884902005
>>> 	 min_delta_ns:   1000
>>> 	 mult:           715827876
>>> 	 shift:          31
>>> 	 mode:           3
>>
>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>
>> The previous timer_list output you gave me when removing the offending
>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>
>> Is it possible you try to get this output again right after onlining the
>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> 
> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> and that didn't end well:
> 	# echo 1 > online && cat /proc/timer_list 

Hmm, I was hoping to have a small delay before the kernel hangs but
apparently this is not the case... :(

I suspect the global timer is shutdown at one moment but I don't
understand why and when.

Can you add a stack trace in the "clockevents_shutdown" function with
the clockevent device name ? Perhaps, we may see at boot time an
interesting trace when it hangs.



> 	[ 4689.992658] CPU1: Booted secondary processor
> 	[ 4690.986295] CPU1: failed to come online
> 	sh: write error: Input/output error
> 	# [ 4691.045945] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
> 	[ 4691.045986] 
> 	[ 4691.052972] ===============================
> 	[ 4691.057349] [ INFO: suspicious RCU usage. ]
> 	[ 4691.061413] 3.11.0-rc3-00001-gc14f576-dirty #139 Not tainted
> 	[ 4691.067026] -------------------------------
> 	[ 4691.071129] kernel/sched/fair.c:5477 suspicious rcu_dereference_check() usage!
> 	[ 4691.078292] 
> 	[ 4691.078292] other info that might help us debug this:
> 	[ 4691.078292] 
> 	[ 4691.086209] 
> 	[ 4691.086209] RCU used illegally from offline CPU!
> 	[ 4691.086209] rcu_scheduler_active = 1, debug_locks = 0
> 	[ 4691.097216] 1 lock held by swapper/1/0:
> 	[ 4691.100968]  #0:  (rcu_read_lock){.+.+..}, at: [<c00679b4>] set_cpu_sd_state_idle+0x0/0x1e4
> 	[ 4691.109250] 
> 	[ 4691.109250] stack backtrace:
> 	[ 4691.113531] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #139
> 	[ 4691.121755] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
> 	[ 4691.130263] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd50>] (dump_stack+0x80/0xc4)
> 	[ 4691.138264] [<c045bd50>] (dump_stack+0x80/0xc4) from [<c007ad78>] (lockdep_rcu_suspicious+0xdc/0x118)
> 	[ 4691.147371] [<c007ad78>] (lockdep_rcu_suspicious+0xdc/0x118) from [<c0067ac0>] (set_cpu_sd_state_idle+0x10c/0x1e4)
> 	[ 4691.157605] [<c0067ac0>] (set_cpu_sd_state_idle+0x10c/0x1e4) from [<c0078238>] (tick_nohz_idle_enter+0x48/0x80)
> 	[ 4691.167583] [<c0078238>] (tick_nohz_idle_enter+0x48/0x80) from [<c006dc5c>] (cpu_startup_entry+0x28/0x388)
> 	[ 4691.177127] [<c006dc5c>] (cpu_startup_entry+0x28/0x388) from [<c0014acc>] (secondary_start_kernel+0x12c/0x144)
> 	[ 4691.187013] [<c0014acc>] (secondary_start_kernel+0x12c/0x144) from [<000081ec>] (0x81ec)
> 
> 
> 	S?ren
> 
> 


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-31 23:01                                 ` Daniel Lezcano
@ 2013-07-31 23:38                                   ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 23:38 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Stuart Menefy,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 5210 bytes --]

On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> > On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> >>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> >>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>>>>>> Hi Daniel,
> >>>>>>>
> >>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>> (snip)
> >>>>>>>>
> >>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>
> >>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>> always remain a mystery to me.
> >>>>>>>
> >>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>> hang pointing to some issue with that driver?
> >>>>>>
> >>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>
> >>>>> So, the correct run results (full output attached).
> >>>>>
> >>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>> broadcast device:
> >>>>> 	Tick Device: mode:     1                                                         
> >>>>> 	Broadcast device  
> >>>>> 	Clock Event Device: ttc_clockevent
> >>>>>
> >>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>> enable the global timer, the twd timers are still used as local timers
> >>>>> and the broadcast device is the global timer:
> >>>>> 	Tick Device: mode:     1                                                         
> >>>>> 	Broadcast device                                                                 
> >>>>> 	Clock Event Device: arm_global_timer
> >>>>>
> >>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>> obtain this information for that case.
> >>>>
> >>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>
> >>> Right, that works. I forgot about that option after you mentioned, that
> >>> it is most likely not that useful.
> >>>
> >>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>> the gt enabled and having maxcpus=1 set.
> >>>
> >>> /proc/timer_list:
> >>> 	Tick Device: mode:     1
> >>> 	Broadcast device
> >>> 	Clock Event Device: arm_global_timer
> >>> 	 max_delta_ns:   12884902005
> >>> 	 min_delta_ns:   1000
> >>> 	 mult:           715827876
> >>> 	 shift:          31
> >>> 	 mode:           3
> >>
> >> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>
> >> The previous timer_list output you gave me when removing the offending
> >> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>
> >> Is it possible you try to get this output again right after onlining the
> >> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> > 
> > How do I do that? I tried to online CPU1 after booting with maxcpus=1
> > and that didn't end well:
> > 	# echo 1 > online && cat /proc/timer_list 
> 
> Hmm, I was hoping to have a small delay before the kernel hangs but
> apparently this is not the case... :(
> 
> I suspect the global timer is shutdown at one moment but I don't
> understand why and when.
> 
> Can you add a stack trace in the "clockevents_shutdown" function with
> the clockevent device name ? Perhaps, we may see at boot time an
> interesting trace when it hangs.

I did this change:
	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
	index 38959c8..3ab11c1 100644
	--- a/kernel/time/clockevents.c
	+++ b/kernel/time/clockevents.c
	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
	  */
	 void clockevents_shutdown(struct clock_event_device *dev)
	 {
	+       pr_info("ce->name:%s\n", dev->name);
	+       dump_stack();
	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
	        dev->next_event.tv64 = KTIME_MAX;
	 }

It is hit a few times during boot, so I attach a full boot log. I really
don't know what to look for, but I hope you can spot something in it. I
really appreciate you taking the time.

	Thanks,
	Sören


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: boot.log --]
[-- Type: text/plain; charset="utf-8", Size: 25156 bytes --]



U-Boot 2013.04 (Jun 19 2013 - 08:17:40)

I2C:   ready
Memory: ECC disabled
DRAM:  1 GiB
WARNING: Caches not enabled
MMC:   zynq_sdhci: 0
SF: Detected S25FL129P_64K with page size 128 KiB, total 32 MiB
*** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Net:   Gem.e000b000
Hit any key to stop autoboot:  3 \b\b\b 2 \b\b\b 1 \b\b\b 0 
TFTPing Linux to RAM...
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'uImage'.
Load address: 0x3000000
Loading: *\b#################################################################
	 #################################################################
	 ####################################
	 4.2 MiB/s
done
Bytes transferred = 2427120 (2508f0 hex)
Gem.e000b000:7 is connected to Gem.e000b000.  Reconnecting to Gem.e000b000
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'devicetree.dtb'.
Load address: 0x2a00000
Loading: *\b#
	 672.9 KiB/s
done
Bytes transferred = 2759 (ac7 hex)
Gem.e000b000:7 is connected to Gem.e000b000.  Reconnecting to Gem.e000b000
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'uramdisk.image.gz'.
Load address: 0x2000000
Loading: *\b#################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ###################################
	 4.4 MiB/s
done
Bytes transferred = 5284624 (50a310 hex)
## Booting kernel from Legacy Image at 03000000 ...
   Image Name:   Linux-3.11.0-rc3-00001-gc14f576-
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2427056 Bytes = 2.3 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 02000000 ...
   Image Name:   
   Image Type:   ARM Linux RAMDisk Image (gzip compressed)
   Data Size:    5284560 Bytes = 5 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 02a00000
   Booting using the fdt blob at 0x02a00000
   Loading Kernel Image ... OK
OK
   Loading Ramdisk to 1faf5000, end 1ffff2d0 ... OK
   Loading Device Tree to 1faf1000, end 1faf4ac6 ... OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[    0.000000] Booting Linux on physical CPU 0x0
\r[    0.000000] Linux version 3.11.0-rc3-00001-gc14f576-dirty (sorenb@xsjandreislx) (gcc version 4.7.2 (Sourcery CodeBench Lite 2012.09-104) ) #144 SMP PREEMPT Wed Jul 31 16:31:16 PDT 2013
\r[    0.000000] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
\r[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
\r[    0.000000] Machine: Xilinx Zynq Platform, model: Zynq ZC706 Development Board
\r[    0.000000] bootconsole [earlycon0] enabled
\r[    0.000000] cma: CMA: reserved 16 MiB at 2e800000
\r[    0.000000] Memory policy: ECC disabled, Data cache writealloc
\r[    0.000000] PERCPU: Embedded 9 pages/cpu @c148e000 s14720 r8192 d13952 u36864
\r[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260624
\r[    0.000000] Kernel command line: console=ttyPS0,115200 earlyprintk
\r[    0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)
\r[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
\r[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
\r[    0.000000] Memory: 1005068K/1048576K available (4849K kernel code, 305K rwdata, 1556K rodata, 334K init, 5699K bss, 43508K reserved, 270336K highmem)
\r[    0.000000] Virtual kernel memory layout:
\r[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
\r[    0.000000]     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
\r[    0.000000]     vmalloc : 0xf0000000 - 0xff000000   ( 240 MB)
\r[    0.000000]     lowmem  : 0xc0000000 - 0xef800000   ( 760 MB)
\r[    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
\r[    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
\r[    0.000000]       .text : 0xc0008000 - 0xc0649930   (6407 kB)
\r[    0.000000]       .init : 0xc064a000 - 0xc069d980   ( 335 kB)
\r[    0.000000]       .data : 0xc069e000 - 0xc06ea4d8   ( 306 kB)
\r[    0.000000]        .bss : 0xc06ea4d8 - 0xc0c7b2e4   (5700 kB)
\r[    0.000000] Preemptible hierarchical RCU implementation.
\r[    0.000000] 	RCU lockdep checking is enabled.
\r[    0.000000] 	Additional per-CPU info printed with stalls.
\r[    0.000000] 	RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
\r[    0.000000] NR_IRQS:16 nr_irqs:16 16
\r[    0.000000] slcr mapped to f0004000
\r[    0.000000] Zynq clock init
\r[    0.000000] sched_clock: 32 bits at 333MHz, resolution 3ns, wraps every 12884ms
\r[    0.000000] ce->name:arm_global_timer
\r[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    0.000000] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    0.000000] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    0.000000] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    0.000000] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)
\r[    0.000000] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fa0>] (tick_check_new_device+0xc8/0x11c)
\r[    0.000000] [<c0075fa0>] (tick_check_new_device+0xc8/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)
\r[    0.000000] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    0.000000] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c0370764>] (gt_clockevents_init+0x8c/0xb4)
\r[    0.000000] [<c0370764>] (gt_clockevents_init+0x8c/0xb4) from [<c0675d64>] (global_timer_of_register+0x1e4/0x294)
\r[    0.000000] [<c0675d64>] (global_timer_of_register+0x1e4/0x294) from [<c0675724>] (clocksource_of_init+0x2c/0x54)
\r[    0.000000] [<c0675724>] (clocksource_of_init+0x2c/0x54) from [<c0652b98>] (zynq_timer_init+0x14/0x18)
\r[    0.000000] [<c0652b98>] (zynq_timer_init+0x14/0x18) from [<c064dec8>] (time_init+0x24/0x34)
\r[    0.000000] [<c064dec8>] (time_init+0x24/0x34) from [<c064a774>] (start_kernel+0x1cc/0x32c)
\r[    0.000000] [<c064a774>] (start_kernel+0x1cc/0x32c) from [<00008074>] (0x8074)
\r[    0.000000] ce->name:ttc_clockevent
\r[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    0.000000] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    0.000000] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    0.000000] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    0.000000] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)
\r[    0.000000] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c00766c4>] (tick_install_broadcast_device+0x88/0xcc)
\r[    0.000000] [<c00766c4>] (tick_install_broadcast_device+0x88/0xcc) from [<c0075fe4>] (tick_check_new_device+0x10c/0x11c)
\r[    0.000000] [<c0075fe4>] (tick_check_new_device+0x10c/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)
\r[    0.000000] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    0.000000] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c0675acc>] (ttc_timer_init+0x2d4/0x388)
\r[    0.000000] [<c0675acc>] (ttc_timer_init+0x2d4/0x388) from [<c0675724>] (clocksource_of_init+0x2c/0x54)
\r[    0.000000] [<c0675724>] (clocksource_of_init+0x2c/0x54) from [<c0652b98>] (zynq_timer_init+0x14/0x18)
\r[    0.000000] [<c0652b98>] (zynq_timer_init+0x14/0x18) from [<c064dec8>] (time_init+0x24/0x34)
\r[    0.000000] [<c064dec8>] (time_init+0x24/0x34) from [<c064a774>] (start_kernel+0x1cc/0x32c)
\r[    0.000000] [<c064a774>] (start_kernel+0x1cc/0x32c) from [<00008074>] (0x8074)
\r[    0.000000] ttc0 #0 at f0006000, irq=43
\r[    0.000000] Console: colour dummy device 80x30
\r[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
\r[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:  8
\r[    0.000000] ... MAX_LOCK_DEPTH:          48
\r[    0.000000] ... MAX_LOCKDEP_KEYS:        8191
\r[    0.000000] ... CLASSHASH_SIZE:          4096
\r[    0.000000] ... MAX_LOCKDEP_ENTRIES:     16384
\r[    0.000000] ... MAX_LOCKDEP_CHAINS:      32768
\r[    0.000000] ... CHAINHASH_SIZE:          16384
\r[    0.000000]  memory used by lock dependency info: 3695 kB
\r[    0.000000]  per task-struct memory footprint: 1152 bytes
\r[    0.347737] Calibrating delay loop... 1325.46 BogoMIPS (lpj=6627328)
\r[    0.385250] pid_max: default: 32768 minimum: 301
\r[    0.388286] Mount-cache hash table entries: 512
\r[    0.399390] CPU: Testing write buffer coherency: ok
\r[    0.399540] ftrace: allocating 16043 entries in 48 pages
\r[    0.439854] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
\r[    0.440069] ce->name:local_timer
\r[    0.443330] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    0.451622] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    0.460207] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    0.468285] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    0.477224] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)
\r[    0.487628] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fa0>] (tick_check_new_device+0xc8/0x11c)
\r[    0.498206] [<c0075fa0>] (tick_check_new_device+0xc8/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)
\r[    0.508870] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    0.520307] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)
\r[    0.530798] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)
\r[    0.540184] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c064e804>] (smp_prepare_cpus+0x58/0x84)
\r[    0.549464] [<c064e804>] (smp_prepare_cpus+0x58/0x84) from [<c064a934>] (kernel_init_freeable+0x60/0x1d0)
\r[    0.559008] [<c064a934>] (kernel_init_freeable+0x60/0x1d0) from [<c04562d4>] (kernel_init+0x1c/0xf4)
\r[    0.568124] [<c04562d4>] (kernel_init+0x1c/0xf4) from [<c000efc8>] (ret_from_fork+0x14/0x20)
\r[    0.576545] ce->name:ttc_clockevent
\r[    0.580068] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    0.588341] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    0.596937] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    0.605019] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    0.613964] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4)
\r[    0.624453] [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4) from [<c0075db8>] (tick_setup_device+0xb8/0x100)
\r[    0.634773] [<c0075db8>] (tick_setup_device+0xb8/0x100) from [<c0075fc8>] (tick_check_new_device+0xf0/0x11c)
\r[    0.644577] [<c0075fc8>] (tick_check_new_device+0xf0/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)
\r[    0.655241] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    0.666679] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)
\r[    0.677169] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)
\r[    0.686549] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c064e804>] (smp_prepare_cpus+0x58/0x84)
\r[    0.695833] [<c064e804>] (smp_prepare_cpus+0x58/0x84) from [<c064a934>] (kernel_init_freeable+0x60/0x1d0)
\r[    0.705377] [<c064a934>] (kernel_init_freeable+0x60/0x1d0) from [<c04562d4>] (kernel_init+0x1c/0xf4)
\r[    0.714495] [<c04562d4>] (kernel_init+0x1c/0xf4) from [<c000efc8>] (ret_from_fork+0x14/0x20)
\r[    0.722917] ce->name:arm_global_timer
\r[    0.726612] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    0.734881] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    0.743479] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    0.751563] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    0.760503] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)
\r[    0.770912] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c00766c4>] (tick_install_broadcast_device+0x88/0xcc)
\r[    0.782091] [<c00766c4>] (tick_install_broadcast_device+0x88/0xcc) from [<c0075fe4>] (tick_check_new_device+0x10c/0x11c)
\r[    0.792926] [<c0075fe4>] (tick_check_new_device+0x10c/0x11c) from [<c007514c>] (clockevents_register_device+0x100/0x134)
\r[    0.803762] [<c007514c>] (clockevents_register_device+0x100/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    0.815286] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)
\r[    0.825777] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)
\r[    0.835156] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c064e804>] (smp_prepare_cpus+0x58/0x84)
\r[    0.844441] [<c064e804>] (smp_prepare_cpus+0x58/0x84) from [<c064a934>] (kernel_init_freeable+0x60/0x1d0)
\r[    0.853986] [<c064a934>] (kernel_init_freeable+0x60/0x1d0) from [<c04562d4>] (kernel_init+0x1c/0xf4)
\r[    0.863102] [<c04562d4>] (kernel_init+0x1c/0xf4) from [<c000efc8>] (ret_from_fork+0x14/0x20)
\r[    0.871662] Setting up static identity map for 0xc0462780 - 0xc04627d8
\r[    0.935967] CPU1: Booted secondary processor
\r[    0.936001] ce->name:arm_global_timer
\r[    0.936015] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    0.936063] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    0.936093] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    0.936120] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    0.936140] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)
\r[    0.936160] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fa0>] (tick_check_new_device+0xc8/0x11c)
\r[    0.936180] [<c0075fa0>] (tick_check_new_device+0xc8/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)
\r[    0.936199] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    0.936227] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c0370764>] (gt_clockevents_init+0x8c/0xb4)
\r[    0.936247] [<c0370764>] (gt_clockevents_init+0x8c/0xb4) from [<c03708e8>] (gt_cpu_notify+0x3c/0x68)
\r[    0.936278] [<c03708e8>] (gt_cpu_notify+0x3c/0x68) from [<c00552e4>] (notifier_call_chain+0xd8/0x110)
\r[    0.936301] [<c00552e4>] (notifier_call_chain+0xd8/0x110) from [<c0055340>] (__raw_notifier_call_chain+0x24/0x2c)
\r[    0.936321] [<c0055340>] (__raw_notifier_call_chain+0x24/0x2c) from [<c002758c>] (__cpu_notify+0x3c/0x58)
\r[    0.936337] [<c002758c>] (__cpu_notify+0x3c/0x58) from [<c00275c8>] (cpu_notify+0x20/0x24)
\r[    0.936354] [<c00275c8>] (cpu_notify+0x20/0x24) from [<c0027dd0>] (notify_cpu_starting+0x58/0x60)
\r[    0.936373] [<c0027dd0>] (notify_cpu_starting+0x58/0x60) from [<c0014a94>] (secondary_start_kernel+0xf4/0x144)
\r[    0.936390] [<c0014a94>] (secondary_start_kernel+0xf4/0x144) from [<000081ec>] (0x81ec)
\r[    0.936404] ce->name:arm_global_timer
\r[    0.936415] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    0.936441] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    0.936464] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    0.936486] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    0.936507] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4)
\r[    0.936526] [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4) from [<c0075db8>] (tick_setup_device+0xb8/0x100)
\r[    0.936545] [<c0075db8>] (tick_setup_device+0xb8/0x100) from [<c0075fc8>] (tick_check_new_device+0xf0/0x11c)
\r[    0.936564] [<c0075fc8>] (tick_check_new_device+0xf0/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)
\r[    0.936584] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    0.936608] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c0370764>] (gt_clockevents_init+0x8c/0xb4)
\r[    0.936627] [<c0370764>] (gt_clockevents_init+0x8c/0xb4) from [<c03708e8>] (gt_cpu_notify+0x3c/0x68)
\r[    0.936652] [<c03708e8>] (gt_cpu_notify+0x3c/0x68) from [<c00552e4>] (notifier_call_chain+0xd8/0x110)
\r[    0.936673] [<c00552e4>] (notifier_call_chain+0xd8/0x110) from [<c0055340>] (__raw_notifier_call_chain+0x24/0x2c)
\r[    0.936692] [<c0055340>] (__raw_notifier_call_chain+0x24/0x2c) from [<c002758c>] (__cpu_notify+0x3c/0x58)
\r[    0.936708] [<c002758c>] (__cpu_notify+0x3c/0x58) from [<c00275c8>] (cpu_notify+0x20/0x24)
\r[    0.936725] [<c00275c8>] (cpu_notify+0x20/0x24) from [<c0027dd0>] (notify_cpu_starting+0x58/0x60)
\r[    0.936744] [<c0027dd0>] (notify_cpu_starting+0x58/0x60) from [<c0014a94>] (secondary_start_kernel+0xf4/0x144)
\r[    0.936760] [<c0014a94>] (secondary_start_kernel+0xf4/0x144) from [<000081ec>] (0x81ec)
\r[    1.022967] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
\r[    1.023071] ce->name:local_timer
\r[    1.024254] Brought up 2 CPUs
\r[    1.024267] SMP: Total of 2 processors activated (2657.48 BogoMIPS).
\r[    1.024272] CPU: All CPU(s) started in SVC mode.
\r[    1.383534] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    1.391814] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    1.400405] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    1.408485] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    1.417425] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)
\r[    1.427831] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fa0>] (tick_check_new_device+0xc8/0x11c)
\r[    1.438410] [<c0075fa0>] (tick_check_new_device+0xc8/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)
\r[    1.449073] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    1.460511] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)
\r[    1.471003] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)
\r[    1.480375] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c0014ab8>] (secondary_start_kernel+0x118/0x144)
\r[    1.490350] [<c0014ab8>] (secondary_start_kernel+0x118/0x144) from [<000081ec>] (0x81ec)
\r[    1.498430] ce->name:arm_global_timer
\r[    1.502125] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144
\r[    1.510397] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    1.518993] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)
\r[    1.527077] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)
\r[    1.536020] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4)
\r[    1.546511] [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4) from [<c0075db8>] (tick_setup_device+0xb8/0x100)
\r[    1.556829] [<c0075db8>] (tick_setup_device+0xb8/0x100) from [<c0075fc8>] (tick_check_new_device+0xf0/0x11c)
\r[    1.566634] [<c0075fc8>] (tick_check_new_device+0xf0/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)
\r[    1.577299] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)
\r[    1.588737] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)
\r[    1.599227] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)
\r[    1.608600] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c0014ab8>] (secondary_start_kernel+0x118/0x144)
\r[    1.618576] [<c0014ab8>] (secondary_start_kernel+0x118/0x144) from [<000081ec>] (0x81ec)
\r[    1.630233] devtmpfs: initialized
\r[    1.648505] NET: Registered protocol family 16
\r[    1.654412] DMA: preallocated 256 KiB pool for atomic coherent allocations
\r[    1.667327] L310 cache controller enabled
\r[    1.667364] l2x0: 8 ways, CACHE_ID 0x410000c8, AUX_CTRL 0x72360000, Cache size: 524288 B
\r[    1.681910] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
\r[    1.684424] hw-breakpoint: maximum watchpoint size is 4 bytes.
\r[    1.763691] bio: create slab <bio-0> at 0
\r[    1.768952] SCSI subsystem initialized
\r[    1.771737] usbcore: registered new interface driver usbfs
\r[    1.773326] usbcore: registered new interface driver hub
\r[    1.778916] usbcore: registered new device driver usb
\r[    1.793685] Switched to clocksource arm_global_timer
\r[    1.968760] NET: Registered protocol family 2
\r[    1.974238] TCP established hash table entries: 8192 (order: 4, 65536 bytes)
\r[    1.976164] TCP bind hash table entries: 8192 (order: 6, 294912 bytes)
\r[    1.986656] TCP: Hash tables configured (established 8192 bind 8192)
\r[    1.989016] TCP: reno registered
\r[    1.992002] UDP hash table entries: 512 (order: 3, 40960 bytes)
\r[    1.998476] UDP-Lite hash table entries: 512 (order: 3, 40960 bytes)
\r[    2.006030] NET: Registered protocol family 1
\r[    2.010481] RPC: Registered named UNIX socket transport module.
\r[    2.014689] RPC: Registered udp transport module.
\r[    2.019326] RPC: Registered tcp transport module.
\r[    2.024133] RPC: Registered tcp NFSv4.1 backchannel transport module.
\r[    2.031730] Trying to unpack rootfs image as initramfs...
\r[    2.854186] Freeing initrd memory: 5160K (dfaf5000 - dffff000)
\r[    2.867362] bounce pool size: 64 pages
\r[    2.871434] jffs2: version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
\r[    2.873772] msgmni has been set to 1477
\r[    2.897430] io scheduler noop registered
\r[    2.897463] io scheduler deadline registered
\r[    2.900325] io scheduler cfq registered (default)
\r[    2.915035] e000ñ[    2.917021] console [ttyPS0] enabled, bootconsole disabled
[    2.917021] console [ttyPS0] enabled, bootconsole disabled
\r[    2.961088] brd: module loaded
[    2.985608] loop: module loaded
[    3.007560] usbcore: registered new interface driver usb-storage
[    3.015956] mousedev: PS/2 mouse device common for all mice
[    3.024029] i2c /dev entries driver
[    3.031288] cpuidle: using governor ladder
[    3.035357] cpuidle: using governor menu
[    3.039170] Xilinx Zynq CpuIdle Driver started
[    3.044929] sdhci: Secure Digital Host Controller Interface driver
[    3.050969] sdhci: Copyright(c) Pierre Ossman
[    3.055475] sdhci-pltfm: SDHCI platform and OF driver helper
[    3.063301] usbcore: registered new interface driver usbhid
[    3.068990] usbhid: USB HID core driver
[    3.073385] TCP: cubic registered
[    3.076686] NET: Registered protocol family 17
[    3.081129] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[    3.088817] Registering SWP/SWPB emulation handler
[    3.098336] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    3.138029] Freeing unused kernel memory: 332K (c064a000 - c069d000)

Welcome to ^[[1mBuildroot 2013.08-git^[[0m!



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-07-31 23:38                                   ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-07-31 23:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> On 08/01/2013 12:18 AM, S??ren Brinkmann wrote:
> > On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >> On 07/31/2013 10:58 PM, S??ren Brinkmann wrote:
> >>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>> On 07/31/2013 12:34 AM, S??ren Brinkmann wrote:
> >>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>> On 07/30/2013 02:03 AM, S??ren Brinkmann wrote:
> >>>>>>> Hi Daniel,
> >>>>>>>
> >>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>> (snip)
> >>>>>>>>
> >>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>
> >>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>> always remain a mystery to me.
> >>>>>>>
> >>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>> hang pointing to some issue with that driver?
> >>>>>>
> >>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>
> >>>>> So, the correct run results (full output attached).
> >>>>>
> >>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>> broadcast device:
> >>>>> 	Tick Device: mode:     1                                                         
> >>>>> 	Broadcast device  
> >>>>> 	Clock Event Device: ttc_clockevent
> >>>>>
> >>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>> enable the global timer, the twd timers are still used as local timers
> >>>>> and the broadcast device is the global timer:
> >>>>> 	Tick Device: mode:     1                                                         
> >>>>> 	Broadcast device                                                                 
> >>>>> 	Clock Event Device: arm_global_timer
> >>>>>
> >>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>> obtain this information for that case.
> >>>>
> >>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>
> >>> Right, that works. I forgot about that option after you mentioned, that
> >>> it is most likely not that useful.
> >>>
> >>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>> the gt enabled and having maxcpus=1 set.
> >>>
> >>> /proc/timer_list:
> >>> 	Tick Device: mode:     1
> >>> 	Broadcast device
> >>> 	Clock Event Device: arm_global_timer
> >>> 	 max_delta_ns:   12884902005
> >>> 	 min_delta_ns:   1000
> >>> 	 mult:           715827876
> >>> 	 shift:          31
> >>> 	 mode:           3
> >>
> >> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>
> >> The previous timer_list output you gave me when removing the offending
> >> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>
> >> Is it possible you try to get this output again right after onlining the
> >> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> > 
> > How do I do that? I tried to online CPU1 after booting with maxcpus=1
> > and that didn't end well:
> > 	# echo 1 > online && cat /proc/timer_list 
> 
> Hmm, I was hoping to have a small delay before the kernel hangs but
> apparently this is not the case... :(
> 
> I suspect the global timer is shutdown at one moment but I don't
> understand why and when.
> 
> Can you add a stack trace in the "clockevents_shutdown" function with
> the clockevent device name ? Perhaps, we may see at boot time an
> interesting trace when it hangs.

I did this change:
	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
	index 38959c8..3ab11c1 100644
	--- a/kernel/time/clockevents.c
	+++ b/kernel/time/clockevents.c
	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
	  */
	 void clockevents_shutdown(struct clock_event_device *dev)
	 {
	+       pr_info("ce->name:%s\n", dev->name);
	+       dump_stack();
	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
	        dev->next_event.tv64 = KTIME_MAX;
	 }

It is hit a few times during boot, so I attach a full boot log. I really
don't know what to look for, but I hope you can spot something in it. I
really appreciate you taking the time.

	Thanks,
	S??ren

-------------- next part --------------


U-Boot 2013.04 (Jun 19 2013 - 08:17:40)

I2C:   ready
Memory: ECC disabled
DRAM:  1 GiB
WARNING: Caches not enabled
MMC:   zynq_sdhci: 0
SF: Detected S25FL129P_64K with page size 128 KiB, total 32 MiB
*** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Net:   Gem.e000b000
Hit any key to stop autoboot:  3 \b\b\b 2 \b\b\b 1 \b\b\b 0 
TFTPing Linux to RAM...
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'uImage'.
Load address: 0x3000000
Loading: *\b#################################################################
	 #################################################################
	 ####################################
	 4.2 MiB/s
done
Bytes transferred = 2427120 (2508f0 hex)
Gem.e000b000:7 is connected to Gem.e000b000.  Reconnecting to Gem.e000b000
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'devicetree.dtb'.
Load address: 0x2a00000
Loading: *\b#
	 672.9 KiB/s
done
Bytes transferred = 2759 (ac7 hex)
Gem.e000b000:7 is connected to Gem.e000b000.  Reconnecting to Gem.e000b000
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'uramdisk.image.gz'.
Load address: 0x2000000
Loading: *\b#################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ###################################
	 4.4 MiB/s
done
Bytes transferred = 5284624 (50a310 hex)
## Booting kernel from Legacy Image at 03000000 ...
   Image Name:   Linux-3.11.0-rc3-00001-gc14f576-
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2427056 Bytes = 2.3 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 02000000 ...
   Image Name:   
   Image Type:   ARM Linux RAMDisk Image (gzip compressed)
   Data Size:    5284560 Bytes = 5 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 02a00000
   Booting using the fdt blob at 0x02a00000
   Loading Kernel Image ... OK
OK
   Loading Ramdisk to 1faf5000, end 1ffff2d0 ... OK
   Loading Device Tree to 1faf1000, end 1faf4ac6 ... OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[    0.000000] Booting Linux on physical CPU 0x0

[    0.000000] Linux version 3.11.0-rc3-00001-gc14f576-dirty (sorenb at xsjandreislx) (gcc version 4.7.2 (Sourcery CodeBench Lite 2012.09-104) ) #144 SMP PREEMPT Wed Jul 31 16:31:16 PDT 2013

[    0.000000] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d

[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache

[    0.000000] Machine: Xilinx Zynq Platform, model: Zynq ZC706 Development Board

[    0.000000] bootconsole [earlycon0] enabled

[    0.000000] cma: CMA: reserved 16 MiB at 2e800000

[    0.000000] Memory policy: ECC disabled, Data cache writealloc

[    0.000000] PERCPU: Embedded 9 pages/cpu @c148e000 s14720 r8192 d13952 u36864

[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260624

[    0.000000] Kernel command line: console=ttyPS0,115200 earlyprintk

[    0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)

[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)

[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)

[    0.000000] Memory: 1005068K/1048576K available (4849K kernel code, 305K rwdata, 1556K rodata, 334K init, 5699K bss, 43508K reserved, 270336K highmem)

[    0.000000] Virtual kernel memory layout:

[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)

[    0.000000]     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)

[    0.000000]     vmalloc : 0xf0000000 - 0xff000000   ( 240 MB)

[    0.000000]     lowmem  : 0xc0000000 - 0xef800000   ( 760 MB)

[    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)

[    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)

[    0.000000]       .text : 0xc0008000 - 0xc0649930   (6407 kB)

[    0.000000]       .init : 0xc064a000 - 0xc069d980   ( 335 kB)

[    0.000000]       .data : 0xc069e000 - 0xc06ea4d8   ( 306 kB)

[    0.000000]        .bss : 0xc06ea4d8 - 0xc0c7b2e4   (5700 kB)

[    0.000000] Preemptible hierarchical RCU implementation.

[    0.000000] 	RCU lockdep checking is enabled.

[    0.000000] 	Additional per-CPU info printed with stalls.

[    0.000000] 	RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.

[    0.000000] NR_IRQS:16 nr_irqs:16 16

[    0.000000] slcr mapped to f0004000

[    0.000000] Zynq clock init

[    0.000000] sched_clock: 32 bits at 333MHz, resolution 3ns, wraps every 12884ms

[    0.000000] ce->name:arm_global_timer

[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    0.000000] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    0.000000] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    0.000000] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    0.000000] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)

[    0.000000] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fa0>] (tick_check_new_device+0xc8/0x11c)

[    0.000000] [<c0075fa0>] (tick_check_new_device+0xc8/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)

[    0.000000] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    0.000000] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c0370764>] (gt_clockevents_init+0x8c/0xb4)

[    0.000000] [<c0370764>] (gt_clockevents_init+0x8c/0xb4) from [<c0675d64>] (global_timer_of_register+0x1e4/0x294)

[    0.000000] [<c0675d64>] (global_timer_of_register+0x1e4/0x294) from [<c0675724>] (clocksource_of_init+0x2c/0x54)

[    0.000000] [<c0675724>] (clocksource_of_init+0x2c/0x54) from [<c0652b98>] (zynq_timer_init+0x14/0x18)

[    0.000000] [<c0652b98>] (zynq_timer_init+0x14/0x18) from [<c064dec8>] (time_init+0x24/0x34)

[    0.000000] [<c064dec8>] (time_init+0x24/0x34) from [<c064a774>] (start_kernel+0x1cc/0x32c)

[    0.000000] [<c064a774>] (start_kernel+0x1cc/0x32c) from [<00008074>] (0x8074)

[    0.000000] ce->name:ttc_clockevent

[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    0.000000] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    0.000000] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    0.000000] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    0.000000] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)

[    0.000000] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c00766c4>] (tick_install_broadcast_device+0x88/0xcc)

[    0.000000] [<c00766c4>] (tick_install_broadcast_device+0x88/0xcc) from [<c0075fe4>] (tick_check_new_device+0x10c/0x11c)

[    0.000000] [<c0075fe4>] (tick_check_new_device+0x10c/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)

[    0.000000] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    0.000000] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c0675acc>] (ttc_timer_init+0x2d4/0x388)

[    0.000000] [<c0675acc>] (ttc_timer_init+0x2d4/0x388) from [<c0675724>] (clocksource_of_init+0x2c/0x54)

[    0.000000] [<c0675724>] (clocksource_of_init+0x2c/0x54) from [<c0652b98>] (zynq_timer_init+0x14/0x18)

[    0.000000] [<c0652b98>] (zynq_timer_init+0x14/0x18) from [<c064dec8>] (time_init+0x24/0x34)

[    0.000000] [<c064dec8>] (time_init+0x24/0x34) from [<c064a774>] (start_kernel+0x1cc/0x32c)

[    0.000000] [<c064a774>] (start_kernel+0x1cc/0x32c) from [<00008074>] (0x8074)

[    0.000000] ttc0 #0 at f0006000, irq=43

[    0.000000] Console: colour dummy device 80x30

[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar

[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:  8

[    0.000000] ... MAX_LOCK_DEPTH:          48

[    0.000000] ... MAX_LOCKDEP_KEYS:        8191

[    0.000000] ... CLASSHASH_SIZE:          4096

[    0.000000] ... MAX_LOCKDEP_ENTRIES:     16384

[    0.000000] ... MAX_LOCKDEP_CHAINS:      32768

[    0.000000] ... CHAINHASH_SIZE:          16384

[    0.000000]  memory used by lock dependency info: 3695 kB

[    0.000000]  per task-struct memory footprint: 1152 bytes

[    0.347737] Calibrating delay loop... 1325.46 BogoMIPS (lpj=6627328)

[    0.385250] pid_max: default: 32768 minimum: 301

[    0.388286] Mount-cache hash table entries: 512

[    0.399390] CPU: Testing write buffer coherency: ok

[    0.399540] ftrace: allocating 16043 entries in 48 pages

[    0.439854] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000

[    0.440069] ce->name:local_timer

[    0.443330] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    0.451622] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    0.460207] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    0.468285] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    0.477224] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)

[    0.487628] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fa0>] (tick_check_new_device+0xc8/0x11c)

[    0.498206] [<c0075fa0>] (tick_check_new_device+0xc8/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)

[    0.508870] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    0.520307] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)

[    0.530798] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)

[    0.540184] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c064e804>] (smp_prepare_cpus+0x58/0x84)

[    0.549464] [<c064e804>] (smp_prepare_cpus+0x58/0x84) from [<c064a934>] (kernel_init_freeable+0x60/0x1d0)

[    0.559008] [<c064a934>] (kernel_init_freeable+0x60/0x1d0) from [<c04562d4>] (kernel_init+0x1c/0xf4)

[    0.568124] [<c04562d4>] (kernel_init+0x1c/0xf4) from [<c000efc8>] (ret_from_fork+0x14/0x20)

[    0.576545] ce->name:ttc_clockevent

[    0.580068] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    0.588341] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    0.596937] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    0.605019] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    0.613964] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4)

[    0.624453] [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4) from [<c0075db8>] (tick_setup_device+0xb8/0x100)

[    0.634773] [<c0075db8>] (tick_setup_device+0xb8/0x100) from [<c0075fc8>] (tick_check_new_device+0xf0/0x11c)

[    0.644577] [<c0075fc8>] (tick_check_new_device+0xf0/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)

[    0.655241] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    0.666679] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)

[    0.677169] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)

[    0.686549] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c064e804>] (smp_prepare_cpus+0x58/0x84)

[    0.695833] [<c064e804>] (smp_prepare_cpus+0x58/0x84) from [<c064a934>] (kernel_init_freeable+0x60/0x1d0)

[    0.705377] [<c064a934>] (kernel_init_freeable+0x60/0x1d0) from [<c04562d4>] (kernel_init+0x1c/0xf4)

[    0.714495] [<c04562d4>] (kernel_init+0x1c/0xf4) from [<c000efc8>] (ret_from_fork+0x14/0x20)

[    0.722917] ce->name:arm_global_timer

[    0.726612] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    0.734881] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    0.743479] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    0.751563] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    0.760503] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)

[    0.770912] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c00766c4>] (tick_install_broadcast_device+0x88/0xcc)

[    0.782091] [<c00766c4>] (tick_install_broadcast_device+0x88/0xcc) from [<c0075fe4>] (tick_check_new_device+0x10c/0x11c)

[    0.792926] [<c0075fe4>] (tick_check_new_device+0x10c/0x11c) from [<c007514c>] (clockevents_register_device+0x100/0x134)

[    0.803762] [<c007514c>] (clockevents_register_device+0x100/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    0.815286] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)

[    0.825777] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)

[    0.835156] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c064e804>] (smp_prepare_cpus+0x58/0x84)

[    0.844441] [<c064e804>] (smp_prepare_cpus+0x58/0x84) from [<c064a934>] (kernel_init_freeable+0x60/0x1d0)

[    0.853986] [<c064a934>] (kernel_init_freeable+0x60/0x1d0) from [<c04562d4>] (kernel_init+0x1c/0xf4)

[    0.863102] [<c04562d4>] (kernel_init+0x1c/0xf4) from [<c000efc8>] (ret_from_fork+0x14/0x20)

[    0.871662] Setting up static identity map for 0xc0462780 - 0xc04627d8

[    0.935967] CPU1: Booted secondary processor

[    0.936001] ce->name:arm_global_timer

[    0.936015] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    0.936063] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    0.936093] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    0.936120] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    0.936140] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)

[    0.936160] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fa0>] (tick_check_new_device+0xc8/0x11c)

[    0.936180] [<c0075fa0>] (tick_check_new_device+0xc8/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)

[    0.936199] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    0.936227] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c0370764>] (gt_clockevents_init+0x8c/0xb4)

[    0.936247] [<c0370764>] (gt_clockevents_init+0x8c/0xb4) from [<c03708e8>] (gt_cpu_notify+0x3c/0x68)

[    0.936278] [<c03708e8>] (gt_cpu_notify+0x3c/0x68) from [<c00552e4>] (notifier_call_chain+0xd8/0x110)

[    0.936301] [<c00552e4>] (notifier_call_chain+0xd8/0x110) from [<c0055340>] (__raw_notifier_call_chain+0x24/0x2c)

[    0.936321] [<c0055340>] (__raw_notifier_call_chain+0x24/0x2c) from [<c002758c>] (__cpu_notify+0x3c/0x58)

[    0.936337] [<c002758c>] (__cpu_notify+0x3c/0x58) from [<c00275c8>] (cpu_notify+0x20/0x24)

[    0.936354] [<c00275c8>] (cpu_notify+0x20/0x24) from [<c0027dd0>] (notify_cpu_starting+0x58/0x60)

[    0.936373] [<c0027dd0>] (notify_cpu_starting+0x58/0x60) from [<c0014a94>] (secondary_start_kernel+0xf4/0x144)

[    0.936390] [<c0014a94>] (secondary_start_kernel+0xf4/0x144) from [<000081ec>] (0x81ec)

[    0.936404] ce->name:arm_global_timer

[    0.936415] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    0.936441] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    0.936464] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    0.936486] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    0.936507] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4)

[    0.936526] [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4) from [<c0075db8>] (tick_setup_device+0xb8/0x100)

[    0.936545] [<c0075db8>] (tick_setup_device+0xb8/0x100) from [<c0075fc8>] (tick_check_new_device+0xf0/0x11c)

[    0.936564] [<c0075fc8>] (tick_check_new_device+0xf0/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)

[    0.936584] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    0.936608] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c0370764>] (gt_clockevents_init+0x8c/0xb4)

[    0.936627] [<c0370764>] (gt_clockevents_init+0x8c/0xb4) from [<c03708e8>] (gt_cpu_notify+0x3c/0x68)

[    0.936652] [<c03708e8>] (gt_cpu_notify+0x3c/0x68) from [<c00552e4>] (notifier_call_chain+0xd8/0x110)

[    0.936673] [<c00552e4>] (notifier_call_chain+0xd8/0x110) from [<c0055340>] (__raw_notifier_call_chain+0x24/0x2c)

[    0.936692] [<c0055340>] (__raw_notifier_call_chain+0x24/0x2c) from [<c002758c>] (__cpu_notify+0x3c/0x58)

[    0.936708] [<c002758c>] (__cpu_notify+0x3c/0x58) from [<c00275c8>] (cpu_notify+0x20/0x24)

[    0.936725] [<c00275c8>] (cpu_notify+0x20/0x24) from [<c0027dd0>] (notify_cpu_starting+0x58/0x60)

[    0.936744] [<c0027dd0>] (notify_cpu_starting+0x58/0x60) from [<c0014a94>] (secondary_start_kernel+0xf4/0x144)

[    0.936760] [<c0014a94>] (secondary_start_kernel+0xf4/0x144) from [<000081ec>] (0x81ec)

[    1.022967] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001

[    1.023071] ce->name:local_timer

[    1.024254] Brought up 2 CPUs

[    1.024267] SMP: Total of 2 processors activated (2657.48 BogoMIPS).

[    1.024272] CPU: All CPU(s) started in SVC mode.

[    1.383534] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    1.391814] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    1.400405] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    1.408485] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    1.417425] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c0075840>] (clockevents_exchange_device+0x94/0xb8)

[    1.427831] [<c0075840>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fa0>] (tick_check_new_device+0xc8/0x11c)

[    1.438410] [<c0075fa0>] (tick_check_new_device+0xc8/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)

[    1.449073] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    1.460511] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)

[    1.471003] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)

[    1.480375] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c0014ab8>] (secondary_start_kernel+0x118/0x144)

[    1.490350] [<c0014ab8>] (secondary_start_kernel+0x118/0x144) from [<000081ec>] (0x81ec)

[    1.498430] ce->name:arm_global_timer

[    1.502125] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc3-00001-gc14f576-dirty #144

[    1.510397] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    1.518993] [<c0012d58>] (show_stack+0x20/0x24) from [<c045bd70>] (dump_stack+0x80/0xc4)

[    1.527077] [<c045bd70>] (dump_stack+0x80/0xc4) from [<c00754e4>] (clockevents_shutdown+0x28/0x48)

[    1.536020] [<c00754e4>] (clockevents_shutdown+0x28/0x48) from [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4)

[    1.546511] [<c00770dc>] (tick_device_uses_broadcast+0x198/0x1e4) from [<c0075db8>] (tick_setup_device+0xb8/0x100)

[    1.556829] [<c0075db8>] (tick_setup_device+0xb8/0x100) from [<c0075fc8>] (tick_check_new_device+0xf0/0x11c)

[    1.566634] [<c0075fc8>] (tick_check_new_device+0xf0/0x11c) from [<c007510c>] (clockevents_register_device+0xc0/0x134)

[    1.577299] [<c007510c>] (clockevents_register_device+0xc0/0x134) from [<c0075748>] (clockevents_config_and_register+0x2c/0x30)

[    1.588737] [<c0075748>] (clockevents_config_and_register+0x2c/0x30) from [<c001585c>] (twd_timer_setup+0x1bc/0x1fc)

[    1.599227] [<c001585c>] (twd_timer_setup+0x1bc/0x1fc) from [<c00146a0>] (percpu_timer_setup+0xa4/0xc8)

[    1.608600] [<c00146a0>] (percpu_timer_setup+0xa4/0xc8) from [<c0014ab8>] (secondary_start_kernel+0x118/0x144)

[    1.618576] [<c0014ab8>] (secondary_start_kernel+0x118/0x144) from [<000081ec>] (0x81ec)

[    1.630233] devtmpfs: initialized

[    1.648505] NET: Registered protocol family 16

[    1.654412] DMA: preallocated 256 KiB pool for atomic coherent allocations

[    1.667327] L310 cache controller enabled

[    1.667364] l2x0: 8 ways, CACHE_ID 0x410000c8, AUX_CTRL 0x72360000, Cache size: 524288 B

[    1.681910] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.

[    1.684424] hw-breakpoint: maximum watchpoint size is 4 bytes.

[    1.763691] bio: create slab <bio-0> at 0

[    1.768952] SCSI subsystem initialized

[    1.771737] usbcore: registered new interface driver usbfs

[    1.773326] usbcore: registered new interface driver hub

[    1.778916] usbcore: registered new device driver usb

[    1.793685] Switched to clocksource arm_global_timer

[    1.968760] NET: Registered protocol family 2

[    1.974238] TCP established hash table entries: 8192 (order: 4, 65536 bytes)

[    1.976164] TCP bind hash table entries: 8192 (order: 6, 294912 bytes)

[    1.986656] TCP: Hash tables configured (established 8192 bind 8192)

[    1.989016] TCP: reno registered

[    1.992002] UDP hash table entries: 512 (order: 3, 40960 bytes)

[    1.998476] UDP-Lite hash table entries: 512 (order: 3, 40960 bytes)

[    2.006030] NET: Registered protocol family 1

[    2.010481] RPC: Registered named UNIX socket transport module.

[    2.014689] RPC: Registered udp transport module.

[    2.019326] RPC: Registered tcp transport module.

[    2.024133] RPC: Registered tcp NFSv4.1 backchannel transport module.

[    2.031730] Trying to unpack rootfs image as initramfs...

[    2.854186] Freeing initrd memory: 5160K (dfaf5000 - dffff000)

[    2.867362] bounce pool size: 64 pages

[    2.871434] jffs2: version 2.2. (NAND) (SUMMARY)  ?? 2001-2006 Red Hat, Inc.

[    2.873772] msgmni has been set to 1477

[    2.897430] io scheduler noop registered

[    2.897463] io scheduler deadline registered

[    2.900325] io scheduler cfq registered (default)

[    2.915035] e000?[    2.917021] console [ttyPS0] enabled, bootconsole disabled
[    2.917021] console [ttyPS0] enabled, bootconsole disabled

[    2.961088] brd: module loaded
[    2.985608] loop: module loaded
[    3.007560] usbcore: registered new interface driver usb-storage
[    3.015956] mousedev: PS/2 mouse device common for all mice
[    3.024029] i2c /dev entries driver
[    3.031288] cpuidle: using governor ladder
[    3.035357] cpuidle: using governor menu
[    3.039170] Xilinx Zynq CpuIdle Driver started
[    3.044929] sdhci: Secure Digital Host Controller Interface driver
[    3.050969] sdhci: Copyright(c) Pierre Ossman
[    3.055475] sdhci-pltfm: SDHCI platform and OF driver helper
[    3.063301] usbcore: registered new interface driver usbhid
[    3.068990] usbhid: USB HID core driver
[    3.073385] TCP: cubic registered
[    3.076686] NET: Registered protocol family 17
[    3.081129] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[    3.088817] Registering SWP/SWPB emulation handler
[    3.098336] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    3.138029] Freeing unused kernel memory: 332K (c064a000 - c069d000)

Welcome to ^[[1mBuildroot 2013.08-git^[[0m!

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-07-31 23:38                                   ` Sören Brinkmann
@ 2013-08-01 17:29                                     ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-01 17:29 UTC (permalink / raw)
  To: Sören Brinkmann, Stuart Menefy
  Cc: Stephen Boyd, John Stultz, Thomas Gleixner, Russell King,
	Michal Simek, linux-kernel, linux-arm-kernel

On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>> Hi Daniel,
>>>>>>>>>
>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>> (snip)
>>>>>>>>>>
>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>
>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>> always remain a mystery to me.
>>>>>>>>>
>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>
>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>
>>>>>>> So, the correct run results (full output attached).
>>>>>>>
>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>> broadcast device:
>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>> 	Broadcast device  
>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>
>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>> and the broadcast device is the global timer:
>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>> 	Broadcast device                                                                 
>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>
>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>> obtain this information for that case.
>>>>>>
>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>
>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>> it is most likely not that useful.
>>>>>
>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>> the gt enabled and having maxcpus=1 set.
>>>>>
>>>>> /proc/timer_list:
>>>>> 	Tick Device: mode:     1
>>>>> 	Broadcast device
>>>>> 	Clock Event Device: arm_global_timer
>>>>> 	 max_delta_ns:   12884902005
>>>>> 	 min_delta_ns:   1000
>>>>> 	 mult:           715827876
>>>>> 	 shift:          31
>>>>> 	 mode:           3
>>>>
>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>
>>>> The previous timer_list output you gave me when removing the offending
>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>
>>>> Is it possible you try to get this output again right after onlining the
>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>
>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>> and that didn't end well:
>>> 	# echo 1 > online && cat /proc/timer_list 
>>
>> Hmm, I was hoping to have a small delay before the kernel hangs but
>> apparently this is not the case... :(
>>
>> I suspect the global timer is shutdown at one moment but I don't
>> understand why and when.
>>
>> Can you add a stack trace in the "clockevents_shutdown" function with
>> the clockevent device name ? Perhaps, we may see at boot time an
>> interesting trace when it hangs.
> 
> I did this change:
> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> 	index 38959c8..3ab11c1 100644
> 	--- a/kernel/time/clockevents.c
> 	+++ b/kernel/time/clockevents.c
> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> 	  */
> 	 void clockevents_shutdown(struct clock_event_device *dev)
> 	 {
> 	+       pr_info("ce->name:%s\n", dev->name);
> 	+       dump_stack();
> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> 	        dev->next_event.tv64 = KTIME_MAX;
> 	 }
> 
> It is hit a few times during boot, so I attach a full boot log. I really
> don't know what to look for, but I hope you can spot something in it. I
> really appreciate you taking the time.

Thanks for the traces.

If you try without the ttc_clockevent configured in the kernel (but with
twd and gt), does it boot ?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-01 17:29                                     ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-01 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>> Hi Daniel,
>>>>>>>>>
>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>> (snip)
>>>>>>>>>>
>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>
>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>> always remain a mystery to me.
>>>>>>>>>
>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>
>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>
>>>>>>> So, the correct run results (full output attached).
>>>>>>>
>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>> broadcast device:
>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>> 	Broadcast device  
>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>
>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>> and the broadcast device is the global timer:
>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>> 	Broadcast device                                                                 
>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>
>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>> obtain this information for that case.
>>>>>>
>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>
>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>> it is most likely not that useful.
>>>>>
>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>> the gt enabled and having maxcpus=1 set.
>>>>>
>>>>> /proc/timer_list:
>>>>> 	Tick Device: mode:     1
>>>>> 	Broadcast device
>>>>> 	Clock Event Device: arm_global_timer
>>>>> 	 max_delta_ns:   12884902005
>>>>> 	 min_delta_ns:   1000
>>>>> 	 mult:           715827876
>>>>> 	 shift:          31
>>>>> 	 mode:           3
>>>>
>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>
>>>> The previous timer_list output you gave me when removing the offending
>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>
>>>> Is it possible you try to get this output again right after onlining the
>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>
>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>> and that didn't end well:
>>> 	# echo 1 > online && cat /proc/timer_list 
>>
>> Hmm, I was hoping to have a small delay before the kernel hangs but
>> apparently this is not the case... :(
>>
>> I suspect the global timer is shutdown at one moment but I don't
>> understand why and when.
>>
>> Can you add a stack trace in the "clockevents_shutdown" function with
>> the clockevent device name ? Perhaps, we may see at boot time an
>> interesting trace when it hangs.
> 
> I did this change:
> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> 	index 38959c8..3ab11c1 100644
> 	--- a/kernel/time/clockevents.c
> 	+++ b/kernel/time/clockevents.c
> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> 	  */
> 	 void clockevents_shutdown(struct clock_event_device *dev)
> 	 {
> 	+       pr_info("ce->name:%s\n", dev->name);
> 	+       dump_stack();
> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> 	        dev->next_event.tv64 = KTIME_MAX;
> 	 }
> 
> It is hit a few times during boot, so I attach a full boot log. I really
> don't know what to look for, but I hope you can spot something in it. I
> really appreciate you taking the time.

Thanks for the traces.

If you try without the ttc_clockevent configured in the kernel (but with
twd and gt), does it boot ?


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-01 17:29                                     ` Daniel Lezcano
@ 2013-08-01 17:43                                       ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-01 17:43 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stuart Menefy, Stephen Boyd, John Stultz, Thomas Gleixner,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> > On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> >>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> >>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> >>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>>>>>>>> Hi Daniel,
> >>>>>>>>>
> >>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>> (snip)
> >>>>>>>>>>
> >>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>
> >>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>> always remain a mystery to me.
> >>>>>>>>>
> >>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>
> >>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>
> >>>>>>> So, the correct run results (full output attached).
> >>>>>>>
> >>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>> broadcast device:
> >>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>> 	Broadcast device  
> >>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>
> >>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>> and the broadcast device is the global timer:
> >>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>> 	Broadcast device                                                                 
> >>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>
> >>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>> obtain this information for that case.
> >>>>>>
> >>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>
> >>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>> it is most likely not that useful.
> >>>>>
> >>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>> the gt enabled and having maxcpus=1 set.
> >>>>>
> >>>>> /proc/timer_list:
> >>>>> 	Tick Device: mode:     1
> >>>>> 	Broadcast device
> >>>>> 	Clock Event Device: arm_global_timer
> >>>>> 	 max_delta_ns:   12884902005
> >>>>> 	 min_delta_ns:   1000
> >>>>> 	 mult:           715827876
> >>>>> 	 shift:          31
> >>>>> 	 mode:           3
> >>>>
> >>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>
> >>>> The previous timer_list output you gave me when removing the offending
> >>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>
> >>>> Is it possible you try to get this output again right after onlining the
> >>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>
> >>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>> and that didn't end well:
> >>> 	# echo 1 > online && cat /proc/timer_list 
> >>
> >> Hmm, I was hoping to have a small delay before the kernel hangs but
> >> apparently this is not the case... :(
> >>
> >> I suspect the global timer is shutdown at one moment but I don't
> >> understand why and when.
> >>
> >> Can you add a stack trace in the "clockevents_shutdown" function with
> >> the clockevent device name ? Perhaps, we may see at boot time an
> >> interesting trace when it hangs.
> > 
> > I did this change:
> > 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> > 	index 38959c8..3ab11c1 100644
> > 	--- a/kernel/time/clockevents.c
> > 	+++ b/kernel/time/clockevents.c
> > 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> > 	  */
> > 	 void clockevents_shutdown(struct clock_event_device *dev)
> > 	 {
> > 	+       pr_info("ce->name:%s\n", dev->name);
> > 	+       dump_stack();
> > 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> > 	        dev->next_event.tv64 = KTIME_MAX;
> > 	 }
> > 
> > It is hit a few times during boot, so I attach a full boot log. I really
> > don't know what to look for, but I hope you can spot something in it. I
> > really appreciate you taking the time.
> 
> Thanks for the traces.

Sure.

> 
> If you try without the ttc_clockevent configured in the kernel (but with
> twd and gt), does it boot ?

Absence of the TTC doesn't seem to make any difference. It hangs at the
same location.

	Thanks,
	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-01 17:43                                       ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-01 17:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
> > On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
> >>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> >>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> >>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>>>>>>>> Hi Daniel,
> >>>>>>>>>
> >>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>> (snip)
> >>>>>>>>>>
> >>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>
> >>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>> always remain a mystery to me.
> >>>>>>>>>
> >>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>
> >>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>
> >>>>>>> So, the correct run results (full output attached).
> >>>>>>>
> >>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>> broadcast device:
> >>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>> 	Broadcast device  
> >>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>
> >>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>> and the broadcast device is the global timer:
> >>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>> 	Broadcast device                                                                 
> >>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>
> >>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>> obtain this information for that case.
> >>>>>>
> >>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>
> >>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>> it is most likely not that useful.
> >>>>>
> >>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>> the gt enabled and having maxcpus=1 set.
> >>>>>
> >>>>> /proc/timer_list:
> >>>>> 	Tick Device: mode:     1
> >>>>> 	Broadcast device
> >>>>> 	Clock Event Device: arm_global_timer
> >>>>> 	 max_delta_ns:   12884902005
> >>>>> 	 min_delta_ns:   1000
> >>>>> 	 mult:           715827876
> >>>>> 	 shift:          31
> >>>>> 	 mode:           3
> >>>>
> >>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>
> >>>> The previous timer_list output you gave me when removing the offending
> >>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>
> >>>> Is it possible you try to get this output again right after onlining the
> >>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>
> >>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>> and that didn't end well:
> >>> 	# echo 1 > online && cat /proc/timer_list 
> >>
> >> Hmm, I was hoping to have a small delay before the kernel hangs but
> >> apparently this is not the case... :(
> >>
> >> I suspect the global timer is shutdown at one moment but I don't
> >> understand why and when.
> >>
> >> Can you add a stack trace in the "clockevents_shutdown" function with
> >> the clockevent device name ? Perhaps, we may see at boot time an
> >> interesting trace when it hangs.
> > 
> > I did this change:
> > 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> > 	index 38959c8..3ab11c1 100644
> > 	--- a/kernel/time/clockevents.c
> > 	+++ b/kernel/time/clockevents.c
> > 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> > 	  */
> > 	 void clockevents_shutdown(struct clock_event_device *dev)
> > 	 {
> > 	+       pr_info("ce->name:%s\n", dev->name);
> > 	+       dump_stack();
> > 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> > 	        dev->next_event.tv64 = KTIME_MAX;
> > 	 }
> > 
> > It is hit a few times during boot, so I attach a full boot log. I really
> > don't know what to look for, but I hope you can spot something in it. I
> > really appreciate you taking the time.
> 
> Thanks for the traces.

Sure.

> 
> If you try without the ttc_clockevent configured in the kernel (but with
> twd and gt), does it boot ?

Absence of the TTC doesn't seem to make any difference. It hangs at the
same location.

	Thanks,
	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-01 17:43                                       ` Sören Brinkmann
@ 2013-08-01 17:48                                         ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-01 17:48 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stuart Menefy, Stephen Boyd, John Stultz, Thomas Gleixner,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> (snip)
>>>>>>>>>>>>
>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>
>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>
>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>
>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>
>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>
>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>> broadcast device:
>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>> 	Broadcast device  
>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>
>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>
>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>> obtain this information for that case.
>>>>>>>>
>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>
>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>> it is most likely not that useful.
>>>>>>>
>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>
>>>>>>> /proc/timer_list:
>>>>>>> 	Tick Device: mode:     1
>>>>>>> 	Broadcast device
>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>> 	 min_delta_ns:   1000
>>>>>>> 	 mult:           715827876
>>>>>>> 	 shift:          31
>>>>>>> 	 mode:           3
>>>>>>
>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>
>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>
>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>
>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>> and that didn't end well:
>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>
>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>> apparently this is not the case... :(
>>>>
>>>> I suspect the global timer is shutdown at one moment but I don't
>>>> understand why and when.
>>>>
>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>> interesting trace when it hangs.
>>>
>>> I did this change:
>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>> 	index 38959c8..3ab11c1 100644
>>> 	--- a/kernel/time/clockevents.c
>>> 	+++ b/kernel/time/clockevents.c
>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>> 	  */
>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>> 	 {
>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>> 	+       dump_stack();
>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>> 	 }
>>>
>>> It is hit a few times during boot, so I attach a full boot log. I really
>>> don't know what to look for, but I hope you can spot something in it. I
>>> really appreciate you taking the time.
>>
>> Thanks for the traces.
> 
> Sure.
> 
>>
>> If you try without the ttc_clockevent configured in the kernel (but with
>> twd and gt), does it boot ?
> 
> Absence of the TTC doesn't seem to make any difference. It hangs at the
> same location.

Ok, IMO there is a problem with the broadcast device registration (may
be vs twd).

I will check later (kid duty) :)

Thanks
  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-01 17:48                                         ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-01 17:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> (snip)
>>>>>>>>>>>>
>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>
>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>
>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>
>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>
>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>
>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>> broadcast device:
>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>> 	Broadcast device  
>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>
>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>
>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>> obtain this information for that case.
>>>>>>>>
>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>
>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>> it is most likely not that useful.
>>>>>>>
>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>
>>>>>>> /proc/timer_list:
>>>>>>> 	Tick Device: mode:     1
>>>>>>> 	Broadcast device
>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>> 	 min_delta_ns:   1000
>>>>>>> 	 mult:           715827876
>>>>>>> 	 shift:          31
>>>>>>> 	 mode:           3
>>>>>>
>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>
>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>
>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>
>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>> and that didn't end well:
>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>
>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>> apparently this is not the case... :(
>>>>
>>>> I suspect the global timer is shutdown at one moment but I don't
>>>> understand why and when.
>>>>
>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>> interesting trace when it hangs.
>>>
>>> I did this change:
>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>> 	index 38959c8..3ab11c1 100644
>>> 	--- a/kernel/time/clockevents.c
>>> 	+++ b/kernel/time/clockevents.c
>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>> 	  */
>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>> 	 {
>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>> 	+       dump_stack();
>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>> 	 }
>>>
>>> It is hit a few times during boot, so I attach a full boot log. I really
>>> don't know what to look for, but I hope you can spot something in it. I
>>> really appreciate you taking the time.
>>
>> Thanks for the traces.
> 
> Sure.
> 
>>
>> If you try without the ttc_clockevent configured in the kernel (but with
>> twd and gt), does it boot ?
> 
> Absence of the TTC doesn't seem to make any difference. It hangs at the
> same location.

Ok, IMO there is a problem with the broadcast device registration (may
be vs twd).

I will check later (kid duty) :)

Thanks
  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-01 17:48                                         ` Daniel Lezcano
@ 2013-08-06  1:28                                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-06  1:28 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stuart Menefy, Stephen Boyd, John Stultz, Thomas Gleixner,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 7172 bytes --]

Hi Daniel,

On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
> > On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> >>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> >>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> >>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> >>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>> (snip)
> >>>>>>>>>>>>
> >>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>
> >>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>
> >>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>
> >>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>
> >>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>
> >>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>> broadcast device:
> >>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>> 	Broadcast device  
> >>>>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>>>
> >>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>> 	Broadcast device                                                                 
> >>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>>
> >>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>> obtain this information for that case.
> >>>>>>>>
> >>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>
> >>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>> it is most likely not that useful.
> >>>>>>>
> >>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>
> >>>>>>> /proc/timer_list:
> >>>>>>> 	Tick Device: mode:     1
> >>>>>>> 	Broadcast device
> >>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>> 	 max_delta_ns:   12884902005
> >>>>>>> 	 min_delta_ns:   1000
> >>>>>>> 	 mult:           715827876
> >>>>>>> 	 shift:          31
> >>>>>>> 	 mode:           3
> >>>>>>
> >>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>
> >>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>
> >>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>
> >>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>> and that didn't end well:
> >>>>> 	# echo 1 > online && cat /proc/timer_list 
> >>>>
> >>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>> apparently this is not the case... :(
> >>>>
> >>>> I suspect the global timer is shutdown at one moment but I don't
> >>>> understand why and when.
> >>>>
> >>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>> interesting trace when it hangs.
> >>>
> >>> I did this change:
> >>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>> 	index 38959c8..3ab11c1 100644
> >>> 	--- a/kernel/time/clockevents.c
> >>> 	+++ b/kernel/time/clockevents.c
> >>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>> 	  */
> >>> 	 void clockevents_shutdown(struct clock_event_device *dev)
> >>> 	 {
> >>> 	+       pr_info("ce->name:%s\n", dev->name);
> >>> 	+       dump_stack();
> >>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>> 	        dev->next_event.tv64 = KTIME_MAX;
> >>> 	 }
> >>>
> >>> It is hit a few times during boot, so I attach a full boot log. I really
> >>> don't know what to look for, but I hope you can spot something in it. I
> >>> really appreciate you taking the time.
> >>
> >> Thanks for the traces.
> > 
> > Sure.
> > 
> >>
> >> If you try without the ttc_clockevent configured in the kernel (but with
> >> twd and gt), does it boot ?
> > 
> > Absence of the TTC doesn't seem to make any difference. It hangs at the
> > same location.
> 
> Ok, IMO there is a problem with the broadcast device registration (may
> be vs twd).
> 
> I will check later (kid duty) :)

I was actually waiting for an update from your side and did something
else, but I seem to have run into this again. I was overhauling the
cadence_ttc (patch attached, based on tip/timers/core). And it seems to
show the same behavior as enabling the global_timer. With cpuidle off, it
works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
C2 state makes it boot again.
It works just fine on our 3.10 kernel.

Another thing I noticed - probably unrelated but hard to tell: On
3.11-rc1 and later my system stops for quite some time at the hand off
to userspace. I.e. I see the 'freeing unused kernel memory...' line and
sometimes the following 'Welcome to Buildroot...' and then it stops and
on good kernels it continues after a while and boots through and on bad
ones it just hangs there.

	Sören


[-- Attachment #2: 0001-clocksource-cadence_ttc-Use-only-one-counter.patch --]
[-- Type: text/x-diff, Size: 7637 bytes --]

>From db4ad00ab774555ba8113ca6fcb9ceace15cfbbc Mon Sep 17 00:00:00 2001
From: Soren Brinkmann <soren.brinkmann@xilinx.com>
Date: Mon, 5 Aug 2013 15:48:59 -0700
Subject: [PATCH] clocksource/cadence_ttc: Use only one counter

Currently the driver uses two of the three counters the TTC provides to
implement a clocksource and a clockevent device. By using the TTC's
match feature we can implement both use cases using a single counter
only.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---
 drivers/clocksource/cadence_ttc_timer.c | 97 ++++++++++++---------------------
 1 file changed, 36 insertions(+), 61 deletions(-)

diff --git a/drivers/clocksource/cadence_ttc_timer.c b/drivers/clocksource/cadence_ttc_timer.c
index b2bb3a4b..d6ab818 100644
--- a/drivers/clocksource/cadence_ttc_timer.c
+++ b/drivers/clocksource/cadence_ttc_timer.c
@@ -46,6 +46,7 @@
 #define TTC_CNT_CNTRL_OFFSET		0x0C /* Counter Control Reg, RW */
 #define TTC_COUNT_VAL_OFFSET		0x18 /* Counter Value Reg, RO */
 #define TTC_INTR_VAL_OFFSET		0x24 /* Interval Count Reg, RW */
+#define TTC_MATCH1_OFFSET	0x30 /* Match reg, RW */
 #define TTC_ISR_OFFSET		0x54 /* Interrupt Status Reg, RO */
 #define TTC_IER_OFFSET		0x60 /* Interrupt Enable Reg, RW */
 
@@ -61,7 +62,10 @@
 #define PRESCALE		2048	/* The exponent must match this */
 #define CLK_CNTRL_PRESCALE	((PRESCALE_EXPONENT - 1) << 1)
 #define CLK_CNTRL_PRESCALE_EN	1
-#define CNT_CNTRL_RESET		(1 << 4)
+#define CNT_CNTRL_RESET		BIT(4)
+#define CNT_CNTRL_MATCH		BIT(3)
+
+#define TTC_INTERRUPT_MATCH1	BIT(1)
 
 /**
  * struct ttc_timer - This definition defines local timer structure
@@ -90,6 +94,7 @@ struct ttc_timer_clocksource {
 struct ttc_timer_clockevent {
 	struct ttc_timer		ttc;
 	struct clock_event_device	ce;
+	unsigned long			interval;
 };
 
 #define to_ttc_timer_clkevent(x) \
@@ -103,25 +108,25 @@ static void __iomem *ttc_sched_clock_val_reg;
  * @timer:	Pointer to the timer instance
  * @cycles:	Timer interval ticks
  **/
-static void ttc_set_interval(struct ttc_timer *timer,
-					unsigned long cycles)
+static void ttc_set_interval(struct ttc_timer *timer, unsigned long cycles)
 {
-	u32 ctrl_reg;
+	struct ttc_timer_clockevent *ttcce = container_of(timer,
+			struct ttc_timer_clockevent, ttc);
 
-	/* Disable the counter, set the counter value  and re-enable counter */
-	ctrl_reg = __raw_readl(timer->base_addr + TTC_CNT_CNTRL_OFFSET);
-	ctrl_reg |= TTC_CNT_CNTRL_DISABLE_MASK;
-	__raw_writel(ctrl_reg, timer->base_addr + TTC_CNT_CNTRL_OFFSET);
+	/* set interval */
+	u32 reg = __raw_readl(timer->base_addr + TTC_COUNT_VAL_OFFSET);
+	reg += cycles;
+	__raw_writel(reg, timer->base_addr + TTC_MATCH1_OFFSET);
 
-	__raw_writel(cycles, timer->base_addr + TTC_INTR_VAL_OFFSET);
+	/* enable match interrupt */
+	__raw_writel(TTC_INTERRUPT_MATCH1, timer->base_addr + TTC_IER_OFFSET);
 
-	/*
-	 * Reset the counter (0x10) so that it starts from 0, one-shot
-	 * mode makes this needed for timing to be right.
-	 */
-	ctrl_reg |= CNT_CNTRL_RESET;
-	ctrl_reg &= ~TTC_CNT_CNTRL_DISABLE_MASK;
-	__raw_writel(ctrl_reg, timer->base_addr + TTC_CNT_CNTRL_OFFSET);
+	/* enable timer */
+	reg = __raw_readl(timer->base_addr + TTC_CNT_CNTRL_OFFSET);
+	reg &= ~TTC_CNT_CNTRL_DISABLE_MASK;
+	__raw_writel(reg, timer->base_addr + TTC_CNT_CNTRL_OFFSET);
+
+	ttcce->interval = cycles;
 }
 
 /**
@@ -139,6 +144,8 @@ static irqreturn_t ttc_clock_event_interrupt(int irq, void *dev_id)
 
 	/* Acknowledge the interrupt and call event handler */
 	__raw_readl(timer->base_addr + TTC_ISR_OFFSET);
+	if (ttce->ce.mode == CLOCK_EVT_MODE_PERIODIC)
+		ttc_set_interval(timer, ttce->interval);
 
 	ttce->ce.event_handler(&ttce->ce);
 
@@ -192,7 +199,6 @@ static void ttc_set_mode(enum clock_event_mode mode,
 {
 	struct ttc_timer_clockevent *ttce = to_ttc_timer_clkevent(evt);
 	struct ttc_timer *timer = &ttce->ttc;
-	u32 ctrl_reg;
 
 	switch (mode) {
 	case CLOCK_EVT_MODE_PERIODIC:
@@ -203,18 +209,9 @@ static void ttc_set_mode(enum clock_event_mode mode,
 	case CLOCK_EVT_MODE_ONESHOT:
 	case CLOCK_EVT_MODE_UNUSED:
 	case CLOCK_EVT_MODE_SHUTDOWN:
-		ctrl_reg = __raw_readl(timer->base_addr +
-					TTC_CNT_CNTRL_OFFSET);
-		ctrl_reg |= TTC_CNT_CNTRL_DISABLE_MASK;
-		__raw_writel(ctrl_reg,
-				timer->base_addr + TTC_CNT_CNTRL_OFFSET);
+		__raw_writel(0, timer->base_addr + TTC_IER_OFFSET);
 		break;
 	case CLOCK_EVT_MODE_RESUME:
-		ctrl_reg = __raw_readl(timer->base_addr +
-					TTC_CNT_CNTRL_OFFSET);
-		ctrl_reg &= ~TTC_CNT_CNTRL_DISABLE_MASK;
-		__raw_writel(ctrl_reg,
-				timer->base_addr + TTC_CNT_CNTRL_OFFSET);
 		break;
 	}
 }
@@ -287,17 +284,6 @@ static void __init ttc_setup_clocksource(struct clk *clk, void __iomem *base)
 	ttccs->cs.mask = CLOCKSOURCE_MASK(16);
 	ttccs->cs.flags = CLOCK_SOURCE_IS_CONTINUOUS;
 
-	/*
-	 * Setup the clock source counter to be an incrementing counter
-	 * with no interrupt and it rolls over at 0xFFFF. Pre-scale
-	 * it by 32 also. Let it start running now.
-	 */
-	__raw_writel(0x0,  ttccs->ttc.base_addr + TTC_IER_OFFSET);
-	__raw_writel(CLK_CNTRL_PRESCALE | CLK_CNTRL_PRESCALE_EN,
-		     ttccs->ttc.base_addr + TTC_CLK_CNTRL_OFFSET);
-	__raw_writel(CNT_CNTRL_RESET,
-		     ttccs->ttc.base_addr + TTC_CNT_CNTRL_OFFSET);
-
 	err = clocksource_register_hz(&ttccs->cs,
 			clk_get_rate(ttccs->ttc.clk) / PRESCALE);
 	if (WARN_ON(err)) {
@@ -377,16 +363,6 @@ static void __init ttc_setup_clockevent(struct clk *clk,
 	ttcce->ce.irq = irq;
 	ttcce->ce.cpumask = cpu_possible_mask;
 
-	/*
-	 * Setup the clock event timer to be an interval timer which
-	 * is prescaled by 32 using the interval interrupt. Leave it
-	 * disabled for now.
-	 */
-	__raw_writel(0x23, ttcce->ttc.base_addr + TTC_CNT_CNTRL_OFFSET);
-	__raw_writel(CLK_CNTRL_PRESCALE | CLK_CNTRL_PRESCALE_EN,
-		     ttcce->ttc.base_addr + TTC_CLK_CNTRL_OFFSET);
-	__raw_writel(0x1,  ttcce->ttc.base_addr + TTC_IER_OFFSET);
-
 	err = request_irq(irq, ttc_clock_event_interrupt,
 			  IRQF_DISABLED | IRQF_TIMER,
 			  ttcce->ce.name, ttcce);
@@ -409,7 +385,7 @@ static void __init ttc_timer_init(struct device_node *timer)
 {
 	unsigned int irq;
 	void __iomem *timer_baseaddr;
-	struct clk *clk_cs, *clk_ce;
+	struct clk *clk;
 	static int initialized;
 	int clksel;
 
@@ -429,7 +405,7 @@ static void __init ttc_timer_init(struct device_node *timer)
 		BUG();
 	}
 
-	irq = irq_of_parse_and_map(timer, 1);
+	irq = irq_of_parse_and_map(timer, 0);
 	if (irq <= 0) {
 		pr_err("ERROR: invalid interrupt number\n");
 		BUG();
@@ -437,22 +413,21 @@ static void __init ttc_timer_init(struct device_node *timer)
 
 	clksel = __raw_readl(timer_baseaddr + TTC_CLK_CNTRL_OFFSET);
 	clksel = !!(clksel & TTC_CLK_CNTRL_CSRC_MASK);
-	clk_cs = of_clk_get(timer, clksel);
-	if (IS_ERR(clk_cs)) {
+	clk = of_clk_get(timer, clksel);
+	if (IS_ERR(clk)) {
 		pr_err("ERROR: timer input clock not found\n");
 		BUG();
 	}
 
-	clksel = __raw_readl(timer_baseaddr + 4 + TTC_CLK_CNTRL_OFFSET);
-	clksel = !!(clksel & TTC_CLK_CNTRL_CSRC_MASK);
-	clk_ce = of_clk_get(timer, clksel);
-	if (IS_ERR(clk_ce)) {
-		pr_err("ERROR: timer input clock not found\n");
-		BUG();
-	}
+	__raw_writel(CLK_CNTRL_PRESCALE | CLK_CNTRL_PRESCALE_EN,
+			timer_baseaddr + TTC_CLK_CNTRL_OFFSET);
+
+	/* start timer in overflow and match mode */
+	__raw_writel(CNT_CNTRL_RESET | CNT_CNTRL_MATCH,
+			timer_baseaddr + TTC_CNT_CNTRL_OFFSET);
 
-	ttc_setup_clocksource(clk_cs, timer_baseaddr);
-	ttc_setup_clockevent(clk_ce, timer_baseaddr + 4, irq);
+	ttc_setup_clocksource(clk, timer_baseaddr);
+	ttc_setup_clockevent(clk, timer_baseaddr, irq);
 
 	pr_info("%s #0 at %p, irq=%d\n", timer->name, timer_baseaddr, irq);
 }
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06  1:28                                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-06  1:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
> > On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
> >>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
> >>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> >>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> >>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>> (snip)
> >>>>>>>>>>>>
> >>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>
> >>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>
> >>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>
> >>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>
> >>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>
> >>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>> broadcast device:
> >>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>> 	Broadcast device  
> >>>>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>>>
> >>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>> 	Broadcast device                                                                 
> >>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>>
> >>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>> obtain this information for that case.
> >>>>>>>>
> >>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>
> >>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>> it is most likely not that useful.
> >>>>>>>
> >>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>
> >>>>>>> /proc/timer_list:
> >>>>>>> 	Tick Device: mode:     1
> >>>>>>> 	Broadcast device
> >>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>> 	 max_delta_ns:   12884902005
> >>>>>>> 	 min_delta_ns:   1000
> >>>>>>> 	 mult:           715827876
> >>>>>>> 	 shift:          31
> >>>>>>> 	 mode:           3
> >>>>>>
> >>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>
> >>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>
> >>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>
> >>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>> and that didn't end well:
> >>>>> 	# echo 1 > online && cat /proc/timer_list 
> >>>>
> >>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>> apparently this is not the case... :(
> >>>>
> >>>> I suspect the global timer is shutdown at one moment but I don't
> >>>> understand why and when.
> >>>>
> >>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>> interesting trace when it hangs.
> >>>
> >>> I did this change:
> >>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>> 	index 38959c8..3ab11c1 100644
> >>> 	--- a/kernel/time/clockevents.c
> >>> 	+++ b/kernel/time/clockevents.c
> >>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>> 	  */
> >>> 	 void clockevents_shutdown(struct clock_event_device *dev)
> >>> 	 {
> >>> 	+       pr_info("ce->name:%s\n", dev->name);
> >>> 	+       dump_stack();
> >>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>> 	        dev->next_event.tv64 = KTIME_MAX;
> >>> 	 }
> >>>
> >>> It is hit a few times during boot, so I attach a full boot log. I really
> >>> don't know what to look for, but I hope you can spot something in it. I
> >>> really appreciate you taking the time.
> >>
> >> Thanks for the traces.
> > 
> > Sure.
> > 
> >>
> >> If you try without the ttc_clockevent configured in the kernel (but with
> >> twd and gt), does it boot ?
> > 
> > Absence of the TTC doesn't seem to make any difference. It hangs at the
> > same location.
> 
> Ok, IMO there is a problem with the broadcast device registration (may
> be vs twd).
> 
> I will check later (kid duty) :)

I was actually waiting for an update from your side and did something
else, but I seem to have run into this again. I was overhauling the
cadence_ttc (patch attached, based on tip/timers/core). And it seems to
show the same behavior as enabling the global_timer. With cpuidle off, it
works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
C2 state makes it boot again.
It works just fine on our 3.10 kernel.

Another thing I noticed - probably unrelated but hard to tell: On
3.11-rc1 and later my system stops for quite some time at the hand off
to userspace. I.e. I see the 'freeing unused kernel memory...' line and
sometimes the following 'Welcome to Buildroot...' and then it stops and
on good kernels it continues after a while and boots through and on bad
ones it just hangs there.

	S?ren

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-clocksource-cadence_ttc-Use-only-one-counter.patch
Type: text/x-diff
Size: 7637 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130805/61292153/attachment-0001.bin>

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06  1:28                                           ` Sören Brinkmann
@ 2013-08-06  8:46                                             ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-06  8:46 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stuart Menefy, Stephen Boyd, John Stultz, Thomas Gleixner,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On 08/06/2013 03:28 AM, Sören Brinkmann wrote:
> Hi Daniel,
> 
> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>
>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>
>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>
>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>> broadcast device:
>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>
>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>
>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>
>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>
>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>> it is most likely not that useful.
>>>>>>>>>
>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>
>>>>>>>>> /proc/timer_list:
>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>> 	Broadcast device
>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>> 	 mult:           715827876
>>>>>>>>> 	 shift:          31
>>>>>>>>> 	 mode:           3
>>>>>>>>
>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>
>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>
>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>
>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>> and that didn't end well:
>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>
>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>> apparently this is not the case... :(
>>>>>>
>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>> understand why and when.
>>>>>>
>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>> interesting trace when it hangs.
>>>>>
>>>>> I did this change:
>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>> 	index 38959c8..3ab11c1 100644
>>>>> 	--- a/kernel/time/clockevents.c
>>>>> 	+++ b/kernel/time/clockevents.c
>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>> 	  */
>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>> 	 {
>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>> 	+       dump_stack();
>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>> 	 }
>>>>>
>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>> really appreciate you taking the time.
>>>>
>>>> Thanks for the traces.
>>>
>>> Sure.
>>>
>>>>
>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>> twd and gt), does it boot ?
>>>
>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>> same location.
>>
>> Ok, IMO there is a problem with the broadcast device registration (may
>> be vs twd).
>>
>> I will check later (kid duty) :)
> 
> I was actually waiting for an update from your side and did something
> else, but I seem to have run into this again. I was overhauling the
> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
> show the same behavior as enabling the global_timer. With cpuidle off, it
> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
> C2 state makes it boot again.
> It works just fine on our 3.10 kernel.

This is not necessary related to the bug. If the patch you sent broke
the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
be stuck. Removing the flag, may signifies you don't use the broadcast
timer, hence the bug is not surfacing.

Going back to the bug with the arm_global_timer, what is observed is the
broadcast timer is *shutdown* when the second cpu is online.

I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
the issue is coming from there but before I have to reproduce the bug,
so find a board I have where I can add the arm_global_timer.

> Another thing I noticed - probably unrelated but hard to tell: On
> 3.11-rc1 and later my system stops for quite some time at the hand off
> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
> sometimes the following 'Welcome to Buildroot...' and then it stops and
> on good kernels it continues after a while and boots through and on bad
> ones it just hangs there.

did you try to dump the stacks with magic-sysrq ? Or git bisect ?

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06  8:46                                             ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-06  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/06/2013 03:28 AM, S?ren Brinkmann wrote:
> Hi Daniel,
> 
> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>
>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>
>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>
>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>> broadcast device:
>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>
>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>
>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>
>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>
>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>> it is most likely not that useful.
>>>>>>>>>
>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>
>>>>>>>>> /proc/timer_list:
>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>> 	Broadcast device
>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>> 	 mult:           715827876
>>>>>>>>> 	 shift:          31
>>>>>>>>> 	 mode:           3
>>>>>>>>
>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>
>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>
>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>
>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>> and that didn't end well:
>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>
>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>> apparently this is not the case... :(
>>>>>>
>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>> understand why and when.
>>>>>>
>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>> interesting trace when it hangs.
>>>>>
>>>>> I did this change:
>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>> 	index 38959c8..3ab11c1 100644
>>>>> 	--- a/kernel/time/clockevents.c
>>>>> 	+++ b/kernel/time/clockevents.c
>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>> 	  */
>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>> 	 {
>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>> 	+       dump_stack();
>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>> 	 }
>>>>>
>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>> really appreciate you taking the time.
>>>>
>>>> Thanks for the traces.
>>>
>>> Sure.
>>>
>>>>
>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>> twd and gt), does it boot ?
>>>
>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>> same location.
>>
>> Ok, IMO there is a problem with the broadcast device registration (may
>> be vs twd).
>>
>> I will check later (kid duty) :)
> 
> I was actually waiting for an update from your side and did something
> else, but I seem to have run into this again. I was overhauling the
> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
> show the same behavior as enabling the global_timer. With cpuidle off, it
> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
> C2 state makes it boot again.
> It works just fine on our 3.10 kernel.

This is not necessary related to the bug. If the patch you sent broke
the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
be stuck. Removing the flag, may signifies you don't use the broadcast
timer, hence the bug is not surfacing.

Going back to the bug with the arm_global_timer, what is observed is the
broadcast timer is *shutdown* when the second cpu is online.

I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
the issue is coming from there but before I have to reproduce the bug,
so find a board I have where I can add the arm_global_timer.

> Another thing I noticed - probably unrelated but hard to tell: On
> 3.11-rc1 and later my system stops for quite some time at the hand off
> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
> sometimes the following 'Welcome to Buildroot...' and then it stops and
> on good kernels it continues after a while and boots through and on bad
> ones it just hangs there.

did you try to dump the stacks with magic-sysrq ? Or git bisect ?

-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06  8:46                                             ` Daniel Lezcano
@ 2013-08-06  9:18                                               ` Michal Simek
  -1 siblings, 0 replies; 142+ messages in thread
From: Michal Simek @ 2013-08-06  9:18 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Sören Brinkmann, Stuart Menefy, Stephen Boyd, John Stultz,
	Thomas Gleixner, Russell King, Michal Simek, linux-kernel,
	linux-arm-kernel

On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
> On 08/06/2013 03:28 AM, Sören Brinkmann wrote:
>> Hi Daniel,
>>
>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>
>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>
>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>
>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>
>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>
>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>
>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>
>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>
>>>>>>>>>> /proc/timer_list:
>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>> 	Broadcast device
>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>> 	 shift:          31
>>>>>>>>>> 	 mode:           3
>>>>>>>>>
>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>
>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>
>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>
>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>> and that didn't end well:
>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>
>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>> apparently this is not the case... :(
>>>>>>>
>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>> understand why and when.
>>>>>>>
>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>> interesting trace when it hangs.
>>>>>>
>>>>>> I did this change:
>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>> 	  */
>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>> 	 {
>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>> 	+       dump_stack();
>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>> 	 }
>>>>>>
>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>> really appreciate you taking the time.
>>>>>
>>>>> Thanks for the traces.
>>>>
>>>> Sure.
>>>>
>>>>>
>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>> twd and gt), does it boot ?
>>>>
>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>> same location.
>>>
>>> Ok, IMO there is a problem with the broadcast device registration (may
>>> be vs twd).
>>>
>>> I will check later (kid duty) :)
>>
>> I was actually waiting for an update from your side and did something
>> else, but I seem to have run into this again. I was overhauling the
>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>> show the same behavior as enabling the global_timer. With cpuidle off, it
>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>> C2 state makes it boot again.
>> It works just fine on our 3.10 kernel.
> 
> This is not necessary related to the bug. If the patch you sent broke
> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
> be stuck. Removing the flag, may signifies you don't use the broadcast
> timer, hence the bug is not surfacing.
> 
> Going back to the bug with the arm_global_timer, what is observed is the
> broadcast timer is *shutdown* when the second cpu is online.
> 
> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
> the issue is coming from there but before I have to reproduce the bug,
> so find a board I have where I can add the arm_global_timer.
> 
>> Another thing I noticed - probably unrelated but hard to tell: On
>> 3.11-rc1 and later my system stops for quite some time at the hand off
>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>> on good kernels it continues after a while and boots through and on bad
>> ones it just hangs there.
> 
> did you try to dump the stacks with magic-sysrq ? Or git bisect ?

Soren: Are you able to replicate this issue on QEMU?
If yes, it should be the best if you can provide Qemu, kernel .config/
rootfs and simple manual to Daniel how to reach that fault.

Thanks,
Michal



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06  9:18                                               ` Michal Simek
  0 siblings, 0 replies; 142+ messages in thread
From: Michal Simek @ 2013-08-06  9:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
> On 08/06/2013 03:28 AM, S?ren Brinkmann wrote:
>> Hi Daniel,
>>
>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>
>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>
>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>
>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>
>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>
>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>
>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>
>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>
>>>>>>>>>> /proc/timer_list:
>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>> 	Broadcast device
>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>> 	 shift:          31
>>>>>>>>>> 	 mode:           3
>>>>>>>>>
>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>
>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>
>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>
>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>> and that didn't end well:
>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>
>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>> apparently this is not the case... :(
>>>>>>>
>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>> understand why and when.
>>>>>>>
>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>> interesting trace when it hangs.
>>>>>>
>>>>>> I did this change:
>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>> 	  */
>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>> 	 {
>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>> 	+       dump_stack();
>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>> 	 }
>>>>>>
>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>> really appreciate you taking the time.
>>>>>
>>>>> Thanks for the traces.
>>>>
>>>> Sure.
>>>>
>>>>>
>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>> twd and gt), does it boot ?
>>>>
>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>> same location.
>>>
>>> Ok, IMO there is a problem with the broadcast device registration (may
>>> be vs twd).
>>>
>>> I will check later (kid duty) :)
>>
>> I was actually waiting for an update from your side and did something
>> else, but I seem to have run into this again. I was overhauling the
>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>> show the same behavior as enabling the global_timer. With cpuidle off, it
>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>> C2 state makes it boot again.
>> It works just fine on our 3.10 kernel.
> 
> This is not necessary related to the bug. If the patch you sent broke
> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
> be stuck. Removing the flag, may signifies you don't use the broadcast
> timer, hence the bug is not surfacing.
> 
> Going back to the bug with the arm_global_timer, what is observed is the
> broadcast timer is *shutdown* when the second cpu is online.
> 
> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
> the issue is coming from there but before I have to reproduce the bug,
> so find a board I have where I can add the arm_global_timer.
> 
>> Another thing I noticed - probably unrelated but hard to tell: On
>> 3.11-rc1 and later my system stops for quite some time at the hand off
>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>> on good kernels it continues after a while and boots through and on bad
>> ones it just hangs there.
> 
> did you try to dump the stacks with magic-sysrq ? Or git bisect ?

Soren: Are you able to replicate this issue on QEMU?
If yes, it should be the best if you can provide Qemu, kernel .config/
rootfs and simple manual to Daniel how to reach that fault.

Thanks,
Michal

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06  9:18                                               ` Michal Simek
@ 2013-08-06 12:30                                                 ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-06 12:30 UTC (permalink / raw)
  To: Michal Simek
  Cc: Sören Brinkmann, Stuart Menefy, Stephen Boyd, John Stultz,
	Thomas Gleixner, Russell King, linux-kernel, linux-arm-kernel

On 08/06/2013 11:18 AM, Michal Simek wrote:
> On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
>> On 08/06/2013 03:28 AM, Sören Brinkmann wrote:
>>> Hi Daniel,
>>>
>>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>>
>>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>>
>>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>
>>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>>
>>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>>
>>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>>
>>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>>
>>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>>> 	Broadcast device
>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>>> 	 shift:          31
>>>>>>>>>>> 	 mode:           3
>>>>>>>>>>
>>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>>
>>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>>
>>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>>
>>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>>> and that didn't end well:
>>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>>
>>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>>> apparently this is not the case... :(
>>>>>>>>
>>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>>> understand why and when.
>>>>>>>>
>>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>>> interesting trace when it hangs.
>>>>>>>
>>>>>>> I did this change:
>>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>> 	  */
>>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>> 	 {
>>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>>> 	+       dump_stack();
>>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>>> 	 }
>>>>>>>
>>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>>> really appreciate you taking the time.
>>>>>>
>>>>>> Thanks for the traces.
>>>>>
>>>>> Sure.
>>>>>
>>>>>>
>>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>>> twd and gt), does it boot ?
>>>>>
>>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>>> same location.
>>>>
>>>> Ok, IMO there is a problem with the broadcast device registration (may
>>>> be vs twd).
>>>>
>>>> I will check later (kid duty) :)
>>>
>>> I was actually waiting for an update from your side and did something
>>> else, but I seem to have run into this again. I was overhauling the
>>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>>> show the same behavior as enabling the global_timer. With cpuidle off, it
>>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>>> C2 state makes it boot again.
>>> It works just fine on our 3.10 kernel.
>>
>> This is not necessary related to the bug. If the patch you sent broke
>> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
>> be stuck. Removing the flag, may signifies you don't use the broadcast
>> timer, hence the bug is not surfacing.
>>
>> Going back to the bug with the arm_global_timer, what is observed is the
>> broadcast timer is *shutdown* when the second cpu is online.
>>
>> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
>> the issue is coming from there but before I have to reproduce the bug,
>> so find a board I have where I can add the arm_global_timer.
>>
>>> Another thing I noticed - probably unrelated but hard to tell: On
>>> 3.11-rc1 and later my system stops for quite some time at the hand off
>>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>>> on good kernels it continues after a while and boots through and on bad
>>> ones it just hangs there.
>>
>> did you try to dump the stacks with magic-sysrq ? Or git bisect ?
> 
> Soren: Are you able to replicate this issue on QEMU?
> If yes, it should be the best if you can provide Qemu, kernel .config/
> rootfs and simple manual to Daniel how to reach that fault.

I tried to download qemu for zynq but it fails:

git clone git://git.xilinx.com/qemu-xarm.git
Cloning into 'qemu-xarm'...
fatal: The remote end hung up unexpectedly

I am also looking for the option specified for the kernel:

"The kernel needs to be built with this feature turned on (in
menuconfig, System Type->Xilinx Specific Features -> Device Tree At
Fixed Address)."

... but I was not able to find it.

any ideas ?

Thanks
  -- Daniel

ps : apart that, well documented website !


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06 12:30                                                 ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-06 12:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/06/2013 11:18 AM, Michal Simek wrote:
> On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
>> On 08/06/2013 03:28 AM, S?ren Brinkmann wrote:
>>> Hi Daniel,
>>>
>>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>>
>>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>>
>>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>
>>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>>
>>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>>
>>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>>
>>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>>
>>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>>> 	Broadcast device
>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>>> 	 shift:          31
>>>>>>>>>>> 	 mode:           3
>>>>>>>>>>
>>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>>
>>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>>
>>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>>
>>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>>> and that didn't end well:
>>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>>
>>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>>> apparently this is not the case... :(
>>>>>>>>
>>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>>> understand why and when.
>>>>>>>>
>>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>>> interesting trace when it hangs.
>>>>>>>
>>>>>>> I did this change:
>>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>> 	  */
>>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>> 	 {
>>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>>> 	+       dump_stack();
>>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>>> 	 }
>>>>>>>
>>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>>> really appreciate you taking the time.
>>>>>>
>>>>>> Thanks for the traces.
>>>>>
>>>>> Sure.
>>>>>
>>>>>>
>>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>>> twd and gt), does it boot ?
>>>>>
>>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>>> same location.
>>>>
>>>> Ok, IMO there is a problem with the broadcast device registration (may
>>>> be vs twd).
>>>>
>>>> I will check later (kid duty) :)
>>>
>>> I was actually waiting for an update from your side and did something
>>> else, but I seem to have run into this again. I was overhauling the
>>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>>> show the same behavior as enabling the global_timer. With cpuidle off, it
>>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>>> C2 state makes it boot again.
>>> It works just fine on our 3.10 kernel.
>>
>> This is not necessary related to the bug. If the patch you sent broke
>> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
>> be stuck. Removing the flag, may signifies you don't use the broadcast
>> timer, hence the bug is not surfacing.
>>
>> Going back to the bug with the arm_global_timer, what is observed is the
>> broadcast timer is *shutdown* when the second cpu is online.
>>
>> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
>> the issue is coming from there but before I have to reproduce the bug,
>> so find a board I have where I can add the arm_global_timer.
>>
>>> Another thing I noticed - probably unrelated but hard to tell: On
>>> 3.11-rc1 and later my system stops for quite some time at the hand off
>>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>>> on good kernels it continues after a while and boots through and on bad
>>> ones it just hangs there.
>>
>> did you try to dump the stacks with magic-sysrq ? Or git bisect ?
> 
> Soren: Are you able to replicate this issue on QEMU?
> If yes, it should be the best if you can provide Qemu, kernel .config/
> rootfs and simple manual to Daniel how to reach that fault.

I tried to download qemu for zynq but it fails:

git clone git://git.xilinx.com/qemu-xarm.git
Cloning into 'qemu-xarm'...
fatal: The remote end hung up unexpectedly

I am also looking for the option specified for the kernel:

"The kernel needs to be built with this feature turned on (in
menuconfig, System Type->Xilinx Specific Features -> Device Tree At
Fixed Address)."

... but I was not able to find it.

any ideas ?

Thanks
  -- Daniel

ps : apart that, well documented website !


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06 12:30                                                 ` Daniel Lezcano
@ 2013-08-06 12:41                                                   ` Michal Simek
  -1 siblings, 0 replies; 142+ messages in thread
From: Michal Simek @ 2013-08-06 12:41 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Michal Simek, Sören Brinkmann, Stuart Menefy, Stephen Boyd,
	John Stultz, Thomas Gleixner, Russell King, linux-kernel,
	linux-arm-kernel

On 08/06/2013 02:30 PM, Daniel Lezcano wrote:
> On 08/06/2013 11:18 AM, Michal Simek wrote:
>> On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
>>> On 08/06/2013 03:28 AM, Sören Brinkmann wrote:
>>>> Hi Daniel,
>>>>
>>>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>>>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>>>
>>>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>>>
>>>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>>>
>>>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>>>> 	Broadcast device
>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>>>> 	 shift:          31
>>>>>>>>>>>> 	 mode:           3
>>>>>>>>>>>
>>>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>>>
>>>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>>>
>>>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>>>
>>>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>>>> and that didn't end well:
>>>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>>>
>>>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>>>> apparently this is not the case... :(
>>>>>>>>>
>>>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>>>> understand why and when.
>>>>>>>>>
>>>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>>>> interesting trace when it hangs.
>>>>>>>>
>>>>>>>> I did this change:
>>>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>>> 	  */
>>>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>>> 	 {
>>>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>>>> 	+       dump_stack();
>>>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>>>> 	 }
>>>>>>>>
>>>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>>>> really appreciate you taking the time.
>>>>>>>
>>>>>>> Thanks for the traces.
>>>>>>
>>>>>> Sure.
>>>>>>
>>>>>>>
>>>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>>>> twd and gt), does it boot ?
>>>>>>
>>>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>>>> same location.
>>>>>
>>>>> Ok, IMO there is a problem with the broadcast device registration (may
>>>>> be vs twd).
>>>>>
>>>>> I will check later (kid duty) :)
>>>>
>>>> I was actually waiting for an update from your side and did something
>>>> else, but I seem to have run into this again. I was overhauling the
>>>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>>>> show the same behavior as enabling the global_timer. With cpuidle off, it
>>>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>>>> C2 state makes it boot again.
>>>> It works just fine on our 3.10 kernel.
>>>
>>> This is not necessary related to the bug. If the patch you sent broke
>>> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
>>> be stuck. Removing the flag, may signifies you don't use the broadcast
>>> timer, hence the bug is not surfacing.
>>>
>>> Going back to the bug with the arm_global_timer, what is observed is the
>>> broadcast timer is *shutdown* when the second cpu is online.
>>>
>>> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
>>> the issue is coming from there but before I have to reproduce the bug,
>>> so find a board I have where I can add the arm_global_timer.
>>>
>>>> Another thing I noticed - probably unrelated but hard to tell: On
>>>> 3.11-rc1 and later my system stops for quite some time at the hand off
>>>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>>>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>>>> on good kernels it continues after a while and boots through and on bad
>>>> ones it just hangs there.
>>>
>>> did you try to dump the stacks with magic-sysrq ? Or git bisect ?
>>
>> Soren: Are you able to replicate this issue on QEMU?
>> If yes, it should be the best if you can provide Qemu, kernel .config/
>> rootfs and simple manual to Daniel how to reach that fault.
> 
> I tried to download qemu for zynq but it fails:
> 
> git clone git://git.xilinx.com/qemu-xarm.git
> Cloning into 'qemu-xarm'...
> fatal: The remote end hung up unexpectedly

Not sure which site have you found but
it should be just qemu.git
https://github.com/Xilinx/qemu

or github clone.

> 
> I am also looking for the option specified for the kernel:
> 
> "The kernel needs to be built with this feature turned on (in
> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
> Fixed Address)."


This also sound like a very ancient tree.
This is the latest kernel tree - master-next is the latest devel branch.
https://github.com/Xilinx/linux-xlnx

Or there should be an option to use the latest kernel from kernel.org.
(I think Soren is using it)

Zynq is the part of multiplatfrom kernel and cadence ttc is there,
dts is also in the mainline kernel.

> ps : apart that, well documented website !

Can you send me the link to it?

This should be the main page for it.
http://www.wiki.xilinx.com/

Thanks,
Michal



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06 12:41                                                   ` Michal Simek
  0 siblings, 0 replies; 142+ messages in thread
From: Michal Simek @ 2013-08-06 12:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/06/2013 02:30 PM, Daniel Lezcano wrote:
> On 08/06/2013 11:18 AM, Michal Simek wrote:
>> On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
>>> On 08/06/2013 03:28 AM, S?ren Brinkmann wrote:
>>>> Hi Daniel,
>>>>
>>>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>>>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>>>
>>>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>>>
>>>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>>>
>>>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>>>> 	Broadcast device
>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>>>> 	 shift:          31
>>>>>>>>>>>> 	 mode:           3
>>>>>>>>>>>
>>>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>>>
>>>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>>>
>>>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>>>
>>>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>>>> and that didn't end well:
>>>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>>>
>>>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>>>> apparently this is not the case... :(
>>>>>>>>>
>>>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>>>> understand why and when.
>>>>>>>>>
>>>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>>>> interesting trace when it hangs.
>>>>>>>>
>>>>>>>> I did this change:
>>>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>>> 	  */
>>>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>>> 	 {
>>>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>>>> 	+       dump_stack();
>>>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>>>> 	 }
>>>>>>>>
>>>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>>>> really appreciate you taking the time.
>>>>>>>
>>>>>>> Thanks for the traces.
>>>>>>
>>>>>> Sure.
>>>>>>
>>>>>>>
>>>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>>>> twd and gt), does it boot ?
>>>>>>
>>>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>>>> same location.
>>>>>
>>>>> Ok, IMO there is a problem with the broadcast device registration (may
>>>>> be vs twd).
>>>>>
>>>>> I will check later (kid duty) :)
>>>>
>>>> I was actually waiting for an update from your side and did something
>>>> else, but I seem to have run into this again. I was overhauling the
>>>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>>>> show the same behavior as enabling the global_timer. With cpuidle off, it
>>>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>>>> C2 state makes it boot again.
>>>> It works just fine on our 3.10 kernel.
>>>
>>> This is not necessary related to the bug. If the patch you sent broke
>>> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
>>> be stuck. Removing the flag, may signifies you don't use the broadcast
>>> timer, hence the bug is not surfacing.
>>>
>>> Going back to the bug with the arm_global_timer, what is observed is the
>>> broadcast timer is *shutdown* when the second cpu is online.
>>>
>>> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
>>> the issue is coming from there but before I have to reproduce the bug,
>>> so find a board I have where I can add the arm_global_timer.
>>>
>>>> Another thing I noticed - probably unrelated but hard to tell: On
>>>> 3.11-rc1 and later my system stops for quite some time at the hand off
>>>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>>>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>>>> on good kernels it continues after a while and boots through and on bad
>>>> ones it just hangs there.
>>>
>>> did you try to dump the stacks with magic-sysrq ? Or git bisect ?
>>
>> Soren: Are you able to replicate this issue on QEMU?
>> If yes, it should be the best if you can provide Qemu, kernel .config/
>> rootfs and simple manual to Daniel how to reach that fault.
> 
> I tried to download qemu for zynq but it fails:
> 
> git clone git://git.xilinx.com/qemu-xarm.git
> Cloning into 'qemu-xarm'...
> fatal: The remote end hung up unexpectedly

Not sure which site have you found but
it should be just qemu.git
https://github.com/Xilinx/qemu

or github clone.

> 
> I am also looking for the option specified for the kernel:
> 
> "The kernel needs to be built with this feature turned on (in
> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
> Fixed Address)."


This also sound like a very ancient tree.
This is the latest kernel tree - master-next is the latest devel branch.
https://github.com/Xilinx/linux-xlnx

Or there should be an option to use the latest kernel from kernel.org.
(I think Soren is using it)

Zynq is the part of multiplatfrom kernel and cadence ttc is there,
dts is also in the mainline kernel.

> ps : apart that, well documented website !

Can you send me the link to it?

This should be the main page for it.
http://www.wiki.xilinx.com/

Thanks,
Michal

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06 12:41                                                   ` Michal Simek
@ 2013-08-06 13:08                                                     ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-06 13:08 UTC (permalink / raw)
  To: Michal Simek
  Cc: Sören Brinkmann, Stuart Menefy, Stephen Boyd, John Stultz,
	Thomas Gleixner, Russell King, linux-kernel, linux-arm-kernel

On 08/06/2013 02:41 PM, Michal Simek wrote:
> On 08/06/2013 02:30 PM, Daniel Lezcano wrote:
>> On 08/06/2013 11:18 AM, Michal Simek wrote:
>>> On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
>>>> On 08/06/2013 03:28 AM, Sören Brinkmann wrote:
>>>>> Hi Daniel,
>>>>>
>>>>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>>>>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>>>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>>>>
>>>>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>>>>> 	Broadcast device
>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>>>>> 	 shift:          31
>>>>>>>>>>>>> 	 mode:           3
>>>>>>>>>>>>
>>>>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>>>>
>>>>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>>>>
>>>>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>>>>
>>>>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>>>>> and that didn't end well:
>>>>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>>>>
>>>>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>>>>> apparently this is not the case... :(
>>>>>>>>>>
>>>>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>>>>> understand why and when.
>>>>>>>>>>
>>>>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>>>>> interesting trace when it hangs.
>>>>>>>>>
>>>>>>>>> I did this change:
>>>>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>>>> 	  */
>>>>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>>>> 	 {
>>>>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>>>>> 	+       dump_stack();
>>>>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>>>>> 	 }
>>>>>>>>>
>>>>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>>>>> really appreciate you taking the time.
>>>>>>>>
>>>>>>>> Thanks for the traces.
>>>>>>>
>>>>>>> Sure.
>>>>>>>
>>>>>>>>
>>>>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>>>>> twd and gt), does it boot ?
>>>>>>>
>>>>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>>>>> same location.
>>>>>>
>>>>>> Ok, IMO there is a problem with the broadcast device registration (may
>>>>>> be vs twd).
>>>>>>
>>>>>> I will check later (kid duty) :)
>>>>>
>>>>> I was actually waiting for an update from your side and did something
>>>>> else, but I seem to have run into this again. I was overhauling the
>>>>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>>>>> show the same behavior as enabling the global_timer. With cpuidle off, it
>>>>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>>>>> C2 state makes it boot again.
>>>>> It works just fine on our 3.10 kernel.
>>>>
>>>> This is not necessary related to the bug. If the patch you sent broke
>>>> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
>>>> be stuck. Removing the flag, may signifies you don't use the broadcast
>>>> timer, hence the bug is not surfacing.
>>>>
>>>> Going back to the bug with the arm_global_timer, what is observed is the
>>>> broadcast timer is *shutdown* when the second cpu is online.
>>>>
>>>> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
>>>> the issue is coming from there but before I have to reproduce the bug,
>>>> so find a board I have where I can add the arm_global_timer.
>>>>
>>>>> Another thing I noticed - probably unrelated but hard to tell: On
>>>>> 3.11-rc1 and later my system stops for quite some time at the hand off
>>>>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>>>>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>>>>> on good kernels it continues after a while and boots through and on bad
>>>>> ones it just hangs there.
>>>>
>>>> did you try to dump the stacks with magic-sysrq ? Or git bisect ?
>>>
>>> Soren: Are you able to replicate this issue on QEMU?
>>> If yes, it should be the best if you can provide Qemu, kernel .config/
>>> rootfs and simple manual to Daniel how to reach that fault.
>>
>> I tried to download qemu for zynq but it fails:
>>
>> git clone git://git.xilinx.com/qemu-xarm.git
>> Cloning into 'qemu-xarm'...
>> fatal: The remote end hung up unexpectedly
> 
> Not sure which site have you found but
> it should be just qemu.git
> https://github.com/Xilinx/qemu
> 
> or github clone.

Ok, cool I was able to clone it.

>> I am also looking for the option specified for the kernel:
>>
>> "The kernel needs to be built with this feature turned on (in
>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
>> Fixed Address)."

Ok.

> This also sound like a very ancient tree.
> This is the latest kernel tree - master-next is the latest devel branch.
> https://github.com/Xilinx/linux-xlnx

Ok, cool. I have the right one.

> Or there should be an option to use the latest kernel from kernel.org.
> (I think Soren is using it)
> 
> Zynq is the part of multiplatfrom kernel and cadence ttc is there,
> dts is also in the mainline kernel.
> 
>> ps : apart that, well documented website !
> 
> Can you send me the link to it?

http://xilinx.wikidot.com/zynq-qemu
http://xilinx.wikidot.com/zynq-linux


> This should be the main page for it.
> http://www.wiki.xilinx.com/

Thanks Michal !

  -- Daniel




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06 13:08                                                     ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-06 13:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/06/2013 02:41 PM, Michal Simek wrote:
> On 08/06/2013 02:30 PM, Daniel Lezcano wrote:
>> On 08/06/2013 11:18 AM, Michal Simek wrote:
>>> On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
>>>> On 08/06/2013 03:28 AM, S?ren Brinkmann wrote:
>>>>> Hi Daniel,
>>>>>
>>>>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>>>>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>>>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>>>>
>>>>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>>>>> 	Broadcast device
>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>>>>> 	 shift:          31
>>>>>>>>>>>>> 	 mode:           3
>>>>>>>>>>>>
>>>>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>>>>
>>>>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>>>>
>>>>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>>>>
>>>>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>>>>> and that didn't end well:
>>>>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>>>>
>>>>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>>>>> apparently this is not the case... :(
>>>>>>>>>>
>>>>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>>>>> understand why and when.
>>>>>>>>>>
>>>>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>>>>> interesting trace when it hangs.
>>>>>>>>>
>>>>>>>>> I did this change:
>>>>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>>>> 	  */
>>>>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>>>> 	 {
>>>>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>>>>> 	+       dump_stack();
>>>>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>>>>> 	 }
>>>>>>>>>
>>>>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>>>>> really appreciate you taking the time.
>>>>>>>>
>>>>>>>> Thanks for the traces.
>>>>>>>
>>>>>>> Sure.
>>>>>>>
>>>>>>>>
>>>>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>>>>> twd and gt), does it boot ?
>>>>>>>
>>>>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>>>>> same location.
>>>>>>
>>>>>> Ok, IMO there is a problem with the broadcast device registration (may
>>>>>> be vs twd).
>>>>>>
>>>>>> I will check later (kid duty) :)
>>>>>
>>>>> I was actually waiting for an update from your side and did something
>>>>> else, but I seem to have run into this again. I was overhauling the
>>>>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>>>>> show the same behavior as enabling the global_timer. With cpuidle off, it
>>>>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>>>>> C2 state makes it boot again.
>>>>> It works just fine on our 3.10 kernel.
>>>>
>>>> This is not necessary related to the bug. If the patch you sent broke
>>>> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
>>>> be stuck. Removing the flag, may signifies you don't use the broadcast
>>>> timer, hence the bug is not surfacing.
>>>>
>>>> Going back to the bug with the arm_global_timer, what is observed is the
>>>> broadcast timer is *shutdown* when the second cpu is online.
>>>>
>>>> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
>>>> the issue is coming from there but before I have to reproduce the bug,
>>>> so find a board I have where I can add the arm_global_timer.
>>>>
>>>>> Another thing I noticed - probably unrelated but hard to tell: On
>>>>> 3.11-rc1 and later my system stops for quite some time at the hand off
>>>>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>>>>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>>>>> on good kernels it continues after a while and boots through and on bad
>>>>> ones it just hangs there.
>>>>
>>>> did you try to dump the stacks with magic-sysrq ? Or git bisect ?
>>>
>>> Soren: Are you able to replicate this issue on QEMU?
>>> If yes, it should be the best if you can provide Qemu, kernel .config/
>>> rootfs and simple manual to Daniel how to reach that fault.
>>
>> I tried to download qemu for zynq but it fails:
>>
>> git clone git://git.xilinx.com/qemu-xarm.git
>> Cloning into 'qemu-xarm'...
>> fatal: The remote end hung up unexpectedly
> 
> Not sure which site have you found but
> it should be just qemu.git
> https://github.com/Xilinx/qemu
> 
> or github clone.

Ok, cool I was able to clone it.

>> I am also looking for the option specified for the kernel:
>>
>> "The kernel needs to be built with this feature turned on (in
>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
>> Fixed Address)."

Ok.

> This also sound like a very ancient tree.
> This is the latest kernel tree - master-next is the latest devel branch.
> https://github.com/Xilinx/linux-xlnx

Ok, cool. I have the right one.

> Or there should be an option to use the latest kernel from kernel.org.
> (I think Soren is using it)
> 
> Zynq is the part of multiplatfrom kernel and cadence ttc is there,
> dts is also in the mainline kernel.
> 
>> ps : apart that, well documented website !
> 
> Can you send me the link to it?

http://xilinx.wikidot.com/zynq-qemu
http://xilinx.wikidot.com/zynq-linux


> This should be the main page for it.
> http://www.wiki.xilinx.com/

Thanks Michal !

  -- Daniel




-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06 13:08                                                     ` Daniel Lezcano
@ 2013-08-06 13:18                                                       ` Michal Simek
  -1 siblings, 0 replies; 142+ messages in thread
From: Michal Simek @ 2013-08-06 13:18 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Michal Simek, Sören Brinkmann, Stuart Menefy, Stephen Boyd,
	John Stultz, Thomas Gleixner, Russell King, linux-kernel,
	linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 10955 bytes --]

On 08/06/2013 03:08 PM, Daniel Lezcano wrote:
> On 08/06/2013 02:41 PM, Michal Simek wrote:
>> On 08/06/2013 02:30 PM, Daniel Lezcano wrote:
>>> On 08/06/2013 11:18 AM, Michal Simek wrote:
>>>> On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
>>>>> On 08/06/2013 03:28 AM, Sören Brinkmann wrote:
>>>>>> Hi Daniel,
>>>>>>
>>>>>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>>>>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>>>>>> 	Broadcast device
>>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>>>>>> 	 shift:          31
>>>>>>>>>>>>>> 	 mode:           3
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>>>>>
>>>>>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>>>>>
>>>>>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>>>>>> and that didn't end well:
>>>>>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>>>>>
>>>>>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>>>>>> apparently this is not the case... :(
>>>>>>>>>>>
>>>>>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>>>>>> understand why and when.
>>>>>>>>>>>
>>>>>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>>>>>> interesting trace when it hangs.
>>>>>>>>>>
>>>>>>>>>> I did this change:
>>>>>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>>>>> 	  */
>>>>>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>>>>> 	 {
>>>>>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>>>>>> 	+       dump_stack();
>>>>>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>>>>>> 	 }
>>>>>>>>>>
>>>>>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>>>>>> really appreciate you taking the time.
>>>>>>>>>
>>>>>>>>> Thanks for the traces.
>>>>>>>>
>>>>>>>> Sure.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>>>>>> twd and gt), does it boot ?
>>>>>>>>
>>>>>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>>>>>> same location.
>>>>>>>
>>>>>>> Ok, IMO there is a problem with the broadcast device registration (may
>>>>>>> be vs twd).
>>>>>>>
>>>>>>> I will check later (kid duty) :)
>>>>>>
>>>>>> I was actually waiting for an update from your side and did something
>>>>>> else, but I seem to have run into this again. I was overhauling the
>>>>>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>>>>>> show the same behavior as enabling the global_timer. With cpuidle off, it
>>>>>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>>>>>> C2 state makes it boot again.
>>>>>> It works just fine on our 3.10 kernel.
>>>>>
>>>>> This is not necessary related to the bug. If the patch you sent broke
>>>>> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
>>>>> be stuck. Removing the flag, may signifies you don't use the broadcast
>>>>> timer, hence the bug is not surfacing.
>>>>>
>>>>> Going back to the bug with the arm_global_timer, what is observed is the
>>>>> broadcast timer is *shutdown* when the second cpu is online.
>>>>>
>>>>> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
>>>>> the issue is coming from there but before I have to reproduce the bug,
>>>>> so find a board I have where I can add the arm_global_timer.
>>>>>
>>>>>> Another thing I noticed - probably unrelated but hard to tell: On
>>>>>> 3.11-rc1 and later my system stops for quite some time at the hand off
>>>>>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>>>>>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>>>>>> on good kernels it continues after a while and boots through and on bad
>>>>>> ones it just hangs there.
>>>>>
>>>>> did you try to dump the stacks with magic-sysrq ? Or git bisect ?
>>>>
>>>> Soren: Are you able to replicate this issue on QEMU?
>>>> If yes, it should be the best if you can provide Qemu, kernel .config/
>>>> rootfs and simple manual to Daniel how to reach that fault.
>>>
>>> I tried to download qemu for zynq but it fails:
>>>
>>> git clone git://git.xilinx.com/qemu-xarm.git
>>> Cloning into 'qemu-xarm'...
>>> fatal: The remote end hung up unexpectedly
>>
>> Not sure which site have you found but
>> it should be just qemu.git
>> https://github.com/Xilinx/qemu
>>
>> or github clone.
> 
> Ok, cool I was able to clone it.
> 
>>> I am also looking for the option specified for the kernel:
>>>
>>> "The kernel needs to be built with this feature turned on (in
>>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
>>> Fixed Address)."
> 
> Ok.
> 
>> This also sound like a very ancient tree.
>> This is the latest kernel tree - master-next is the latest devel branch.
>> https://github.com/Xilinx/linux-xlnx
> 
> Ok, cool. I have the right one.
> 
>> Or there should be an option to use the latest kernel from kernel.org.
>> (I think Soren is using it)
>>
>> Zynq is the part of multiplatfrom kernel and cadence ttc is there,
>> dts is also in the mainline kernel.
>>
>>> ps : apart that, well documented website !
>>
>> Can you send me the link to it?
> 
> http://xilinx.wikidot.com/zynq-qemu
> http://xilinx.wikidot.com/zynq-linux

I will find out information why it is still there.
I think it was moved to the new location.

> 
>> This should be the main page for it.
>> http://www.wiki.xilinx.com/
> 
> Thanks Michal !

Thank you too Daniel,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06 13:18                                                       ` Michal Simek
  0 siblings, 0 replies; 142+ messages in thread
From: Michal Simek @ 2013-08-06 13:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/06/2013 03:08 PM, Daniel Lezcano wrote:
> On 08/06/2013 02:41 PM, Michal Simek wrote:
>> On 08/06/2013 02:30 PM, Daniel Lezcano wrote:
>>> On 08/06/2013 11:18 AM, Michal Simek wrote:
>>>> On 08/06/2013 10:46 AM, Daniel Lezcano wrote:
>>>>> On 08/06/2013 03:28 AM, S?ren Brinkmann wrote:
>>>>>> Hi Daniel,
>>>>>>
>>>>>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>>>>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>>>>>> 	Broadcast device
>>>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>>>>>> 	 shift:          31
>>>>>>>>>>>>>> 	 mode:           3
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>>>>>
>>>>>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>>>>>
>>>>>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>>>>>> and that didn't end well:
>>>>>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>>>>>
>>>>>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>>>>>> apparently this is not the case... :(
>>>>>>>>>>>
>>>>>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>>>>>> understand why and when.
>>>>>>>>>>>
>>>>>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>>>>>> interesting trace when it hangs.
>>>>>>>>>>
>>>>>>>>>> I did this change:
>>>>>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>>>>> 	  */
>>>>>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>>>>> 	 {
>>>>>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>>>>>> 	+       dump_stack();
>>>>>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>>>>>> 	 }
>>>>>>>>>>
>>>>>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>>>>>> really appreciate you taking the time.
>>>>>>>>>
>>>>>>>>> Thanks for the traces.
>>>>>>>>
>>>>>>>> Sure.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>>>>>> twd and gt), does it boot ?
>>>>>>>>
>>>>>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>>>>>> same location.
>>>>>>>
>>>>>>> Ok, IMO there is a problem with the broadcast device registration (may
>>>>>>> be vs twd).
>>>>>>>
>>>>>>> I will check later (kid duty) :)
>>>>>>
>>>>>> I was actually waiting for an update from your side and did something
>>>>>> else, but I seem to have run into this again. I was overhauling the
>>>>>> cadence_ttc (patch attached, based on tip/timers/core). And it seems to
>>>>>> show the same behavior as enabling the global_timer. With cpuidle off, it
>>>>>> works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
>>>>>> C2 state makes it boot again.
>>>>>> It works just fine on our 3.10 kernel.
>>>>>
>>>>> This is not necessary related to the bug. If the patch you sent broke
>>>>> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
>>>>> be stuck. Removing the flag, may signifies you don't use the broadcast
>>>>> timer, hence the bug is not surfacing.
>>>>>
>>>>> Going back to the bug with the arm_global_timer, what is observed is the
>>>>> broadcast timer is *shutdown* when the second cpu is online.
>>>>>
>>>>> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
>>>>> the issue is coming from there but before I have to reproduce the bug,
>>>>> so find a board I have where I can add the arm_global_timer.
>>>>>
>>>>>> Another thing I noticed - probably unrelated but hard to tell: On
>>>>>> 3.11-rc1 and later my system stops for quite some time at the hand off
>>>>>> to userspace. I.e. I see the 'freeing unused kernel memory...' line and
>>>>>> sometimes the following 'Welcome to Buildroot...' and then it stops and
>>>>>> on good kernels it continues after a while and boots through and on bad
>>>>>> ones it just hangs there.
>>>>>
>>>>> did you try to dump the stacks with magic-sysrq ? Or git bisect ?
>>>>
>>>> Soren: Are you able to replicate this issue on QEMU?
>>>> If yes, it should be the best if you can provide Qemu, kernel .config/
>>>> rootfs and simple manual to Daniel how to reach that fault.
>>>
>>> I tried to download qemu for zynq but it fails:
>>>
>>> git clone git://git.xilinx.com/qemu-xarm.git
>>> Cloning into 'qemu-xarm'...
>>> fatal: The remote end hung up unexpectedly
>>
>> Not sure which site have you found but
>> it should be just qemu.git
>> https://github.com/Xilinx/qemu
>>
>> or github clone.
> 
> Ok, cool I was able to clone it.
> 
>>> I am also looking for the option specified for the kernel:
>>>
>>> "The kernel needs to be built with this feature turned on (in
>>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
>>> Fixed Address)."
> 
> Ok.
> 
>> This also sound like a very ancient tree.
>> This is the latest kernel tree - master-next is the latest devel branch.
>> https://github.com/Xilinx/linux-xlnx
> 
> Ok, cool. I have the right one.
> 
>> Or there should be an option to use the latest kernel from kernel.org.
>> (I think Soren is using it)
>>
>> Zynq is the part of multiplatfrom kernel and cadence ttc is there,
>> dts is also in the mainline kernel.
>>
>>> ps : apart that, well documented website !
>>
>> Can you send me the link to it?
> 
> http://xilinx.wikidot.com/zynq-qemu
> http://xilinx.wikidot.com/zynq-linux

I will find out information why it is still there.
I think it was moved to the new location.

> 
>> This should be the main page for it.
>> http://www.wiki.xilinx.com/
> 
> Thanks Michal !

Thank you too Daniel,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 263 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130806/646e30d0/attachment.sig>

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06 13:18                                                       ` Michal Simek
@ 2013-08-06 16:09                                                         ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-06 16:09 UTC (permalink / raw)
  To: monstr
  Cc: Michal Simek, Sören Brinkmann, Stuart Menefy, Stephen Boyd,
	John Stultz, Thomas Gleixner, Russell King, linux-kernel,
	linux-arm-kernel

On 08/06/2013 03:18 PM, Michal Simek wrote:

[ ... ]

>>>>> Soren: Are you able to replicate this issue on QEMU?
>>>>> If yes, it should be the best if you can provide Qemu, kernel .config/
>>>>> rootfs and simple manual to Daniel how to reach that fault.
>>>>
>>>> I tried to download qemu for zynq but it fails:
>>>>
>>>> git clone git://git.xilinx.com/qemu-xarm.git
>>>> Cloning into 'qemu-xarm'...
>>>> fatal: The remote end hung up unexpectedly
>>>
>>> Not sure which site have you found but
>>> it should be just qemu.git
>>> https://github.com/Xilinx/qemu
>>>
>>> or github clone.
>>
>> Ok, cool I was able to clone it.
>>
>>>> I am also looking for the option specified for the kernel:
>>>>
>>>> "The kernel needs to be built with this feature turned on (in
>>>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
>>>> Fixed Address)."
>>
>> Ok.
>>
>>> This also sound like a very ancient tree.
>>> This is the latest kernel tree - master-next is the latest devel branch.
>>> https://github.com/Xilinx/linux-xlnx
>>
>> Ok, cool. I have the right one.

Following the documentation, I was able to boot a kernel with qemu for
the linux-xlnx and qemu-xilinx.

But this kernel is outdated regarding the upstream one, so I tried to
boot a 3.11-rc4 kernel without success, I did the following:

I used the default config file from linux-xlnx for the upstream kernel.

I compiled the kernel with:

make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-
UIMAGE_LOADADDR=0x8000 uImage

I generated the dtb with:

make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- dtbs

For qemu, I started qemu with:

./arm-softmmu/qemu-system-arm -M arm-generic-fdt -nographic -smp 2
-machine linux=on -serial mon:stdio -dtb zynq-zed.dtb -kernel
kernel/zImage -initrd filesystem/ramdisk.img

I tried with the dtb available for the upstream kernel:

zynq-zc706.dtb, zynq-zc702.dtb and zynq-zed.dtb

Did I miss something ?

Thanks
  -- Daniel



>>
>>> Or there should be an option to use the latest kernel from kernel.org.
>>> (I think Soren is using it)
>>>
>>> Zynq is the part of multiplatfrom kernel and cadence ttc is there,
>>> dts is also in the mainline kernel.
>>>
>>>> ps : apart that, well documented website !
>>>
>>> Can you send me the link to it?
>>
>> http://xilinx.wikidot.com/zynq-qemu
>> http://xilinx.wikidot.com/zynq-linux
> 
> I will find out information why it is still there.
> I think it was moved to the new location.
> 
>>
>>> This should be the main page for it.
>>> http://www.wiki.xilinx.com/


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06 16:09                                                         ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-06 16:09 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/06/2013 03:18 PM, Michal Simek wrote:

[ ... ]

>>>>> Soren: Are you able to replicate this issue on QEMU?
>>>>> If yes, it should be the best if you can provide Qemu, kernel .config/
>>>>> rootfs and simple manual to Daniel how to reach that fault.
>>>>
>>>> I tried to download qemu for zynq but it fails:
>>>>
>>>> git clone git://git.xilinx.com/qemu-xarm.git
>>>> Cloning into 'qemu-xarm'...
>>>> fatal: The remote end hung up unexpectedly
>>>
>>> Not sure which site have you found but
>>> it should be just qemu.git
>>> https://github.com/Xilinx/qemu
>>>
>>> or github clone.
>>
>> Ok, cool I was able to clone it.
>>
>>>> I am also looking for the option specified for the kernel:
>>>>
>>>> "The kernel needs to be built with this feature turned on (in
>>>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
>>>> Fixed Address)."
>>
>> Ok.
>>
>>> This also sound like a very ancient tree.
>>> This is the latest kernel tree - master-next is the latest devel branch.
>>> https://github.com/Xilinx/linux-xlnx
>>
>> Ok, cool. I have the right one.

Following the documentation, I was able to boot a kernel with qemu for
the linux-xlnx and qemu-xilinx.

But this kernel is outdated regarding the upstream one, so I tried to
boot a 3.11-rc4 kernel without success, I did the following:

I used the default config file from linux-xlnx for the upstream kernel.

I compiled the kernel with:

make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-
UIMAGE_LOADADDR=0x8000 uImage

I generated the dtb with:

make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- dtbs

For qemu, I started qemu with:

./arm-softmmu/qemu-system-arm -M arm-generic-fdt -nographic -smp 2
-machine linux=on -serial mon:stdio -dtb zynq-zed.dtb -kernel
kernel/zImage -initrd filesystem/ramdisk.img

I tried with the dtb available for the upstream kernel:

zynq-zc706.dtb, zynq-zc702.dtb and zynq-zed.dtb

Did I miss something ?

Thanks
  -- Daniel



>>
>>> Or there should be an option to use the latest kernel from kernel.org.
>>> (I think Soren is using it)
>>>
>>> Zynq is the part of multiplatfrom kernel and cadence ttc is there,
>>> dts is also in the mainline kernel.
>>>
>>>> ps : apart that, well documented website !
>>>
>>> Can you send me the link to it?
>>
>> http://xilinx.wikidot.com/zynq-qemu
>> http://xilinx.wikidot.com/zynq-linux
> 
> I will find out information why it is still there.
> I think it was moved to the new location.
> 
>>
>>> This should be the main page for it.
>>> http://www.wiki.xilinx.com/


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06 16:09                                                         ` Daniel Lezcano
@ 2013-08-06 16:13                                                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-06 16:13 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: monstr, Michal Simek, Stuart Menefy, Stephen Boyd, John Stultz,
	Thomas Gleixner, Russell King, linux-kernel, linux-arm-kernel

On Tue, Aug 06, 2013 at 06:09:09PM +0200, Daniel Lezcano wrote:
> On 08/06/2013 03:18 PM, Michal Simek wrote:
> 
> [ ... ]
> 
> >>>>> Soren: Are you able to replicate this issue on QEMU?
> >>>>> If yes, it should be the best if you can provide Qemu, kernel .config/
> >>>>> rootfs and simple manual to Daniel how to reach that fault.
> >>>>
> >>>> I tried to download qemu for zynq but it fails:
> >>>>
> >>>> git clone git://git.xilinx.com/qemu-xarm.git
> >>>> Cloning into 'qemu-xarm'...
> >>>> fatal: The remote end hung up unexpectedly
> >>>
> >>> Not sure which site have you found but
> >>> it should be just qemu.git
> >>> https://github.com/Xilinx/qemu
> >>>
> >>> or github clone.
> >>
> >> Ok, cool I was able to clone it.
> >>
> >>>> I am also looking for the option specified for the kernel:
> >>>>
> >>>> "The kernel needs to be built with this feature turned on (in
> >>>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
> >>>> Fixed Address)."
> >>
> >> Ok.
> >>
> >>> This also sound like a very ancient tree.
> >>> This is the latest kernel tree - master-next is the latest devel branch.
> >>> https://github.com/Xilinx/linux-xlnx
> >>
> >> Ok, cool. I have the right one.
> 
> Following the documentation, I was able to boot a kernel with qemu for
> the linux-xlnx and qemu-xilinx.
> 
> But this kernel is outdated regarding the upstream one, so I tried to
> boot a 3.11-rc4 kernel without success, I did the following:
> 
> I used the default config file from linux-xlnx for the upstream kernel.
> 
> I compiled the kernel with:
> 
> make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-
> UIMAGE_LOADADDR=0x8000 uImage
> 
> I generated the dtb with:
> 
> make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- dtbs
> 
> For qemu, I started qemu with:
> 
> ./arm-softmmu/qemu-system-arm -M arm-generic-fdt -nographic -smp 2
> -machine linux=on -serial mon:stdio -dtb zynq-zed.dtb -kernel
> kernel/zImage -initrd filesystem/ramdisk.img
> 
> I tried with the dtb available for the upstream kernel:
> 
> zynq-zc706.dtb, zynq-zc702.dtb and zynq-zed.dtb
> 
> Did I miss something ?

Unfortunately the public github QEMU is behind our internal development.
IIRC, there were a couple of DT compatible strings QEMU didn't know.
Those are sometimes removed making boot fail/hang.

Are you on IRC? It's probably easier to resolve this in a more direct
way (I'm 'sorenb' on freenode). Otherwise I could give you a few more
instructions for debugging this per mail?

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06 16:13                                                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-06 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 06, 2013 at 06:09:09PM +0200, Daniel Lezcano wrote:
> On 08/06/2013 03:18 PM, Michal Simek wrote:
> 
> [ ... ]
> 
> >>>>> Soren: Are you able to replicate this issue on QEMU?
> >>>>> If yes, it should be the best if you can provide Qemu, kernel .config/
> >>>>> rootfs and simple manual to Daniel how to reach that fault.
> >>>>
> >>>> I tried to download qemu for zynq but it fails:
> >>>>
> >>>> git clone git://git.xilinx.com/qemu-xarm.git
> >>>> Cloning into 'qemu-xarm'...
> >>>> fatal: The remote end hung up unexpectedly
> >>>
> >>> Not sure which site have you found but
> >>> it should be just qemu.git
> >>> https://github.com/Xilinx/qemu
> >>>
> >>> or github clone.
> >>
> >> Ok, cool I was able to clone it.
> >>
> >>>> I am also looking for the option specified for the kernel:
> >>>>
> >>>> "The kernel needs to be built with this feature turned on (in
> >>>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
> >>>> Fixed Address)."
> >>
> >> Ok.
> >>
> >>> This also sound like a very ancient tree.
> >>> This is the latest kernel tree - master-next is the latest devel branch.
> >>> https://github.com/Xilinx/linux-xlnx
> >>
> >> Ok, cool. I have the right one.
> 
> Following the documentation, I was able to boot a kernel with qemu for
> the linux-xlnx and qemu-xilinx.
> 
> But this kernel is outdated regarding the upstream one, so I tried to
> boot a 3.11-rc4 kernel without success, I did the following:
> 
> I used the default config file from linux-xlnx for the upstream kernel.
> 
> I compiled the kernel with:
> 
> make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-
> UIMAGE_LOADADDR=0x8000 uImage
> 
> I generated the dtb with:
> 
> make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- dtbs
> 
> For qemu, I started qemu with:
> 
> ./arm-softmmu/qemu-system-arm -M arm-generic-fdt -nographic -smp 2
> -machine linux=on -serial mon:stdio -dtb zynq-zed.dtb -kernel
> kernel/zImage -initrd filesystem/ramdisk.img
> 
> I tried with the dtb available for the upstream kernel:
> 
> zynq-zc706.dtb, zynq-zc702.dtb and zynq-zed.dtb
> 
> Did I miss something ?

Unfortunately the public github QEMU is behind our internal development.
IIRC, there were a couple of DT compatible strings QEMU didn't know.
Those are sometimes removed making boot fail/hang.

Are you on IRC? It's probably easier to resolve this in a more direct
way (I'm 'sorenb' on freenode). Otherwise I could give you a few more
instructions for debugging this per mail?

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06  8:46                                             ` Daniel Lezcano
@ 2013-08-06 16:18                                               ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-06 16:18 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stuart Menefy, Stephen Boyd, John Stultz, Thomas Gleixner,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

On Tue, Aug 06, 2013 at 10:46:54AM +0200, Daniel Lezcano wrote:
> On 08/06/2013 03:28 AM, Sören Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
> >>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> >>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> >>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> >>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> >>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>>>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>>>
> >>>>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>>>
> >>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>>>> broadcast device:
> >>>>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>>>> 	Broadcast device  
> >>>>>>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>>>>>
> >>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>>>> 	Broadcast device                                                                 
> >>>>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>>>>
> >>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>>>> obtain this information for that case.
> >>>>>>>>>>
> >>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>>>
> >>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>>>> it is most likely not that useful.
> >>>>>>>>>
> >>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>>>
> >>>>>>>>> /proc/timer_list:
> >>>>>>>>> 	Tick Device: mode:     1
> >>>>>>>>> 	Broadcast device
> >>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>> 	 max_delta_ns:   12884902005
> >>>>>>>>> 	 min_delta_ns:   1000
> >>>>>>>>> 	 mult:           715827876
> >>>>>>>>> 	 shift:          31
> >>>>>>>>> 	 mode:           3
> >>>>>>>>
> >>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>>>
> >>>>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>>>
> >>>>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>>>
> >>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>>>> and that didn't end well:
> >>>>>>> 	# echo 1 > online && cat /proc/timer_list 
> >>>>>>
> >>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>>>> apparently this is not the case... :(
> >>>>>>
> >>>>>> I suspect the global timer is shutdown at one moment but I don't
> >>>>>> understand why and when.
> >>>>>>
> >>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>>>> interesting trace when it hangs.
> >>>>>
> >>>>> I did this change:
> >>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>>>> 	index 38959c8..3ab11c1 100644
> >>>>> 	--- a/kernel/time/clockevents.c
> >>>>> 	+++ b/kernel/time/clockevents.c
> >>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>>>> 	  */
> >>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
> >>>>> 	 {
> >>>>> 	+       pr_info("ce->name:%s\n", dev->name);
> >>>>> 	+       dump_stack();
> >>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>>>> 	        dev->next_event.tv64 = KTIME_MAX;
> >>>>> 	 }
> >>>>>
> >>>>> It is hit a few times during boot, so I attach a full boot log. I really
> >>>>> don't know what to look for, but I hope you can spot something in it. I
> >>>>> really appreciate you taking the time.
> >>>>
> >>>> Thanks for the traces.
> >>>
> >>> Sure.
> >>>
> >>>>
> >>>> If you try without the ttc_clockevent configured in the kernel (but with
> >>>> twd and gt), does it boot ?
> >>>
> >>> Absence of the TTC doesn't seem to make any difference. It hangs at the
> >>> same location.
> >>
> >> Ok, IMO there is a problem with the broadcast device registration (may
> >> be vs twd).
> >>
> >> I will check later (kid duty) :)
> > 
> > I was actually waiting for an update from your side and did something
> > else, but I seem to have run into this again. I was overhauling the
> > cadence_ttc (patch attached, based on tip/timers/core). And it seems to
> > show the same behavior as enabling the global_timer. With cpuidle off, it
> > works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
> > C2 state makes it boot again.
> > It works just fine on our 3.10 kernel.
> 
> This is not necessary related to the bug. If the patch you sent broke
> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
> be stuck. Removing the flag, may signifies you don't use the broadcast
> timer, hence the bug is not surfacing.
> 
> Going back to the bug with the arm_global_timer, what is observed is the
> broadcast timer is *shutdown* when the second cpu is online.
> 
> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
> the issue is coming from there but before I have to reproduce the bug,
> so find a board I have where I can add the arm_global_timer.
> 
> > Another thing I noticed - probably unrelated but hard to tell: On
> > 3.11-rc1 and later my system stops for quite some time at the hand off
> > to userspace. I.e. I see the 'freeing unused kernel memory...' line and
> > sometimes the following 'Welcome to Buildroot...' and then it stops and
> > on good kernels it continues after a while and boots through and on bad
> > ones it just hangs there.
> 
> did you try to dump the stacks with magic-sysrq ? Or git bisect ?

That's my plan for today. Trying to find a reasonable approach for
bisecting some of this stuff.

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06 16:18                                               ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-06 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 06, 2013 at 10:46:54AM +0200, Daniel Lezcano wrote:
> On 08/06/2013 03:28 AM, S?ren Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
> >>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
> >>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
> >>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> >>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> >>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>>>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>>>
> >>>>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>>>
> >>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>>>> broadcast device:
> >>>>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>>>> 	Broadcast device  
> >>>>>>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>>>>>
> >>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>>>> 	Broadcast device                                                                 
> >>>>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>>>>
> >>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>>>> obtain this information for that case.
> >>>>>>>>>>
> >>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>>>
> >>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>>>> it is most likely not that useful.
> >>>>>>>>>
> >>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>>>
> >>>>>>>>> /proc/timer_list:
> >>>>>>>>> 	Tick Device: mode:     1
> >>>>>>>>> 	Broadcast device
> >>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>> 	 max_delta_ns:   12884902005
> >>>>>>>>> 	 min_delta_ns:   1000
> >>>>>>>>> 	 mult:           715827876
> >>>>>>>>> 	 shift:          31
> >>>>>>>>> 	 mode:           3
> >>>>>>>>
> >>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>>>
> >>>>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>>>
> >>>>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>>>
> >>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>>>> and that didn't end well:
> >>>>>>> 	# echo 1 > online && cat /proc/timer_list 
> >>>>>>
> >>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>>>> apparently this is not the case... :(
> >>>>>>
> >>>>>> I suspect the global timer is shutdown at one moment but I don't
> >>>>>> understand why and when.
> >>>>>>
> >>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>>>> interesting trace when it hangs.
> >>>>>
> >>>>> I did this change:
> >>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>>>> 	index 38959c8..3ab11c1 100644
> >>>>> 	--- a/kernel/time/clockevents.c
> >>>>> 	+++ b/kernel/time/clockevents.c
> >>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>>>> 	  */
> >>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
> >>>>> 	 {
> >>>>> 	+       pr_info("ce->name:%s\n", dev->name);
> >>>>> 	+       dump_stack();
> >>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>>>> 	        dev->next_event.tv64 = KTIME_MAX;
> >>>>> 	 }
> >>>>>
> >>>>> It is hit a few times during boot, so I attach a full boot log. I really
> >>>>> don't know what to look for, but I hope you can spot something in it. I
> >>>>> really appreciate you taking the time.
> >>>>
> >>>> Thanks for the traces.
> >>>
> >>> Sure.
> >>>
> >>>>
> >>>> If you try without the ttc_clockevent configured in the kernel (but with
> >>>> twd and gt), does it boot ?
> >>>
> >>> Absence of the TTC doesn't seem to make any difference. It hangs at the
> >>> same location.
> >>
> >> Ok, IMO there is a problem with the broadcast device registration (may
> >> be vs twd).
> >>
> >> I will check later (kid duty) :)
> > 
> > I was actually waiting for an update from your side and did something
> > else, but I seem to have run into this again. I was overhauling the
> > cadence_ttc (patch attached, based on tip/timers/core). And it seems to
> > show the same behavior as enabling the global_timer. With cpuidle off, it
> > works. With cpuidle, on it hangs. Removing the TIMER_STOP flag from the
> > C2 state makes it boot again.
> > It works just fine on our 3.10 kernel.
> 
> This is not necessary related to the bug. If the patch you sent broke
> the cadence_ttc driver, when you use it (with the TIMER_STOP), you will
> be stuck. Removing the flag, may signifies you don't use the broadcast
> timer, hence the bug is not surfacing.
> 
> Going back to the bug with the arm_global_timer, what is observed is the
> broadcast timer is *shutdown* when the second cpu is online.
> 
> I have to dig into the kernel/time/clockevents.c|tick-*.c because IMO
> the issue is coming from there but before I have to reproduce the bug,
> so find a board I have where I can add the arm_global_timer.
> 
> > Another thing I noticed - probably unrelated but hard to tell: On
> > 3.11-rc1 and later my system stops for quite some time at the hand off
> > to userspace. I.e. I see the 'freeing unused kernel memory...' line and
> > sometimes the following 'Welcome to Buildroot...' and then it stops and
> > on good kernels it continues after a while and boots through and on bad
> > ones it just hangs there.
> 
> did you try to dump the stacks with magic-sysrq ? Or git bisect ?

That's my plan for today. Trying to find a reasonable approach for
bisecting some of this stuff.

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-06 16:09                                                         ` Daniel Lezcano
@ 2013-08-06 16:25                                                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-06 16:25 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: monstr, Michal Simek, Stuart Menefy, Stephen Boyd, John Stultz,
	Thomas Gleixner, Russell King, linux-kernel, linux-arm-kernel

On Tue, Aug 06, 2013 at 06:09:09PM +0200, Daniel Lezcano wrote:
> On 08/06/2013 03:18 PM, Michal Simek wrote:
> 
> [ ... ]
> 
> >>>>> Soren: Are you able to replicate this issue on QEMU?
> >>>>> If yes, it should be the best if you can provide Qemu, kernel .config/
> >>>>> rootfs and simple manual to Daniel how to reach that fault.
> >>>>
> >>>> I tried to download qemu for zynq but it fails:
> >>>>
> >>>> git clone git://git.xilinx.com/qemu-xarm.git
> >>>> Cloning into 'qemu-xarm'...
> >>>> fatal: The remote end hung up unexpectedly
> >>>
> >>> Not sure which site have you found but
> >>> it should be just qemu.git
> >>> https://github.com/Xilinx/qemu
> >>>
> >>> or github clone.
> >>
> >> Ok, cool I was able to clone it.
> >>
> >>>> I am also looking for the option specified for the kernel:
> >>>>
> >>>> "The kernel needs to be built with this feature turned on (in
> >>>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
> >>>> Fixed Address)."
> >>
> >> Ok.
> >>
> >>> This also sound like a very ancient tree.
> >>> This is the latest kernel tree - master-next is the latest devel branch.
> >>> https://github.com/Xilinx/linux-xlnx
> >>
> >> Ok, cool. I have the right one.
> 
> Following the documentation, I was able to boot a kernel with qemu for
> the linux-xlnx and qemu-xilinx.
> 
> But this kernel is outdated regarding the upstream one, so I tried to
> boot a 3.11-rc4 kernel without success, I did the following:
> 
> I used the default config file from linux-xlnx for the upstream kernel.
> 
> I compiled the kernel with:
> 
> make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-
> UIMAGE_LOADADDR=0x8000 uImage
> 
> I generated the dtb with:
> 
> make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- dtbs
> 
> For qemu, I started qemu with:
> 
> ./arm-softmmu/qemu-system-arm -M arm-generic-fdt -nographic -smp 2
> -machine linux=on -serial mon:stdio -dtb zynq-zed.dtb -kernel
> kernel/zImage -initrd filesystem/ramdisk.img
> 
> I tried with the dtb available for the upstream kernel:
> 
> zynq-zc706.dtb, zynq-zc702.dtb and zynq-zed.dtb
> 
> Did I miss something ?

Some debugging hints in case you wanna go through this.
Add this additional option to configure:
 --extra-cflags="-DFDT_GENERIC_UTIL_ERR_DEBUG=1

That'll print out a lot of messages when the dtb is parsed. It's likely
that QEMU invalidates some vital node due to its compatible string being
unknown. In that case you can simply add it to the list of known devices
in 
	hw/core/fdt_generic_devices.c
The list is pretty much at the end of that file. I try to get it running
here and might be able to send you a patch.

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-06 16:25                                                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-06 16:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 06, 2013 at 06:09:09PM +0200, Daniel Lezcano wrote:
> On 08/06/2013 03:18 PM, Michal Simek wrote:
> 
> [ ... ]
> 
> >>>>> Soren: Are you able to replicate this issue on QEMU?
> >>>>> If yes, it should be the best if you can provide Qemu, kernel .config/
> >>>>> rootfs and simple manual to Daniel how to reach that fault.
> >>>>
> >>>> I tried to download qemu for zynq but it fails:
> >>>>
> >>>> git clone git://git.xilinx.com/qemu-xarm.git
> >>>> Cloning into 'qemu-xarm'...
> >>>> fatal: The remote end hung up unexpectedly
> >>>
> >>> Not sure which site have you found but
> >>> it should be just qemu.git
> >>> https://github.com/Xilinx/qemu
> >>>
> >>> or github clone.
> >>
> >> Ok, cool I was able to clone it.
> >>
> >>>> I am also looking for the option specified for the kernel:
> >>>>
> >>>> "The kernel needs to be built with this feature turned on (in
> >>>> menuconfig, System Type->Xilinx Specific Features -> Device Tree At
> >>>> Fixed Address)."
> >>
> >> Ok.
> >>
> >>> This also sound like a very ancient tree.
> >>> This is the latest kernel tree - master-next is the latest devel branch.
> >>> https://github.com/Xilinx/linux-xlnx
> >>
> >> Ok, cool. I have the right one.
> 
> Following the documentation, I was able to boot a kernel with qemu for
> the linux-xlnx and qemu-xilinx.
> 
> But this kernel is outdated regarding the upstream one, so I tried to
> boot a 3.11-rc4 kernel without success, I did the following:
> 
> I used the default config file from linux-xlnx for the upstream kernel.
> 
> I compiled the kernel with:
> 
> make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-
> UIMAGE_LOADADDR=0x8000 uImage
> 
> I generated the dtb with:
> 
> make -j 5 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- dtbs
> 
> For qemu, I started qemu with:
> 
> ./arm-softmmu/qemu-system-arm -M arm-generic-fdt -nographic -smp 2
> -machine linux=on -serial mon:stdio -dtb zynq-zed.dtb -kernel
> kernel/zImage -initrd filesystem/ramdisk.img
> 
> I tried with the dtb available for the upstream kernel:
> 
> zynq-zc706.dtb, zynq-zc702.dtb and zynq-zed.dtb
> 
> Did I miss something ?

Some debugging hints in case you wanna go through this.
Add this additional option to configure:
 --extra-cflags="-DFDT_GENERIC_UTIL_ERR_DEBUG=1

That'll print out a lot of messages when the dtb is parsed. It's likely
that QEMU invalidates some vital node due to its compatible string being
unknown. In that case you can simply add it to the list of known devices
in 
	hw/core/fdt_generic_devices.c
The list is pretty much at the end of that file. I try to get it running
here and might be able to send you a patch.

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-01 17:48                                         ` Daniel Lezcano
@ 2013-08-08 17:11                                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-08 17:11 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stuart Menefy, Stephen Boyd, John Stultz, Thomas Gleixner,
	Russell King, Michal Simek, linux-kernel, linux-arm-kernel

Hi Daniel,

On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
> > On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> >>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> >>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> >>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> >>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>> (snip)
> >>>>>>>>>>>>
> >>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>
> >>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>
> >>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>
> >>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>
> >>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>
> >>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>> broadcast device:
> >>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>> 	Broadcast device  
> >>>>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>>>
> >>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>> 	Broadcast device                                                                 
> >>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>>
> >>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>> obtain this information for that case.
> >>>>>>>>
> >>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>
> >>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>> it is most likely not that useful.
> >>>>>>>
> >>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>
> >>>>>>> /proc/timer_list:
> >>>>>>> 	Tick Device: mode:     1
> >>>>>>> 	Broadcast device
> >>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>> 	 max_delta_ns:   12884902005
> >>>>>>> 	 min_delta_ns:   1000
> >>>>>>> 	 mult:           715827876
> >>>>>>> 	 shift:          31
> >>>>>>> 	 mode:           3
> >>>>>>
> >>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>
> >>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>
> >>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>
> >>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>> and that didn't end well:
> >>>>> 	# echo 1 > online && cat /proc/timer_list 
> >>>>
> >>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>> apparently this is not the case... :(
> >>>>
> >>>> I suspect the global timer is shutdown at one moment but I don't
> >>>> understand why and when.
> >>>>
> >>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>> interesting trace when it hangs.
> >>>
> >>> I did this change:
> >>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>> 	index 38959c8..3ab11c1 100644
> >>> 	--- a/kernel/time/clockevents.c
> >>> 	+++ b/kernel/time/clockevents.c
> >>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>> 	  */
> >>> 	 void clockevents_shutdown(struct clock_event_device *dev)
> >>> 	 {
> >>> 	+       pr_info("ce->name:%s\n", dev->name);
> >>> 	+       dump_stack();
> >>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>> 	        dev->next_event.tv64 = KTIME_MAX;
> >>> 	 }
> >>>
> >>> It is hit a few times during boot, so I attach a full boot log. I really
> >>> don't know what to look for, but I hope you can spot something in it. I
> >>> really appreciate you taking the time.
> >>
> >> Thanks for the traces.
> > 
> > Sure.
> > 
> >>
> >> If you try without the ttc_clockevent configured in the kernel (but with
> >> twd and gt), does it boot ?
> > 
> > Absence of the TTC doesn't seem to make any difference. It hangs at the
> > same location.
> 
> Ok, IMO there is a problem with the broadcast device registration (may
> be vs twd).

I have an idea, but no real evidence to prove it:
Some of the registers in the arm_global_timer are banked per CPU. I.e.
some code must be executed on the CPU the timer is associated with
(struct clock_event_device.cpumask) to have the intended effect
As far as I can tell, there is no guarantee, that the set_mode()
and program_next_event() calls execute on the correct CPU.
If this was correct, shutting down the timer for the CPU entering
idle might actually shut down the timer for the running CPU, if
set_mode() executes on the CPU which is _not_ about to enter idle.

I tried to prove this by adding some really ugly smp_call_any() wrappers
in kernel/time/clockevents.c for the calls to set_mode() and
program_net_event() but that ends in all kinds of dead locks.

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-08 17:11                                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-08 17:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
> > On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
> >>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
> >>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> >>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> >>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>> (snip)
> >>>>>>>>>>>>
> >>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>
> >>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>
> >>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>
> >>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>
> >>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>
> >>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>> broadcast device:
> >>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>> 	Broadcast device  
> >>>>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>>>
> >>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>> 	Broadcast device                                                                 
> >>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>>
> >>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>> obtain this information for that case.
> >>>>>>>>
> >>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>
> >>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>> it is most likely not that useful.
> >>>>>>>
> >>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>
> >>>>>>> /proc/timer_list:
> >>>>>>> 	Tick Device: mode:     1
> >>>>>>> 	Broadcast device
> >>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>> 	 max_delta_ns:   12884902005
> >>>>>>> 	 min_delta_ns:   1000
> >>>>>>> 	 mult:           715827876
> >>>>>>> 	 shift:          31
> >>>>>>> 	 mode:           3
> >>>>>>
> >>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>
> >>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>
> >>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>
> >>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>> and that didn't end well:
> >>>>> 	# echo 1 > online && cat /proc/timer_list 
> >>>>
> >>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>> apparently this is not the case... :(
> >>>>
> >>>> I suspect the global timer is shutdown at one moment but I don't
> >>>> understand why and when.
> >>>>
> >>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>> interesting trace when it hangs.
> >>>
> >>> I did this change:
> >>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>> 	index 38959c8..3ab11c1 100644
> >>> 	--- a/kernel/time/clockevents.c
> >>> 	+++ b/kernel/time/clockevents.c
> >>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>> 	  */
> >>> 	 void clockevents_shutdown(struct clock_event_device *dev)
> >>> 	 {
> >>> 	+       pr_info("ce->name:%s\n", dev->name);
> >>> 	+       dump_stack();
> >>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>> 	        dev->next_event.tv64 = KTIME_MAX;
> >>> 	 }
> >>>
> >>> It is hit a few times during boot, so I attach a full boot log. I really
> >>> don't know what to look for, but I hope you can spot something in it. I
> >>> really appreciate you taking the time.
> >>
> >> Thanks for the traces.
> > 
> > Sure.
> > 
> >>
> >> If you try without the ttc_clockevent configured in the kernel (but with
> >> twd and gt), does it boot ?
> > 
> > Absence of the TTC doesn't seem to make any difference. It hangs at the
> > same location.
> 
> Ok, IMO there is a problem with the broadcast device registration (may
> be vs twd).

I have an idea, but no real evidence to prove it:
Some of the registers in the arm_global_timer are banked per CPU. I.e.
some code must be executed on the CPU the timer is associated with
(struct clock_event_device.cpumask) to have the intended effect
As far as I can tell, there is no guarantee, that the set_mode()
and program_next_event() calls execute on the correct CPU.
If this was correct, shutting down the timer for the CPU entering
idle might actually shut down the timer for the running CPU, if
set_mode() executes on the CPU which is _not_ about to enter idle.

I tried to prove this by adding some really ugly smp_call_any() wrappers
in kernel/time/clockevents.c for the calls to set_mode() and
program_net_event() but that ends in all kinds of dead locks.

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-08 17:11                                           ` Sören Brinkmann
@ 2013-08-08 17:16                                             ` Mark Rutland
  -1 siblings, 0 replies; 142+ messages in thread
From: Mark Rutland @ 2013-08-08 17:16 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Daniel Lezcano, Russell King, Stephen Boyd, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Thu, Aug 08, 2013 at 06:11:26PM +0100, Sören Brinkmann wrote:
> Hi Daniel,
> 
> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> > On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
> > > On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> > >> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> > >>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> > >>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> > >>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> > >>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> > >>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> > >>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> > >>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> > >>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> > >>>>>>>>>>> Hi Daniel,
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > >>>>>>>>>>> (snip)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> > >>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> > >>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> > >>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> > >>>>>>>>>>>> idle state, switching the local timer back in use.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> > >>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > >>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> > >>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > >>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> > >>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> > >>>>>>>>>>> always remain a mystery to me.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Actually this more or less leads to the question: What is this
> > >>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> > >>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > >>>>>>>>>>> hang pointing to some issue with that driver?
> > >>>>>>>>>>
> > >>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > >>>>>>>>>
> > >>>>>>>>> So, the correct run results (full output attached).
> > >>>>>>>>>
> > >>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> > >>>>>>>>> broadcast device:
> > >>>>>>>>>     Tick Device: mode:     1
> > >>>>>>>>>     Broadcast device
> > >>>>>>>>>     Clock Event Device: ttc_clockevent
> > >>>>>>>>>
> > >>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> > >>>>>>>>> enable the global timer, the twd timers are still used as local timers
> > >>>>>>>>> and the broadcast device is the global timer:
> > >>>>>>>>>     Tick Device: mode:     1
> > >>>>>>>>>     Broadcast device
> > >>>>>>>>>     Clock Event Device: arm_global_timer
> > >>>>>>>>>
> > >>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> > >>>>>>>>> obtain this information for that case.
> > >>>>>>>>
> > >>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> > >>>>>>>
> > >>>>>>> Right, that works. I forgot about that option after you mentioned, that
> > >>>>>>> it is most likely not that useful.
> > >>>>>>>
> > >>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> > >>>>>>> the gt enabled and having maxcpus=1 set.
> > >>>>>>>
> > >>>>>>> /proc/timer_list:
> > >>>>>>>       Tick Device: mode:     1
> > >>>>>>>       Broadcast device
> > >>>>>>>       Clock Event Device: arm_global_timer
> > >>>>>>>        max_delta_ns:   12884902005
> > >>>>>>>        min_delta_ns:   1000
> > >>>>>>>        mult:           715827876
> > >>>>>>>        shift:          31
> > >>>>>>>        mode:           3
> > >>>>>>
> > >>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> > >>>>>>
> > >>>>>> The previous timer_list output you gave me when removing the offending
> > >>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> > >>>>>>
> > >>>>>> Is it possible you try to get this output again right after onlining the
> > >>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> > >>>>>
> > >>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> > >>>>> and that didn't end well:
> > >>>>>         # echo 1 > online && cat /proc/timer_list
> > >>>>
> > >>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> > >>>> apparently this is not the case... :(
> > >>>>
> > >>>> I suspect the global timer is shutdown at one moment but I don't
> > >>>> understand why and when.
> > >>>>
> > >>>> Can you add a stack trace in the "clockevents_shutdown" function with
> > >>>> the clockevent device name ? Perhaps, we may see at boot time an
> > >>>> interesting trace when it hangs.
> > >>>
> > >>> I did this change:
> > >>>   diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> > >>>   index 38959c8..3ab11c1 100644
> > >>>   --- a/kernel/time/clockevents.c
> > >>>   +++ b/kernel/time/clockevents.c
> > >>>   @@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> > >>>     */
> > >>>    void clockevents_shutdown(struct clock_event_device *dev)
> > >>>    {
> > >>>   +       pr_info("ce->name:%s\n", dev->name);
> > >>>   +       dump_stack();
> > >>>           clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> > >>>           dev->next_event.tv64 = KTIME_MAX;
> > >>>    }
> > >>>
> > >>> It is hit a few times during boot, so I attach a full boot log. I really
> > >>> don't know what to look for, but I hope you can spot something in it. I
> > >>> really appreciate you taking the time.
> > >>
> > >> Thanks for the traces.
> > >
> > > Sure.
> > >
> > >>
> > >> If you try without the ttc_clockevent configured in the kernel (but with
> > >> twd and gt), does it boot ?
> > >
> > > Absence of the TTC doesn't seem to make any difference. It hangs at the
> > > same location.
> >
> > Ok, IMO there is a problem with the broadcast device registration (may
> > be vs twd).
> 
> I have an idea, but no real evidence to prove it:
> Some of the registers in the arm_global_timer are banked per CPU. I.e.
> some code must be executed on the CPU the timer is associated with
> (struct clock_event_device.cpumask) to have the intended effect
> As far as I can tell, there is no guarantee, that the set_mode()
> and program_next_event() calls execute on the correct CPU.

I believe the core clockevents code enforces that, or all other percpu
clockevent_device drivers would be horrifically broken.

Thanks,
Mark.

> If this was correct, shutting down the timer for the CPU entering
> idle might actually shut down the timer for the running CPU, if
> set_mode() executes on the CPU which is _not_ about to enter idle.
> 
> I tried to prove this by adding some really ugly smp_call_any() wrappers
> in kernel/time/clockevents.c for the calls to set_mode() and
> program_net_event() but that ends in all kinds of dead locks.
> 
>         Sören
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-08 17:16                                             ` Mark Rutland
  0 siblings, 0 replies; 142+ messages in thread
From: Mark Rutland @ 2013-08-08 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 08, 2013 at 06:11:26PM +0100, S?ren Brinkmann wrote:
> Hi Daniel,
> 
> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> > On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
> > > On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> > >> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
> > >>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> > >>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
> > >>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> > >>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> > >>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> > >>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> > >>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> > >>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> > >>>>>>>>>>> Hi Daniel,
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > >>>>>>>>>>> (snip)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> > >>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> > >>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> > >>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> > >>>>>>>>>>>> idle state, switching the local timer back in use.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> > >>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > >>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> > >>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > >>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> > >>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> > >>>>>>>>>>> always remain a mystery to me.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Actually this more or less leads to the question: What is this
> > >>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> > >>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > >>>>>>>>>>> hang pointing to some issue with that driver?
> > >>>>>>>>>>
> > >>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > >>>>>>>>>
> > >>>>>>>>> So, the correct run results (full output attached).
> > >>>>>>>>>
> > >>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> > >>>>>>>>> broadcast device:
> > >>>>>>>>>     Tick Device: mode:     1
> > >>>>>>>>>     Broadcast device
> > >>>>>>>>>     Clock Event Device: ttc_clockevent
> > >>>>>>>>>
> > >>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> > >>>>>>>>> enable the global timer, the twd timers are still used as local timers
> > >>>>>>>>> and the broadcast device is the global timer:
> > >>>>>>>>>     Tick Device: mode:     1
> > >>>>>>>>>     Broadcast device
> > >>>>>>>>>     Clock Event Device: arm_global_timer
> > >>>>>>>>>
> > >>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> > >>>>>>>>> obtain this information for that case.
> > >>>>>>>>
> > >>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> > >>>>>>>
> > >>>>>>> Right, that works. I forgot about that option after you mentioned, that
> > >>>>>>> it is most likely not that useful.
> > >>>>>>>
> > >>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> > >>>>>>> the gt enabled and having maxcpus=1 set.
> > >>>>>>>
> > >>>>>>> /proc/timer_list:
> > >>>>>>>       Tick Device: mode:     1
> > >>>>>>>       Broadcast device
> > >>>>>>>       Clock Event Device: arm_global_timer
> > >>>>>>>        max_delta_ns:   12884902005
> > >>>>>>>        min_delta_ns:   1000
> > >>>>>>>        mult:           715827876
> > >>>>>>>        shift:          31
> > >>>>>>>        mode:           3
> > >>>>>>
> > >>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> > >>>>>>
> > >>>>>> The previous timer_list output you gave me when removing the offending
> > >>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> > >>>>>>
> > >>>>>> Is it possible you try to get this output again right after onlining the
> > >>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> > >>>>>
> > >>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> > >>>>> and that didn't end well:
> > >>>>>         # echo 1 > online && cat /proc/timer_list
> > >>>>
> > >>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> > >>>> apparently this is not the case... :(
> > >>>>
> > >>>> I suspect the global timer is shutdown at one moment but I don't
> > >>>> understand why and when.
> > >>>>
> > >>>> Can you add a stack trace in the "clockevents_shutdown" function with
> > >>>> the clockevent device name ? Perhaps, we may see at boot time an
> > >>>> interesting trace when it hangs.
> > >>>
> > >>> I did this change:
> > >>>   diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> > >>>   index 38959c8..3ab11c1 100644
> > >>>   --- a/kernel/time/clockevents.c
> > >>>   +++ b/kernel/time/clockevents.c
> > >>>   @@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> > >>>     */
> > >>>    void clockevents_shutdown(struct clock_event_device *dev)
> > >>>    {
> > >>>   +       pr_info("ce->name:%s\n", dev->name);
> > >>>   +       dump_stack();
> > >>>           clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> > >>>           dev->next_event.tv64 = KTIME_MAX;
> > >>>    }
> > >>>
> > >>> It is hit a few times during boot, so I attach a full boot log. I really
> > >>> don't know what to look for, but I hope you can spot something in it. I
> > >>> really appreciate you taking the time.
> > >>
> > >> Thanks for the traces.
> > >
> > > Sure.
> > >
> > >>
> > >> If you try without the ttc_clockevent configured in the kernel (but with
> > >> twd and gt), does it boot ?
> > >
> > > Absence of the TTC doesn't seem to make any difference. It hangs at the
> > > same location.
> >
> > Ok, IMO there is a problem with the broadcast device registration (may
> > be vs twd).
> 
> I have an idea, but no real evidence to prove it:
> Some of the registers in the arm_global_timer are banked per CPU. I.e.
> some code must be executed on the CPU the timer is associated with
> (struct clock_event_device.cpumask) to have the intended effect
> As far as I can tell, there is no guarantee, that the set_mode()
> and program_next_event() calls execute on the correct CPU.

I believe the core clockevents code enforces that, or all other percpu
clockevent_device drivers would be horrifically broken.

Thanks,
Mark.

> If this was correct, shutting down the timer for the CPU entering
> idle might actually shut down the timer for the running CPU, if
> set_mode() executes on the CPU which is _not_ about to enter idle.
> 
> I tried to prove this by adding some really ugly smp_call_any() wrappers
> in kernel/time/clockevents.c for the calls to set_mode() and
> program_net_event() but that ends in all kinds of dead locks.
> 
>         S?ren
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-08 17:16                                             ` Mark Rutland
@ 2013-08-08 17:22                                               ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-08 17:22 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Sören Brinkmann, Daniel Lezcano, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/08/13 10:16, Mark Rutland wrote:
> On Thu, Aug 08, 2013 at 06:11:26PM +0100, Sören Brinkmann wrote:
>> Hi Daniel,
>>
>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>
>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>     Tick Device: mode:     1
>>>>>>>>>>>>     Broadcast device
>>>>>>>>>>>>     Clock Event Device: ttc_clockevent
>>>>>>>>>>>>
>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>     Tick Device: mode:     1
>>>>>>>>>>>>     Broadcast device
>>>>>>>>>>>>     Clock Event Device: arm_global_timer
>>>>>>>>>>>>
>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>
>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>
>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>       Tick Device: mode:     1
>>>>>>>>>>       Broadcast device
>>>>>>>>>>       Clock Event Device: arm_global_timer
>>>>>>>>>>        max_delta_ns:   12884902005
>>>>>>>>>>        min_delta_ns:   1000
>>>>>>>>>>        mult:           715827876
>>>>>>>>>>        shift:          31
>>>>>>>>>>        mode:           3
>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>
>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>
>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>> and that didn't end well:
>>>>>>>>         # echo 1 > online && cat /proc/timer_list
>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>> apparently this is not the case... :(
>>>>>>>
>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>> understand why and when.
>>>>>>>
>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>> interesting trace when it hangs.
>>>>>> I did this change:
>>>>>>   diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>   index 38959c8..3ab11c1 100644
>>>>>>   --- a/kernel/time/clockevents.c
>>>>>>   +++ b/kernel/time/clockevents.c
>>>>>>   @@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>     */
>>>>>>    void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>    {
>>>>>>   +       pr_info("ce->name:%s\n", dev->name);
>>>>>>   +       dump_stack();
>>>>>>           clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>           dev->next_event.tv64 = KTIME_MAX;
>>>>>>    }
>>>>>>
>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>> really appreciate you taking the time.
>>>>> Thanks for the traces.
>>>> Sure.
>>>>
>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>> twd and gt), does it boot ?
>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>> same location.
>>> Ok, IMO there is a problem with the broadcast device registration (may
>>> be vs twd).
>> I have an idea, but no real evidence to prove it:
>> Some of the registers in the arm_global_timer are banked per CPU. I.e.
>> some code must be executed on the CPU the timer is associated with
>> (struct clock_event_device.cpumask) to have the intended effect
>> As far as I can tell, there is no guarantee, that the set_mode()
>> and program_next_event() calls execute on the correct CPU.
> I believe the core clockevents code enforces that, or all other percpu
> clockevent_device drivers would be horrifically broken.

Maybe the problem here is that a per-cpu device is being used for the
broadcast source? I can't recall but I think the broadcast programming
can bounce around CPUs depending on which CPU is the one to enter
broadcast mode first? At least I don't think this configuration has ever
been tested (for example, look at how tick_do_broadcast_on_off() enables
the broadcast timer on whatever CPU goes into deep idle first).

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-08 17:22                                               ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-08 17:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/08/13 10:16, Mark Rutland wrote:
> On Thu, Aug 08, 2013 at 06:11:26PM +0100, S?ren Brinkmann wrote:
>> Hi Daniel,
>>
>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>
>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>>     Tick Device: mode:     1
>>>>>>>>>>>>     Broadcast device
>>>>>>>>>>>>     Clock Event Device: ttc_clockevent
>>>>>>>>>>>>
>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>>     Tick Device: mode:     1
>>>>>>>>>>>>     Broadcast device
>>>>>>>>>>>>     Clock Event Device: arm_global_timer
>>>>>>>>>>>>
>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>
>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>
>>>>>>>>>> /proc/timer_list:
>>>>>>>>>>       Tick Device: mode:     1
>>>>>>>>>>       Broadcast device
>>>>>>>>>>       Clock Event Device: arm_global_timer
>>>>>>>>>>        max_delta_ns:   12884902005
>>>>>>>>>>        min_delta_ns:   1000
>>>>>>>>>>        mult:           715827876
>>>>>>>>>>        shift:          31
>>>>>>>>>>        mode:           3
>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>
>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>
>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>> and that didn't end well:
>>>>>>>>         # echo 1 > online && cat /proc/timer_list
>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>> apparently this is not the case... :(
>>>>>>>
>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>> understand why and when.
>>>>>>>
>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>> interesting trace when it hangs.
>>>>>> I did this change:
>>>>>>   diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>>   index 38959c8..3ab11c1 100644
>>>>>>   --- a/kernel/time/clockevents.c
>>>>>>   +++ b/kernel/time/clockevents.c
>>>>>>   @@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>>     */
>>>>>>    void clockevents_shutdown(struct clock_event_device *dev)
>>>>>>    {
>>>>>>   +       pr_info("ce->name:%s\n", dev->name);
>>>>>>   +       dump_stack();
>>>>>>           clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>>           dev->next_event.tv64 = KTIME_MAX;
>>>>>>    }
>>>>>>
>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>> really appreciate you taking the time.
>>>>> Thanks for the traces.
>>>> Sure.
>>>>
>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>> twd and gt), does it boot ?
>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>> same location.
>>> Ok, IMO there is a problem with the broadcast device registration (may
>>> be vs twd).
>> I have an idea, but no real evidence to prove it:
>> Some of the registers in the arm_global_timer are banked per CPU. I.e.
>> some code must be executed on the CPU the timer is associated with
>> (struct clock_event_device.cpumask) to have the intended effect
>> As far as I can tell, there is no guarantee, that the set_mode()
>> and program_next_event() calls execute on the correct CPU.
> I believe the core clockevents code enforces that, or all other percpu
> clockevent_device drivers would be horrifically broken.

Maybe the problem here is that a per-cpu device is being used for the
broadcast source? I can't recall but I think the broadcast programming
can bounce around CPUs depending on which CPU is the one to enter
broadcast mode first? At least I don't think this configuration has ever
been tested (for example, look at how tick_do_broadcast_on_off() enables
the broadcast timer on whatever CPU goes into deep idle first).

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-08 17:16                                             ` Mark Rutland
@ 2013-08-08 17:22                                               ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-08 17:22 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Daniel Lezcano, Russell King, Stephen Boyd, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 11184 bytes --]

On Thu, Aug 08, 2013 at 06:16:50PM +0100, Mark Rutland wrote:
> On Thu, Aug 08, 2013 at 06:11:26PM +0100, Sören Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> > > On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
> > > > On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> > > >> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> > > >>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> > > >>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> > > >>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> > > >>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> > > >>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> > > >>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> > > >>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> > > >>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> > > >>>>>>>>>>> Hi Daniel,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > > >>>>>>>>>>> (snip)
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> > > >>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> > > >>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> > > >>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> > > >>>>>>>>>>>> idle state, switching the local timer back in use.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> > > >>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > > >>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> > > >>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > > >>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> > > >>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> > > >>>>>>>>>>> always remain a mystery to me.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Actually this more or less leads to the question: What is this
> > > >>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> > > >>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > > >>>>>>>>>>> hang pointing to some issue with that driver?
> > > >>>>>>>>>>
> > > >>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > > >>>>>>>>>
> > > >>>>>>>>> So, the correct run results (full output attached).
> > > >>>>>>>>>
> > > >>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> > > >>>>>>>>> broadcast device:
> > > >>>>>>>>>     Tick Device: mode:     1
> > > >>>>>>>>>     Broadcast device
> > > >>>>>>>>>     Clock Event Device: ttc_clockevent
> > > >>>>>>>>>
> > > >>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> > > >>>>>>>>> enable the global timer, the twd timers are still used as local timers
> > > >>>>>>>>> and the broadcast device is the global timer:
> > > >>>>>>>>>     Tick Device: mode:     1
> > > >>>>>>>>>     Broadcast device
> > > >>>>>>>>>     Clock Event Device: arm_global_timer
> > > >>>>>>>>>
> > > >>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> > > >>>>>>>>> obtain this information for that case.
> > > >>>>>>>>
> > > >>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> > > >>>>>>>
> > > >>>>>>> Right, that works. I forgot about that option after you mentioned, that
> > > >>>>>>> it is most likely not that useful.
> > > >>>>>>>
> > > >>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> > > >>>>>>> the gt enabled and having maxcpus=1 set.
> > > >>>>>>>
> > > >>>>>>> /proc/timer_list:
> > > >>>>>>>       Tick Device: mode:     1
> > > >>>>>>>       Broadcast device
> > > >>>>>>>       Clock Event Device: arm_global_timer
> > > >>>>>>>        max_delta_ns:   12884902005
> > > >>>>>>>        min_delta_ns:   1000
> > > >>>>>>>        mult:           715827876
> > > >>>>>>>        shift:          31
> > > >>>>>>>        mode:           3
> > > >>>>>>
> > > >>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> > > >>>>>>
> > > >>>>>> The previous timer_list output you gave me when removing the offending
> > > >>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> > > >>>>>>
> > > >>>>>> Is it possible you try to get this output again right after onlining the
> > > >>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> > > >>>>>
> > > >>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> > > >>>>> and that didn't end well:
> > > >>>>>         # echo 1 > online && cat /proc/timer_list
> > > >>>>
> > > >>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> > > >>>> apparently this is not the case... :(
> > > >>>>
> > > >>>> I suspect the global timer is shutdown at one moment but I don't
> > > >>>> understand why and when.
> > > >>>>
> > > >>>> Can you add a stack trace in the "clockevents_shutdown" function with
> > > >>>> the clockevent device name ? Perhaps, we may see at boot time an
> > > >>>> interesting trace when it hangs.
> > > >>>
> > > >>> I did this change:
> > > >>>   diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> > > >>>   index 38959c8..3ab11c1 100644
> > > >>>   --- a/kernel/time/clockevents.c
> > > >>>   +++ b/kernel/time/clockevents.c
> > > >>>   @@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> > > >>>     */
> > > >>>    void clockevents_shutdown(struct clock_event_device *dev)
> > > >>>    {
> > > >>>   +       pr_info("ce->name:%s\n", dev->name);
> > > >>>   +       dump_stack();
> > > >>>           clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> > > >>>           dev->next_event.tv64 = KTIME_MAX;
> > > >>>    }
> > > >>>
> > > >>> It is hit a few times during boot, so I attach a full boot log. I really
> > > >>> don't know what to look for, but I hope you can spot something in it. I
> > > >>> really appreciate you taking the time.
> > > >>
> > > >> Thanks for the traces.
> > > >
> > > > Sure.
> > > >
> > > >>
> > > >> If you try without the ttc_clockevent configured in the kernel (but with
> > > >> twd and gt), does it boot ?
> > > >
> > > > Absence of the TTC doesn't seem to make any difference. It hangs at the
> > > > same location.
> > >
> > > Ok, IMO there is a problem with the broadcast device registration (may
> > > be vs twd).
> > 
> > I have an idea, but no real evidence to prove it:
> > Some of the registers in the arm_global_timer are banked per CPU. I.e.
> > some code must be executed on the CPU the timer is associated with
> > (struct clock_event_device.cpumask) to have the intended effect
> > As far as I can tell, there is no guarantee, that the set_mode()
> > and program_next_event() calls execute on the correct CPU.
> 
> I believe the core clockevents code enforces that, or all other percpu
> clockevent_device drivers would be horrifically broken.

Well, I have some evidence. I booted into the system (bootlog attached).
It seems to be luck to not deadlock.
In between I also had some additional print statements in the global
timer driver which seemed to back up my suspicion.

This is the ugly hack I added to the clockevents core:

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 38959c8..419c973 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -59,6 +59,20 @@ u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt)
 }
 EXPORT_SYMBOL_GPL(clockevent_delta2ns);
 
+struct ce_set_mode_args {
+	struct clock_event_device *dev;
+	enum clock_event_mode mode;
+};
+
+static struct ce_set_mode_args ce_mode_args;
+
+static void ce_set_mode_xcall(void *info)
+{
+	struct ce_set_mode_args *args = info;
+
+	args->dev->set_mode(args->mode, args->dev);
+}
+
 /**
  * clockevents_set_mode - set the operating mode of a clock event device
  * @dev:	device to modify
@@ -70,7 +84,10 @@ void clockevents_set_mode(struct clock_event_device *dev,
 				 enum clock_event_mode mode)
 {
 	if (dev->mode != mode) {
-		dev->set_mode(mode, dev);
+		ce_mode_args.mode = mode;
+		ce_mode_args.dev = dev;
+		smp_call_function_any(dev->cpumask, ce_set_mode_xcall,
+				&ce_mode_args, 1);
 		dev->mode = mode;
 
 		/*
@@ -96,6 +113,20 @@ void clockevents_shutdown(struct clock_event_device *dev)
 	dev->next_event.tv64 = KTIME_MAX;
 }
 
+struct ce_prog_eve_args {
+	struct clock_event_device *dev;
+	unsigned long clc;
+};
+
+static struct ce_prog_eve_args ce_prog_eve_args;
+
+static void  ce_prog_event_xcall(void *info)
+{
+	struct ce_prog_eve_args *args = info;
+
+	args->dev->set_next_event(args->clc, args->dev);
+}
+
 #ifdef CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST
 
 /* Limit min_delta to a jiffie */
@@ -141,6 +172,7 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
 	unsigned long long clc;
 	int64_t delta;
 	int i;
+	int cpu;
 
 	for (i = 0;;) {
 		delta = dev->min_delta_ns;
@@ -151,8 +183,11 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
 
 		dev->retries++;
 		clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
-		if (dev->set_next_event((unsigned long) clc, dev) == 0)
-			return 0;
+		ce_prog_eve_args.clc = clc;
+		ce_prog_eve_args.dev = dev;
+		smp_call_function_any(dev->cpumask, ce_prog_event_xcall,
+				&ce_prog_eve_args, 1);
+		return 0;
 
 		if (++i > 2) {
 			/*
@@ -179,6 +214,7 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
 {
 	unsigned long long clc;
 	int64_t delta;
+	int ret;
 
 	delta = dev->min_delta_ns;
 	dev->next_event = ktime_add_ns(ktime_get(), delta);
@@ -188,7 +224,13 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
 
 	dev->retries++;
 	clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
-	return dev->set_next_event((unsigned long) clc, dev);
+	ce_prog_eve_args.clc = clc;
+	ce_prog_eve_args.dev = dev;
+	smp_call_function_any(dev->cpumask, ce_prog_event_xcall,
+			&ce_prog_eve_args, 1);
+	ret = 0;
+
+	return ret;
 }
 
 #endif /* CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST */
@@ -230,7 +272,11 @@ int clockevents_program_event(struct clock_event_device *dev, ktime_t expires,
 	delta = max(delta, (int64_t) dev->min_delta_ns);
 
 	clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
-	rc = dev->set_next_event((unsigned long) clc, dev);
+	ce_prog_eve_args.clc = clc;
+	ce_prog_eve_args.dev = dev;
+	smp_call_function_any(dev->cpumask, ce_prog_event_xcall,
+			&ce_prog_eve_args, 1);
+	rc = 0;
 
 	return (rc && force) ? clockevents_program_min_delta(dev) : rc;
 }


 	Sören


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: boot.log --]
[-- Type: text/plain; charset="utf-8", Size: 21366 bytes --]



U-Boot 2013.04 (Jun 19 2013 - 08:17:40)

I2C:   ready
Memory: ECC disabled
DRAM:  1 GiB
WARNING: Caches not enabled
MMC:   zynq_sdhci: 0
SF: Detected N25Q128A with page size 128 KiB, total 32 MiB
*** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Net:   Gem.e000b000
Hit any key to stop autoboot:  3 \b\b\b 2 \b\b\b 1 \b\b\b 0 
TFTPing Linux to RAM...
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'uImage'.
Load address: 0x3000000
Loading: *\b#################################################################
	 #################################################################
	 #####################################
	 4.1 MiB/s
done
Bytes transferred = 2449256 (255f68 hex)
Gem.e000b000:7 is connected to Gem.e000b000.  Reconnecting to Gem.e000b000
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'devicetree.dtb'.
Load address: 0x2a00000
Loading: *\b#
	 127.9 KiB/s
done
Bytes transferred = 2759 (ac7 hex)
Gem.e000b000:7 is connected to Gem.e000b000.  Reconnecting to Gem.e000b000
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'uramdisk.image.gz'.
Load address: 0x2000000
Loading: *\b#################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ###################################
	 4.2 MiB/s
done
Bytes transferred = 5284624 (50a310 hex)
## Booting kernel from Legacy Image at 03000000 ...
   Image Name:   Linux-3.11.0-rc3-00001-gceefecf-
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2449192 Bytes = 2.3 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 02000000 ...
   Image Name:   
   Image Type:   ARM Linux RAMDisk Image (gzip compressed)
   Data Size:    5284560 Bytes = 5 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 02a00000
   Booting using the fdt blob at 0x02a00000
   Loading Kernel Image ... OK
OK
   Loading Ramdisk to 1faf5000, end 1ffff2d0 ... OK
   Loading Device Tree to 1faf1000, end 1faf4ac6 ... OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[    0.000000] Booting Linux on physical CPU 0x0
\r[    0.000000] Linux version 3.11.0-rc3-00001-gceefecf-dirty (sorenb@xsjandreislx) (gcc version 4.7.2 (Sourcery CodeBench Lite 2012.09-104) ) #93 SMP PREEMPT Thu Aug 8 10:12:39 PDT 2013
\r[    0.000000] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
\r[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
\r[    0.000000] Machine: Xilinx Zynq Platform, model: Zynq ZC706 Development Board
\r[    0.000000] bootconsole [earlycon0] enabled
\r[    0.000000] cma: CMA: reserved 16 MiB at 2e800000
\r[    0.000000] Memory policy: ECC disabled, Data cache writealloc
\r[    0.000000] PERCPU: Embedded 9 pages/cpu @c149c000 s14720 r8192 d13952 u36864
\r[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260624
\r[    0.000000] Kernel command line: console=ttyPS0,115200 earlyprintk
\r[    0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)
\r[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
\r[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
\r[    0.000000] Memory: 1005012K/1048576K available (4887K kernel code, 307K rwdata, 1564K rodata, 338K init, 5699K bss, 43564K reserved, 270336K highmem)
\r[    0.000000] Virtual kernel memory layout:
\r[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
\r[    0.000000]     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
\r[    0.000000]     vmalloc : 0xf0000000 - 0xff000000   ( 240 MB)
\r[    0.000000]     lowmem  : 0xc0000000 - 0xef800000   ( 760 MB)
\r[    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
\r[    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
\r[    0.000000]       .text : 0xc0008000 - 0xc065513c   (6453 kB)
\r[    0.000000]       .init : 0xc0656000 - 0xc06aa980   ( 339 kB)
\r[    0.000000]       .data : 0xc06ac000 - 0xc06f8c20   ( 308 kB)
\r[    0.000000]        .bss : 0xc06f8c20 - 0xc0c89aa4   (5700 kB)
\r[    0.000000] Preemptible hierarchical RCU implementation.
\r[    0.000000] 	RCU lockdep checking is enabled.
\r[    0.000000] 	Additional per-CPU info printed with stalls.
\r[    0.000000] 	RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
\r[    0.000000] NR_IRQS:16 nr_irqs:16 16
\r[    0.000000] slcr mapped to f0004000
\r[    0.000000] Zynq clock init
\r[    0.000000] sched_clock: 32 bits at 333MHz, resolution 3ns, wraps every 12884ms
\r[    0.000000] ------------[ cut here ]------------
\r[    0.000000] WARNING: CPU: 0 PID: 0 at kernel/smp.c:244 smp_call_function_single+0xac/0x1e8()
\r[    0.000000] Modules linked in:
\r[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gceefecf-dirty #93
\r[    0.000000] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)
\r[    0.000000] [<c0012d58>] (show_stack+0x20/0x24) from [<c0465858>] (dump_stack+0x80/0xc4)
\r[    0.000000] [<c0465858>] (dump_stack+0x80/0xc4) from [<c0023a80>] (warn_slowpath_common+0x78/0x9c)
\r[    0.000000] [<c0023a80>] (warn_slowpath_common+0x78/0x9c) from [<c0023ad0>] (warn_slowpath_null+0x2c/0x34)
\r[    0.000000] [<c0023ad0>] (warn_slowpath_null+0x2c/0x34) from [<c0084d58>] (smp_call_function_single+0xac/0x1e8)
\r[    0.000000] [<c0084d58>] (smp_call_function_single+0xac/0x1e8) from [<c008552c>] (smp_call_function_any+0xe0/0x11c)
\r[    0.000000] [<c008552c>] (smp_call_function_any+0xe0/0x11c) from [<c0075458>] (clockevents_set_mode+0x40/0x7c)
\r[    0.000000] [<c0075458>] (clockevents_set_mode+0x40/0x7c) from [<c00754b4>] (clockevents_shutdown+0x20/0x30)
\r[    0.000000] [<c00754b4>] (clockevents_shutdown+0x20/0x30) from [<c0075850>] (clockevents_exchange_device+0x94/0xb8)
\r[    0.000000] [<c0075850>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fb0>] (tick_check_new_device+0xc8/0x11c)
\r[    0.000000] [<c0075fb0>] (tick_check_new_device+0xc8/0x11c) from [<c007515c>] (clockevents_register_device+0xc0/0x134)
\r[    0.000000] [<c007515c>] (clockevents_register_device+0xc0/0x134) from [<c0075758>] (clockevents_config_and_register+0x2c/0x30)
\r[    0.000000] [<c0075758>] (clockevents_config_and_register+0x2c/0x30) from [<c037a24c>] (gt_clockevents_init+0x8c/0xb4)
\r[    0.000000] [<c037a24c>] (gt_clockevents_init+0x8c/0xb4) from [<c06820bc>] (global_timer_of_register+0x1e4/0x294)
\r[    0.000000] [<c06820bc>] (global_timer_of_register+0x1e4/0x294) from [<c0681a7c>] (clocksource_of_init+0x2c/0x54)
\r[    0.000000] [<c0681a7c>] (clocksource_of_init+0x2c/0x54) from [<c065eb98>] (zynq_timer_init+0x14/0x18)
\r[    0.000000] [<c065eb98>] (zynq_timer_init+0x14/0x18) from [<c0659ec8>] (time_init+0x24/0x34)
\r[    0.000000] [<c0659ec8>] (time_init+0x24/0x34) from [<c0656774>] (start_kernel+0x1cc/0x32c)
\r[    0.000000] [<c0656774>] (start_kernel+0x1cc/0x32c) from [<00008074>] (0x8074)
\r[    0.000000] ---[ end trace 1b75b31a2719ed1c ]---
\r[    0.000000] ttc0 #0 at f0006000, irq=43
\r[    0.000000] Console: colour dummy device 80x30
\r[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
\r[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:  8
\r[    0.000000] ... MAX_LOCK_DEPTH:          48
\r[    0.000000] ... MAX_LOCKDEP_KEYS:        8191
\r[    0.000000] ... CLASSHASH_SIZE:          4096
\r[    0.000000] ... MAX_LOCKDEP_ENTRIES:     16384
\r[    0.000000] ... MAX_LOCKDEP_CHAINS:      32768
\r[    0.000000] ... CHAINHASH_SIZE:          16384
\r[    0.000000]  memory used by lock dependency info: 3695 kB
\r[    0.000000]  per task-struct memory footprint: 1152 bytes
\r[    0.269533] Calibrating delay loop... 1325.46 BogoMIPS (lpj=6627328)
\r[    0.312154] pid_max: default: 32768 minimum: 301
\r[    0.315192] Mount-cache hash table entries: 512
\r[    0.326370] CPU: Testing write buffer coherency: ok
\r[    0.326521] ftrace: allocating 16144 entries in 48 pages
\r[    0.367009] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
\r[    0.367298] Setting up static identity map for 0xc046c318 - 0xc046c370
\r[    0.430300] CPU1: Booted secondary processor
\r[    0.517282] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
\r[    0.518605] Brought up 2 CPUs
\r[    0.526120] SMP: Total of 2 processors activated (2657.48 BogoMIPS).
\r[    0.532544] CPU: All CPU(s) started in SVC mode.
\r[    0.541709] devtmpfs: initialized
\r[    0.560097] NET: Registered protocol family 16
\r[    0.566004] DMA: preallocated 256 KiB pool for atomic coherent allocations
\r[    0.579037] L310 cache controller enabled
\r[    0.579074] l2x0: 8 ways, CACHE_ID 0x410000c8, AUX_CTRL 0x72360000, Cache size: 524288 B
\r[    0.593862] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
\r[    0.596377] hw-breakpoint: maximum watchpoint size is 4 bytes.
\r[    0.676943] bio: create slab <bio-0> at 0
\r[    0.682624] SCSI subsystem initialized
\r[    0.685426] usbcore: registered new interface driver usbfs
\r[    0.687002] usbcore: registered new interface driver hub
\r[    0.692621] usbcore: registered new device driver usb
\r[    0.707311] Switched to clocksource arm_global_timer
\r[    0.882154] NET: Registered protocol family 2
\r[    0.887819] TCP established hash table entries: 8192 (order: 4, 65536 bytes)
\r[    0.889741] TCP bind hash table entries: 8192 (order: 6, 294912 bytes)
\r[    0.899642] TCP: Hash tables configured (established 8192 bind 8192)
\r[    0.902596] TCP: reno registered
\r[    0.905581] UDP hash table entries: 512 (order: 3, 40960 bytes)
\r[    0.912006] UDP-Lite hash table entries: 512 (order: 3, 40960 bytes)
\r[    0.919667] NET: Registered protocol family 1
\r[    0.924041] RPC: Registered named UNIX socket transport module.
\r[    0.928796] RPC: Registered udp transport module.
\r[    0.932906] RPC: Registered tcp transport module.
\r[    0.937728] RPC: Registered tcp NFSv4.1 backchannel transport module.
\r[    0.945378] Trying to unpack rootfs image as initramfs...
\r[    1.766936] Freeing initrd memory: 5160K (dfaf5000 - dffff000)
\r[    1.779937] bounce pool size: 64 pages
\r[    1.784050] jffs2: version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
\r[    1.786266] msgmni has been set to 1476
\r[    1.810310] io scheduler noop registered
\r[    1.810343] io scheduler deadline registered
\r[    1.813214] io scheduler cfq registered (default)
\r[    1.833806] e000ñ[    1.835799] console [ttyPS0] enabled, bootconsole disabled
[    1.835799] console [ttyPS0] enabled, bootconsole disabled
\r[    1.879905] brd: module loaded
[    1.904214] loop: module loaded
[    1.924904] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.932898] usbcore: registered new interface driver usb-storage
[    1.941629] mousedev: PS/2 mouse device common for all mice
[    1.949576] i2c /dev entries driver
[    1.956841] cpuidle: using governor ladder
[    1.960931] cpuidle: using governor menu
[    1.964752] Xilinx Zynq CpuIdle Driver started
[    1.970481] sdhci: Secure Digital Host Controller Interface driver
[    1.976524] sdhci: Copyright(c) Pierre Ossman
[    1.981168] sdhci-pltfm: SDHCI platform and OF driver helper
[    1.989126] usbcore: registered new interface driver usbhid
[    1.994567] usbhid: USB HID core driver
[    1.999236] TCP: cubic registered
[    2.002456] NET: Registered protocol family 17
[    2.006950] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[    2.014555] Registering SWP/SWPB emulation handler
[    2.023942] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    2.063376] Freeing unused kernel memory: 336K (c0656000 - c06aa000)

Welcome to ^[[1mBuildroot 2013.08-git^[[0m!

Starting Media Directory...                                                    
Starting File System Check on Root Device...                                   
Starting Remount API VFS...                                                    
Started Runtime Directory                                              [^[[1;32m  OK  ^[[0m]
Started Lock Directory                                                 [^[[1;32m  OK  ^[[0m]
Started Set Up Additional Binary Formats                               [^[[1;32m  OK  ^[[0m]
Starting Apply Kernel Variables...                                             
Starting Journal Service...                                                    
Started Journal Service                                                [^[[1;32m  OK  ^[[0m]
Started POSIX Message Queue File System                                [^[[1;32m  OK  ^[[0m]
Started Huge Pages File System                                         [^[[1;32m  OK  ^[[0m]
Started Security File System                                           [^[[1;32m  OK  ^[[0m]
Starting Debug File System...                                                  
Started Load Kernel Modules                                            [^[[1;32m  OK  ^[[0m]
Started FUSE Control File System                                       [^[[1;32m  OK  ^[[0m]
Started Configuration File System                                      [^[[1;32m  OK  ^[[0m]
Starting Setup Virtual Console...                                              
Starting udev Coldplug all Devices...                                          
Starting udev Kernel Device Manager...                                         
Started Media Directory                                                [^[[1;32m  OK  ^[[0m]
[   14.455828] udevd[470]: starting version 182
Started udev Kernel Device Manager                                     [^[[1;32m  OK  ^[[0m]
Started File System Check on Root Device                               [^[[1;32m  OK  ^[[0m]
Started Remount API VFS                                                [^[[1;32m  OK  ^[[0m]
Started Apply Kernel Variables                                         [^[[1;32m  OK  ^[[0m]
Started Debug File System                                              [^[[1;32m  OK  ^[[0m]
Started Setup Virtual Console                                          [^[[1;32m  OK  ^[[0m]
Starting Remount Root FS...                                                    
Started Remount Root FS                                                [^[[1;32m  OK  ^[[0m]
Starting Load Random Seed...                                                   
Starting Recreate Volatile Files and Directories...                            
Started udev Coldplug all Devices                                      [^[[1;32m  OK  ^[[0m]
Started Load Random Seed                                               [^[[1;32m  OK  ^[[0m]
Started Recreate Volatile Files and Directories                        [^[[1;32m  OK  ^[[0m]
Starting Permit User Sessions...                                               
Starting Login Service...                                                      
Starting D-Bus System Message Bus...                                           
Started Permit User Sessions                                           [^[[1;32m  OK  ^[[0m]
Starting Serial Getty on ttyPS0...                                             
Started Serial Getty on ttyPS0                                         [^[[1;32m  OK  ^[[0m]
Started D-Bus System Message Bus                                       [^[[1;32m  OK  ^[[0m]
Started Login Service                                                  [^[[1;32m  OK  ^[[0m]

Zynq Busybox System
\rzynq login: root
Password: 
# ls
# cat /proc/int\r# cat /proc/interrupts ^[[J
           CPU0       CPU1       
 27:        262          1       GIC  27  gt
 29:        739        732       GIC  29  twd
 43:          0          0       GIC  43  ttc_clockevent
 82:        217          0       GIC  82  xuartps
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0         51  Timer broadcast interrupts
IPI2:       1236       1291  Rescheduling interrupts
IPI3:          0          0  Function call interrupts
IPI4:         86         76  Single function call interrupts
IPI5:          0          0  CPU stop interrupts
Err:          0
# cat /proc/tim\a\r# cat /proc/timer_^[[Jl\r# cat /proc/timer_list ^[[J
Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 4
now at 87282945949 nsecs

cpu: 0
 clock 0:
  .base:       c149c718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c149e670>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
 # expires at 87283071244-87283071244 nsecs [in 125988 to 125988 nsecs]
 #1: <c149cac8>, tick_sched_timer, S:01, hrtimer_start, swapper/0/0
 # expires at 96080000000-96080000000 nsecs [in 8797054744 to 8797054744 nsecs]
 clock 1:
  .base:       c149c750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c149c788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c149c7c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 87283238554 nsecs
  .hres_active    : 1
  .nr_events      : 782
  .nr_retries     : 1
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 87280000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4294946023
  .idle_calls     : 1543
  .idle_sleeps    : 253
  .idle_entrytime : 87283083871 nsecs
  .idle_waketime  : 87283074220 nsecs
  .idle_exittime  : 87274123411 nsecs
  .idle_sleeptime : 84313098543 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294946024
  .next_jiffies   : 4294946904
  .idle_expires   : 96080000000 nsecs
jiffies: 4294946024

cpu: 1
 clock 0:
  .base:       c14a5718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c14a5ac8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
 # expires at 87290000000-87290000000 nsecs [in 7054744 to 7054744 nsecs]
 #1: <ed886d80>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 102213360049-102213360049 nsecs [in 14930414793 to 14930414793 nsecs]
 clock 1:
  .base:       c14a5750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c14a5788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c14a57c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 87290000000 nsecs
  .hres_active    : 1
  .nr_events      : 787
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 86900000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294945985
  .idle_calls     : 1208
  .idle_sleeps    : 180
  .idle_entrytime : 87275940901 nsecs
  .idle_waketime  : 86909367960 nsecs
  .idle_exittime  : 87273080476 nsecs
  .idle_sleeptime : 84847843417 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294946023
  .next_jiffies   : 4294946024
  .idle_expires   : 123360000000 nsecs
jiffies: 4294946024

Tick Device: mode:     1
Broadcast device
Clock Event Device: arm_global_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     96080000000 nsecs
 set_next_event: gt_clockevent_set_next_event
 set_mode:       gt_clockevent_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0

tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     87283401937 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     87290000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

# Dependency failed. Aborted start of Serial Getty on getty              [^[[1;31m ABORT^[[0m]


^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-08 17:22                                               ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-08 17:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 08, 2013 at 06:16:50PM +0100, Mark Rutland wrote:
> On Thu, Aug 08, 2013 at 06:11:26PM +0100, S??ren Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> > > On 08/01/2013 07:43 PM, S??ren Brinkmann wrote:
> > > > On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> > > >> On 08/01/2013 01:38 AM, S??ren Brinkmann wrote:
> > > >>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> > > >>>> On 08/01/2013 12:18 AM, S??ren Brinkmann wrote:
> > > >>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> > > >>>>>> On 07/31/2013 10:58 PM, S??ren Brinkmann wrote:
> > > >>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> > > >>>>>>>> On 07/31/2013 12:34 AM, S??ren Brinkmann wrote:
> > > >>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> > > >>>>>>>>>> On 07/30/2013 02:03 AM, S??ren Brinkmann wrote:
> > > >>>>>>>>>>> Hi Daniel,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> > > >>>>>>>>>>> (snip)
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> > > >>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> > > >>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> > > >>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> > > >>>>>>>>>>>> idle state, switching the local timer back in use.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> > > >>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> > > >>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> > > >>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> > > >>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> > > >>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> > > >>>>>>>>>>> always remain a mystery to me.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Actually this more or less leads to the question: What is this
> > > >>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> > > >>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> > > >>>>>>>>>>> hang pointing to some issue with that driver?
> > > >>>>>>>>>>
> > > >>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> > > >>>>>>>>>
> > > >>>>>>>>> So, the correct run results (full output attached).
> > > >>>>>>>>>
> > > >>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> > > >>>>>>>>> broadcast device:
> > > >>>>>>>>>     Tick Device: mode:     1
> > > >>>>>>>>>     Broadcast device
> > > >>>>>>>>>     Clock Event Device: ttc_clockevent
> > > >>>>>>>>>
> > > >>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> > > >>>>>>>>> enable the global timer, the twd timers are still used as local timers
> > > >>>>>>>>> and the broadcast device is the global timer:
> > > >>>>>>>>>     Tick Device: mode:     1
> > > >>>>>>>>>     Broadcast device
> > > >>>>>>>>>     Clock Event Device: arm_global_timer
> > > >>>>>>>>>
> > > >>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> > > >>>>>>>>> obtain this information for that case.
> > > >>>>>>>>
> > > >>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> > > >>>>>>>
> > > >>>>>>> Right, that works. I forgot about that option after you mentioned, that
> > > >>>>>>> it is most likely not that useful.
> > > >>>>>>>
> > > >>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> > > >>>>>>> the gt enabled and having maxcpus=1 set.
> > > >>>>>>>
> > > >>>>>>> /proc/timer_list:
> > > >>>>>>>       Tick Device: mode:     1
> > > >>>>>>>       Broadcast device
> > > >>>>>>>       Clock Event Device: arm_global_timer
> > > >>>>>>>        max_delta_ns:   12884902005
> > > >>>>>>>        min_delta_ns:   1000
> > > >>>>>>>        mult:           715827876
> > > >>>>>>>        shift:          31
> > > >>>>>>>        mode:           3
> > > >>>>>>
> > > >>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> > > >>>>>>
> > > >>>>>> The previous timer_list output you gave me when removing the offending
> > > >>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> > > >>>>>>
> > > >>>>>> Is it possible you try to get this output again right after onlining the
> > > >>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> > > >>>>>
> > > >>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> > > >>>>> and that didn't end well:
> > > >>>>>         # echo 1 > online && cat /proc/timer_list
> > > >>>>
> > > >>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> > > >>>> apparently this is not the case... :(
> > > >>>>
> > > >>>> I suspect the global timer is shutdown at one moment but I don't
> > > >>>> understand why and when.
> > > >>>>
> > > >>>> Can you add a stack trace in the "clockevents_shutdown" function with
> > > >>>> the clockevent device name ? Perhaps, we may see at boot time an
> > > >>>> interesting trace when it hangs.
> > > >>>
> > > >>> I did this change:
> > > >>>   diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> > > >>>   index 38959c8..3ab11c1 100644
> > > >>>   --- a/kernel/time/clockevents.c
> > > >>>   +++ b/kernel/time/clockevents.c
> > > >>>   @@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> > > >>>     */
> > > >>>    void clockevents_shutdown(struct clock_event_device *dev)
> > > >>>    {
> > > >>>   +       pr_info("ce->name:%s\n", dev->name);
> > > >>>   +       dump_stack();
> > > >>>           clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> > > >>>           dev->next_event.tv64 = KTIME_MAX;
> > > >>>    }
> > > >>>
> > > >>> It is hit a few times during boot, so I attach a full boot log. I really
> > > >>> don't know what to look for, but I hope you can spot something in it. I
> > > >>> really appreciate you taking the time.
> > > >>
> > > >> Thanks for the traces.
> > > >
> > > > Sure.
> > > >
> > > >>
> > > >> If you try without the ttc_clockevent configured in the kernel (but with
> > > >> twd and gt), does it boot ?
> > > >
> > > > Absence of the TTC doesn't seem to make any difference. It hangs at the
> > > > same location.
> > >
> > > Ok, IMO there is a problem with the broadcast device registration (may
> > > be vs twd).
> > 
> > I have an idea, but no real evidence to prove it:
> > Some of the registers in the arm_global_timer are banked per CPU. I.e.
> > some code must be executed on the CPU the timer is associated with
> > (struct clock_event_device.cpumask) to have the intended effect
> > As far as I can tell, there is no guarantee, that the set_mode()
> > and program_next_event() calls execute on the correct CPU.
> 
> I believe the core clockevents code enforces that, or all other percpu
> clockevent_device drivers would be horrifically broken.

Well, I have some evidence. I booted into the system (bootlog attached).
It seems to be luck to not deadlock.
In between I also had some additional print statements in the global
timer driver which seemed to back up my suspicion.

This is the ugly hack I added to the clockevents core:

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 38959c8..419c973 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -59,6 +59,20 @@ u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt)
 }
 EXPORT_SYMBOL_GPL(clockevent_delta2ns);
 
+struct ce_set_mode_args {
+	struct clock_event_device *dev;
+	enum clock_event_mode mode;
+};
+
+static struct ce_set_mode_args ce_mode_args;
+
+static void ce_set_mode_xcall(void *info)
+{
+	struct ce_set_mode_args *args = info;
+
+	args->dev->set_mode(args->mode, args->dev);
+}
+
 /**
  * clockevents_set_mode - set the operating mode of a clock event device
  * @dev:	device to modify
@@ -70,7 +84,10 @@ void clockevents_set_mode(struct clock_event_device *dev,
 				 enum clock_event_mode mode)
 {
 	if (dev->mode != mode) {
-		dev->set_mode(mode, dev);
+		ce_mode_args.mode = mode;
+		ce_mode_args.dev = dev;
+		smp_call_function_any(dev->cpumask, ce_set_mode_xcall,
+				&ce_mode_args, 1);
 		dev->mode = mode;
 
 		/*
@@ -96,6 +113,20 @@ void clockevents_shutdown(struct clock_event_device *dev)
 	dev->next_event.tv64 = KTIME_MAX;
 }
 
+struct ce_prog_eve_args {
+	struct clock_event_device *dev;
+	unsigned long clc;
+};
+
+static struct ce_prog_eve_args ce_prog_eve_args;
+
+static void  ce_prog_event_xcall(void *info)
+{
+	struct ce_prog_eve_args *args = info;
+
+	args->dev->set_next_event(args->clc, args->dev);
+}
+
 #ifdef CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST
 
 /* Limit min_delta to a jiffie */
@@ -141,6 +172,7 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
 	unsigned long long clc;
 	int64_t delta;
 	int i;
+	int cpu;
 
 	for (i = 0;;) {
 		delta = dev->min_delta_ns;
@@ -151,8 +183,11 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
 
 		dev->retries++;
 		clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
-		if (dev->set_next_event((unsigned long) clc, dev) == 0)
-			return 0;
+		ce_prog_eve_args.clc = clc;
+		ce_prog_eve_args.dev = dev;
+		smp_call_function_any(dev->cpumask, ce_prog_event_xcall,
+				&ce_prog_eve_args, 1);
+		return 0;
 
 		if (++i > 2) {
 			/*
@@ -179,6 +214,7 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
 {
 	unsigned long long clc;
 	int64_t delta;
+	int ret;
 
 	delta = dev->min_delta_ns;
 	dev->next_event = ktime_add_ns(ktime_get(), delta);
@@ -188,7 +224,13 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
 
 	dev->retries++;
 	clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
-	return dev->set_next_event((unsigned long) clc, dev);
+	ce_prog_eve_args.clc = clc;
+	ce_prog_eve_args.dev = dev;
+	smp_call_function_any(dev->cpumask, ce_prog_event_xcall,
+			&ce_prog_eve_args, 1);
+	ret = 0;
+
+	return ret;
 }
 
 #endif /* CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST */
@@ -230,7 +272,11 @@ int clockevents_program_event(struct clock_event_device *dev, ktime_t expires,
 	delta = max(delta, (int64_t) dev->min_delta_ns);
 
 	clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
-	rc = dev->set_next_event((unsigned long) clc, dev);
+	ce_prog_eve_args.clc = clc;
+	ce_prog_eve_args.dev = dev;
+	smp_call_function_any(dev->cpumask, ce_prog_event_xcall,
+			&ce_prog_eve_args, 1);
+	rc = 0;
 
 	return (rc && force) ? clockevents_program_min_delta(dev) : rc;
 }


 	S??ren

-------------- next part --------------


U-Boot 2013.04 (Jun 19 2013 - 08:17:40)

I2C:   ready
Memory: ECC disabled
DRAM:  1 GiB
WARNING: Caches not enabled
MMC:   zynq_sdhci: 0
SF: Detected N25Q128A with page size 128 KiB, total 32 MiB
*** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Net:   Gem.e000b000
Hit any key to stop autoboot:  3 \b\b\b 2 \b\b\b 1 \b\b\b 0 
TFTPing Linux to RAM...
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'uImage'.
Load address: 0x3000000
Loading: *\b#################################################################
	 #################################################################
	 #####################################
	 4.1 MiB/s
done
Bytes transferred = 2449256 (255f68 hex)
Gem.e000b000:7 is connected to Gem.e000b000.  Reconnecting to Gem.e000b000
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'devicetree.dtb'.
Load address: 0x2a00000
Loading: *\b#
	 127.9 KiB/s
done
Bytes transferred = 2759 (ac7 hex)
Gem.e000b000:7 is connected to Gem.e000b000.  Reconnecting to Gem.e000b000
Gem.e000b000 Waiting for PHY auto negotiation to complete..... done
Using Gem.e000b000 device
TFTP from server 10.10.70.101; our IP address is 10.10.70.102
Filename 'uramdisk.image.gz'.
Load address: 0x2000000
Loading: *\b#################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ###################################
	 4.2 MiB/s
done
Bytes transferred = 5284624 (50a310 hex)
## Booting kernel from Legacy Image at 03000000 ...
   Image Name:   Linux-3.11.0-rc3-00001-gceefecf-
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2449192 Bytes = 2.3 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 02000000 ...
   Image Name:   
   Image Type:   ARM Linux RAMDisk Image (gzip compressed)
   Data Size:    5284560 Bytes = 5 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 02a00000
   Booting using the fdt blob at 0x02a00000
   Loading Kernel Image ... OK
OK
   Loading Ramdisk to 1faf5000, end 1ffff2d0 ... OK
   Loading Device Tree to 1faf1000, end 1faf4ac6 ... OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[    0.000000] Booting Linux on physical CPU 0x0

[    0.000000] Linux version 3.11.0-rc3-00001-gceefecf-dirty (sorenb at xsjandreislx) (gcc version 4.7.2 (Sourcery CodeBench Lite 2012.09-104) ) #93 SMP PREEMPT Thu Aug 8 10:12:39 PDT 2013

[    0.000000] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d

[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache

[    0.000000] Machine: Xilinx Zynq Platform, model: Zynq ZC706 Development Board

[    0.000000] bootconsole [earlycon0] enabled

[    0.000000] cma: CMA: reserved 16 MiB at 2e800000

[    0.000000] Memory policy: ECC disabled, Data cache writealloc

[    0.000000] PERCPU: Embedded 9 pages/cpu @c149c000 s14720 r8192 d13952 u36864

[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260624

[    0.000000] Kernel command line: console=ttyPS0,115200 earlyprintk

[    0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)

[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)

[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)

[    0.000000] Memory: 1005012K/1048576K available (4887K kernel code, 307K rwdata, 1564K rodata, 338K init, 5699K bss, 43564K reserved, 270336K highmem)

[    0.000000] Virtual kernel memory layout:

[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)

[    0.000000]     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)

[    0.000000]     vmalloc : 0xf0000000 - 0xff000000   ( 240 MB)

[    0.000000]     lowmem  : 0xc0000000 - 0xef800000   ( 760 MB)

[    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)

[    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)

[    0.000000]       .text : 0xc0008000 - 0xc065513c   (6453 kB)

[    0.000000]       .init : 0xc0656000 - 0xc06aa980   ( 339 kB)

[    0.000000]       .data : 0xc06ac000 - 0xc06f8c20   ( 308 kB)

[    0.000000]        .bss : 0xc06f8c20 - 0xc0c89aa4   (5700 kB)

[    0.000000] Preemptible hierarchical RCU implementation.

[    0.000000] 	RCU lockdep checking is enabled.

[    0.000000] 	Additional per-CPU info printed with stalls.

[    0.000000] 	RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.

[    0.000000] NR_IRQS:16 nr_irqs:16 16

[    0.000000] slcr mapped to f0004000

[    0.000000] Zynq clock init

[    0.000000] sched_clock: 32 bits at 333MHz, resolution 3ns, wraps every 12884ms

[    0.000000] ------------[ cut here ]------------

[    0.000000] WARNING: CPU: 0 PID: 0 at kernel/smp.c:244 smp_call_function_single+0xac/0x1e8()

[    0.000000] Modules linked in:

[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc3-00001-gceefecf-dirty #93

[    0.000000] [<c0016a88>] (unwind_backtrace+0x0/0x128) from [<c0012d58>] (show_stack+0x20/0x24)

[    0.000000] [<c0012d58>] (show_stack+0x20/0x24) from [<c0465858>] (dump_stack+0x80/0xc4)

[    0.000000] [<c0465858>] (dump_stack+0x80/0xc4) from [<c0023a80>] (warn_slowpath_common+0x78/0x9c)

[    0.000000] [<c0023a80>] (warn_slowpath_common+0x78/0x9c) from [<c0023ad0>] (warn_slowpath_null+0x2c/0x34)

[    0.000000] [<c0023ad0>] (warn_slowpath_null+0x2c/0x34) from [<c0084d58>] (smp_call_function_single+0xac/0x1e8)

[    0.000000] [<c0084d58>] (smp_call_function_single+0xac/0x1e8) from [<c008552c>] (smp_call_function_any+0xe0/0x11c)

[    0.000000] [<c008552c>] (smp_call_function_any+0xe0/0x11c) from [<c0075458>] (clockevents_set_mode+0x40/0x7c)

[    0.000000] [<c0075458>] (clockevents_set_mode+0x40/0x7c) from [<c00754b4>] (clockevents_shutdown+0x20/0x30)

[    0.000000] [<c00754b4>] (clockevents_shutdown+0x20/0x30) from [<c0075850>] (clockevents_exchange_device+0x94/0xb8)

[    0.000000] [<c0075850>] (clockevents_exchange_device+0x94/0xb8) from [<c0075fb0>] (tick_check_new_device+0xc8/0x11c)

[    0.000000] [<c0075fb0>] (tick_check_new_device+0xc8/0x11c) from [<c007515c>] (clockevents_register_device+0xc0/0x134)

[    0.000000] [<c007515c>] (clockevents_register_device+0xc0/0x134) from [<c0075758>] (clockevents_config_and_register+0x2c/0x30)

[    0.000000] [<c0075758>] (clockevents_config_and_register+0x2c/0x30) from [<c037a24c>] (gt_clockevents_init+0x8c/0xb4)

[    0.000000] [<c037a24c>] (gt_clockevents_init+0x8c/0xb4) from [<c06820bc>] (global_timer_of_register+0x1e4/0x294)

[    0.000000] [<c06820bc>] (global_timer_of_register+0x1e4/0x294) from [<c0681a7c>] (clocksource_of_init+0x2c/0x54)

[    0.000000] [<c0681a7c>] (clocksource_of_init+0x2c/0x54) from [<c065eb98>] (zynq_timer_init+0x14/0x18)

[    0.000000] [<c065eb98>] (zynq_timer_init+0x14/0x18) from [<c0659ec8>] (time_init+0x24/0x34)

[    0.000000] [<c0659ec8>] (time_init+0x24/0x34) from [<c0656774>] (start_kernel+0x1cc/0x32c)

[    0.000000] [<c0656774>] (start_kernel+0x1cc/0x32c) from [<00008074>] (0x8074)

[    0.000000] ---[ end trace 1b75b31a2719ed1c ]---

[    0.000000] ttc0 #0 at f0006000, irq=43

[    0.000000] Console: colour dummy device 80x30

[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar

[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:  8

[    0.000000] ... MAX_LOCK_DEPTH:          48

[    0.000000] ... MAX_LOCKDEP_KEYS:        8191

[    0.000000] ... CLASSHASH_SIZE:          4096

[    0.000000] ... MAX_LOCKDEP_ENTRIES:     16384

[    0.000000] ... MAX_LOCKDEP_CHAINS:      32768

[    0.000000] ... CHAINHASH_SIZE:          16384

[    0.000000]  memory used by lock dependency info: 3695 kB

[    0.000000]  per task-struct memory footprint: 1152 bytes

[    0.269533] Calibrating delay loop... 1325.46 BogoMIPS (lpj=6627328)

[    0.312154] pid_max: default: 32768 minimum: 301

[    0.315192] Mount-cache hash table entries: 512

[    0.326370] CPU: Testing write buffer coherency: ok

[    0.326521] ftrace: allocating 16144 entries in 48 pages

[    0.367009] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000

[    0.367298] Setting up static identity map for 0xc046c318 - 0xc046c370

[    0.430300] CPU1: Booted secondary processor

[    0.517282] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001

[    0.518605] Brought up 2 CPUs

[    0.526120] SMP: Total of 2 processors activated (2657.48 BogoMIPS).

[    0.532544] CPU: All CPU(s) started in SVC mode.

[    0.541709] devtmpfs: initialized

[    0.560097] NET: Registered protocol family 16

[    0.566004] DMA: preallocated 256 KiB pool for atomic coherent allocations

[    0.579037] L310 cache controller enabled

[    0.579074] l2x0: 8 ways, CACHE_ID 0x410000c8, AUX_CTRL 0x72360000, Cache size: 524288 B

[    0.593862] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.

[    0.596377] hw-breakpoint: maximum watchpoint size is 4 bytes.

[    0.676943] bio: create slab <bio-0> at 0

[    0.682624] SCSI subsystem initialized

[    0.685426] usbcore: registered new interface driver usbfs

[    0.687002] usbcore: registered new interface driver hub

[    0.692621] usbcore: registered new device driver usb

[    0.707311] Switched to clocksource arm_global_timer

[    0.882154] NET: Registered protocol family 2

[    0.887819] TCP established hash table entries: 8192 (order: 4, 65536 bytes)

[    0.889741] TCP bind hash table entries: 8192 (order: 6, 294912 bytes)

[    0.899642] TCP: Hash tables configured (established 8192 bind 8192)

[    0.902596] TCP: reno registered

[    0.905581] UDP hash table entries: 512 (order: 3, 40960 bytes)

[    0.912006] UDP-Lite hash table entries: 512 (order: 3, 40960 bytes)

[    0.919667] NET: Registered protocol family 1

[    0.924041] RPC: Registered named UNIX socket transport module.

[    0.928796] RPC: Registered udp transport module.

[    0.932906] RPC: Registered tcp transport module.

[    0.937728] RPC: Registered tcp NFSv4.1 backchannel transport module.

[    0.945378] Trying to unpack rootfs image as initramfs...

[    1.766936] Freeing initrd memory: 5160K (dfaf5000 - dffff000)

[    1.779937] bounce pool size: 64 pages

[    1.784050] jffs2: version 2.2. (NAND) (SUMMARY)  ?? 2001-2006 Red Hat, Inc.

[    1.786266] msgmni has been set to 1476

[    1.810310] io scheduler noop registered

[    1.810343] io scheduler deadline registered

[    1.813214] io scheduler cfq registered (default)

[    1.833806] e000?[    1.835799] console [ttyPS0] enabled, bootconsole disabled
[    1.835799] console [ttyPS0] enabled, bootconsole disabled

[    1.879905] brd: module loaded
[    1.904214] loop: module loaded
[    1.924904] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.932898] usbcore: registered new interface driver usb-storage
[    1.941629] mousedev: PS/2 mouse device common for all mice
[    1.949576] i2c /dev entries driver
[    1.956841] cpuidle: using governor ladder
[    1.960931] cpuidle: using governor menu
[    1.964752] Xilinx Zynq CpuIdle Driver started
[    1.970481] sdhci: Secure Digital Host Controller Interface driver
[    1.976524] sdhci: Copyright(c) Pierre Ossman
[    1.981168] sdhci-pltfm: SDHCI platform and OF driver helper
[    1.989126] usbcore: registered new interface driver usbhid
[    1.994567] usbhid: USB HID core driver
[    1.999236] TCP: cubic registered
[    2.002456] NET: Registered protocol family 17
[    2.006950] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[    2.014555] Registering SWP/SWPB emulation handler
[    2.023942] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    2.063376] Freeing unused kernel memory: 336K (c0656000 - c06aa000)

Welcome to ^[[1mBuildroot 2013.08-git^[[0m!

Starting Media Directory...                                                    
Starting File System Check on Root Device...                                   
Starting Remount API VFS...                                                    
Started Runtime Directory                                              [^[[1;32m  OK  ^[[0m]
Started Lock Directory                                                 [^[[1;32m  OK  ^[[0m]
Started Set Up Additional Binary Formats                               [^[[1;32m  OK  ^[[0m]
Starting Apply Kernel Variables...                                             
Starting Journal Service...                                                    
Started Journal Service                                                [^[[1;32m  OK  ^[[0m]
Started POSIX Message Queue File System                                [^[[1;32m  OK  ^[[0m]
Started Huge Pages File System                                         [^[[1;32m  OK  ^[[0m]
Started Security File System                                           [^[[1;32m  OK  ^[[0m]
Starting Debug File System...                                                  
Started Load Kernel Modules                                            [^[[1;32m  OK  ^[[0m]
Started FUSE Control File System                                       [^[[1;32m  OK  ^[[0m]
Started Configuration File System                                      [^[[1;32m  OK  ^[[0m]
Starting Setup Virtual Console...                                              
Starting udev Coldplug all Devices...                                          
Starting udev Kernel Device Manager...                                         
Started Media Directory                                                [^[[1;32m  OK  ^[[0m]
[   14.455828] udevd[470]: starting version 182
Started udev Kernel Device Manager                                     [^[[1;32m  OK  ^[[0m]
Started File System Check on Root Device                               [^[[1;32m  OK  ^[[0m]
Started Remount API VFS                                                [^[[1;32m  OK  ^[[0m]
Started Apply Kernel Variables                                         [^[[1;32m  OK  ^[[0m]
Started Debug File System                                              [^[[1;32m  OK  ^[[0m]
Started Setup Virtual Console                                          [^[[1;32m  OK  ^[[0m]
Starting Remount Root FS...                                                    
Started Remount Root FS                                                [^[[1;32m  OK  ^[[0m]
Starting Load Random Seed...                                                   
Starting Recreate Volatile Files and Directories...                            
Started udev Coldplug all Devices                                      [^[[1;32m  OK  ^[[0m]
Started Load Random Seed                                               [^[[1;32m  OK  ^[[0m]
Started Recreate Volatile Files and Directories                        [^[[1;32m  OK  ^[[0m]
Starting Permit User Sessions...                                               
Starting Login Service...                                                      
Starting D-Bus System Message Bus...                                           
Started Permit User Sessions                                           [^[[1;32m  OK  ^[[0m]
Starting Serial Getty on ttyPS0...                                             
Started Serial Getty on ttyPS0                                         [^[[1;32m  OK  ^[[0m]
Started D-Bus System Message Bus                                       [^[[1;32m  OK  ^[[0m]
Started Login Service                                                  [^[[1;32m  OK  ^[[0m]


Zynq Busybox System

zynq login: root
Password: 
# ls
# cat /proc/int
# cat /proc/interrupts ^[[J
           CPU0       CPU1       
 27:        262          1       GIC  27  gt
 29:        739        732       GIC  29  twd
 43:          0          0       GIC  43  ttc_clockevent
 82:        217          0       GIC  82  xuartps
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0         51  Timer broadcast interrupts
IPI2:       1236       1291  Rescheduling interrupts
IPI3:          0          0  Function call interrupts
IPI4:         86         76  Single function call interrupts
IPI5:          0          0  CPU stop interrupts
Err:          0
# cat /proc/tim\a
# cat /proc/timer_^[[Jl
# cat /proc/timer_list ^[[J
Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 4
now at 87282945949 nsecs

cpu: 0
 clock 0:
  .base:       c149c718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c149e670>, menu_hrtimer_notify, S:01, hrtimer_start, swapper/0/0
 # expires at 87283071244-87283071244 nsecs [in 125988 to 125988 nsecs]
 #1: <c149cac8>, tick_sched_timer, S:01, hrtimer_start, swapper/0/0
 # expires at 96080000000-96080000000 nsecs [in 8797054744 to 8797054744 nsecs]
 clock 1:
  .base:       c149c750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c149c788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c149c7c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 87283238554 nsecs
  .hres_active    : 1
  .nr_events      : 782
  .nr_retries     : 1
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 87280000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4294946023
  .idle_calls     : 1543
  .idle_sleeps    : 253
  .idle_entrytime : 87283083871 nsecs
  .idle_waketime  : 87283074220 nsecs
  .idle_exittime  : 87274123411 nsecs
  .idle_sleeptime : 84313098543 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294946024
  .next_jiffies   : 4294946904
  .idle_expires   : 96080000000 nsecs
jiffies: 4294946024

cpu: 1
 clock 0:
  .base:       c14a5718
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <c14a5ac8>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/1/0
 # expires at 87290000000-87290000000 nsecs [in 7054744 to 7054744 nsecs]
 #1: <ed886d80>, timerfd_tmrproc, S:01, hrtimer_start, systemd/1
 # expires at 102213360049-102213360049 nsecs [in 14930414793 to 14930414793 nsecs]
 clock 1:
  .base:       c14a5750
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       c14a5788
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       c14a57c0
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 87290000000 nsecs
  .hres_active    : 1
  .nr_events      : 787
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 2
  .last_tick      : 86900000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294945985
  .idle_calls     : 1208
  .idle_sleeps    : 180
  .idle_entrytime : 87275940901 nsecs
  .idle_waketime  : 86909367960 nsecs
  .idle_exittime  : 87273080476 nsecs
  .idle_sleeptime : 84847843417 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294946023
  .next_jiffies   : 4294946024
  .idle_expires   : 123360000000 nsecs
jiffies: 4294946024

Tick Device: mode:     1
Broadcast device
Clock Event Device: arm_global_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     96080000000 nsecs
 set_next_event: gt_clockevent_set_next_event
 set_mode:       gt_clockevent_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0

tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     87283401937 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     87290000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

# Dependency failed. Aborted start of Serial Getty on getty              [^[[1;31m ABORT^[[0m]

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-08 17:11                                           ` Sören Brinkmann
@ 2013-08-09 10:32                                             ` Srinivas KANDAGATLA
  -1 siblings, 0 replies; 142+ messages in thread
From: Srinivas KANDAGATLA @ 2013-08-09 10:32 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Daniel Lezcano, Russell King, Stephen Boyd, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/08/13 18:11, Sören Brinkmann wrote:
> Hi Daniel,
> 
> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>
>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>
>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>
>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>> broadcast device:
>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>
>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>
>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>
>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>
>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>> it is most likely not that useful.
>>>>>>>>>
>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>
>>>>>>>>> /proc/timer_list:
>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>> 	Broadcast device
>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>> 	 mult:           715827876
>>>>>>>>> 	 shift:          31
>>>>>>>>> 	 mode:           3
>>>>>>>>
>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>
>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>
>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>
>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>> and that didn't end well:
>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>
>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>> apparently this is not the case... :(
>>>>>>
>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>> understand why and when.
>>>>>>
>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>> interesting trace when it hangs.
>>>>>
>>>>> I did this change:
>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>> 	index 38959c8..3ab11c1 100644
>>>>> 	--- a/kernel/time/clockevents.c
>>>>> 	+++ b/kernel/time/clockevents.c
>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>> 	  */
>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>> 	 {
>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>> 	+       dump_stack();
>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>> 	 }
>>>>>
>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>> really appreciate you taking the time.
>>>>
>>>> Thanks for the traces.
>>>
>>> Sure.
>>>
>>>>
>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>> twd and gt), does it boot ?
>>>
>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>> same location.
>>
>> Ok, IMO there is a problem with the broadcast device registration (may
>> be vs twd).
> 
> I have an idea, but no real evidence to prove it:
> Some of the registers in the arm_global_timer are banked per CPU. I.e.
> some code must be executed on the CPU the timer is associated with
> (struct clock_event_device.cpumask) to have the intended effect
> As far as I can tell, there is no guarantee, that the set_mode()
> and program_next_event() calls execute on the correct CPU.
> If this was correct, shutting down the timer for the CPU entering
> idle might actually shut down the timer for the running CPU, if
> set_mode() executes on the CPU which is _not_ about to enter idle.

Hi Sören,
Am able to reproduce similar issue on StiH415 SOC by enabling both
global_timer and twd and using cpuidle driver like zynq.

When CPU0 goes to idle, I noticed that the global timer used for
boardcast is actually scheduled on wrong cpu.
My traces for printk like this
	printk("DEBUG: %s on CPU:%d CPUMASK:%s\n", __FUNCTION__,
smp_processor_id(), scpumask);

shows:

DEBUG: gt_clockevent_set_mode on CPU:1 CPUMASK: 0
DEBUG: gt_clockevent_set_next_event on CPU:1 CPUMASK:0

Which indicates that setting the mode and next_event for a clkevent with
cpumask 0 is scheduled on cpu1, this will generate an global timer
interrupt on cpu1 rather than cpu0.

This might be the reason for cpu0 not coming out of the cpu_idle_loop.

Thanks,
srini
> 
> I tried to prove this by adding some really ugly smp_call_any() wrappers
> in kernel/time/clockevents.c for the calls to set_mode() and
> program_net_event() but that ends in all kinds of dead locks.
> 
> 	Sören
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-09 10:32                                             ` Srinivas KANDAGATLA
  0 siblings, 0 replies; 142+ messages in thread
From: Srinivas KANDAGATLA @ 2013-08-09 10:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/08/13 18:11, S?ren Brinkmann wrote:
> Hi Daniel,
> 
> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>
>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>
>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>
>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>> broadcast device:
>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>
>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>
>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>
>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>
>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>> it is most likely not that useful.
>>>>>>>>>
>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>
>>>>>>>>> /proc/timer_list:
>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>> 	Broadcast device
>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>> 	 mult:           715827876
>>>>>>>>> 	 shift:          31
>>>>>>>>> 	 mode:           3
>>>>>>>>
>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>
>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>
>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>
>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>> and that didn't end well:
>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>
>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>> apparently this is not the case... :(
>>>>>>
>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>> understand why and when.
>>>>>>
>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>> interesting trace when it hangs.
>>>>>
>>>>> I did this change:
>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>> 	index 38959c8..3ab11c1 100644
>>>>> 	--- a/kernel/time/clockevents.c
>>>>> 	+++ b/kernel/time/clockevents.c
>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>> 	  */
>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>> 	 {
>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>> 	+       dump_stack();
>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>> 	 }
>>>>>
>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>> really appreciate you taking the time.
>>>>
>>>> Thanks for the traces.
>>>
>>> Sure.
>>>
>>>>
>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>> twd and gt), does it boot ?
>>>
>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>> same location.
>>
>> Ok, IMO there is a problem with the broadcast device registration (may
>> be vs twd).
> 
> I have an idea, but no real evidence to prove it:
> Some of the registers in the arm_global_timer are banked per CPU. I.e.
> some code must be executed on the CPU the timer is associated with
> (struct clock_event_device.cpumask) to have the intended effect
> As far as I can tell, there is no guarantee, that the set_mode()
> and program_next_event() calls execute on the correct CPU.
> If this was correct, shutting down the timer for the CPU entering
> idle might actually shut down the timer for the running CPU, if
> set_mode() executes on the CPU which is _not_ about to enter idle.

Hi S?ren,
Am able to reproduce similar issue on StiH415 SOC by enabling both
global_timer and twd and using cpuidle driver like zynq.

When CPU0 goes to idle, I noticed that the global timer used for
boardcast is actually scheduled on wrong cpu.
My traces for printk like this
	printk("DEBUG: %s on CPU:%d CPUMASK:%s\n", __FUNCTION__,
smp_processor_id(), scpumask);

shows:

DEBUG: gt_clockevent_set_mode on CPU:1 CPUMASK: 0
DEBUG: gt_clockevent_set_next_event on CPU:1 CPUMASK:0

Which indicates that setting the mode and next_event for a clkevent with
cpumask 0 is scheduled on cpu1, this will generate an global timer
interrupt on cpu1 rather than cpu0.

This might be the reason for cpu0 not coming out of the cpu_idle_loop.

Thanks,
srini
> 
> I tried to prove this by adding some really ugly smp_call_any() wrappers
> in kernel/time/clockevents.c for the calls to set_mode() and
> program_net_event() but that ends in all kinds of dead locks.
> 
> 	S?ren
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-08 17:22                                               ` Stephen Boyd
@ 2013-08-09 10:58                                                 ` Mark Rutland
  -1 siblings, 0 replies; 142+ messages in thread
From: Mark Rutland @ 2013-08-09 10:58 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Sören Brinkmann, Daniel Lezcano, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Thu, Aug 08, 2013 at 06:22:36PM +0100, Stephen Boyd wrote:
> On 08/08/13 10:16, Mark Rutland wrote:
> > On Thu, Aug 08, 2013 at 06:11:26PM +0100, Sören Brinkmann wrote:
> >> Hi Daniel,
> >>
> >> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> >>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
> >>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> >>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> >>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> >>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> >>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>>>>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>>>>
> >>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>>>>> broadcast device:
> >>>>>>>>>>>>     Tick Device: mode:     1
> >>>>>>>>>>>>     Broadcast device
> >>>>>>>>>>>>     Clock Event Device: ttc_clockevent
> >>>>>>>>>>>>
> >>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>>>>>     Tick Device: mode:     1
> >>>>>>>>>>>>     Broadcast device
> >>>>>>>>>>>>     Clock Event Device: arm_global_timer
> >>>>>>>>>>>>
> >>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>>>>> obtain this information for that case.
> >>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>>>>> it is most likely not that useful.
> >>>>>>>>>>
> >>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>>>>
> >>>>>>>>>> /proc/timer_list:
> >>>>>>>>>>       Tick Device: mode:     1
> >>>>>>>>>>       Broadcast device
> >>>>>>>>>>       Clock Event Device: arm_global_timer
> >>>>>>>>>>        max_delta_ns:   12884902005
> >>>>>>>>>>        min_delta_ns:   1000
> >>>>>>>>>>        mult:           715827876
> >>>>>>>>>>        shift:          31
> >>>>>>>>>>        mode:           3
> >>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>>>>
> >>>>>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>>>>
> >>>>>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>>>>> and that didn't end well:
> >>>>>>>>         # echo 1 > online && cat /proc/timer_list
> >>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>>>>> apparently this is not the case... :(
> >>>>>>>
> >>>>>>> I suspect the global timer is shutdown at one moment but I don't
> >>>>>>> understand why and when.
> >>>>>>>
> >>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>>>>> interesting trace when it hangs.
> >>>>>> I did this change:
> >>>>>>   diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>>>>>   index 38959c8..3ab11c1 100644
> >>>>>>   --- a/kernel/time/clockevents.c
> >>>>>>   +++ b/kernel/time/clockevents.c
> >>>>>>   @@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>>>>>     */
> >>>>>>    void clockevents_shutdown(struct clock_event_device *dev)
> >>>>>>    {
> >>>>>>   +       pr_info("ce->name:%s\n", dev->name);
> >>>>>>   +       dump_stack();
> >>>>>>           clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>>>>>           dev->next_event.tv64 = KTIME_MAX;
> >>>>>>    }
> >>>>>>
> >>>>>> It is hit a few times during boot, so I attach a full boot log. I really
> >>>>>> don't know what to look for, but I hope you can spot something in it. I
> >>>>>> really appreciate you taking the time.
> >>>>> Thanks for the traces.
> >>>> Sure.
> >>>>
> >>>>> If you try without the ttc_clockevent configured in the kernel (but with
> >>>>> twd and gt), does it boot ?
> >>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
> >>>> same location.
> >>> Ok, IMO there is a problem with the broadcast device registration (may
> >>> be vs twd).
> >> I have an idea, but no real evidence to prove it:
> >> Some of the registers in the arm_global_timer are banked per CPU. I.e.
> >> some code must be executed on the CPU the timer is associated with
> >> (struct clock_event_device.cpumask) to have the intended effect
> >> As far as I can tell, there is no guarantee, that the set_mode()
> >> and program_next_event() calls execute on the correct CPU.
> > I believe the core clockevents code enforces that, or all other percpu
> > clockevent_device drivers would be horrifically broken.
> 
> Maybe the problem here is that a per-cpu device is being used for the
> broadcast source? I can't recall but I think the broadcast programming
> can bounce around CPUs depending on which CPU is the one to enter
> broadcast mode first? At least I don't think this configuration has ever
> been tested (for example, look at how tick_do_broadcast_on_off() enables
> the broadcast timer on whatever CPU goes into deep idle first).

Ah, yes. That does look problematic.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-09 10:58                                                 ` Mark Rutland
  0 siblings, 0 replies; 142+ messages in thread
From: Mark Rutland @ 2013-08-09 10:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 08, 2013 at 06:22:36PM +0100, Stephen Boyd wrote:
> On 08/08/13 10:16, Mark Rutland wrote:
> > On Thu, Aug 08, 2013 at 06:11:26PM +0100, S?ren Brinkmann wrote:
> >> Hi Daniel,
> >>
> >> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> >>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
> >>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
> >>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
> >>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> >>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> >>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>>>>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>>>>
> >>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>>>>> broadcast device:
> >>>>>>>>>>>>     Tick Device: mode:     1
> >>>>>>>>>>>>     Broadcast device
> >>>>>>>>>>>>     Clock Event Device: ttc_clockevent
> >>>>>>>>>>>>
> >>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>>>>>     Tick Device: mode:     1
> >>>>>>>>>>>>     Broadcast device
> >>>>>>>>>>>>     Clock Event Device: arm_global_timer
> >>>>>>>>>>>>
> >>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>>>>> obtain this information for that case.
> >>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>>>>> it is most likely not that useful.
> >>>>>>>>>>
> >>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>>>>
> >>>>>>>>>> /proc/timer_list:
> >>>>>>>>>>       Tick Device: mode:     1
> >>>>>>>>>>       Broadcast device
> >>>>>>>>>>       Clock Event Device: arm_global_timer
> >>>>>>>>>>        max_delta_ns:   12884902005
> >>>>>>>>>>        min_delta_ns:   1000
> >>>>>>>>>>        mult:           715827876
> >>>>>>>>>>        shift:          31
> >>>>>>>>>>        mode:           3
> >>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>>>>
> >>>>>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>>>>
> >>>>>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>>>>> and that didn't end well:
> >>>>>>>>         # echo 1 > online && cat /proc/timer_list
> >>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>>>>> apparently this is not the case... :(
> >>>>>>>
> >>>>>>> I suspect the global timer is shutdown at one moment but I don't
> >>>>>>> understand why and when.
> >>>>>>>
> >>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>>>>> interesting trace when it hangs.
> >>>>>> I did this change:
> >>>>>>   diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>>>>>   index 38959c8..3ab11c1 100644
> >>>>>>   --- a/kernel/time/clockevents.c
> >>>>>>   +++ b/kernel/time/clockevents.c
> >>>>>>   @@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>>>>>     */
> >>>>>>    void clockevents_shutdown(struct clock_event_device *dev)
> >>>>>>    {
> >>>>>>   +       pr_info("ce->name:%s\n", dev->name);
> >>>>>>   +       dump_stack();
> >>>>>>           clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>>>>>           dev->next_event.tv64 = KTIME_MAX;
> >>>>>>    }
> >>>>>>
> >>>>>> It is hit a few times during boot, so I attach a full boot log. I really
> >>>>>> don't know what to look for, but I hope you can spot something in it. I
> >>>>>> really appreciate you taking the time.
> >>>>> Thanks for the traces.
> >>>> Sure.
> >>>>
> >>>>> If you try without the ttc_clockevent configured in the kernel (but with
> >>>>> twd and gt), does it boot ?
> >>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
> >>>> same location.
> >>> Ok, IMO there is a problem with the broadcast device registration (may
> >>> be vs twd).
> >> I have an idea, but no real evidence to prove it:
> >> Some of the registers in the arm_global_timer are banked per CPU. I.e.
> >> some code must be executed on the CPU the timer is associated with
> >> (struct clock_event_device.cpumask) to have the intended effect
> >> As far as I can tell, there is no guarantee, that the set_mode()
> >> and program_next_event() calls execute on the correct CPU.
> > I believe the core clockevents code enforces that, or all other percpu
> > clockevent_device drivers would be horrifically broken.
> 
> Maybe the problem here is that a per-cpu device is being used for the
> broadcast source? I can't recall but I think the broadcast programming
> can bounce around CPUs depending on which CPU is the one to enter
> broadcast mode first? At least I don't think this configuration has ever
> been tested (for example, look at how tick_do_broadcast_on_off() enables
> the broadcast timer on whatever CPU goes into deep idle first).

Ah, yes. That does look problematic.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-09 10:32                                             ` Srinivas KANDAGATLA
@ 2013-08-09 14:19                                               ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-09 14:19 UTC (permalink / raw)
  To: srinivas.kandagatla
  Cc: Sören Brinkmann, Russell King, Stephen Boyd, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/09/2013 12:32 PM, Srinivas KANDAGATLA wrote:
> On 08/08/13 18:11, Sören Brinkmann wrote:
>> Hi Daniel,
>>
>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>
>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>
>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>
>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>
>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>
>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>
>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>
>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>
>>>>>>>>>> /proc/timer_list:
>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>> 	Broadcast device
>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>> 	 shift:          31
>>>>>>>>>> 	 mode:           3
>>>>>>>>>
>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>
>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>
>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>
>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>> and that didn't end well:
>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>
>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>> apparently this is not the case... :(
>>>>>>>
>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>> understand why and when.
>>>>>>>
>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>> interesting trace when it hangs.
>>>>>>
>>>>>> I did this change:
>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>> 	  */
>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>> 	 {
>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>> 	+       dump_stack();
>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>> 	 }
>>>>>>
>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>> really appreciate you taking the time.
>>>>>
>>>>> Thanks for the traces.
>>>>
>>>> Sure.
>>>>
>>>>>
>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>> twd and gt), does it boot ?
>>>>
>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>> same location.
>>>
>>> Ok, IMO there is a problem with the broadcast device registration (may
>>> be vs twd).
>>
>> I have an idea, but no real evidence to prove it:
>> Some of the registers in the arm_global_timer are banked per CPU. I.e.
>> some code must be executed on the CPU the timer is associated with
>> (struct clock_event_device.cpumask) to have the intended effect
>> As far as I can tell, there is no guarantee, that the set_mode()
>> and program_next_event() calls execute on the correct CPU.
>> If this was correct, shutting down the timer for the CPU entering
>> idle might actually shut down the timer for the running CPU, if
>> set_mode() executes on the CPU which is _not_ about to enter idle.
> 
> Hi Sören,
> Am able to reproduce similar issue on StiH415 SOC by enabling both
> global_timer and twd and using cpuidle driver like zynq.
> 
> When CPU0 goes to idle, I noticed that the global timer used for
> boardcast is actually scheduled on wrong cpu.
> My traces for printk like this
> 	printk("DEBUG: %s on CPU:%d CPUMASK:%s\n", __FUNCTION__,
> smp_processor_id(), scpumask);
> 
> shows:
> 
> DEBUG: gt_clockevent_set_mode on CPU:1 CPUMASK: 0
> DEBUG: gt_clockevent_set_next_event on CPU:1 CPUMASK:0
> 
> Which indicates that setting the mode and next_event for a clkevent with
> cpumask 0 is scheduled on cpu1, this will generate an global timer
> interrupt on cpu1 rather than cpu0.
> 
> This might be the reason for cpu0 not coming out of the cpu_idle_loop.

yes, but at least the broadcast mechanism should send an IPI to cpu0 to
wake it up, no ? As Stephen stated this kind of configuration should has
never been tested before so the tick broadcast code is not handling this
case properly IMHO.




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-09 14:19                                               ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-09 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/09/2013 12:32 PM, Srinivas KANDAGATLA wrote:
> On 08/08/13 18:11, S?ren Brinkmann wrote:
>> Hi Daniel,
>>
>> On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
>>> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
>>>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
>>>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
>>>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
>>>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
>>>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
>>>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
>>>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
>>>>>>>>>>>>>> (snip)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
>>>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
>>>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
>>>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
>>>>>>>>>>>>>>> idle state, switching the local timer back in use.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
>>>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
>>>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
>>>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
>>>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
>>>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
>>>>>>>>>>>>>> always remain a mystery to me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually this more or less leads to the question: What is this
>>>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
>>>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
>>>>>>>>>>>>>> hang pointing to some issue with that driver?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
>>>>>>>>>>>>
>>>>>>>>>>>> So, the correct run results (full output attached).
>>>>>>>>>>>>
>>>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
>>>>>>>>>>>> broadcast device:
>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>> 	Broadcast device  
>>>>>>>>>>>> 	Clock Event Device: ttc_clockevent
>>>>>>>>>>>>
>>>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
>>>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
>>>>>>>>>>>> and the broadcast device is the global timer:
>>>>>>>>>>>> 	Tick Device: mode:     1                                                         
>>>>>>>>>>>> 	Broadcast device                                                                 
>>>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>>>>
>>>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
>>>>>>>>>>>> obtain this information for that case.
>>>>>>>>>>>
>>>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
>>>>>>>>>>
>>>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
>>>>>>>>>> it is most likely not that useful.
>>>>>>>>>>
>>>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
>>>>>>>>>> the gt enabled and having maxcpus=1 set.
>>>>>>>>>>
>>>>>>>>>> /proc/timer_list:
>>>>>>>>>> 	Tick Device: mode:     1
>>>>>>>>>> 	Broadcast device
>>>>>>>>>> 	Clock Event Device: arm_global_timer
>>>>>>>>>> 	 max_delta_ns:   12884902005
>>>>>>>>>> 	 min_delta_ns:   1000
>>>>>>>>>> 	 mult:           715827876
>>>>>>>>>> 	 shift:          31
>>>>>>>>>> 	 mode:           3
>>>>>>>>>
>>>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
>>>>>>>>>
>>>>>>>>> The previous timer_list output you gave me when removing the offending
>>>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
>>>>>>>>>
>>>>>>>>> Is it possible you try to get this output again right after onlining the
>>>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
>>>>>>>>
>>>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
>>>>>>>> and that didn't end well:
>>>>>>>> 	# echo 1 > online && cat /proc/timer_list 
>>>>>>>
>>>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
>>>>>>> apparently this is not the case... :(
>>>>>>>
>>>>>>> I suspect the global timer is shutdown at one moment but I don't
>>>>>>> understand why and when.
>>>>>>>
>>>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
>>>>>>> the clockevent device name ? Perhaps, we may see at boot time an
>>>>>>> interesting trace when it hangs.
>>>>>>
>>>>>> I did this change:
>>>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>>>>>> 	index 38959c8..3ab11c1 100644
>>>>>> 	--- a/kernel/time/clockevents.c
>>>>>> 	+++ b/kernel/time/clockevents.c
>>>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
>>>>>> 	  */
>>>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
>>>>>> 	 {
>>>>>> 	+       pr_info("ce->name:%s\n", dev->name);
>>>>>> 	+       dump_stack();
>>>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>>>>>> 	        dev->next_event.tv64 = KTIME_MAX;
>>>>>> 	 }
>>>>>>
>>>>>> It is hit a few times during boot, so I attach a full boot log. I really
>>>>>> don't know what to look for, but I hope you can spot something in it. I
>>>>>> really appreciate you taking the time.
>>>>>
>>>>> Thanks for the traces.
>>>>
>>>> Sure.
>>>>
>>>>>
>>>>> If you try without the ttc_clockevent configured in the kernel (but with
>>>>> twd and gt), does it boot ?
>>>>
>>>> Absence of the TTC doesn't seem to make any difference. It hangs at the
>>>> same location.
>>>
>>> Ok, IMO there is a problem with the broadcast device registration (may
>>> be vs twd).
>>
>> I have an idea, but no real evidence to prove it:
>> Some of the registers in the arm_global_timer are banked per CPU. I.e.
>> some code must be executed on the CPU the timer is associated with
>> (struct clock_event_device.cpumask) to have the intended effect
>> As far as I can tell, there is no guarantee, that the set_mode()
>> and program_next_event() calls execute on the correct CPU.
>> If this was correct, shutting down the timer for the CPU entering
>> idle might actually shut down the timer for the running CPU, if
>> set_mode() executes on the CPU which is _not_ about to enter idle.
> 
> Hi S?ren,
> Am able to reproduce similar issue on StiH415 SOC by enabling both
> global_timer and twd and using cpuidle driver like zynq.
> 
> When CPU0 goes to idle, I noticed that the global timer used for
> boardcast is actually scheduled on wrong cpu.
> My traces for printk like this
> 	printk("DEBUG: %s on CPU:%d CPUMASK:%s\n", __FUNCTION__,
> smp_processor_id(), scpumask);
> 
> shows:
> 
> DEBUG: gt_clockevent_set_mode on CPU:1 CPUMASK: 0
> DEBUG: gt_clockevent_set_next_event on CPU:1 CPUMASK:0
> 
> Which indicates that setting the mode and next_event for a clkevent with
> cpumask 0 is scheduled on cpu1, this will generate an global timer
> interrupt on cpu1 rather than cpu0.
> 
> This might be the reason for cpu0 not coming out of the cpu_idle_loop.

yes, but at least the broadcast mechanism should send an IPI to cpu0 to
wake it up, no ? As Stephen stated this kind of configuration should has
never been tested before so the tick broadcast code is not handling this
case properly IMHO.




-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-09 10:32                                             ` Srinivas KANDAGATLA
@ 2013-08-09 16:03                                               ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-09 16:03 UTC (permalink / raw)
  To: Srinivas KANDAGATLA
  Cc: Daniel Lezcano, Russell King, Stephen Boyd, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Fri, Aug 09, 2013 at 11:32:42AM +0100, Srinivas KANDAGATLA wrote:
> On 08/08/13 18:11, Sören Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 07:43 PM, Sören Brinkmann wrote:
> >>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >>>> On 08/01/2013 01:38 AM, Sören Brinkmann wrote:
> >>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>>>> On 08/01/2013 12:18 AM, Sören Brinkmann wrote:
> >>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/31/2013 10:58 PM, Sören Brinkmann wrote:
> >>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>> On 07/31/2013 12:34 AM, Sören Brinkmann wrote:
> >>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>> On 07/30/2013 02:03 AM, Sören Brinkmann wrote:
> >>>>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>>>
> >>>>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>>>
> >>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>>>> broadcast device:
> >>>>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>>>> 	Broadcast device  
> >>>>>>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>>>>>
> >>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>>>> 	Broadcast device                                                                 
> >>>>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>>>>
> >>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>>>> obtain this information for that case.
> >>>>>>>>>>
> >>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>>>
> >>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>>>> it is most likely not that useful.
> >>>>>>>>>
> >>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>>>
> >>>>>>>>> /proc/timer_list:
> >>>>>>>>> 	Tick Device: mode:     1
> >>>>>>>>> 	Broadcast device
> >>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>> 	 max_delta_ns:   12884902005
> >>>>>>>>> 	 min_delta_ns:   1000
> >>>>>>>>> 	 mult:           715827876
> >>>>>>>>> 	 shift:          31
> >>>>>>>>> 	 mode:           3
> >>>>>>>>
> >>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>>>
> >>>>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>>>
> >>>>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>>>
> >>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>>>> and that didn't end well:
> >>>>>>> 	# echo 1 > online && cat /proc/timer_list 
> >>>>>>
> >>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>>>> apparently this is not the case... :(
> >>>>>>
> >>>>>> I suspect the global timer is shutdown at one moment but I don't
> >>>>>> understand why and when.
> >>>>>>
> >>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>>>> interesting trace when it hangs.
> >>>>>
> >>>>> I did this change:
> >>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>>>> 	index 38959c8..3ab11c1 100644
> >>>>> 	--- a/kernel/time/clockevents.c
> >>>>> 	+++ b/kernel/time/clockevents.c
> >>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>>>> 	  */
> >>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
> >>>>> 	 {
> >>>>> 	+       pr_info("ce->name:%s\n", dev->name);
> >>>>> 	+       dump_stack();
> >>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>>>> 	        dev->next_event.tv64 = KTIME_MAX;
> >>>>> 	 }
> >>>>>
> >>>>> It is hit a few times during boot, so I attach a full boot log. I really
> >>>>> don't know what to look for, but I hope you can spot something in it. I
> >>>>> really appreciate you taking the time.
> >>>>
> >>>> Thanks for the traces.
> >>>
> >>> Sure.
> >>>
> >>>>
> >>>> If you try without the ttc_clockevent configured in the kernel (but with
> >>>> twd and gt), does it boot ?
> >>>
> >>> Absence of the TTC doesn't seem to make any difference. It hangs at the
> >>> same location.
> >>
> >> Ok, IMO there is a problem with the broadcast device registration (may
> >> be vs twd).
> > 
> > I have an idea, but no real evidence to prove it:
> > Some of the registers in the arm_global_timer are banked per CPU. I.e.
> > some code must be executed on the CPU the timer is associated with
> > (struct clock_event_device.cpumask) to have the intended effect
> > As far as I can tell, there is no guarantee, that the set_mode()
> > and program_next_event() calls execute on the correct CPU.
> > If this was correct, shutting down the timer for the CPU entering
> > idle might actually shut down the timer for the running CPU, if
> > set_mode() executes on the CPU which is _not_ about to enter idle.
> 
> Hi Sören,
> Am able to reproduce similar issue on StiH415 SOC by enabling both
> global_timer and twd and using cpuidle driver like zynq.
> 
> When CPU0 goes to idle, I noticed that the global timer used for
> boardcast is actually scheduled on wrong cpu.
> My traces for printk like this
> 	printk("DEBUG: %s on CPU:%d CPUMASK:%s\n", __FUNCTION__,
> smp_processor_id(), scpumask);
> 
> shows:
> 
> DEBUG: gt_clockevent_set_mode on CPU:1 CPUMASK: 0
> DEBUG: gt_clockevent_set_next_event on CPU:1 CPUMASK:0
> 
> Which indicates that setting the mode and next_event for a clkevent with
> cpumask 0 is scheduled on cpu1, this will generate an global timer
> interrupt on cpu1 rather than cpu0.
> 
> This might be the reason for cpu0 not coming out of the cpu_idle_loop.

Thanks for reproducing.
I think so too. I had similar debug prints in set_mode and
program_next event, which compared the passed struct clocke_event_device
pointer with the this_cpu_ptr(gt_evt) pointer, which clearly indicate
this mismatch.

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-09 16:03                                               ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-09 16:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 09, 2013 at 11:32:42AM +0100, Srinivas KANDAGATLA wrote:
> On 08/08/13 18:11, S?ren Brinkmann wrote:
> > Hi Daniel,
> > 
> > On Thu, Aug 01, 2013 at 07:48:04PM +0200, Daniel Lezcano wrote:
> >> On 08/01/2013 07:43 PM, S?ren Brinkmann wrote:
> >>> On Thu, Aug 01, 2013 at 07:29:12PM +0200, Daniel Lezcano wrote:
> >>>> On 08/01/2013 01:38 AM, S?ren Brinkmann wrote:
> >>>>> On Thu, Aug 01, 2013 at 01:01:27AM +0200, Daniel Lezcano wrote:
> >>>>>> On 08/01/2013 12:18 AM, S?ren Brinkmann wrote:
> >>>>>>> On Wed, Jul 31, 2013 at 11:08:51PM +0200, Daniel Lezcano wrote:
> >>>>>>>> On 07/31/2013 10:58 PM, S?ren Brinkmann wrote:
> >>>>>>>>> On Wed, Jul 31, 2013 at 10:49:06PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>> On 07/31/2013 12:34 AM, S?ren Brinkmann wrote:
> >>>>>>>>>>> On Tue, Jul 30, 2013 at 10:47:15AM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>> On 07/30/2013 02:03 AM, S?ren Brinkmann wrote:
> >>>>>>>>>>>>> Hi Daniel,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Jul 29, 2013 at 02:51:49PM +0200, Daniel Lezcano wrote:
> >>>>>>>>>>>>> (snip)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the CPUIDLE_FLAG_TIMER_STOP flag tells the cpuidle framework the local
> >>>>>>>>>>>>>> timer will be stopped when entering to the idle state. In this case, the
> >>>>>>>>>>>>>> cpuidle framework will call clockevents_notify(ENTER) and switches to a
> >>>>>>>>>>>>>> broadcast timer and will call clockevents_notify(EXIT) when exiting the
> >>>>>>>>>>>>>> idle state, switching the local timer back in use.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've been thinking about this, trying to understand how this makes my
> >>>>>>>>>>>>> boot attempts on Zynq hang. IIUC, the wrongly provided TIMER_STOP flag
> >>>>>>>>>>>>> would make the timer core switch to a broadcast device even though it
> >>>>>>>>>>>>> wouldn't be necessary. But shouldn't it still work? It sounds like we do
> >>>>>>>>>>>>> something useless, but nothing wrong in a sense that it should result in
> >>>>>>>>>>>>> breakage. I guess I'm missing something obvious. This timer system will
> >>>>>>>>>>>>> always remain a mystery to me.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Actually this more or less leads to the question: What is this
> >>>>>>>>>>>>> 'broadcast timer'. I guess that is some clockevent device which is
> >>>>>>>>>>>>> common to all cores? (that would be the cadence_ttc for Zynq). Is the
> >>>>>>>>>>>>> hang pointing to some issue with that driver?
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you look at the /proc/timer_list, which timer is used for broadcasting ?
> >>>>>>>>>>>
> >>>>>>>>>>> So, the correct run results (full output attached).
> >>>>>>>>>>>
> >>>>>>>>>>> The vanilla kernel uses the twd timers as local timers and the TTC as
> >>>>>>>>>>> broadcast device:
> >>>>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>>>> 	Broadcast device  
> >>>>>>>>>>> 	Clock Event Device: ttc_clockevent
> >>>>>>>>>>>
> >>>>>>>>>>> When I remove the offending CPUIDLE flag and add the DT fragment to
> >>>>>>>>>>> enable the global timer, the twd timers are still used as local timers
> >>>>>>>>>>> and the broadcast device is the global timer:
> >>>>>>>>>>> 	Tick Device: mode:     1                                                         
> >>>>>>>>>>> 	Broadcast device                                                                 
> >>>>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>>>>
> >>>>>>>>>>> Again, since boot hangs in the actually broken case, I don't see way to
> >>>>>>>>>>> obtain this information for that case.
> >>>>>>>>>>
> >>>>>>>>>> Can't you use the maxcpus=1 option to ensure the system to boot up ?
> >>>>>>>>>
> >>>>>>>>> Right, that works. I forgot about that option after you mentioned, that
> >>>>>>>>> it is most likely not that useful.
> >>>>>>>>>
> >>>>>>>>> Anyway, this are those sysfs files with an unmodified cpuidle driver and
> >>>>>>>>> the gt enabled and having maxcpus=1 set.
> >>>>>>>>>
> >>>>>>>>> /proc/timer_list:
> >>>>>>>>> 	Tick Device: mode:     1
> >>>>>>>>> 	Broadcast device
> >>>>>>>>> 	Clock Event Device: arm_global_timer
> >>>>>>>>> 	 max_delta_ns:   12884902005
> >>>>>>>>> 	 min_delta_ns:   1000
> >>>>>>>>> 	 mult:           715827876
> >>>>>>>>> 	 shift:          31
> >>>>>>>>> 	 mode:           3
> >>>>>>>>
> >>>>>>>> Here the mode is 3 (CLOCK_EVT_MODE_ONESHOT)
> >>>>>>>>
> >>>>>>>> The previous timer_list output you gave me when removing the offending
> >>>>>>>> cpuidle flag, it was 1 (CLOCK_EVT_MODE_SHUTDOWN).
> >>>>>>>>
> >>>>>>>> Is it possible you try to get this output again right after onlining the
> >>>>>>>> cpu1 in order to check if the broadcast device switches to SHUTDOWN ?
> >>>>>>>
> >>>>>>> How do I do that? I tried to online CPU1 after booting with maxcpus=1
> >>>>>>> and that didn't end well:
> >>>>>>> 	# echo 1 > online && cat /proc/timer_list 
> >>>>>>
> >>>>>> Hmm, I was hoping to have a small delay before the kernel hangs but
> >>>>>> apparently this is not the case... :(
> >>>>>>
> >>>>>> I suspect the global timer is shutdown at one moment but I don't
> >>>>>> understand why and when.
> >>>>>>
> >>>>>> Can you add a stack trace in the "clockevents_shutdown" function with
> >>>>>> the clockevent device name ? Perhaps, we may see at boot time an
> >>>>>> interesting trace when it hangs.
> >>>>>
> >>>>> I did this change:
> >>>>> 	diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> >>>>> 	index 38959c8..3ab11c1 100644
> >>>>> 	--- a/kernel/time/clockevents.c
> >>>>> 	+++ b/kernel/time/clockevents.c
> >>>>> 	@@ -92,6 +92,8 @@ void clockevents_set_mode(struct clock_event_device *dev,
> >>>>> 	  */
> >>>>> 	 void clockevents_shutdown(struct clock_event_device *dev)
> >>>>> 	 {
> >>>>> 	+       pr_info("ce->name:%s\n", dev->name);
> >>>>> 	+       dump_stack();
> >>>>> 	        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
> >>>>> 	        dev->next_event.tv64 = KTIME_MAX;
> >>>>> 	 }
> >>>>>
> >>>>> It is hit a few times during boot, so I attach a full boot log. I really
> >>>>> don't know what to look for, but I hope you can spot something in it. I
> >>>>> really appreciate you taking the time.
> >>>>
> >>>> Thanks for the traces.
> >>>
> >>> Sure.
> >>>
> >>>>
> >>>> If you try without the ttc_clockevent configured in the kernel (but with
> >>>> twd and gt), does it boot ?
> >>>
> >>> Absence of the TTC doesn't seem to make any difference. It hangs at the
> >>> same location.
> >>
> >> Ok, IMO there is a problem with the broadcast device registration (may
> >> be vs twd).
> > 
> > I have an idea, but no real evidence to prove it:
> > Some of the registers in the arm_global_timer are banked per CPU. I.e.
> > some code must be executed on the CPU the timer is associated with
> > (struct clock_event_device.cpumask) to have the intended effect
> > As far as I can tell, there is no guarantee, that the set_mode()
> > and program_next_event() calls execute on the correct CPU.
> > If this was correct, shutting down the timer for the CPU entering
> > idle might actually shut down the timer for the running CPU, if
> > set_mode() executes on the CPU which is _not_ about to enter idle.
> 
> Hi S?ren,
> Am able to reproduce similar issue on StiH415 SOC by enabling both
> global_timer and twd and using cpuidle driver like zynq.
> 
> When CPU0 goes to idle, I noticed that the global timer used for
> boardcast is actually scheduled on wrong cpu.
> My traces for printk like this
> 	printk("DEBUG: %s on CPU:%d CPUMASK:%s\n", __FUNCTION__,
> smp_processor_id(), scpumask);
> 
> shows:
> 
> DEBUG: gt_clockevent_set_mode on CPU:1 CPUMASK: 0
> DEBUG: gt_clockevent_set_next_event on CPU:1 CPUMASK:0
> 
> Which indicates that setting the mode and next_event for a clkevent with
> cpumask 0 is scheduled on cpu1, this will generate an global timer
> interrupt on cpu1 rather than cpu0.
> 
> This might be the reason for cpu0 not coming out of the cpu_idle_loop.

Thanks for reproducing.
I think so too. I had similar debug prints in set_mode and
program_next event, which compared the passed struct clocke_event_device
pointer with the this_cpu_ptr(gt_evt) pointer, which clearly indicate
this mismatch.

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-09 14:19                                               ` Daniel Lezcano
@ 2013-08-09 17:27                                                 ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-09 17:27 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: srinivas.kandagatla, S??ren Brinkmann, Russell King,
	Michal Simek, linux-kernel, Stuart Menefy, John Stultz,
	Thomas Gleixner, linux-arm-kernel

On 08/09, Daniel Lezcano wrote:
> 
> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> wake it up, no ? As Stephen stated this kind of configuration should has
> never been tested before so the tick broadcast code is not handling this
> case properly IMHO.
> 

If you have a per-cpu tick device that isn't suffering from
FEAT_C3_STOP why wouldn't you use that for the tick versus a
per-cpu tick device that has FEAT_C3_STOP? It sounds like there
is a bug in the preference logic or you should boost the rating
of the arm global timer above the twd. Does this patch help? It
should make the arm global timer the tick device and whatever the
cadence timer you have into the broadcast device.

---8<----
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 218bcb5..d3539e5 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
 	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
 		return false;
 
+	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
+		return false;
+
 	return !curdev || newdev->rating > curdev->rating;
 }
 
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 64522ec..1628b9f 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -245,6 +245,15 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
 	}
 
 	/*
+	 * Prefer tick devices that don't suffer from FEAT_C3_STOP
+	 * regardless of their rating
+	 */
+	if (curdev && cpumask_equal(curdev->cpumask, newdev->cpumask) &&
+	    !(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+	    (curdev->features & CLOCK_EVT_FEAT_C3STOP))
+		return true;
+
+	/*
 	 * Use the higher rated one, but prefer a CPU local device with a lower
 	 * rating than a non-CPU local device
 	 */
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-09 17:27                                                 ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-09 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/09, Daniel Lezcano wrote:
> 
> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> wake it up, no ? As Stephen stated this kind of configuration should has
> never been tested before so the tick broadcast code is not handling this
> case properly IMHO.
> 

If you have a per-cpu tick device that isn't suffering from
FEAT_C3_STOP why wouldn't you use that for the tick versus a
per-cpu tick device that has FEAT_C3_STOP? It sounds like there
is a bug in the preference logic or you should boost the rating
of the arm global timer above the twd. Does this patch help? It
should make the arm global timer the tick device and whatever the
cadence timer you have into the broadcast device.

---8<----
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 218bcb5..d3539e5 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
 	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
 		return false;
 
+	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
+		return false;
+
 	return !curdev || newdev->rating > curdev->rating;
 }
 
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 64522ec..1628b9f 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -245,6 +245,15 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
 	}
 
 	/*
+	 * Prefer tick devices that don't suffer from FEAT_C3_STOP
+	 * regardless of their rating
+	 */
+	if (curdev && cpumask_equal(curdev->cpumask, newdev->cpumask) &&
+	    !(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+	    (curdev->features & CLOCK_EVT_FEAT_C3STOP))
+		return true;
+
+	/*
 	 * Use the higher rated one, but prefer a CPU local device with a lower
 	 * rating than a non-CPU local device
 	 */
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-09 17:27                                                 ` Stephen Boyd
@ 2013-08-09 17:48                                                   ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-09 17:48 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> On 08/09, Daniel Lezcano wrote:
> > 
> > yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> > wake it up, no ? As Stephen stated this kind of configuration should has
> > never been tested before so the tick broadcast code is not handling this
> > case properly IMHO.
> > 
> 
> If you have a per-cpu tick device that isn't suffering from
> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> is a bug in the preference logic or you should boost the rating
> of the arm global timer above the twd. Does this patch help? It
> should make the arm global timer the tick device and whatever the
> cadence timer you have into the broadcast device.

I'm not sure I'm getting this right. But neither the cadence_ttc nor the
arm_global_timer have the FEAT_C3_STOP flag set. So, shouldn't they be
treated equally even with your change?

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-09 17:48                                                   ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-09 17:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> On 08/09, Daniel Lezcano wrote:
> > 
> > yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> > wake it up, no ? As Stephen stated this kind of configuration should has
> > never been tested before so the tick broadcast code is not handling this
> > case properly IMHO.
> > 
> 
> If you have a per-cpu tick device that isn't suffering from
> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> is a bug in the preference logic or you should boost the rating
> of the arm global timer above the twd. Does this patch help? It
> should make the arm global timer the tick device and whatever the
> cadence timer you have into the broadcast device.

I'm not sure I'm getting this right. But neither the cadence_ttc nor the
arm_global_timer have the FEAT_C3_STOP flag set. So, shouldn't they be
treated equally even with your change?

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-09 17:48                                                   ` Sören Brinkmann
@ 2013-08-09 18:45                                                     ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-09 18:45 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/09/13 10:48, Sören Brinkmann wrote:
> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>> On 08/09, Daniel Lezcano wrote:
>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>> never been tested before so the tick broadcast code is not handling this
>>> case properly IMHO.
>>>
>> If you have a per-cpu tick device that isn't suffering from
>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>> is a bug in the preference logic or you should boost the rating
>> of the arm global timer above the twd. Does this patch help? It
>> should make the arm global timer the tick device and whatever the
>> cadence timer you have into the broadcast device.
> I'm not sure I'm getting this right. But neither the cadence_ttc nor the
> arm_global_timer have the FEAT_C3_STOP flag set. So, shouldn't they be
> treated equally even with your change?

The cadence_ttc is a global clockevent, i.e. the irq can interrupt any
CPU, and it has a rating of 200. The arm global timer is a per-cpu
clockevent with a rating of 300. The TWD is a per-cpu clockevent with a
rating of 350. Because the arm global timer is rated higher than the
cadence_ttc but less than the TWD the arm global timer will fill in the
broadcast spot and the TWD will fill in the tick position. We really
want the arm global timer to fill in the tick position and the
cadence_ttc to fill in the broadcast spot (although the cadence_ttc
should never be needed because the arm global timer doesn't need help in
deep idle states).

Unless I got lost in all the combinations of tests you've done so far?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-09 18:45                                                     ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-09 18:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/09/13 10:48, S?ren Brinkmann wrote:
> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>> On 08/09, Daniel Lezcano wrote:
>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>> never been tested before so the tick broadcast code is not handling this
>>> case properly IMHO.
>>>
>> If you have a per-cpu tick device that isn't suffering from
>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>> is a bug in the preference logic or you should boost the rating
>> of the arm global timer above the twd. Does this patch help? It
>> should make the arm global timer the tick device and whatever the
>> cadence timer you have into the broadcast device.
> I'm not sure I'm getting this right. But neither the cadence_ttc nor the
> arm_global_timer have the FEAT_C3_STOP flag set. So, shouldn't they be
> treated equally even with your change?

The cadence_ttc is a global clockevent, i.e. the irq can interrupt any
CPU, and it has a rating of 200. The arm global timer is a per-cpu
clockevent with a rating of 300. The TWD is a per-cpu clockevent with a
rating of 350. Because the arm global timer is rated higher than the
cadence_ttc but less than the TWD the arm global timer will fill in the
broadcast spot and the TWD will fill in the tick position. We really
want the arm global timer to fill in the tick position and the
cadence_ttc to fill in the broadcast spot (although the cadence_ttc
should never be needed because the arm global timer doesn't need help in
deep idle states).

Unless I got lost in all the combinations of tests you've done so far?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-09 17:27                                                 ` Stephen Boyd
@ 2013-08-12 10:53                                                   ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 10:53 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: srinivas.kandagatla, S??ren Brinkmann, Russell King,
	Michal Simek, linux-kernel, Stuart Menefy, John Stultz,
	Thomas Gleixner, linux-arm-kernel

On 08/09/2013 07:27 PM, Stephen Boyd wrote:
> On 08/09, Daniel Lezcano wrote:
>>
>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>> wake it up, no ? As Stephen stated this kind of configuration should has
>> never been tested before so the tick broadcast code is not handling this
>> case properly IMHO.
>>
> 
> If you have a per-cpu tick device that isn't suffering from
> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> is a bug in the preference logic or you should boost the rating
> of the arm global timer above the twd. Does this patch help? It
> should make the arm global timer the tick device and whatever the
> cadence timer you have into the broadcast device.
> 
> ---8<----
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index 218bcb5..d3539e5 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
>  	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
>  		return false;
>  
> +	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
> +		return false;

Yes, that makes sense to prevent local timer devices to be a broadcast one.

In the case of the arm global timer, each cpu will register their timer,
so the test will be ok but is it possible the cpu0 registers the timers
for the other cpus ?

>  	return !curdev || newdev->rating > curdev->rating;
>  }
>  
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index 64522ec..1628b9f 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -245,6 +245,15 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>  	}
>  
>  	/*
> +	 * Prefer tick devices that don't suffer from FEAT_C3_STOP
> +	 * regardless of their rating
> +	 */
> +	if (curdev && cpumask_equal(curdev->cpumask, newdev->cpumask) &&
> +	    !(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> +	    (curdev->features & CLOCK_EVT_FEAT_C3STOP))
> +		return true;

That sounds reasonable, but what is the acceptable gap between the
ratings ? I am wondering if there isn't too much heuristic in the tick
code...


> +
> +	/*
>  	 * Use the higher rated one, but prefer a CPU local device with a lower
>  	 * rating than a non-CPU local device
>  	 */
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 10:53                                                   ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 10:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/09/2013 07:27 PM, Stephen Boyd wrote:
> On 08/09, Daniel Lezcano wrote:
>>
>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>> wake it up, no ? As Stephen stated this kind of configuration should has
>> never been tested before so the tick broadcast code is not handling this
>> case properly IMHO.
>>
> 
> If you have a per-cpu tick device that isn't suffering from
> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> is a bug in the preference logic or you should boost the rating
> of the arm global timer above the twd. Does this patch help? It
> should make the arm global timer the tick device and whatever the
> cadence timer you have into the broadcast device.
> 
> ---8<----
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index 218bcb5..d3539e5 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
>  	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
>  		return false;
>  
> +	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
> +		return false;

Yes, that makes sense to prevent local timer devices to be a broadcast one.

In the case of the arm global timer, each cpu will register their timer,
so the test will be ok but is it possible the cpu0 registers the timers
for the other cpus ?

>  	return !curdev || newdev->rating > curdev->rating;
>  }
>  
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index 64522ec..1628b9f 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -245,6 +245,15 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>  	}
>  
>  	/*
> +	 * Prefer tick devices that don't suffer from FEAT_C3_STOP
> +	 * regardless of their rating
> +	 */
> +	if (curdev && cpumask_equal(curdev->cpumask, newdev->cpumask) &&
> +	    !(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> +	    (curdev->features & CLOCK_EVT_FEAT_C3STOP))
> +		return true;

That sounds reasonable, but what is the acceptable gap between the
ratings ? I am wondering if there isn't too much heuristic in the tick
code...


> +
> +	/*
>  	 * Use the higher rated one, but prefer a CPU local device with a lower
>  	 * rating than a non-CPU local device
>  	 */
> 


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-09 17:27                                                 ` Stephen Boyd
@ 2013-08-12 16:03                                                   ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:03 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> On 08/09, Daniel Lezcano wrote:
> > 
> > yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> > wake it up, no ? As Stephen stated this kind of configuration should has
> > never been tested before so the tick broadcast code is not handling this
> > case properly IMHO.
> > 
> 
> If you have a per-cpu tick device that isn't suffering from
> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> is a bug in the preference logic or you should boost the rating
> of the arm global timer above the twd. Does this patch help? It
> should make the arm global timer the tick device and whatever the
> cadence timer you have into the broadcast device.

I finally got to test your patch. Unfortunately, it makes the system
hang even earlier:

	Starting kernel ...
	
	Uncompressing Linux... done, booting the kernel.
	[    0.000000] Booting Linux on physical CPU 0x0
	[    0.000000] Linux version 3.11.0-rc3-00002-g391ac9b (sorenb@xsjandreislx) (gcc version 4.7.2 (Sourcery CodeBench Lite 2012.09-104) ) #98 SMP PREEMPT Mon Aug 12 08:59:34 PDT 2013
	[    0.000000] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
	[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
	[    0.000000] Machine: Xilinx Zynq Platform, model: Zynq ZC706 Development Board
	[    0.000000] bootconsole [earlycon0] enabled
	[    0.000000] cma: CMA: reserved 16 MiB at 2e800000
	[    0.000000] Memory policy: ECC disabled, Data cache writealloc
	[    0.000000] PERCPU: Embedded 9 pages/cpu @c149c000 s14720 r8192 d13952 u36864
	[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260624
	[    0.000000] Kernel command line: console=ttyPS0,115200 earlyprintk
	[    0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)
	[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
	[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
	[    0.000000] Memory: 1004928K/1048576K available (4891K kernel code, 307K rwdata, 1564K rodata, 338K init, 5699K bss, 43648K reserved, 270336K highmem)
	[    0.000000] Virtual kernel memory layout:
	[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
	[    0.000000]     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
	[    0.000000]     vmalloc : 0xf0000000 - 0xff000000   ( 240 MB)
	[    0.000000]     lowmem  : 0xc0000000 - 0xef800000   ( 760 MB)
	[    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
	[    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
	[    0.000000]       .text : 0xc0008000 - 0xc0656128   (6457 kB)
	[    0.000000]       .init : 0xc0657000 - 0xc06ab980   ( 339 kB)
	[    0.000000]       .data : 0xc06ac000 - 0xc06f8c20   ( 308 kB)
	[    0.000000]        .bss : 0xc06f8c20 - 0xc0c89aa4   (5700 kB)
	[    0.000000] Preemptible hierarchical RCU implementation.
	[    0.000000]  RCU lockdep checking is enabled.
	[    0.000000]  Additional per-CPU info printed with stalls.
	[    0.000000]  RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
	[    0.000000] NR_IRQS:16 nr_irqs:16 16
	[    0.000000] slcr mapped to f0004000
	[    0.000000] Zynq clock init
	[    0.000000] sched_clock: 32 bits at 333MHz, resolution 3ns, wraps every 12884ms
	[    0.000000] ttc0 #0 at f0006000, irq=43
	[    0.000000] Console: colour dummy device 80x30
	[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
	[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:  8
	[    0.000000] ... MAX_LOCK_DEPTH:          48
	[    0.000000] ... MAX_LOCKDEP_KEYS:        8191
	[    0.000000] ... CLASSHASH_SIZE:          4096
	[    0.000000] ... MAX_LOCKDEP_ENTRIES:     16384
	[    0.000000] ... MAX_LOCKDEP_CHAINS:      32768
	[    0.000000] ... CHAINHASH_SIZE:          16384
	[    0.000000]  memory used by lock dependency info: 3695 kB
	[    0.000000]  per task-struct memory footprint: 1152 bytes
	[    0.057541] Calibrating delay loop... 1325.46 BogoMIPS (lpj=6627328)
	[    0.100248] pid_max: default: 32768 minimum: 301
	[    0.103294] Mount-cache hash table entries: 512
	[    0.114364] CPU: Testing write buffer coherency: ok
	[    0.114513] ftrace: allocating 16143 entries in 48 pages
	[    0.155012] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000


	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:03                                                   ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> On 08/09, Daniel Lezcano wrote:
> > 
> > yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> > wake it up, no ? As Stephen stated this kind of configuration should has
> > never been tested before so the tick broadcast code is not handling this
> > case properly IMHO.
> > 
> 
> If you have a per-cpu tick device that isn't suffering from
> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> is a bug in the preference logic or you should boost the rating
> of the arm global timer above the twd. Does this patch help? It
> should make the arm global timer the tick device and whatever the
> cadence timer you have into the broadcast device.

I finally got to test your patch. Unfortunately, it makes the system
hang even earlier:

	Starting kernel ...
	
	Uncompressing Linux... done, booting the kernel.
	[    0.000000] Booting Linux on physical CPU 0x0
	[    0.000000] Linux version 3.11.0-rc3-00002-g391ac9b (sorenb at xsjandreislx) (gcc version 4.7.2 (Sourcery CodeBench Lite 2012.09-104) ) #98 SMP PREEMPT Mon Aug 12 08:59:34 PDT 2013
	[    0.000000] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
	[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
	[    0.000000] Machine: Xilinx Zynq Platform, model: Zynq ZC706 Development Board
	[    0.000000] bootconsole [earlycon0] enabled
	[    0.000000] cma: CMA: reserved 16 MiB at 2e800000
	[    0.000000] Memory policy: ECC disabled, Data cache writealloc
	[    0.000000] PERCPU: Embedded 9 pages/cpu @c149c000 s14720 r8192 d13952 u36864
	[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260624
	[    0.000000] Kernel command line: console=ttyPS0,115200 earlyprintk
	[    0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)
	[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
	[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
	[    0.000000] Memory: 1004928K/1048576K available (4891K kernel code, 307K rwdata, 1564K rodata, 338K init, 5699K bss, 43648K reserved, 270336K highmem)
	[    0.000000] Virtual kernel memory layout:
	[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
	[    0.000000]     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
	[    0.000000]     vmalloc : 0xf0000000 - 0xff000000   ( 240 MB)
	[    0.000000]     lowmem  : 0xc0000000 - 0xef800000   ( 760 MB)
	[    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
	[    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
	[    0.000000]       .text : 0xc0008000 - 0xc0656128   (6457 kB)
	[    0.000000]       .init : 0xc0657000 - 0xc06ab980   ( 339 kB)
	[    0.000000]       .data : 0xc06ac000 - 0xc06f8c20   ( 308 kB)
	[    0.000000]        .bss : 0xc06f8c20 - 0xc0c89aa4   (5700 kB)
	[    0.000000] Preemptible hierarchical RCU implementation.
	[    0.000000]  RCU lockdep checking is enabled.
	[    0.000000]  Additional per-CPU info printed with stalls.
	[    0.000000]  RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
	[    0.000000] NR_IRQS:16 nr_irqs:16 16
	[    0.000000] slcr mapped to f0004000
	[    0.000000] Zynq clock init
	[    0.000000] sched_clock: 32 bits at 333MHz, resolution 3ns, wraps every 12884ms
	[    0.000000] ttc0 #0 at f0006000, irq=43
	[    0.000000] Console: colour dummy device 80x30
	[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
	[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:  8
	[    0.000000] ... MAX_LOCK_DEPTH:          48
	[    0.000000] ... MAX_LOCKDEP_KEYS:        8191
	[    0.000000] ... CLASSHASH_SIZE:          4096
	[    0.000000] ... MAX_LOCKDEP_ENTRIES:     16384
	[    0.000000] ... MAX_LOCKDEP_CHAINS:      32768
	[    0.000000] ... CHAINHASH_SIZE:          16384
	[    0.000000]  memory used by lock dependency info: 3695 kB
	[    0.000000]  per task-struct memory footprint: 1152 bytes
	[    0.057541] Calibrating delay loop... 1325.46 BogoMIPS (lpj=6627328)
	[    0.100248] pid_max: default: 32768 minimum: 301
	[    0.103294] Mount-cache hash table entries: 512
	[    0.114364] CPU: Testing write buffer coherency: ok
	[    0.114513] ftrace: allocating 16143 entries in 48 pages
	[    0.155012] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000


	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:03                                                   ` Sören Brinkmann
@ 2013-08-12 16:08                                                     ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 16:08 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/12/2013 06:03 PM, Sören Brinkmann wrote:
> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>> On 08/09, Daniel Lezcano wrote:
>>>
>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>> never been tested before so the tick broadcast code is not handling this
>>> case properly IMHO.
>>>
>>
>> If you have a per-cpu tick device that isn't suffering from
>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>> is a bug in the preference logic or you should boost the rating
>> of the arm global timer above the twd. Does this patch help? It
>> should make the arm global timer the tick device and whatever the
>> cadence timer you have into the broadcast device.
> 
> I finally got to test your patch. Unfortunately, it makes the system
> hang even earlier:

[ ... ]

Can you boot with maxcpus=1 and then give the result of timer_list ?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:08                                                     ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 16:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/2013 06:03 PM, S?ren Brinkmann wrote:
> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>> On 08/09, Daniel Lezcano wrote:
>>>
>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>> never been tested before so the tick broadcast code is not handling this
>>> case properly IMHO.
>>>
>>
>> If you have a per-cpu tick device that isn't suffering from
>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>> is a bug in the preference logic or you should boost the rating
>> of the arm global timer above the twd. Does this patch help? It
>> should make the arm global timer the tick device and whatever the
>> cadence timer you have into the broadcast device.
> 
> I finally got to test your patch. Unfortunately, it makes the system
> hang even earlier:

[ ... ]

Can you boot with maxcpus=1 and then give the result of timer_list ?


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:08                                                     ` Daniel Lezcano
@ 2013-08-12 16:17                                                       ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:17 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Mon, Aug 12, 2013 at 06:08:46PM +0200, Daniel Lezcano wrote:
> On 08/12/2013 06:03 PM, Sören Brinkmann wrote:
> > On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >> On 08/09, Daniel Lezcano wrote:
> >>>
> >>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>> never been tested before so the tick broadcast code is not handling this
> >>> case properly IMHO.
> >>>
> >>
> >> If you have a per-cpu tick device that isn't suffering from
> >> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >> is a bug in the preference logic or you should boost the rating
> >> of the arm global timer above the twd. Does this patch help? It
> >> should make the arm global timer the tick device and whatever the
> >> cadence timer you have into the broadcast device.
> > 
> > I finally got to test your patch. Unfortunately, it makes the system
> > hang even earlier:
> 
> [ ... ]
> 
> Can you boot with maxcpus=1 and then give the result of timer_list ?

That doesn't seem to help anymore. It hangs too. Same picture w/ and w/o
'maxcpus', except for
	[    0.000000] Kernel command line: console=ttyPS0,115200 earlyprintk maxcpus=1


	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:17                                                       ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 12, 2013 at 06:08:46PM +0200, Daniel Lezcano wrote:
> On 08/12/2013 06:03 PM, S?ren Brinkmann wrote:
> > On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >> On 08/09, Daniel Lezcano wrote:
> >>>
> >>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>> never been tested before so the tick broadcast code is not handling this
> >>> case properly IMHO.
> >>>
> >>
> >> If you have a per-cpu tick device that isn't suffering from
> >> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >> is a bug in the preference logic or you should boost the rating
> >> of the arm global timer above the twd. Does this patch help? It
> >> should make the arm global timer the tick device and whatever the
> >> cadence timer you have into the broadcast device.
> > 
> > I finally got to test your patch. Unfortunately, it makes the system
> > hang even earlier:
> 
> [ ... ]
> 
> Can you boot with maxcpus=1 and then give the result of timer_list ?

That doesn't seem to help anymore. It hangs too. Same picture w/ and w/o
'maxcpus', except for
	[    0.000000] Kernel command line: console=ttyPS0,115200 earlyprintk maxcpus=1


	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:03                                                   ` Sören Brinkmann
@ 2013-08-12 16:20                                                     ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-12 16:20 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/12/13 09:03, Sören Brinkmann wrote:
> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>> On 08/09, Daniel Lezcano wrote:
>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>> never been tested before so the tick broadcast code is not handling this
>>> case properly IMHO.
>>>
>> If you have a per-cpu tick device that isn't suffering from
>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>> is a bug in the preference logic or you should boost the rating
>> of the arm global timer above the twd. Does this patch help? It
>> should make the arm global timer the tick device and whatever the
>> cadence timer you have into the broadcast device.
> I finally got to test your patch. Unfortunately, it makes the system
> hang even earlier:

Sorry it had a bug depending on the registration order. Can you try this
one (tabs are probably spaces, sorry)? I will go read through this
thread to see if we already covered the registration order.

---8<----
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 218bcb5..d3539e5 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
            !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
                return false;
 
+       if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
+               return false;
+
        return !curdev || newdev->rating > curdev->rating;
 }
 
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 64522ec..dd08f3b 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
                        return false;
        }
 
+       if (!curdev)
+               return true;
+
+       /* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
+       if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+            (curdev->features & CLOCK_EVT_FEAT_C3STOP))
+               return true;
+       if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+           !(curdev->features & CLOCK_EVT_FEAT_C3STOP))
+               return false;
+
        /*
         * Use the higher rated one, but prefer a CPU local device with a lower
         * rating than a non-CPU local device
         */
-       return !curdev ||
-               newdev->rating > curdev->rating ||
+       return newdev->rating > curdev->rating ||
               !cpumask_equal(curdev->cpumask, newdev->cpumask);
 }
 


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:20                                                     ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-12 16:20 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/13 09:03, S?ren Brinkmann wrote:
> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>> On 08/09, Daniel Lezcano wrote:
>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>> never been tested before so the tick broadcast code is not handling this
>>> case properly IMHO.
>>>
>> If you have a per-cpu tick device that isn't suffering from
>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>> is a bug in the preference logic or you should boost the rating
>> of the arm global timer above the twd. Does this patch help? It
>> should make the arm global timer the tick device and whatever the
>> cadence timer you have into the broadcast device.
> I finally got to test your patch. Unfortunately, it makes the system
> hang even earlier:

Sorry it had a bug depending on the registration order. Can you try this
one (tabs are probably spaces, sorry)? I will go read through this
thread to see if we already covered the registration order.

---8<----
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 218bcb5..d3539e5 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
            !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
                return false;
 
+       if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
+               return false;
+
        return !curdev || newdev->rating > curdev->rating;
 }
 
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 64522ec..dd08f3b 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
                        return false;
        }
 
+       if (!curdev)
+               return true;
+
+       /* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
+       if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+            (curdev->features & CLOCK_EVT_FEAT_C3STOP))
+               return true;
+       if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+           !(curdev->features & CLOCK_EVT_FEAT_C3STOP))
+               return false;
+
        /*
         * Use the higher rated one, but prefer a CPU local device with a lower
         * rating than a non-CPU local device
         */
-       return !curdev ||
-               newdev->rating > curdev->rating ||
+       return newdev->rating > curdev->rating ||
               !cpumask_equal(curdev->cpumask, newdev->cpumask);
 }
 


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 10:53                                                   ` Daniel Lezcano
@ 2013-08-12 16:23                                                     ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-12 16:23 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: srinivas.kandagatla, S??ren Brinkmann, Russell King,
	Michal Simek, linux-kernel, Stuart Menefy, John Stultz,
	Thomas Gleixner, linux-arm-kernel

On 08/12/13 03:53, Daniel Lezcano wrote:
> On 08/09/2013 07:27 PM, Stephen Boyd wrote:
>> On 08/09, Daniel Lezcano wrote:
>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>> never been tested before so the tick broadcast code is not handling this
>>> case properly IMHO.
>>>
>> If you have a per-cpu tick device that isn't suffering from
>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>> is a bug in the preference logic or you should boost the rating
>> of the arm global timer above the twd. Does this patch help? It
>> should make the arm global timer the tick device and whatever the
>> cadence timer you have into the broadcast device.
>>
>> ---8<----
>> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
>> index 218bcb5..d3539e5 100644
>> --- a/kernel/time/tick-broadcast.c
>> +++ b/kernel/time/tick-broadcast.c
>> @@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
>>  	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
>>  		return false;
>>  
>> +	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
>> +		return false;
> Yes, that makes sense to prevent local timer devices to be a broadcast one.
>
> In the case of the arm global timer, each cpu will register their timer,
> so the test will be ok but is it possible the cpu0 registers the timers
> for the other cpus ?

As far as I know every tick device has to be registered on the CPU that
it will be used on. See tick_check_percpu().

>
>>  	return !curdev || newdev->rating > curdev->rating;
>>  }
>>  
>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>> index 64522ec..1628b9f 100644
>> --- a/kernel/time/tick-common.c
>> +++ b/kernel/time/tick-common.c
>> @@ -245,6 +245,15 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>  	}
>>  
>>  	/*
>> +	 * Prefer tick devices that don't suffer from FEAT_C3_STOP
>> +	 * regardless of their rating
>> +	 */
>> +	if (curdev && cpumask_equal(curdev->cpumask, newdev->cpumask) &&
>> +	    !(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>> +	    (curdev->features & CLOCK_EVT_FEAT_C3STOP))
>> +		return true;
> That sounds reasonable, but what is the acceptable gap between the
> ratings ? I am wondering if there isn't too much heuristic in the tick
> code...

Yes I wonder if we should just change the ratings of the clockevents.
But it feels to me like the rating should only matter if the two are
equal in features. Otherwise we can use the features to determine what
we want.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:23                                                     ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-12 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/13 03:53, Daniel Lezcano wrote:
> On 08/09/2013 07:27 PM, Stephen Boyd wrote:
>> On 08/09, Daniel Lezcano wrote:
>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>> never been tested before so the tick broadcast code is not handling this
>>> case properly IMHO.
>>>
>> If you have a per-cpu tick device that isn't suffering from
>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>> is a bug in the preference logic or you should boost the rating
>> of the arm global timer above the twd. Does this patch help? It
>> should make the arm global timer the tick device and whatever the
>> cadence timer you have into the broadcast device.
>>
>> ---8<----
>> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
>> index 218bcb5..d3539e5 100644
>> --- a/kernel/time/tick-broadcast.c
>> +++ b/kernel/time/tick-broadcast.c
>> @@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
>>  	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
>>  		return false;
>>  
>> +	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
>> +		return false;
> Yes, that makes sense to prevent local timer devices to be a broadcast one.
>
> In the case of the arm global timer, each cpu will register their timer,
> so the test will be ok but is it possible the cpu0 registers the timers
> for the other cpus ?

As far as I know every tick device has to be registered on the CPU that
it will be used on. See tick_check_percpu().

>
>>  	return !curdev || newdev->rating > curdev->rating;
>>  }
>>  
>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>> index 64522ec..1628b9f 100644
>> --- a/kernel/time/tick-common.c
>> +++ b/kernel/time/tick-common.c
>> @@ -245,6 +245,15 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>  	}
>>  
>>  	/*
>> +	 * Prefer tick devices that don't suffer from FEAT_C3_STOP
>> +	 * regardless of their rating
>> +	 */
>> +	if (curdev && cpumask_equal(curdev->cpumask, newdev->cpumask) &&
>> +	    !(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>> +	    (curdev->features & CLOCK_EVT_FEAT_C3STOP))
>> +		return true;
> That sounds reasonable, but what is the acceptable gap between the
> ratings ? I am wondering if there isn't too much heuristic in the tick
> code...

Yes I wonder if we should just change the ratings of the clockevents.
But it feels to me like the rating should only matter if the two are
equal in features. Otherwise we can use the features to determine what
we want.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:20                                                     ` Stephen Boyd
@ 2013-08-12 16:24                                                       ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:24 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> On 08/12/13 09:03, Sören Brinkmann wrote:
> > On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >> On 08/09, Daniel Lezcano wrote:
> >>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>> never been tested before so the tick broadcast code is not handling this
> >>> case properly IMHO.
> >>>
> >> If you have a per-cpu tick device that isn't suffering from
> >> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >> is a bug in the preference logic or you should boost the rating
> >> of the arm global timer above the twd. Does this patch help? It
> >> should make the arm global timer the tick device and whatever the
> >> cadence timer you have into the broadcast device.
> > I finally got to test your patch. Unfortunately, it makes the system
> > hang even earlier:
> 
> Sorry it had a bug depending on the registration order. Can you try this
> one (tabs are probably spaces, sorry)? I will go read through this
> thread to see if we already covered the registration order.

What is the base for your patch? I based my GT enable patch on 3.11-rc3
and for consistency in our debugging, I didn't move it elsewhere since.
Your patch doesn't apply cleanly on it. I see if I can work it out, just
let me know if it depends on something not available in 3.11-rc3.

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:24                                                       ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> On 08/12/13 09:03, S?ren Brinkmann wrote:
> > On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >> On 08/09, Daniel Lezcano wrote:
> >>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>> never been tested before so the tick broadcast code is not handling this
> >>> case properly IMHO.
> >>>
> >> If you have a per-cpu tick device that isn't suffering from
> >> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >> is a bug in the preference logic or you should boost the rating
> >> of the arm global timer above the twd. Does this patch help? It
> >> should make the arm global timer the tick device and whatever the
> >> cadence timer you have into the broadcast device.
> > I finally got to test your patch. Unfortunately, it makes the system
> > hang even earlier:
> 
> Sorry it had a bug depending on the registration order. Can you try this
> one (tabs are probably spaces, sorry)? I will go read through this
> thread to see if we already covered the registration order.

What is the base for your patch? I based my GT enable patch on 3.11-rc3
and for consistency in our debugging, I didn't move it elsewhere since.
Your patch doesn't apply cleanly on it. I see if I can work it out, just
let me know if it depends on something not available in 3.11-rc3.

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:20                                                     ` Stephen Boyd
@ 2013-08-12 16:32                                                       ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:32 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> On 08/12/13 09:03, Sören Brinkmann wrote:
> > On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >> On 08/09, Daniel Lezcano wrote:
> >>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>> never been tested before so the tick broadcast code is not handling this
> >>> case properly IMHO.
> >>>
> >> If you have a per-cpu tick device that isn't suffering from
> >> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >> is a bug in the preference logic or you should boost the rating
> >> of the arm global timer above the twd. Does this patch help? It
> >> should make the arm global timer the tick device and whatever the
> >> cadence timer you have into the broadcast device.
> > I finally got to test your patch. Unfortunately, it makes the system
> > hang even earlier:
> 
> Sorry it had a bug depending on the registration order. Can you try this
> one (tabs are probably spaces, sorry)? I will go read through this
> thread to see if we already covered the registration order.

That did it! Booted straight into the system. The broadcast device is
the TTC instead of GT, now.

	Tick Device: mode:     1
	Broadcast device
	Clock Event Device: ttc_clockevent
	 max_delta_ns:   1207932479
	 min_delta_ns:   18432
	 mult:           233015
	 shift:          32
	 mode:           1
	 next_event:     9223372036854775807 nsecs
	 set_next_event: ttc_set_next_event
	 set_mode:       ttc_set_mode
	 event_handler:  tick_handle_oneshot_broadcast
	 retries:        0
	
	tick_broadcast_mask: 00000000
	tick_broadcast_oneshot_mask: 00000000
	
	Tick Device: mode:     1
	Per CPU device: 0
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     24216749370 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0
	
	Tick Device: mode:     1
	Per CPU device: 1
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     24220000000 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0


	# cat /proc/interrupts 
	           CPU0       CPU1       
	 27:       1668       1640       GIC  27  gt
	 29:          0          0       GIC  29  twd
	 43:          0          0       GIC  43  ttc_clockevent
	 82:        536          0       GIC  82  xuartps
	IPI0:          0          0  CPU wakeup interrupts
	IPI1:          0          0  Timer broadcast interrupts
	IPI2:       1264       1322  Rescheduling interrupts
	IPI3:          0          0  Function call interrupts
	IPI4:         24         70  Single function call interrupts
	IPI5:          0          0  CPU stop interrupts
	Err:          0


	Thanks,
	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:32                                                       ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> On 08/12/13 09:03, S?ren Brinkmann wrote:
> > On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >> On 08/09, Daniel Lezcano wrote:
> >>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>> never been tested before so the tick broadcast code is not handling this
> >>> case properly IMHO.
> >>>
> >> If you have a per-cpu tick device that isn't suffering from
> >> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >> is a bug in the preference logic or you should boost the rating
> >> of the arm global timer above the twd. Does this patch help? It
> >> should make the arm global timer the tick device and whatever the
> >> cadence timer you have into the broadcast device.
> > I finally got to test your patch. Unfortunately, it makes the system
> > hang even earlier:
> 
> Sorry it had a bug depending on the registration order. Can you try this
> one (tabs are probably spaces, sorry)? I will go read through this
> thread to see if we already covered the registration order.

That did it! Booted straight into the system. The broadcast device is
the TTC instead of GT, now.

	Tick Device: mode:     1
	Broadcast device
	Clock Event Device: ttc_clockevent
	 max_delta_ns:   1207932479
	 min_delta_ns:   18432
	 mult:           233015
	 shift:          32
	 mode:           1
	 next_event:     9223372036854775807 nsecs
	 set_next_event: ttc_set_next_event
	 set_mode:       ttc_set_mode
	 event_handler:  tick_handle_oneshot_broadcast
	 retries:        0
	
	tick_broadcast_mask: 00000000
	tick_broadcast_oneshot_mask: 00000000
	
	Tick Device: mode:     1
	Per CPU device: 0
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     24216749370 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0
	
	Tick Device: mode:     1
	Per CPU device: 1
	Clock Event Device: arm_global_timer
	 max_delta_ns:   12884902005
	 min_delta_ns:   1000
	 mult:           715827876
	 shift:          31
	 mode:           3
	 next_event:     24220000000 nsecs
	 set_next_event: gt_clockevent_set_next_event
	 set_mode:       gt_clockevent_set_mode
	 event_handler:  hrtimer_interrupt
	 retries:        0


	# cat /proc/interrupts 
	           CPU0       CPU1       
	 27:       1668       1640       GIC  27  gt
	 29:          0          0       GIC  29  twd
	 43:          0          0       GIC  43  ttc_clockevent
	 82:        536          0       GIC  82  xuartps
	IPI0:          0          0  CPU wakeup interrupts
	IPI1:          0          0  Timer broadcast interrupts
	IPI2:       1264       1322  Rescheduling interrupts
	IPI3:          0          0  Function call interrupts
	IPI4:         24         70  Single function call interrupts
	IPI5:          0          0  CPU stop interrupts
	Err:          0


	Thanks,
	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:24                                                       ` Sören Brinkmann
@ 2013-08-12 16:40                                                         ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-12 16:40 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/12/13 09:24, Sören Brinkmann wrote:
> On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
>> On 08/12/13 09:03, Sören Brinkmann wrote:
>>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>>>> On 08/09, Daniel Lezcano wrote:
>>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>>>> never been tested before so the tick broadcast code is not handling this
>>>>> case properly IMHO.
>>>>>
>>>> If you have a per-cpu tick device that isn't suffering from
>>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>>>> is a bug in the preference logic or you should boost the rating
>>>> of the arm global timer above the twd. Does this patch help? It
>>>> should make the arm global timer the tick device and whatever the
>>>> cadence timer you have into the broadcast device.
>>> I finally got to test your patch. Unfortunately, it makes the system
>>> hang even earlier:
>> Sorry it had a bug depending on the registration order. Can you try this
>> one (tabs are probably spaces, sorry)? I will go read through this
>> thread to see if we already covered the registration order.
> What is the base for your patch? I based my GT enable patch on 3.11-rc3
> and for consistency in our debugging, I didn't move it elsewhere since.
> Your patch doesn't apply cleanly on it. I see if I can work it out, just
> let me know if it depends on something not available in 3.11-rc3.

I applied this on 3.11-rc4. I don't think anything has changed there
between rc3 and rc4. so you're probably running into the whitespace problem.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:40                                                         ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-12 16:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/13 09:24, S?ren Brinkmann wrote:
> On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
>> On 08/12/13 09:03, S?ren Brinkmann wrote:
>>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>>>> On 08/09, Daniel Lezcano wrote:
>>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>>>> never been tested before so the tick broadcast code is not handling this
>>>>> case properly IMHO.
>>>>>
>>>> If you have a per-cpu tick device that isn't suffering from
>>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>>>> is a bug in the preference logic or you should boost the rating
>>>> of the arm global timer above the twd. Does this patch help? It
>>>> should make the arm global timer the tick device and whatever the
>>>> cadence timer you have into the broadcast device.
>>> I finally got to test your patch. Unfortunately, it makes the system
>>> hang even earlier:
>> Sorry it had a bug depending on the registration order. Can you try this
>> one (tabs are probably spaces, sorry)? I will go read through this
>> thread to see if we already covered the registration order.
> What is the base for your patch? I based my GT enable patch on 3.11-rc3
> and for consistency in our debugging, I didn't move it elsewhere since.
> Your patch doesn't apply cleanly on it. I see if I can work it out, just
> let me know if it depends on something not available in 3.11-rc3.

I applied this on 3.11-rc4. I don't think anything has changed there
between rc3 and rc4. so you're probably running into the whitespace problem.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:40                                                         ` Stephen Boyd
@ 2013-08-12 16:43                                                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:43 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Mon, Aug 12, 2013 at 09:40:52AM -0700, Stephen Boyd wrote:
> On 08/12/13 09:24, Sören Brinkmann wrote:
> > On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> >> On 08/12/13 09:03, Sören Brinkmann wrote:
> >>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >>>> On 08/09, Daniel Lezcano wrote:
> >>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>>>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>>>> never been tested before so the tick broadcast code is not handling this
> >>>>> case properly IMHO.
> >>>>>
> >>>> If you have a per-cpu tick device that isn't suffering from
> >>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >>>> is a bug in the preference logic or you should boost the rating
> >>>> of the arm global timer above the twd. Does this patch help? It
> >>>> should make the arm global timer the tick device and whatever the
> >>>> cadence timer you have into the broadcast device.
> >>> I finally got to test your patch. Unfortunately, it makes the system
> >>> hang even earlier:
> >> Sorry it had a bug depending on the registration order. Can you try this
> >> one (tabs are probably spaces, sorry)? I will go read through this
> >> thread to see if we already covered the registration order.
> > What is the base for your patch? I based my GT enable patch on 3.11-rc3
> > and for consistency in our debugging, I didn't move it elsewhere since.
> > Your patch doesn't apply cleanly on it. I see if I can work it out, just
> > let me know if it depends on something not available in 3.11-rc3.
> 
> I applied this on 3.11-rc4. I don't think anything has changed there
> between rc3 and rc4. so you're probably running into the whitespace problem.

That or the scissors are my problem. I never worked with scissors and it
looks kinda odd what happened when I am'ed the patch. Anyway, I just
manually merged it in.

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:43                                                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 12, 2013 at 09:40:52AM -0700, Stephen Boyd wrote:
> On 08/12/13 09:24, S?ren Brinkmann wrote:
> > On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> >> On 08/12/13 09:03, S?ren Brinkmann wrote:
> >>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >>>> On 08/09, Daniel Lezcano wrote:
> >>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>>>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>>>> never been tested before so the tick broadcast code is not handling this
> >>>>> case properly IMHO.
> >>>>>
> >>>> If you have a per-cpu tick device that isn't suffering from
> >>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >>>> is a bug in the preference logic or you should boost the rating
> >>>> of the arm global timer above the twd. Does this patch help? It
> >>>> should make the arm global timer the tick device and whatever the
> >>>> cadence timer you have into the broadcast device.
> >>> I finally got to test your patch. Unfortunately, it makes the system
> >>> hang even earlier:
> >> Sorry it had a bug depending on the registration order. Can you try this
> >> one (tabs are probably spaces, sorry)? I will go read through this
> >> thread to see if we already covered the registration order.
> > What is the base for your patch? I based my GT enable patch on 3.11-rc3
> > and for consistency in our debugging, I didn't move it elsewhere since.
> > Your patch doesn't apply cleanly on it. I see if I can work it out, just
> > let me know if it depends on something not available in 3.11-rc3.
> 
> I applied this on 3.11-rc4. I don't think anything has changed there
> between rc3 and rc4. so you're probably running into the whitespace problem.

That or the scissors are my problem. I never worked with scissors and it
looks kinda odd what happened when I am'ed the patch. Anyway, I just
manually merged it in.

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:32                                                       ` Sören Brinkmann
@ 2013-08-12 16:49                                                         ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 16:49 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/12/2013 06:32 PM, Sören Brinkmann wrote:
> On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
>> On 08/12/13 09:03, Sören Brinkmann wrote:
>>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>>>> On 08/09, Daniel Lezcano wrote:
>>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>>>> never been tested before so the tick broadcast code is not handling this
>>>>> case properly IMHO.
>>>>>
>>>> If you have a per-cpu tick device that isn't suffering from
>>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>>>> is a bug in the preference logic or you should boost the rating
>>>> of the arm global timer above the twd. Does this patch help? It
>>>> should make the arm global timer the tick device and whatever the
>>>> cadence timer you have into the broadcast device.
>>> I finally got to test your patch. Unfortunately, it makes the system
>>> hang even earlier:
>>
>> Sorry it had a bug depending on the registration order. Can you try this
>> one (tabs are probably spaces, sorry)? I will go read through this
>> thread to see if we already covered the registration order.
> 
> That did it! Booted straight into the system. 

Good news :)

> The broadcast device is
> the TTC instead of GT, now.
> 
> 	Tick Device: mode:     1
> 	Broadcast device
> 	Clock Event Device: ttc_clockevent
> 	 max_delta_ns:   1207932479
> 	 min_delta_ns:   18432
> 	 mult:           233015
> 	 shift:          32
> 	 mode:           1
> 	 next_event:     9223372036854775807 nsecs
> 	 set_next_event: ttc_set_next_event
> 	 set_mode:       ttc_set_mode
> 	 event_handler:  tick_handle_oneshot_broadcast
> 	 retries:        0
> 	
> 	tick_broadcast_mask: 00000000
> 	tick_broadcast_oneshot_mask: 00000000

At the first glance, the timer broadcast usage is not set, right ? Can
you try with the cpuidle flag even if it is not needed ?

Thanks
  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:49                                                         ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/2013 06:32 PM, S?ren Brinkmann wrote:
> On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
>> On 08/12/13 09:03, S?ren Brinkmann wrote:
>>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>>>> On 08/09, Daniel Lezcano wrote:
>>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>>>> never been tested before so the tick broadcast code is not handling this
>>>>> case properly IMHO.
>>>>>
>>>> If you have a per-cpu tick device that isn't suffering from
>>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>>>> is a bug in the preference logic or you should boost the rating
>>>> of the arm global timer above the twd. Does this patch help? It
>>>> should make the arm global timer the tick device and whatever the
>>>> cadence timer you have into the broadcast device.
>>> I finally got to test your patch. Unfortunately, it makes the system
>>> hang even earlier:
>>
>> Sorry it had a bug depending on the registration order. Can you try this
>> one (tabs are probably spaces, sorry)? I will go read through this
>> thread to see if we already covered the registration order.
> 
> That did it! Booted straight into the system. 

Good news :)

> The broadcast device is
> the TTC instead of GT, now.
> 
> 	Tick Device: mode:     1
> 	Broadcast device
> 	Clock Event Device: ttc_clockevent
> 	 max_delta_ns:   1207932479
> 	 min_delta_ns:   18432
> 	 mult:           233015
> 	 shift:          32
> 	 mode:           1
> 	 next_event:     9223372036854775807 nsecs
> 	 set_next_event: ttc_set_next_event
> 	 set_mode:       ttc_set_mode
> 	 event_handler:  tick_handle_oneshot_broadcast
> 	 retries:        0
> 	
> 	tick_broadcast_mask: 00000000
> 	tick_broadcast_oneshot_mask: 00000000

At the first glance, the timer broadcast usage is not set, right ? Can
you try with the cpuidle flag even if it is not needed ?

Thanks
  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:23                                                     ` Stephen Boyd
@ 2013-08-12 16:53                                                       ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 16:53 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: srinivas.kandagatla, S??ren Brinkmann, Russell King,
	Michal Simek, linux-kernel, Stuart Menefy, John Stultz,
	Thomas Gleixner, linux-arm-kernel

On 08/12/2013 06:23 PM, Stephen Boyd wrote:
> On 08/12/13 03:53, Daniel Lezcano wrote:
>> On 08/09/2013 07:27 PM, Stephen Boyd wrote:
>>> On 08/09, Daniel Lezcano wrote:
>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>>> never been tested before so the tick broadcast code is not handling this
>>>> case properly IMHO.
>>>>
>>> If you have a per-cpu tick device that isn't suffering from
>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>>> is a bug in the preference logic or you should boost the rating
>>> of the arm global timer above the twd. Does this patch help? It
>>> should make the arm global timer the tick device and whatever the
>>> cadence timer you have into the broadcast device.
>>>
>>> ---8<----
>>> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
>>> index 218bcb5..d3539e5 100644
>>> --- a/kernel/time/tick-broadcast.c
>>> +++ b/kernel/time/tick-broadcast.c
>>> @@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
>>>  	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
>>>  		return false;
>>>  
>>> +	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
>>> +		return false;
>> Yes, that makes sense to prevent local timer devices to be a broadcast one.
>>
>> In the case of the arm global timer, each cpu will register their timer,
>> so the test will be ok but is it possible the cpu0 registers the timers
>> for the other cpus ?
> 
> As far as I know every tick device has to be registered on the CPU that
> it will be used on. See tick_check_percpu().

Ah, ok I see. Thx for the pointer.

>>>  	return !curdev || newdev->rating > curdev->rating;
>>>  }
>>>  
>>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>>> index 64522ec..1628b9f 100644
>>> --- a/kernel/time/tick-common.c
>>> +++ b/kernel/time/tick-common.c
>>> @@ -245,6 +245,15 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>>  	}
>>>  
>>>  	/*
>>> +	 * Prefer tick devices that don't suffer from FEAT_C3_STOP
>>> +	 * regardless of their rating
>>> +	 */
>>> +	if (curdev && cpumask_equal(curdev->cpumask, newdev->cpumask) &&
>>> +	    !(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>> +	    (curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>> +		return true;
>> That sounds reasonable, but what is the acceptable gap between the
>> ratings ? I am wondering if there isn't too much heuristic in the tick
>> code...
> 
> Yes I wonder if we should just change the ratings of the clockevents.
> But it feels to me like the rating should only matter if the two are
> equal in features. Otherwise we can use the features to determine what
> we want.

Is it desirable for real time system ? (I am not expert in this area, so
may be this question has no sense :)


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:53                                                       ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 16:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/2013 06:23 PM, Stephen Boyd wrote:
> On 08/12/13 03:53, Daniel Lezcano wrote:
>> On 08/09/2013 07:27 PM, Stephen Boyd wrote:
>>> On 08/09, Daniel Lezcano wrote:
>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>>> never been tested before so the tick broadcast code is not handling this
>>>> case properly IMHO.
>>>>
>>> If you have a per-cpu tick device that isn't suffering from
>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>>> is a bug in the preference logic or you should boost the rating
>>> of the arm global timer above the twd. Does this patch help? It
>>> should make the arm global timer the tick device and whatever the
>>> cadence timer you have into the broadcast device.
>>>
>>> ---8<----
>>> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
>>> index 218bcb5..d3539e5 100644
>>> --- a/kernel/time/tick-broadcast.c
>>> +++ b/kernel/time/tick-broadcast.c
>>> @@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
>>>  	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
>>>  		return false;
>>>  
>>> +	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
>>> +		return false;
>> Yes, that makes sense to prevent local timer devices to be a broadcast one.
>>
>> In the case of the arm global timer, each cpu will register their timer,
>> so the test will be ok but is it possible the cpu0 registers the timers
>> for the other cpus ?
> 
> As far as I know every tick device has to be registered on the CPU that
> it will be used on. See tick_check_percpu().

Ah, ok I see. Thx for the pointer.

>>>  	return !curdev || newdev->rating > curdev->rating;
>>>  }
>>>  
>>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>>> index 64522ec..1628b9f 100644
>>> --- a/kernel/time/tick-common.c
>>> +++ b/kernel/time/tick-common.c
>>> @@ -245,6 +245,15 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>>  	}
>>>  
>>>  	/*
>>> +	 * Prefer tick devices that don't suffer from FEAT_C3_STOP
>>> +	 * regardless of their rating
>>> +	 */
>>> +	if (curdev && cpumask_equal(curdev->cpumask, newdev->cpumask) &&
>>> +	    !(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>> +	    (curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>> +		return true;
>> That sounds reasonable, but what is the acceptable gap between the
>> ratings ? I am wondering if there isn't too much heuristic in the tick
>> code...
> 
> Yes I wonder if we should just change the ratings of the clockevents.
> But it feels to me like the rating should only matter if the two are
> equal in features. Otherwise we can use the features to determine what
> we want.

Is it desirable for real time system ? (I am not expert in this area, so
may be this question has no sense :)


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:49                                                         ` Daniel Lezcano
@ 2013-08-12 16:53                                                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:53 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Mon, Aug 12, 2013 at 06:49:17PM +0200, Daniel Lezcano wrote:
> On 08/12/2013 06:32 PM, Sören Brinkmann wrote:
> > On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> >> On 08/12/13 09:03, Sören Brinkmann wrote:
> >>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >>>> On 08/09, Daniel Lezcano wrote:
> >>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>>>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>>>> never been tested before so the tick broadcast code is not handling this
> >>>>> case properly IMHO.
> >>>>>
> >>>> If you have a per-cpu tick device that isn't suffering from
> >>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >>>> is a bug in the preference logic or you should boost the rating
> >>>> of the arm global timer above the twd. Does this patch help? It
> >>>> should make the arm global timer the tick device and whatever the
> >>>> cadence timer you have into the broadcast device.
> >>> I finally got to test your patch. Unfortunately, it makes the system
> >>> hang even earlier:
> >>
> >> Sorry it had a bug depending on the registration order. Can you try this
> >> one (tabs are probably spaces, sorry)? I will go read through this
> >> thread to see if we already covered the registration order.
> > 
> > That did it! Booted straight into the system. 
> 
> Good news :)
> 
> > The broadcast device is
> > the TTC instead of GT, now.
> > 
> > 	Tick Device: mode:     1
> > 	Broadcast device
> > 	Clock Event Device: ttc_clockevent
> > 	 max_delta_ns:   1207932479
> > 	 min_delta_ns:   18432
> > 	 mult:           233015
> > 	 shift:          32
> > 	 mode:           1
> > 	 next_event:     9223372036854775807 nsecs
> > 	 set_next_event: ttc_set_next_event
> > 	 set_mode:       ttc_set_mode
> > 	 event_handler:  tick_handle_oneshot_broadcast
> > 	 retries:        0
> > 	
> > 	tick_broadcast_mask: 00000000
> > 	tick_broadcast_oneshot_mask: 00000000
> 
> At the first glance, the timer broadcast usage is not set, right ? Can
> you try with the cpuidle flag even if it is not needed ?

It's actually present. I have a clean 3.11-rc3 and the only changes are
my patch to enable the GT and Stephen's fix.
The cpuidle stats show both idle states being used.

	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 16:53                                                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-12 16:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 12, 2013 at 06:49:17PM +0200, Daniel Lezcano wrote:
> On 08/12/2013 06:32 PM, S?ren Brinkmann wrote:
> > On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> >> On 08/12/13 09:03, S?ren Brinkmann wrote:
> >>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >>>> On 08/09, Daniel Lezcano wrote:
> >>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>>>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>>>> never been tested before so the tick broadcast code is not handling this
> >>>>> case properly IMHO.
> >>>>>
> >>>> If you have a per-cpu tick device that isn't suffering from
> >>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >>>> is a bug in the preference logic or you should boost the rating
> >>>> of the arm global timer above the twd. Does this patch help? It
> >>>> should make the arm global timer the tick device and whatever the
> >>>> cadence timer you have into the broadcast device.
> >>> I finally got to test your patch. Unfortunately, it makes the system
> >>> hang even earlier:
> >>
> >> Sorry it had a bug depending on the registration order. Can you try this
> >> one (tabs are probably spaces, sorry)? I will go read through this
> >> thread to see if we already covered the registration order.
> > 
> > That did it! Booted straight into the system. 
> 
> Good news :)
> 
> > The broadcast device is
> > the TTC instead of GT, now.
> > 
> > 	Tick Device: mode:     1
> > 	Broadcast device
> > 	Clock Event Device: ttc_clockevent
> > 	 max_delta_ns:   1207932479
> > 	 min_delta_ns:   18432
> > 	 mult:           233015
> > 	 shift:          32
> > 	 mode:           1
> > 	 next_event:     9223372036854775807 nsecs
> > 	 set_next_event: ttc_set_next_event
> > 	 set_mode:       ttc_set_mode
> > 	 event_handler:  tick_handle_oneshot_broadcast
> > 	 retries:        0
> > 	
> > 	tick_broadcast_mask: 00000000
> > 	tick_broadcast_oneshot_mask: 00000000
> 
> At the first glance, the timer broadcast usage is not set, right ? Can
> you try with the cpuidle flag even if it is not needed ?

It's actually present. I have a clean 3.11-rc3 and the only changes are
my patch to enable the GT and Stephen's fix.
The cpuidle stats show both idle states being used.

	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 16:53                                                           ` Sören Brinkmann
@ 2013-08-12 17:02                                                             ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 17:02 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Stephen Boyd, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/12/2013 06:53 PM, Sören Brinkmann wrote:
> On Mon, Aug 12, 2013 at 06:49:17PM +0200, Daniel Lezcano wrote:
>> On 08/12/2013 06:32 PM, Sören Brinkmann wrote:
>>> On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
>>>> On 08/12/13 09:03, Sören Brinkmann wrote:
>>>>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>>>>>> On 08/09, Daniel Lezcano wrote:
>>>>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>>>>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>>>>>> never been tested before so the tick broadcast code is not handling this
>>>>>>> case properly IMHO.
>>>>>>>
>>>>>> If you have a per-cpu tick device that isn't suffering from
>>>>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>>>>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>>>>>> is a bug in the preference logic or you should boost the rating
>>>>>> of the arm global timer above the twd. Does this patch help? It
>>>>>> should make the arm global timer the tick device and whatever the
>>>>>> cadence timer you have into the broadcast device.
>>>>> I finally got to test your patch. Unfortunately, it makes the system
>>>>> hang even earlier:
>>>>
>>>> Sorry it had a bug depending on the registration order. Can you try this
>>>> one (tabs are probably spaces, sorry)? I will go read through this
>>>> thread to see if we already covered the registration order.
>>>
>>> That did it! Booted straight into the system. 
>>
>> Good news :)
>>
>>> The broadcast device is
>>> the TTC instead of GT, now.
>>>
>>> 	Tick Device: mode:     1
>>> 	Broadcast device
>>> 	Clock Event Device: ttc_clockevent
>>> 	 max_delta_ns:   1207932479
>>> 	 min_delta_ns:   18432
>>> 	 mult:           233015
>>> 	 shift:          32
>>> 	 mode:           1
>>> 	 next_event:     9223372036854775807 nsecs
>>> 	 set_next_event: ttc_set_next_event
>>> 	 set_mode:       ttc_set_mode
>>> 	 event_handler:  tick_handle_oneshot_broadcast
>>> 	 retries:        0
>>> 	
>>> 	tick_broadcast_mask: 00000000
>>> 	tick_broadcast_oneshot_mask: 00000000
>>
>> At the first glance, the timer broadcast usage is not set, right ? Can
>> you try with the cpuidle flag even if it is not needed ?
> 
> It's actually present. I have a clean 3.11-rc3 and the only changes are
> my patch to enable the GT and Stephen's fix.
> The cpuidle stats show both idle states being used.

Ah, right. The tick_broadcast_mask is not set because the arm global
timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.

Thanks
  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-12 17:02                                                             ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-12 17:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/2013 06:53 PM, S?ren Brinkmann wrote:
> On Mon, Aug 12, 2013 at 06:49:17PM +0200, Daniel Lezcano wrote:
>> On 08/12/2013 06:32 PM, S?ren Brinkmann wrote:
>>> On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
>>>> On 08/12/13 09:03, S?ren Brinkmann wrote:
>>>>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
>>>>>> On 08/09, Daniel Lezcano wrote:
>>>>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
>>>>>>> wake it up, no ? As Stephen stated this kind of configuration should has
>>>>>>> never been tested before so the tick broadcast code is not handling this
>>>>>>> case properly IMHO.
>>>>>>>
>>>>>> If you have a per-cpu tick device that isn't suffering from
>>>>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
>>>>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
>>>>>> is a bug in the preference logic or you should boost the rating
>>>>>> of the arm global timer above the twd. Does this patch help? It
>>>>>> should make the arm global timer the tick device and whatever the
>>>>>> cadence timer you have into the broadcast device.
>>>>> I finally got to test your patch. Unfortunately, it makes the system
>>>>> hang even earlier:
>>>>
>>>> Sorry it had a bug depending on the registration order. Can you try this
>>>> one (tabs are probably spaces, sorry)? I will go read through this
>>>> thread to see if we already covered the registration order.
>>>
>>> That did it! Booted straight into the system. 
>>
>> Good news :)
>>
>>> The broadcast device is
>>> the TTC instead of GT, now.
>>>
>>> 	Tick Device: mode:     1
>>> 	Broadcast device
>>> 	Clock Event Device: ttc_clockevent
>>> 	 max_delta_ns:   1207932479
>>> 	 min_delta_ns:   18432
>>> 	 mult:           233015
>>> 	 shift:          32
>>> 	 mode:           1
>>> 	 next_event:     9223372036854775807 nsecs
>>> 	 set_next_event: ttc_set_next_event
>>> 	 set_mode:       ttc_set_mode
>>> 	 event_handler:  tick_handle_oneshot_broadcast
>>> 	 retries:        0
>>> 	
>>> 	tick_broadcast_mask: 00000000
>>> 	tick_broadcast_oneshot_mask: 00000000
>>
>> At the first glance, the timer broadcast usage is not set, right ? Can
>> you try with the cpuidle flag even if it is not needed ?
> 
> It's actually present. I have a clean 3.11-rc3 and the only changes are
> my patch to enable the GT and Stephen's fix.
> The cpuidle stats show both idle states being used.

Ah, right. The tick_broadcast_mask is not set because the arm global
timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.

Thanks
  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-12 17:02                                                             ` Daniel Lezcano
@ 2013-08-16 17:28                                                               ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-16 17:28 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Stephen Boyd, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
> On 08/12/2013 06:53 PM, Sören Brinkmann wrote:
> > On Mon, Aug 12, 2013 at 06:49:17PM +0200, Daniel Lezcano wrote:
> >> On 08/12/2013 06:32 PM, Sören Brinkmann wrote:
> >>> On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> >>>> On 08/12/13 09:03, Sören Brinkmann wrote:
> >>>>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >>>>>> On 08/09, Daniel Lezcano wrote:
> >>>>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>>>>>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>>>>>> never been tested before so the tick broadcast code is not handling this
> >>>>>>> case properly IMHO.
> >>>>>>>
> >>>>>> If you have a per-cpu tick device that isn't suffering from
> >>>>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >>>>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >>>>>> is a bug in the preference logic or you should boost the rating
> >>>>>> of the arm global timer above the twd. Does this patch help? It
> >>>>>> should make the arm global timer the tick device and whatever the
> >>>>>> cadence timer you have into the broadcast device.
> >>>>> I finally got to test your patch. Unfortunately, it makes the system
> >>>>> hang even earlier:
> >>>>
> >>>> Sorry it had a bug depending on the registration order. Can you try this
> >>>> one (tabs are probably spaces, sorry)? I will go read through this
> >>>> thread to see if we already covered the registration order.
> >>>
> >>> That did it! Booted straight into the system. 
> >>
> >> Good news :)
> >>
> >>> The broadcast device is
> >>> the TTC instead of GT, now.
> >>>
> >>> 	Tick Device: mode:     1
> >>> 	Broadcast device
> >>> 	Clock Event Device: ttc_clockevent
> >>> 	 max_delta_ns:   1207932479
> >>> 	 min_delta_ns:   18432
> >>> 	 mult:           233015
> >>> 	 shift:          32
> >>> 	 mode:           1
> >>> 	 next_event:     9223372036854775807 nsecs
> >>> 	 set_next_event: ttc_set_next_event
> >>> 	 set_mode:       ttc_set_mode
> >>> 	 event_handler:  tick_handle_oneshot_broadcast
> >>> 	 retries:        0
> >>> 	
> >>> 	tick_broadcast_mask: 00000000
> >>> 	tick_broadcast_oneshot_mask: 00000000
> >>
> >> At the first glance, the timer broadcast usage is not set, right ? Can
> >> you try with the cpuidle flag even if it is not needed ?
> > 
> > It's actually present. I have a clean 3.11-rc3 and the only changes are
> > my patch to enable the GT and Stephen's fix.
> > The cpuidle stats show both idle states being used.
> 
> Ah, right. The tick_broadcast_mask is not set because the arm global
> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.

Just to check in. Do you want any additional testing done? Or can I
expect Stephens fix to get merged, so Zynq can use the GT?

	Thanks,
	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-16 17:28                                                               ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-16 17:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
> On 08/12/2013 06:53 PM, S?ren Brinkmann wrote:
> > On Mon, Aug 12, 2013 at 06:49:17PM +0200, Daniel Lezcano wrote:
> >> On 08/12/2013 06:32 PM, S?ren Brinkmann wrote:
> >>> On Mon, Aug 12, 2013 at 09:20:19AM -0700, Stephen Boyd wrote:
> >>>> On 08/12/13 09:03, S?ren Brinkmann wrote:
> >>>>> On Fri, Aug 09, 2013 at 10:27:57AM -0700, Stephen Boyd wrote:
> >>>>>> On 08/09, Daniel Lezcano wrote:
> >>>>>>> yes, but at least the broadcast mechanism should send an IPI to cpu0 to
> >>>>>>> wake it up, no ? As Stephen stated this kind of configuration should has
> >>>>>>> never been tested before so the tick broadcast code is not handling this
> >>>>>>> case properly IMHO.
> >>>>>>>
> >>>>>> If you have a per-cpu tick device that isn't suffering from
> >>>>>> FEAT_C3_STOP why wouldn't you use that for the tick versus a
> >>>>>> per-cpu tick device that has FEAT_C3_STOP? It sounds like there
> >>>>>> is a bug in the preference logic or you should boost the rating
> >>>>>> of the arm global timer above the twd. Does this patch help? It
> >>>>>> should make the arm global timer the tick device and whatever the
> >>>>>> cadence timer you have into the broadcast device.
> >>>>> I finally got to test your patch. Unfortunately, it makes the system
> >>>>> hang even earlier:
> >>>>
> >>>> Sorry it had a bug depending on the registration order. Can you try this
> >>>> one (tabs are probably spaces, sorry)? I will go read through this
> >>>> thread to see if we already covered the registration order.
> >>>
> >>> That did it! Booted straight into the system. 
> >>
> >> Good news :)
> >>
> >>> The broadcast device is
> >>> the TTC instead of GT, now.
> >>>
> >>> 	Tick Device: mode:     1
> >>> 	Broadcast device
> >>> 	Clock Event Device: ttc_clockevent
> >>> 	 max_delta_ns:   1207932479
> >>> 	 min_delta_ns:   18432
> >>> 	 mult:           233015
> >>> 	 shift:          32
> >>> 	 mode:           1
> >>> 	 next_event:     9223372036854775807 nsecs
> >>> 	 set_next_event: ttc_set_next_event
> >>> 	 set_mode:       ttc_set_mode
> >>> 	 event_handler:  tick_handle_oneshot_broadcast
> >>> 	 retries:        0
> >>> 	
> >>> 	tick_broadcast_mask: 00000000
> >>> 	tick_broadcast_oneshot_mask: 00000000
> >>
> >> At the first glance, the timer broadcast usage is not set, right ? Can
> >> you try with the cpuidle flag even if it is not needed ?
> > 
> > It's actually present. I have a clean 3.11-rc3 and the only changes are
> > my patch to enable the GT and Stephen's fix.
> > The cpuidle stats show both idle states being used.
> 
> Ah, right. The tick_broadcast_mask is not set because the arm global
> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.

Just to check in. Do you want any additional testing done? Or can I
expect Stephens fix to get merged, so Zynq can use the GT?

	Thanks,
	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-16 17:28                                                               ` Sören Brinkmann
@ 2013-08-19 23:00                                                                 ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-19 23:00 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/16/13 10:28, Sören Brinkmann wrote:
> On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
>> On 08/12/2013 06:53 PM, Sören Brinkmann wrote:
>>> It's actually present. I have a clean 3.11-rc3 and the only changes are
>>> my patch to enable the GT and Stephen's fix.
>>> The cpuidle stats show both idle states being used.
>> Ah, right. The tick_broadcast_mask is not set because the arm global
>> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.
> Just to check in. Do you want any additional testing done? Or can I
> expect Stephens fix to get merged, so Zynq can use the GT?
>

I was curious, can you use just the first hunk of the patch that applied
to tick-broadcast.c to fix the problem? I think the answer is yes.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-19 23:00                                                                 ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-19 23:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/16/13 10:28, S?ren Brinkmann wrote:
> On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
>> On 08/12/2013 06:53 PM, S?ren Brinkmann wrote:
>>> It's actually present. I have a clean 3.11-rc3 and the only changes are
>>> my patch to enable the GT and Stephen's fix.
>>> The cpuidle stats show both idle states being used.
>> Ah, right. The tick_broadcast_mask is not set because the arm global
>> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.
> Just to check in. Do you want any additional testing done? Or can I
> expect Stephens fix to get merged, so Zynq can use the GT?
>

I was curious, can you use just the first hunk of the patch that applied
to tick-broadcast.c to fix the problem? I think the answer is yes.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-19 23:00                                                                 ` Stephen Boyd
@ 2013-08-19 23:30                                                                   ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-19 23:30 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

Hi Stephen,

On Mon, Aug 19, 2013 at 04:00:36PM -0700, Stephen Boyd wrote:
> On 08/16/13 10:28, Sören Brinkmann wrote:
> > On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
> >> On 08/12/2013 06:53 PM, Sören Brinkmann wrote:
> >>> It's actually present. I have a clean 3.11-rc3 and the only changes are
> >>> my patch to enable the GT and Stephen's fix.
> >>> The cpuidle stats show both idle states being used.
> >> Ah, right. The tick_broadcast_mask is not set because the arm global
> >> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.
> > Just to check in. Do you want any additional testing done? Or can I
> > expect Stephens fix to get merged, so Zynq can use the GT?
> >
> 
> I was curious, can you use just the first hunk of the patch that applied
> to tick-broadcast.c to fix the problem? I think the answer is yes.

Yes, that seems to be enough.

# cat /proc/interrupts 
           CPU0       CPU1       
 27:         14          1       GIC  27  gt
 29:        664        759       GIC  29  twd
 43:        725          0       GIC  43  ttc_clockevent
 82:        214          0       GIC  82  xuartps
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0         58  Timer broadcast interrupts
IPI2:       1224       1120  Rescheduling interrupts
IPI3:          0          0  Function call interrupts
IPI4:         44         50  Single function call interrupts
IPI5:          0          0  CPU stop interrupts
Err:          0

Timer list:
Tick Device: mode:     1
Broadcast device
Clock Event Device: ttc_clockevent
 max_delta_ns:   1207932479
 min_delta_ns:   18432
 mult:           233015
 shift:          32
 mode:           3
 next_event:     60080000000 nsecs
 set_next_event: ttc_set_next_event
 set_mode:       ttc_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0

tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     59075238755 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     59080000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0


	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-19 23:30                                                                   ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-19 23:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Stephen,

On Mon, Aug 19, 2013 at 04:00:36PM -0700, Stephen Boyd wrote:
> On 08/16/13 10:28, S?ren Brinkmann wrote:
> > On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
> >> On 08/12/2013 06:53 PM, S?ren Brinkmann wrote:
> >>> It's actually present. I have a clean 3.11-rc3 and the only changes are
> >>> my patch to enable the GT and Stephen's fix.
> >>> The cpuidle stats show both idle states being used.
> >> Ah, right. The tick_broadcast_mask is not set because the arm global
> >> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.
> > Just to check in. Do you want any additional testing done? Or can I
> > expect Stephens fix to get merged, so Zynq can use the GT?
> >
> 
> I was curious, can you use just the first hunk of the patch that applied
> to tick-broadcast.c to fix the problem? I think the answer is yes.

Yes, that seems to be enough.

# cat /proc/interrupts 
           CPU0       CPU1       
 27:         14          1       GIC  27  gt
 29:        664        759       GIC  29  twd
 43:        725          0       GIC  43  ttc_clockevent
 82:        214          0       GIC  82  xuartps
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0         58  Timer broadcast interrupts
IPI2:       1224       1120  Rescheduling interrupts
IPI3:          0          0  Function call interrupts
IPI4:         44         50  Single function call interrupts
IPI5:          0          0  CPU stop interrupts
Err:          0

Timer list:
Tick Device: mode:     1
Broadcast device
Clock Event Device: ttc_clockevent
 max_delta_ns:   1207932479
 min_delta_ns:   18432
 mult:           233015
 shift:          32
 mode:           3
 next_event:     60080000000 nsecs
 set_next_event: ttc_set_next_event
 set_mode:       ttc_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0

tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     59075238755 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   12884902005
 min_delta_ns:   1000
 mult:           715827876
 shift:          31
 mode:           3
 next_event:     59080000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:        0


	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-19 23:30                                                                   ` Sören Brinkmann
@ 2013-08-20  0:57                                                                     ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-20  0:57 UTC (permalink / raw)
  To: S??ren Brinkmann
  Cc: Daniel Lezcano, srinivas.kandagatla, Russell King, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Thomas Gleixner,
	linux-arm-kernel

On 08/19, S??ren Brinkmann wrote:
> Hi Stephen,
> 
> On Mon, Aug 19, 2013 at 04:00:36PM -0700, Stephen Boyd wrote:
> > On 08/16/13 10:28, S??ren Brinkmann wrote:
> > > On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
> > >> On 08/12/2013 06:53 PM, S??ren Brinkmann wrote:
> > >>> It's actually present. I have a clean 3.11-rc3 and the only changes are
> > >>> my patch to enable the GT and Stephen's fix.
> > >>> The cpuidle stats show both idle states being used.
> > >> Ah, right. The tick_broadcast_mask is not set because the arm global
> > >> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.
> > > Just to check in. Do you want any additional testing done? Or can I
> > > expect Stephens fix to get merged, so Zynq can use the GT?
> > >
> > 
> > I was curious, can you use just the first hunk of the patch that applied
> > to tick-broadcast.c to fix the problem? I think the answer is yes.
> 
> Yes, that seems to be enough.
> 

Great thank you. I will split the patch into two pieces. That way
we can discuss the merit of always using a timer that doesn't
suffer from FEAT_C3_STOP over a timer that does.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-20  0:57                                                                     ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-20  0:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/19, S??ren Brinkmann wrote:
> Hi Stephen,
> 
> On Mon, Aug 19, 2013 at 04:00:36PM -0700, Stephen Boyd wrote:
> > On 08/16/13 10:28, S??ren Brinkmann wrote:
> > > On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
> > >> On 08/12/2013 06:53 PM, S??ren Brinkmann wrote:
> > >>> It's actually present. I have a clean 3.11-rc3 and the only changes are
> > >>> my patch to enable the GT and Stephen's fix.
> > >>> The cpuidle stats show both idle states being used.
> > >> Ah, right. The tick_broadcast_mask is not set because the arm global
> > >> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.
> > > Just to check in. Do you want any additional testing done? Or can I
> > > expect Stephens fix to get merged, so Zynq can use the GT?
> > >
> > 
> > I was curious, can you use just the first hunk of the patch that applied
> > to tick-broadcast.c to fix the problem? I think the answer is yes.
> 
> Yes, that seems to be enough.
> 

Great thank you. I will split the patch into two pieces. That way
we can discuss the merit of always using a timer that doesn't
suffer from FEAT_C3_STOP over a timer that does.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: Enable arm_global_timer for Zynq brakes boot
  2013-08-20  0:57                                                                     ` Stephen Boyd
@ 2013-08-20 15:13                                                                       ` Daniel Lezcano
  -1 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-20 15:13 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: S??ren Brinkmann, srinivas.kandagatla, Russell King,
	Michal Simek, linux-kernel, Stuart Menefy, John Stultz,
	Thomas Gleixner, linux-arm-kernel

On 08/20/2013 02:57 AM, Stephen Boyd wrote:
> On 08/19, S??ren Brinkmann wrote:
>> Hi Stephen,
>>
>> On Mon, Aug 19, 2013 at 04:00:36PM -0700, Stephen Boyd wrote:
>>> On 08/16/13 10:28, S??ren Brinkmann wrote:
>>>> On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
>>>>> On 08/12/2013 06:53 PM, S??ren Brinkmann wrote:
>>>>>> It's actually present. I have a clean 3.11-rc3 and the only changes are
>>>>>> my patch to enable the GT and Stephen's fix.
>>>>>> The cpuidle stats show both idle states being used.
>>>>> Ah, right. The tick_broadcast_mask is not set because the arm global
>>>>> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.
>>>> Just to check in. Do you want any additional testing done? Or can I
>>>> expect Stephens fix to get merged, so Zynq can use the GT?
>>>>
>>>
>>> I was curious, can you use just the first hunk of the patch that applied
>>> to tick-broadcast.c to fix the problem? I think the answer is yes.
>>
>> Yes, that seems to be enough.
>>
> 
> Great thank you. I will split the patch into two pieces. That way
> we can discuss the merit of always using a timer that doesn't
> suffer from FEAT_C3_STOP over a timer that does.

Yes, that sounds a good idea.

  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Enable arm_global_timer for Zynq brakes boot
@ 2013-08-20 15:13                                                                       ` Daniel Lezcano
  0 siblings, 0 replies; 142+ messages in thread
From: Daniel Lezcano @ 2013-08-20 15:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/20/2013 02:57 AM, Stephen Boyd wrote:
> On 08/19, S??ren Brinkmann wrote:
>> Hi Stephen,
>>
>> On Mon, Aug 19, 2013 at 04:00:36PM -0700, Stephen Boyd wrote:
>>> On 08/16/13 10:28, S??ren Brinkmann wrote:
>>>> On Mon, Aug 12, 2013 at 07:02:39PM +0200, Daniel Lezcano wrote:
>>>>> On 08/12/2013 06:53 PM, S??ren Brinkmann wrote:
>>>>>> It's actually present. I have a clean 3.11-rc3 and the only changes are
>>>>>> my patch to enable the GT and Stephen's fix.
>>>>>> The cpuidle stats show both idle states being used.
>>>>> Ah, right. The tick_broadcast_mask is not set because the arm global
>>>>> timer has not the CLOCK_EVT_FEAT_C3STOP feature flag set.
>>>> Just to check in. Do you want any additional testing done? Or can I
>>>> expect Stephens fix to get merged, so Zynq can use the GT?
>>>>
>>>
>>> I was curious, can you use just the first hunk of the patch that applied
>>> to tick-broadcast.c to fix the problem? I think the answer is yes.
>>
>> Yes, that seems to be enough.
>>
> 
> Great thank you. I will split the patch into two pieces. That way
> we can discuss the merit of always using a timer that doesn't
> suffer from FEAT_C3_STOP over a timer that does.

Yes, that sounds a good idea.

  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 142+ messages in thread

* [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources
  2013-08-20 15:13                                                                       ` Daniel Lezcano
@ 2013-08-22 17:06                                                                         ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-22 17:06 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Michal Simek, Russell King, linux-kernel, Stuart Menefy,
	John Stultz, Thomas Gleixner, linux-arm-kernel,
	srinivas.kandagatla, Sören Brinkmann

On most ARM systems the per-cpu clockevents are truly per-cpu in
the sense that they can't be controlled on any other CPU besides
the CPU that they interrupt. If one of these clockevents were to
become a broadcast source we will run into a lot of trouble
because the broadcast source is enabled on the first CPU to go
into deep idle (if that CPU suffers from FEAT_C3_STOP) and that
could be a different CPU than what the clockevent is interrupting
(or even worse the CPU that the clockevent interrupts could be
offline).

Theoretically it's possible to support per-cpu clockevents as the
broadcast source but so far we haven't needed this and supporting
it is rather complicated. Let's just deny the possibility for now
until this becomes a reality (let's hope it never does!).

Reported-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---
 kernel/time/tick-broadcast.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 218bcb5..d3539e5 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
 	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
 		return false;
 
+	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
+		return false;
+
 	return !curdev || newdev->rating > curdev->rating;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources
@ 2013-08-22 17:06                                                                         ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-22 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

On most ARM systems the per-cpu clockevents are truly per-cpu in
the sense that they can't be controlled on any other CPU besides
the CPU that they interrupt. If one of these clockevents were to
become a broadcast source we will run into a lot of trouble
because the broadcast source is enabled on the first CPU to go
into deep idle (if that CPU suffers from FEAT_C3_STOP) and that
could be a different CPU than what the clockevent is interrupting
(or even worse the CPU that the clockevent interrupts could be
offline).

Theoretically it's possible to support per-cpu clockevents as the
broadcast source but so far we haven't needed this and supporting
it is rather complicated. Let's just deny the possibility for now
until this becomes a reality (let's hope it never does!).

Reported-by: S?ren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---
 kernel/time/tick-broadcast.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 218bcb5..d3539e5 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -77,6 +77,9 @@ static bool tick_check_broadcast_device(struct clock_event_device *curdev,
 	    !(newdev->features & CLOCK_EVT_FEAT_ONESHOT))
 		return false;
 
+	if (cpumask_equal(newdev->cpumask, cpumask_of(smp_processor_id())))
+		return false;
+
 	return !curdev || newdev->rating > curdev->rating;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
  2013-08-22 17:06                                                                         ` Stephen Boyd
@ 2013-08-22 17:06                                                                           ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-22 17:06 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Michal Simek, Russell King, linux-kernel, Stuart Menefy,
	John Stultz, Thomas Gleixner, linux-arm-kernel,
	srinivas.kandagatla, Sören Brinkmann

On some ARM systems there are two sets of per-cpu timers: the TWD
timers and the global timers. The TWD timers are rated at 350 and
the global timers are rated at 300 but the TWD timers suffer from
FEAT_C3_STOP while the arm global timers don't. The tick device
selection logic doesn't consider the FEAT_C3_STOP flag so we'll
always end up with the TWD as the tick device although we don't
want that.

Extend the preference logic to take the FEAT_C3_STOP flag into
account and always prefer tick devices that don't suffer from
FEAT_C3_STOP even if their rating is lower than tick devices that
do suffer from FEAT_C3_STOP. This will remove the need to do any
broadcasting on such ARM systems.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---
 kernel/time/tick-common.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 64522ec..3ae437d 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
 			return false;
 	}
 
+	if (!curdev)
+		return true;
+
+	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
+	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
+		return true;
+	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
+		return false;
+
 	/*
 	 * Use the higher rated one, but prefer a CPU local device with a lower
 	 * rating than a non-CPU local device
 	 */
-	return !curdev ||
-		newdev->rating > curdev->rating ||
+       return newdev->rating > curdev->rating ||
 	       !cpumask_equal(curdev->cpumask, newdev->cpumask);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
@ 2013-08-22 17:06                                                                           ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-22 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

On some ARM systems there are two sets of per-cpu timers: the TWD
timers and the global timers. The TWD timers are rated at 350 and
the global timers are rated at 300 but the TWD timers suffer from
FEAT_C3_STOP while the arm global timers don't. The tick device
selection logic doesn't consider the FEAT_C3_STOP flag so we'll
always end up with the TWD as the tick device although we don't
want that.

Extend the preference logic to take the FEAT_C3_STOP flag into
account and always prefer tick devices that don't suffer from
FEAT_C3_STOP even if their rating is lower than tick devices that
do suffer from FEAT_C3_STOP. This will remove the need to do any
broadcasting on such ARM systems.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---
 kernel/time/tick-common.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 64522ec..3ae437d 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
 			return false;
 	}
 
+	if (!curdev)
+		return true;
+
+	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
+	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
+		return true;
+	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
+			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
+		return false;
+
 	/*
 	 * Use the higher rated one, but prefer a CPU local device with a lower
 	 * rating than a non-CPU local device
 	 */
-	return !curdev ||
-		newdev->rating > curdev->rating ||
+       return newdev->rating > curdev->rating ||
 	       !cpumask_equal(curdev->cpumask, newdev->cpumask);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Re: [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
  2013-08-22 17:06                                                                           ` Stephen Boyd
@ 2013-08-22 17:33                                                                             ` Santosh Shilimkar
  -1 siblings, 0 replies; 142+ messages in thread
From: Santosh Shilimkar @ 2013-08-22 17:33 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, Russell King, srinivas.kandagatla, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Sören Brinkmann,
	Thomas Gleixner, linux-arm-kernel

On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
> On some ARM systems there are two sets of per-cpu timers: the TWD
> timers and the global timers. The TWD timers are rated at 350 and
> the global timers are rated at 300 but the TWD timers suffer from
> FEAT_C3_STOP while the arm global timers don't. The tick device
> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
> always end up with the TWD as the tick device although we don't
> want that.
> 
> Extend the preference logic to take the FEAT_C3_STOP flag into
> account and always prefer tick devices that don't suffer from
> FEAT_C3_STOP even if their rating is lower than tick devices that
> do suffer from FEAT_C3_STOP. This will remove the need to do any
> broadcasting on such ARM systems.
> 
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
> ---
>  kernel/time/tick-common.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index 64522ec..3ae437d 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>  			return false;
>  	}
>  
> +	if (!curdev)
> +		return true;
> +
> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
> +		return true;
> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
> +		return false;
> +
I don't think preferring the clock-event which doesn't suffer
from FEAT_C3STOP is a good idea if the quality of the time source
is not same. Generally the global timers are slow and far away from
CPU(cycle cost). So as long as we don't get impacted because of low power
states, the tick should run out of local timers which are faster access
as well as high resolution.

Regards,
Santosh




^ permalink raw reply	[flat|nested] 142+ messages in thread

* [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
@ 2013-08-22 17:33                                                                             ` Santosh Shilimkar
  0 siblings, 0 replies; 142+ messages in thread
From: Santosh Shilimkar @ 2013-08-22 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
> On some ARM systems there are two sets of per-cpu timers: the TWD
> timers and the global timers. The TWD timers are rated at 350 and
> the global timers are rated at 300 but the TWD timers suffer from
> FEAT_C3_STOP while the arm global timers don't. The tick device
> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
> always end up with the TWD as the tick device although we don't
> want that.
> 
> Extend the preference logic to take the FEAT_C3_STOP flag into
> account and always prefer tick devices that don't suffer from
> FEAT_C3_STOP even if their rating is lower than tick devices that
> do suffer from FEAT_C3_STOP. This will remove the need to do any
> broadcasting on such ARM systems.
> 
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
> ---
>  kernel/time/tick-common.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index 64522ec..3ae437d 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>  			return false;
>  	}
>  
> +	if (!curdev)
> +		return true;
> +
> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
> +		return true;
> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
> +		return false;
> +
I don't think preferring the clock-event which doesn't suffer
from FEAT_C3STOP is a good idea if the quality of the time source
is not same. Generally the global timers are slow and far away from
CPU(cycle cost). So as long as we don't get impacted because of low power
states, the tick should run out of local timers which are faster access
as well as high resolution.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
  2013-08-22 17:33                                                                             ` Santosh Shilimkar
@ 2013-08-22 17:40                                                                               ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-22 17:40 UTC (permalink / raw)
  To: Santosh Shilimkar
  Cc: Daniel Lezcano, Russell King, srinivas.kandagatla, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Sören Brinkmann,
	Thomas Gleixner, linux-arm-kernel

On 08/22/13 10:33, Santosh Shilimkar wrote:
> On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
>> On some ARM systems there are two sets of per-cpu timers: the TWD
>> timers and the global timers. The TWD timers are rated at 350 and
>> the global timers are rated at 300 but the TWD timers suffer from
>> FEAT_C3_STOP while the arm global timers don't. The tick device
>> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
>> always end up with the TWD as the tick device although we don't
>> want that.
>>
>> Extend the preference logic to take the FEAT_C3_STOP flag into
>> account and always prefer tick devices that don't suffer from
>> FEAT_C3_STOP even if their rating is lower than tick devices that
>> do suffer from FEAT_C3_STOP. This will remove the need to do any
>> broadcasting on such ARM systems.
>>
>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
>> ---
>>  kernel/time/tick-common.c | 14 ++++++++++++--
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>> index 64522ec..3ae437d 100644
>> --- a/kernel/time/tick-common.c
>> +++ b/kernel/time/tick-common.c
>> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>  			return false;
>>  	}
>>  
>> +	if (!curdev)
>> +		return true;
>> +
>> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
>> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>> +		return true;
>> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>> +		return false;
>> +
> I don't think preferring the clock-event which doesn't suffer
> from FEAT_C3STOP is a good idea if the quality of the time source
> is not same. Generally the global timers are slow and far away from
> CPU(cycle cost). So as long as we don't get impacted because of low power
> states, the tick should run out of local timers which are faster access
> as well as high resolution.
>

Fair enough. I have no data either way to convince anyone that this is a
good or bad idea so this patch should have probably been an RFC. Are you
hinting at something like switching to a per-cpu timer that doesn't
suffer from FEAT_C3_STOP when a CPU goes into a deep idle state?
Interesting idea but I think I'll leave that to someone else if they
really care to do that.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
@ 2013-08-22 17:40                                                                               ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-22 17:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/22/13 10:33, Santosh Shilimkar wrote:
> On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
>> On some ARM systems there are two sets of per-cpu timers: the TWD
>> timers and the global timers. The TWD timers are rated at 350 and
>> the global timers are rated at 300 but the TWD timers suffer from
>> FEAT_C3_STOP while the arm global timers don't. The tick device
>> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
>> always end up with the TWD as the tick device although we don't
>> want that.
>>
>> Extend the preference logic to take the FEAT_C3_STOP flag into
>> account and always prefer tick devices that don't suffer from
>> FEAT_C3_STOP even if their rating is lower than tick devices that
>> do suffer from FEAT_C3_STOP. This will remove the need to do any
>> broadcasting on such ARM systems.
>>
>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
>> ---
>>  kernel/time/tick-common.c | 14 ++++++++++++--
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>> index 64522ec..3ae437d 100644
>> --- a/kernel/time/tick-common.c
>> +++ b/kernel/time/tick-common.c
>> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>  			return false;
>>  	}
>>  
>> +	if (!curdev)
>> +		return true;
>> +
>> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
>> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>> +		return true;
>> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>> +		return false;
>> +
> I don't think preferring the clock-event which doesn't suffer
> from FEAT_C3STOP is a good idea if the quality of the time source
> is not same. Generally the global timers are slow and far away from
> CPU(cycle cost). So as long as we don't get impacted because of low power
> states, the tick should run out of local timers which are faster access
> as well as high resolution.
>

Fair enough. I have no data either way to convince anyone that this is a
good or bad idea so this patch should have probably been an RFC. Are you
hinting at something like switching to a per-cpu timer that doesn't
suffer from FEAT_C3_STOP when a CPU goes into a deep idle state?
Interesting idea but I think I'll leave that to someone else if they
really care to do that.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
  2013-08-22 17:40                                                                               ` Stephen Boyd
@ 2013-08-22 17:48                                                                                 ` Santosh Shilimkar
  -1 siblings, 0 replies; 142+ messages in thread
From: Santosh Shilimkar @ 2013-08-22 17:48 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, Russell King, srinivas.kandagatla, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, Sören Brinkmann,
	Thomas Gleixner, linux-arm-kernel

On Thursday 22 August 2013 01:40 PM, Stephen Boyd wrote:
> On 08/22/13 10:33, Santosh Shilimkar wrote:
>> On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
>>> On some ARM systems there are two sets of per-cpu timers: the TWD
>>> timers and the global timers. The TWD timers are rated at 350 and
>>> the global timers are rated at 300 but the TWD timers suffer from
>>> FEAT_C3_STOP while the arm global timers don't. The tick device
>>> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
>>> always end up with the TWD as the tick device although we don't
>>> want that.
>>>
>>> Extend the preference logic to take the FEAT_C3_STOP flag into
>>> account and always prefer tick devices that don't suffer from
>>> FEAT_C3_STOP even if their rating is lower than tick devices that
>>> do suffer from FEAT_C3_STOP. This will remove the need to do any
>>> broadcasting on such ARM systems.
>>>
>>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
>>> ---
>>>  kernel/time/tick-common.c | 14 ++++++++++++--
>>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>>> index 64522ec..3ae437d 100644
>>> --- a/kernel/time/tick-common.c
>>> +++ b/kernel/time/tick-common.c
>>> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>>  			return false;
>>>  	}
>>>  
>>> +	if (!curdev)
>>> +		return true;
>>> +
>>> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
>>> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>> +		return true;
>>> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>> +		return false;
>>> +
>> I don't think preferring the clock-event which doesn't suffer
>> from FEAT_C3STOP is a good idea if the quality of the time source
>> is not same. Generally the global timers are slow and far away from
>> CPU(cycle cost). So as long as we don't get impacted because of low power
>> states, the tick should run out of local timers which are faster access
>> as well as high resolution.
>>
> 
> Fair enough. I have no data either way to convince anyone that this is a
> good or bad idea so this patch should have probably been an RFC. Are you
> hinting at something like switching to a per-cpu timer that doesn't
> suffer from FEAT_C3_STOP when a CPU goes into a deep idle state?
> Interesting idea but I think I'll leave that to someone else if they
> really care to do that.
> 
If the per-CPU timer don't suffer from C3_STOP, there is no need to
switch at all and the per CPU tick continue to runs on CPU local
timers. We need to switch to a broadcast device(no affected by C3_STOP)
only for the cases where the per-CPU timer wakeup don't work in CPU
low power states.

Are we talking about a hardware where one of the CPU from a cluster
has a local timer which is not affected by C3_STOP where as rest
of the CPU local timers are ? If yes, it must be crazy hardware and
we should not care too much about that.

Regards,
Santosh



^ permalink raw reply	[flat|nested] 142+ messages in thread

* [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
@ 2013-08-22 17:48                                                                                 ` Santosh Shilimkar
  0 siblings, 0 replies; 142+ messages in thread
From: Santosh Shilimkar @ 2013-08-22 17:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 22 August 2013 01:40 PM, Stephen Boyd wrote:
> On 08/22/13 10:33, Santosh Shilimkar wrote:
>> On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
>>> On some ARM systems there are two sets of per-cpu timers: the TWD
>>> timers and the global timers. The TWD timers are rated at 350 and
>>> the global timers are rated at 300 but the TWD timers suffer from
>>> FEAT_C3_STOP while the arm global timers don't. The tick device
>>> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
>>> always end up with the TWD as the tick device although we don't
>>> want that.
>>>
>>> Extend the preference logic to take the FEAT_C3_STOP flag into
>>> account and always prefer tick devices that don't suffer from
>>> FEAT_C3_STOP even if their rating is lower than tick devices that
>>> do suffer from FEAT_C3_STOP. This will remove the need to do any
>>> broadcasting on such ARM systems.
>>>
>>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
>>> ---
>>>  kernel/time/tick-common.c | 14 ++++++++++++--
>>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>>> index 64522ec..3ae437d 100644
>>> --- a/kernel/time/tick-common.c
>>> +++ b/kernel/time/tick-common.c
>>> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>>  			return false;
>>>  	}
>>>  
>>> +	if (!curdev)
>>> +		return true;
>>> +
>>> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
>>> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>> +		return true;
>>> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>> +		return false;
>>> +
>> I don't think preferring the clock-event which doesn't suffer
>> from FEAT_C3STOP is a good idea if the quality of the time source
>> is not same. Generally the global timers are slow and far away from
>> CPU(cycle cost). So as long as we don't get impacted because of low power
>> states, the tick should run out of local timers which are faster access
>> as well as high resolution.
>>
> 
> Fair enough. I have no data either way to convince anyone that this is a
> good or bad idea so this patch should have probably been an RFC. Are you
> hinting at something like switching to a per-cpu timer that doesn't
> suffer from FEAT_C3_STOP when a CPU goes into a deep idle state?
> Interesting idea but I think I'll leave that to someone else if they
> really care to do that.
> 
If the per-CPU timer don't suffer from C3_STOP, there is no need to
switch at all and the per CPU tick continue to runs on CPU local
timers. We need to switch to a broadcast device(no affected by C3_STOP)
only for the cases where the per-CPU timer wakeup don't work in CPU
low power states.

Are we talking about a hardware where one of the CPU from a cluster
has a local timer which is not affected by C3_STOP where as rest
of the CPU local timers are ? If yes, it must be crazy hardware and
we should not care too much about that.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
  2013-08-22 17:48                                                                                 ` Santosh Shilimkar
@ 2013-08-22 18:31                                                                                   ` Stephen Boyd
  -1 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-22 18:31 UTC (permalink / raw)
  To: Santosh Shilimkar
  Cc: Daniel Lezcano, Russell King, srinivas.kandagatla, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, S?ren Brinkmann,
	Thomas Gleixner, linux-arm-kernel

On 08/22, Santosh Shilimkar wrote:
> On Thursday 22 August 2013 01:40 PM, Stephen Boyd wrote:
> > On 08/22/13 10:33, Santosh Shilimkar wrote:
> >> On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
> >>> On some ARM systems there are two sets of per-cpu timers: the TWD
> >>> timers and the global timers. The TWD timers are rated at 350 and
> >>> the global timers are rated at 300 but the TWD timers suffer from
> >>> FEAT_C3_STOP while the arm global timers don't. The tick device
> >>> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
> >>> always end up with the TWD as the tick device although we don't
> >>> want that.
> >>>
> >>> Extend the preference logic to take the FEAT_C3_STOP flag into
> >>> account and always prefer tick devices that don't suffer from
> >>> FEAT_C3_STOP even if their rating is lower than tick devices that
> >>> do suffer from FEAT_C3_STOP. This will remove the need to do any
> >>> broadcasting on such ARM systems.
> >>>
> >>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
> >>> ---
> >>>  kernel/time/tick-common.c | 14 ++++++++++++--
> >>>  1 file changed, 12 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> >>> index 64522ec..3ae437d 100644
> >>> --- a/kernel/time/tick-common.c
> >>> +++ b/kernel/time/tick-common.c
> >>> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
> >>>  			return false;
> >>>  	}
> >>>  
> >>> +	if (!curdev)
> >>> +		return true;
> >>> +
> >>> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
> >>> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> >>> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
> >>> +		return true;
> >>> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> >>> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
> >>> +		return false;
> >>> +
> >> I don't think preferring the clock-event which doesn't suffer
> >> from FEAT_C3STOP is a good idea if the quality of the time source
> >> is not same. Generally the global timers are slow and far away from
> >> CPU(cycle cost). So as long as we don't get impacted because of low power
> >> states, the tick should run out of local timers which are faster access
> >> as well as high resolution.
> >>
> > 
> > Fair enough. I have no data either way to convince anyone that this is a
> > good or bad idea so this patch should have probably been an RFC. Are you
> > hinting at something like switching to a per-cpu timer that doesn't
> > suffer from FEAT_C3_STOP when a CPU goes into a deep idle state?
> > Interesting idea but I think I'll leave that to someone else if they
> > really care to do that.
> > 
> If the per-CPU timer don't suffer from C3_STOP, there is no need to
> switch at all and the per CPU tick continue to runs on CPU local
> timers. We need to switch to a broadcast device(no affected by C3_STOP)
> only for the cases where the per-CPU timer wakeup don't work in CPU
> low power states.
> 
> Are we talking about a hardware where one of the CPU from a cluster
> has a local timer which is not affected by C3_STOP where as rest
> of the CPU local timers are ? If yes, it must be crazy hardware and
> we should not care too much about that.

We're talking about a device with the TWD and the arm global
timers where each CPU has a TWD and an arm global timer and the
TWD stops during deep idle but the global timer doesn't.
Put another way, two timers per CPU where one doesn't work all
the time.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
@ 2013-08-22 18:31                                                                                   ` Stephen Boyd
  0 siblings, 0 replies; 142+ messages in thread
From: Stephen Boyd @ 2013-08-22 18:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/22, Santosh Shilimkar wrote:
> On Thursday 22 August 2013 01:40 PM, Stephen Boyd wrote:
> > On 08/22/13 10:33, Santosh Shilimkar wrote:
> >> On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
> >>> On some ARM systems there are two sets of per-cpu timers: the TWD
> >>> timers and the global timers. The TWD timers are rated at 350 and
> >>> the global timers are rated at 300 but the TWD timers suffer from
> >>> FEAT_C3_STOP while the arm global timers don't. The tick device
> >>> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
> >>> always end up with the TWD as the tick device although we don't
> >>> want that.
> >>>
> >>> Extend the preference logic to take the FEAT_C3_STOP flag into
> >>> account and always prefer tick devices that don't suffer from
> >>> FEAT_C3_STOP even if their rating is lower than tick devices that
> >>> do suffer from FEAT_C3_STOP. This will remove the need to do any
> >>> broadcasting on such ARM systems.
> >>>
> >>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
> >>> ---
> >>>  kernel/time/tick-common.c | 14 ++++++++++++--
> >>>  1 file changed, 12 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> >>> index 64522ec..3ae437d 100644
> >>> --- a/kernel/time/tick-common.c
> >>> +++ b/kernel/time/tick-common.c
> >>> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
> >>>  			return false;
> >>>  	}
> >>>  
> >>> +	if (!curdev)
> >>> +		return true;
> >>> +
> >>> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
> >>> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> >>> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
> >>> +		return true;
> >>> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
> >>> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
> >>> +		return false;
> >>> +
> >> I don't think preferring the clock-event which doesn't suffer
> >> from FEAT_C3STOP is a good idea if the quality of the time source
> >> is not same. Generally the global timers are slow and far away from
> >> CPU(cycle cost). So as long as we don't get impacted because of low power
> >> states, the tick should run out of local timers which are faster access
> >> as well as high resolution.
> >>
> > 
> > Fair enough. I have no data either way to convince anyone that this is a
> > good or bad idea so this patch should have probably been an RFC. Are you
> > hinting at something like switching to a per-cpu timer that doesn't
> > suffer from FEAT_C3_STOP when a CPU goes into a deep idle state?
> > Interesting idea but I think I'll leave that to someone else if they
> > really care to do that.
> > 
> If the per-CPU timer don't suffer from C3_STOP, there is no need to
> switch at all and the per CPU tick continue to runs on CPU local
> timers. We need to switch to a broadcast device(no affected by C3_STOP)
> only for the cases where the per-CPU timer wakeup don't work in CPU
> low power states.
> 
> Are we talking about a hardware where one of the CPU from a cluster
> has a local timer which is not affected by C3_STOP where as rest
> of the CPU local timers are ? If yes, it must be crazy hardware and
> we should not care too much about that.

We're talking about a device with the TWD and the arm global
timers where each CPU has a TWD and an arm global timer and the
TWD stops during deep idle but the global timer doesn't.
Put another way, two timers per CPU where one doesn't work all
the time.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
  2013-08-22 18:31                                                                                   ` Stephen Boyd
@ 2013-08-22 21:06                                                                                     ` Santosh Shilimkar
  -1 siblings, 0 replies; 142+ messages in thread
From: Santosh Shilimkar @ 2013-08-22 21:06 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, Russell King, srinivas.kandagatla, Michal Simek,
	linux-kernel, Stuart Menefy, John Stultz, S?ren Brinkmann,
	Thomas Gleixner, linux-arm-kernel

On Thursday 22 August 2013 02:31 PM, Stephen Boyd wrote:
> On 08/22, Santosh Shilimkar wrote:
>> On Thursday 22 August 2013 01:40 PM, Stephen Boyd wrote:
>>> On 08/22/13 10:33, Santosh Shilimkar wrote:
>>>> On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
>>>>> On some ARM systems there are two sets of per-cpu timers: the TWD
>>>>> timers and the global timers. The TWD timers are rated at 350 and
>>>>> the global timers are rated at 300 but the TWD timers suffer from
>>>>> FEAT_C3_STOP while the arm global timers don't. The tick device
>>>>> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
>>>>> always end up with the TWD as the tick device although we don't
>>>>> want that.
>>>>>
>>>>> Extend the preference logic to take the FEAT_C3_STOP flag into
>>>>> account and always prefer tick devices that don't suffer from
>>>>> FEAT_C3_STOP even if their rating is lower than tick devices that
>>>>> do suffer from FEAT_C3_STOP. This will remove the need to do any
>>>>> broadcasting on such ARM systems.
>>>>>
>>>>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
>>>>> ---
>>>>>  kernel/time/tick-common.c | 14 ++++++++++++--
>>>>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>>>>> index 64522ec..3ae437d 100644
>>>>> --- a/kernel/time/tick-common.c
>>>>> +++ b/kernel/time/tick-common.c
>>>>> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>>>>  			return false;
>>>>>  	}
>>>>>  
>>>>> +	if (!curdev)
>>>>> +		return true;
>>>>> +
>>>>> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
>>>>> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>>>> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>>>> +		return true;
>>>>> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>>>> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>>>> +		return false;
>>>>> +
>>>> I don't think preferring the clock-event which doesn't suffer
>>>> from FEAT_C3STOP is a good idea if the quality of the time source
>>>> is not same. Generally the global timers are slow and far away from
>>>> CPU(cycle cost). So as long as we don't get impacted because of low power
>>>> states, the tick should run out of local timers which are faster access
>>>> as well as high resolution.
>>>>
>>>
>>> Fair enough. I have no data either way to convince anyone that this is a
>>> good or bad idea so this patch should have probably been an RFC. Are you
>>> hinting at something like switching to a per-cpu timer that doesn't
>>> suffer from FEAT_C3_STOP when a CPU goes into a deep idle state?
>>> Interesting idea but I think I'll leave that to someone else if they
>>> really care to do that.
>>>
>> If the per-CPU timer don't suffer from C3_STOP, there is no need to
>> switch at all and the per CPU tick continue to runs on CPU local
>> timers. We need to switch to a broadcast device(no affected by C3_STOP)
>> only for the cases where the per-CPU timer wakeup don't work in CPU
>> low power states.
>>
>> Are we talking about a hardware where one of the CPU from a cluster
>> has a local timer which is not affected by C3_STOP where as rest
>> of the CPU local timers are ? If yes, it must be crazy hardware and
>> we should not care too much about that.
> 
> We're talking about a device with the TWD and the arm global
> timers where each CPU has a TWD and an arm global timer and the
> TWD stops during deep idle but the global timer doesn't.
> Put another way, two timers per CPU where one doesn't work all
> the time.
> 
This global timer is really not global timer then since its
per CPU. So in this case, you don't need to use TWD's and
hence no need of broadcast. You should rather not register the
TWD's since you have better per-CPU timer which is alive in
the low power states as well and hence that should be registered
as a per CPU clock-event. If the current ARM timer code
doesn't support this, it should be enhanced to be able to
achieve above.

Regards,
Santosh






^ permalink raw reply	[flat|nested] 142+ messages in thread

* [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP
@ 2013-08-22 21:06                                                                                     ` Santosh Shilimkar
  0 siblings, 0 replies; 142+ messages in thread
From: Santosh Shilimkar @ 2013-08-22 21:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 22 August 2013 02:31 PM, Stephen Boyd wrote:
> On 08/22, Santosh Shilimkar wrote:
>> On Thursday 22 August 2013 01:40 PM, Stephen Boyd wrote:
>>> On 08/22/13 10:33, Santosh Shilimkar wrote:
>>>> On Thursday 22 August 2013 01:06 PM, Stephen Boyd wrote:
>>>>> On some ARM systems there are two sets of per-cpu timers: the TWD
>>>>> timers and the global timers. The TWD timers are rated at 350 and
>>>>> the global timers are rated at 300 but the TWD timers suffer from
>>>>> FEAT_C3_STOP while the arm global timers don't. The tick device
>>>>> selection logic doesn't consider the FEAT_C3_STOP flag so we'll
>>>>> always end up with the TWD as the tick device although we don't
>>>>> want that.
>>>>>
>>>>> Extend the preference logic to take the FEAT_C3_STOP flag into
>>>>> account and always prefer tick devices that don't suffer from
>>>>> FEAT_C3_STOP even if their rating is lower than tick devices that
>>>>> do suffer from FEAT_C3_STOP. This will remove the need to do any
>>>>> broadcasting on such ARM systems.
>>>>>
>>>>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
>>>>> ---
>>>>>  kernel/time/tick-common.c | 14 ++++++++++++--
>>>>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>>>>> index 64522ec..3ae437d 100644
>>>>> --- a/kernel/time/tick-common.c
>>>>> +++ b/kernel/time/tick-common.c
>>>>> @@ -244,12 +244,22 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
>>>>>  			return false;
>>>>>  	}
>>>>>  
>>>>> +	if (!curdev)
>>>>> +		return true;
>>>>> +
>>>>> +	/* Always prefer a tick device that doesn't suffer from FEAT_C3STOP */
>>>>> +	if (!(newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>>>> +			(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>>>> +		return true;
>>>>> +	if ((newdev->features & CLOCK_EVT_FEAT_C3STOP) &&
>>>>> +			!(curdev->features & CLOCK_EVT_FEAT_C3STOP))
>>>>> +		return false;
>>>>> +
>>>> I don't think preferring the clock-event which doesn't suffer
>>>> from FEAT_C3STOP is a good idea if the quality of the time source
>>>> is not same. Generally the global timers are slow and far away from
>>>> CPU(cycle cost). So as long as we don't get impacted because of low power
>>>> states, the tick should run out of local timers which are faster access
>>>> as well as high resolution.
>>>>
>>>
>>> Fair enough. I have no data either way to convince anyone that this is a
>>> good or bad idea so this patch should have probably been an RFC. Are you
>>> hinting at something like switching to a per-cpu timer that doesn't
>>> suffer from FEAT_C3_STOP when a CPU goes into a deep idle state?
>>> Interesting idea but I think I'll leave that to someone else if they
>>> really care to do that.
>>>
>> If the per-CPU timer don't suffer from C3_STOP, there is no need to
>> switch at all and the per CPU tick continue to runs on CPU local
>> timers. We need to switch to a broadcast device(no affected by C3_STOP)
>> only for the cases where the per-CPU timer wakeup don't work in CPU
>> low power states.
>>
>> Are we talking about a hardware where one of the CPU from a cluster
>> has a local timer which is not affected by C3_STOP where as rest
>> of the CPU local timers are ? If yes, it must be crazy hardware and
>> we should not care too much about that.
> 
> We're talking about a device with the TWD and the arm global
> timers where each CPU has a TWD and an arm global timer and the
> TWD stops during deep idle but the global timer doesn't.
> Put another way, two timers per CPU where one doesn't work all
> the time.
> 
This global timer is really not global timer then since its
per CPU. So in this case, you don't need to use TWD's and
hence no need of broadcast. You should rather not register the
TWD's since you have better per-CPU timer which is alive in
the low power states as well and hence that should be registered
as a per CPU clock-event. If the current ARM timer code
doesn't support this, it should be enhanced to be able to
achieve above.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources
  2013-08-22 17:06                                                                         ` Stephen Boyd
@ 2013-08-22 23:38                                                                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-22 23:38 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, Michal Simek, Russell King, linux-kernel,
	Stuart Menefy, John Stultz, Thomas Gleixner, linux-arm-kernel,
	srinivas.kandagatla

On Thu, Aug 22, 2013 at 10:06:40AM -0700, Stephen Boyd wrote:
> On most ARM systems the per-cpu clockevents are truly per-cpu in
> the sense that they can't be controlled on any other CPU besides
> the CPU that they interrupt. If one of these clockevents were to
> become a broadcast source we will run into a lot of trouble
> because the broadcast source is enabled on the first CPU to go
> into deep idle (if that CPU suffers from FEAT_C3_STOP) and that
> could be a different CPU than what the clockevent is interrupting
> (or even worse the CPU that the clockevent interrupts could be
> offline).
> 
> Theoretically it's possible to support per-cpu clockevents as the
> broadcast source but so far we haven't needed this and supporting
> it is rather complicated. Let's just deny the possibility for now
> until this becomes a reality (let's hope it never does!).
> 
> Reported-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Sören Brinkmann <soren.brinkmann@xilinx.com>

This fixes the issue I reported when enabling the global timer on Zynq.
The global timer is prevented from becoming the broadcast device and my
system boots.

	Thanks,
	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources
@ 2013-08-22 23:38                                                                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-08-22 23:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 22, 2013 at 10:06:40AM -0700, Stephen Boyd wrote:
> On most ARM systems the per-cpu clockevents are truly per-cpu in
> the sense that they can't be controlled on any other CPU besides
> the CPU that they interrupt. If one of these clockevents were to
> become a broadcast source we will run into a lot of trouble
> because the broadcast source is enabled on the first CPU to go
> into deep idle (if that CPU suffers from FEAT_C3_STOP) and that
> could be a different CPU than what the clockevent is interrupting
> (or even worse the CPU that the clockevent interrupts could be
> offline).
> 
> Theoretically it's possible to support per-cpu clockevents as the
> broadcast source but so far we haven't needed this and supporting
> it is rather complicated. Let's just deny the possibility for now
> until this becomes a reality (let's hope it never does!).
> 
> Reported-by: S?ren Brinkmann <soren.brinkmann@xilinx.com>
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: S?ren Brinkmann <soren.brinkmann@xilinx.com>

This fixes the issue I reported when enabling the global timer on Zynq.
The global timer is prevented from becoming the broadcast device and my
system boots.

	Thanks,
	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources
  2013-08-22 17:06                                                                         ` Stephen Boyd
@ 2013-09-05 16:53                                                                           ` Sören Brinkmann
  -1 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-09-05 16:53 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Daniel Lezcano, Michal Simek, Russell King, linux-kernel,
	Stuart Menefy, John Stultz, Thomas Gleixner, linux-arm-kernel,
	srinivas.kandagatla

On Thu, Aug 22, 2013 at 10:06:40AM -0700, Stephen Boyd wrote:
> On most ARM systems the per-cpu clockevents are truly per-cpu in
> the sense that they can't be controlled on any other CPU besides
> the CPU that they interrupt. If one of these clockevents were to
> become a broadcast source we will run into a lot of trouble
> because the broadcast source is enabled on the first CPU to go
> into deep idle (if that CPU suffers from FEAT_C3_STOP) and that
> could be a different CPU than what the clockevent is interrupting
> (or even worse the CPU that the clockevent interrupts could be
> offline).
> 
> Theoretically it's possible to support per-cpu clockevents as the
> broadcast source but so far we haven't needed this and supporting
> it is rather complicated. Let's just deny the possibility for now
> until this becomes a reality (let's hope it never does!).
> 
> Reported-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

Has this been merged anywhere?

	Thanks,
	Sören



^ permalink raw reply	[flat|nested] 142+ messages in thread

* [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources
@ 2013-09-05 16:53                                                                           ` Sören Brinkmann
  0 siblings, 0 replies; 142+ messages in thread
From: Sören Brinkmann @ 2013-09-05 16:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 22, 2013 at 10:06:40AM -0700, Stephen Boyd wrote:
> On most ARM systems the per-cpu clockevents are truly per-cpu in
> the sense that they can't be controlled on any other CPU besides
> the CPU that they interrupt. If one of these clockevents were to
> become a broadcast source we will run into a lot of trouble
> because the broadcast source is enabled on the first CPU to go
> into deep idle (if that CPU suffers from FEAT_C3_STOP) and that
> could be a different CPU than what the clockevent is interrupting
> (or even worse the CPU that the clockevent interrupts could be
> offline).
> 
> Theoretically it's possible to support per-cpu clockevents as the
> broadcast source but so far we haven't needed this and supporting
> it is rather complicated. Let's just deny the possibility for now
> until this becomes a reality (let's hope it never does!).
> 
> Reported-by: S?ren Brinkmann <soren.brinkmann@xilinx.com>
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

Has this been merged anywhere?

	Thanks,
	S?ren

^ permalink raw reply	[flat|nested] 142+ messages in thread

end of thread, other threads:[~2013-09-05 16:53 UTC | newest]

Thread overview: 142+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20130717210417.GP13667@xsjandreislx>
     [not found] ` <51E72DCA.9070500@codeaurora.org>
     [not found]   ` <f5b76049-7e5b-4dd0-9863-19cfd1d599b9@TX2EHSMHS006.ehs.local>
     [not found]     ` <51E7435B.3060605@codeaurora.org>
     [not found]       ` <ff4d40be-7177-4a55-ab2f-dff11fa18642@DB9EHSMHS009.ehs.local>
     [not found]         ` <51ED8DF2.60600@codeaurora.org>
     [not found]           ` <20130722201348.GI453@xsjandreislx>
2013-07-22 22:41             ` Enable arm_global_timer for Zynq brakes boot Sören Brinkmann
2013-07-22 22:41               ` Sören Brinkmann
2013-07-29 12:51               ` Daniel Lezcano
2013-07-29 12:51                 ` Daniel Lezcano
2013-07-29 15:58                 ` Sören Brinkmann
2013-07-29 15:58                   ` Sören Brinkmann
2013-07-30  0:03                 ` Sören Brinkmann
2013-07-30  0:03                   ` Sören Brinkmann
2013-07-30  8:47                   ` Daniel Lezcano
2013-07-30  8:47                     ` Daniel Lezcano
2013-07-30 22:14                     ` Sören Brinkmann
2013-07-30 22:14                       ` Sören Brinkmann
2013-07-30 22:23                       ` Sören Brinkmann
2013-07-30 22:23                         ` Sören Brinkmann
2013-07-30 22:34                     ` Sören Brinkmann
2013-07-30 22:34                       ` Sören Brinkmann
2013-07-31  7:27                       ` Daniel Lezcano
2013-07-31  7:27                         ` Daniel Lezcano
2013-07-31 16:26                         ` Sören Brinkmann
2013-07-31 16:26                           ` Sören Brinkmann
2013-07-31  9:34                       ` Daniel Lezcano
2013-07-31  9:34                         ` Daniel Lezcano
2013-07-31 16:43                         ` Sören Brinkmann
2013-07-31 16:43                           ` Sören Brinkmann
2013-07-31 20:49                       ` Daniel Lezcano
2013-07-31 20:49                         ` Daniel Lezcano
2013-07-31 20:58                         ` Sören Brinkmann
2013-07-31 20:58                           ` Sören Brinkmann
2013-07-31 21:08                           ` Daniel Lezcano
2013-07-31 21:08                             ` Daniel Lezcano
2013-07-31 22:18                             ` Sören Brinkmann
2013-07-31 22:18                               ` Sören Brinkmann
2013-07-31 23:01                               ` Daniel Lezcano
2013-07-31 23:01                                 ` Daniel Lezcano
2013-07-31 23:38                                 ` Sören Brinkmann
2013-07-31 23:38                                   ` Sören Brinkmann
2013-08-01 17:29                                   ` Daniel Lezcano
2013-08-01 17:29                                     ` Daniel Lezcano
2013-08-01 17:43                                     ` Sören Brinkmann
2013-08-01 17:43                                       ` Sören Brinkmann
2013-08-01 17:48                                       ` Daniel Lezcano
2013-08-01 17:48                                         ` Daniel Lezcano
2013-08-06  1:28                                         ` Sören Brinkmann
2013-08-06  1:28                                           ` Sören Brinkmann
2013-08-06  8:46                                           ` Daniel Lezcano
2013-08-06  8:46                                             ` Daniel Lezcano
2013-08-06  9:18                                             ` Michal Simek
2013-08-06  9:18                                               ` Michal Simek
2013-08-06 12:30                                               ` Daniel Lezcano
2013-08-06 12:30                                                 ` Daniel Lezcano
2013-08-06 12:41                                                 ` Michal Simek
2013-08-06 12:41                                                   ` Michal Simek
2013-08-06 13:08                                                   ` Daniel Lezcano
2013-08-06 13:08                                                     ` Daniel Lezcano
2013-08-06 13:18                                                     ` Michal Simek
2013-08-06 13:18                                                       ` Michal Simek
2013-08-06 16:09                                                       ` Daniel Lezcano
2013-08-06 16:09                                                         ` Daniel Lezcano
2013-08-06 16:13                                                         ` Sören Brinkmann
2013-08-06 16:13                                                           ` Sören Brinkmann
2013-08-06 16:25                                                         ` Sören Brinkmann
2013-08-06 16:25                                                           ` Sören Brinkmann
2013-08-06 16:18                                             ` Sören Brinkmann
2013-08-06 16:18                                               ` Sören Brinkmann
2013-08-08 17:11                                         ` Sören Brinkmann
2013-08-08 17:11                                           ` Sören Brinkmann
2013-08-08 17:16                                           ` Mark Rutland
2013-08-08 17:16                                             ` Mark Rutland
2013-08-08 17:22                                             ` Stephen Boyd
2013-08-08 17:22                                               ` Stephen Boyd
2013-08-09 10:58                                               ` Mark Rutland
2013-08-09 10:58                                                 ` Mark Rutland
2013-08-08 17:22                                             ` Sören Brinkmann
2013-08-08 17:22                                               ` Sören Brinkmann
2013-08-09 10:32                                           ` Srinivas KANDAGATLA
2013-08-09 10:32                                             ` Srinivas KANDAGATLA
2013-08-09 14:19                                             ` Daniel Lezcano
2013-08-09 14:19                                               ` Daniel Lezcano
2013-08-09 17:27                                               ` Stephen Boyd
2013-08-09 17:27                                                 ` Stephen Boyd
2013-08-09 17:48                                                 ` Sören Brinkmann
2013-08-09 17:48                                                   ` Sören Brinkmann
2013-08-09 18:45                                                   ` Stephen Boyd
2013-08-09 18:45                                                     ` Stephen Boyd
2013-08-12 10:53                                                 ` Daniel Lezcano
2013-08-12 10:53                                                   ` Daniel Lezcano
2013-08-12 16:23                                                   ` Stephen Boyd
2013-08-12 16:23                                                     ` Stephen Boyd
2013-08-12 16:53                                                     ` Daniel Lezcano
2013-08-12 16:53                                                       ` Daniel Lezcano
2013-08-12 16:03                                                 ` Sören Brinkmann
2013-08-12 16:03                                                   ` Sören Brinkmann
2013-08-12 16:08                                                   ` Daniel Lezcano
2013-08-12 16:08                                                     ` Daniel Lezcano
2013-08-12 16:17                                                     ` Sören Brinkmann
2013-08-12 16:17                                                       ` Sören Brinkmann
2013-08-12 16:20                                                   ` Stephen Boyd
2013-08-12 16:20                                                     ` Stephen Boyd
2013-08-12 16:24                                                     ` Sören Brinkmann
2013-08-12 16:24                                                       ` Sören Brinkmann
2013-08-12 16:40                                                       ` Stephen Boyd
2013-08-12 16:40                                                         ` Stephen Boyd
2013-08-12 16:43                                                         ` Sören Brinkmann
2013-08-12 16:43                                                           ` Sören Brinkmann
2013-08-12 16:32                                                     ` Sören Brinkmann
2013-08-12 16:32                                                       ` Sören Brinkmann
2013-08-12 16:49                                                       ` Daniel Lezcano
2013-08-12 16:49                                                         ` Daniel Lezcano
2013-08-12 16:53                                                         ` Sören Brinkmann
2013-08-12 16:53                                                           ` Sören Brinkmann
2013-08-12 17:02                                                           ` Daniel Lezcano
2013-08-12 17:02                                                             ` Daniel Lezcano
2013-08-16 17:28                                                             ` Sören Brinkmann
2013-08-16 17:28                                                               ` Sören Brinkmann
2013-08-19 23:00                                                               ` Stephen Boyd
2013-08-19 23:00                                                                 ` Stephen Boyd
2013-08-19 23:30                                                                 ` Sören Brinkmann
2013-08-19 23:30                                                                   ` Sören Brinkmann
2013-08-20  0:57                                                                   ` Stephen Boyd
2013-08-20  0:57                                                                     ` Stephen Boyd
2013-08-20 15:13                                                                     ` Daniel Lezcano
2013-08-20 15:13                                                                       ` Daniel Lezcano
2013-08-22 17:06                                                                       ` [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources Stephen Boyd
2013-08-22 17:06                                                                         ` Stephen Boyd
2013-08-22 17:06                                                                         ` [PATCH 2/2] clockevents: Prefer clockevents that don't suffer from FEAT_C3_STOP Stephen Boyd
2013-08-22 17:06                                                                           ` Stephen Boyd
2013-08-22 17:33                                                                           ` Santosh Shilimkar
2013-08-22 17:33                                                                             ` Santosh Shilimkar
2013-08-22 17:40                                                                             ` Stephen Boyd
2013-08-22 17:40                                                                               ` Stephen Boyd
2013-08-22 17:48                                                                               ` Santosh Shilimkar
2013-08-22 17:48                                                                                 ` Santosh Shilimkar
2013-08-22 18:31                                                                                 ` Stephen Boyd
2013-08-22 18:31                                                                                   ` Stephen Boyd
2013-08-22 21:06                                                                                   ` Santosh Shilimkar
2013-08-22 21:06                                                                                     ` Santosh Shilimkar
2013-08-22 23:38                                                                         ` [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources Sören Brinkmann
2013-08-22 23:38                                                                           ` Sören Brinkmann
2013-09-05 16:53                                                                         ` Sören Brinkmann
2013-09-05 16:53                                                                           ` Sören Brinkmann
2013-08-09 16:03                                             ` Enable arm_global_timer for Zynq brakes boot Sören Brinkmann
2013-08-09 16:03                                               ` Sören Brinkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.