linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [BUG] ARM Architected timers appear broken in 3.9-rc1
@ 2013-03-07 11:04 Jon Medhurst (Tixy)
  2013-03-07 15:26 ` Mark Rutland
  0 siblings, 1 reply; 4+ messages in thread
From: Jon Medhurst (Tixy) @ 2013-03-07 11:04 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark

When booting Versatile Express TC2 I am getting a null pointer
dereference which appears to be related to the recent changes in
architected timer support. 

The interesting part of the call stack is below (the full log is
attached). It shows that whilst entering CPU idle the code is calling
the NULL set_next_event function on the dummy timer setup by
broadcast_timer_setup.

I've also attached the kernel config I'm using and the code is at
git://git.linaro.org/people/tixy/kernel.git branch arch-timer-bug.

[    5.250325] PC is at 0x0
[    5.250336] LR is at clockevents_program_event+0x99/0xf4
[...]
[    5.250448] [<c004e1d5>] (clockevents_program_event+0x99/0xf4) from [<c004e8a9>] (tick_broadcast_set_event+0x35/0x3c)
[    5.250459] [<c004e8a9>] (tick_broadcast_set_event+0x35/0x3c) from [<c004ee25>] (tick_broadcast_oneshot_control+0x9d/0xd8)
[    5.250468] [<c004ee25>] (tick_broadcast_oneshot_control+0x9d/0xd8) from [<c004e72f>] (tick_notify+0x18b/0x2d0)
[    5.250477] [<c004e72f>] (tick_notify+0x18b/0x2d0) from [<c0036679>] (notifier_call_chain+0x45/0x54)
[    5.250485] [<c0036679>] (notifier_call_chain+0x45/0x54) from [<c0036713>] (raw_notifier_call_chain+0x17/0x1c)
[    5.250493] [<c0036713>] (raw_notifier_call_chain+0x17/0x1c) from [<c004de3d>] (clockevents_notify+0x25/0x10c)
[    5.250503] [<c004de3d>] (clockevents_notify+0x25/0x10c) from [<c02ef51b>] (bl_enter_powerdown+0x2b/0x9c)
[    5.250513] [<c02ef51b>] (bl_enter_powerdown+0x2b/0x9c) from [<c02edec5>] (cpuidle_enter+0xd/0x10)
[    5.250523] [<c02edec5>] (cpuidle_enter+0xd/0x10) from [<c02ee1e9>] (cpuidle_enter_state+0x11/0x50)
[    5.250532] [<c02ee1e9>] (cpuidle_enter_state+0x11/0x50) from [<c02ee293>] (cpuidle_idle_call+0x6b/0x10c)
[    5.250543] [<c02ee293>] (cpuidle_idle_call+0x6b/0x10c) from [<c000da41>] (cpu_idle+0x3d/0xd8)
[    5.250550] [<c000da41>] (cpu_idle+0x3d/0xd8) from [<803d90c9>] (0x803d90c9)

-- 
Tixy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config.tar.gz
Type: application/x-compressed-tar
Size: 13863 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130307/f967ba01/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: arch-timer-bug.log.tar.gz
Type: application/x-compressed-tar
Size: 7807 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130307/f967ba01/attachment-0003.bin>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [BUG] ARM Architected timers appear broken in 3.9-rc1
  2013-03-07 11:04 [BUG] ARM Architected timers appear broken in 3.9-rc1 Jon Medhurst (Tixy)
@ 2013-03-07 15:26 ` Mark Rutland
  2013-03-07 16:11   ` Thomas Gleixner
  2013-03-07 17:54   ` Jon Medhurst (Tixy)
  0 siblings, 2 replies; 4+ messages in thread
From: Mark Rutland @ 2013-03-07 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 07, 2013 at 11:04:49AM +0000, Jon Medhurst (Tixy) wrote:
> Hi Mark

Hi,

> 
> When booting Versatile Express TC2 I am getting a null pointer
> dereference which appears to be related to the recent changes in
> architected timer support. 
> 
> The interesting part of the call stack is below (the full log is
> attached). It shows that whilst entering CPU idle the code is calling
> the NULL set_next_event function on the dummy timer setup by
> broadcast_timer_setup.

As you say, we're calling the NULL set_next_event on the dummy timer,
because the dummy is registered as the broadcast device, which makes no
sense (as that means it's responsible for broadcasting to itself).

As far as I can see, the issue is that the dummy timer has a higher
rating (400) than the sp804 (300) that would otherwise be used as the
broadcast source, and tick_check_broadcast_device does not reject clocks
with CLOCK_EVT_FEAT_DUMMY (which should *never* be used as a broadcast
source). 

Giving the dummy timer a lower rating than the sp804 would prevent this,
but it might make more sense to do something like the below and
explicitly prevent dummy timers from being registered as broadcast
sources (this seems to work for me with your kernel, though I hit an
unrelated BUG() later due to bL_platform_power_ops *power_ops not being
set).

Thomas, what do you think?

Thanks,
Mark.
---->8----
>From 7432019cdfea9b2808e3b04489cd71a89266ca8f Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland@arm.com>
Date: Thu, 7 Mar 2013 15:09:24 +0000
Subject: [PATCH] clockevents: don't allow dummy broadcast timers

Currently tick_check_broadcast_device doesn't reject clock_event_devices
with CLOCK_EVT_FEAT_DUMMY, and may select them in preference to real
hardware if they have a higher rating value. In this situation, the
dummy timer is responsible for broadcasting to itself, and the core
clockevents code may attempt to call non-existent callbacks for
programming the dummy, eventually leading to a panic.

This patch makes tick_check_broadcast_device always reject dummy timers,
preventing this problem.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
---
 kernel/time/tick-broadcast.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 2fb8cb8..7f32fe0 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -67,7 +67,8 @@ static void tick_broadcast_start_periodic(struct clock_event_device *bc)
  */
 int tick_check_broadcast_device(struct clock_event_device *dev)
 {
-	if ((tick_broadcast_device.evtdev &&
+	if ((dev->features & CLOCK_EVT_FEAT_DUMMY) ||
+	    (tick_broadcast_device.evtdev &&
 	     tick_broadcast_device.evtdev->rating >= dev->rating) ||
 	     (dev->features & CLOCK_EVT_FEAT_C3STOP))
 		return 0;
-- 
1.8.1.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [BUG] ARM Architected timers appear broken in 3.9-rc1
  2013-03-07 15:26 ` Mark Rutland
@ 2013-03-07 16:11   ` Thomas Gleixner
  2013-03-07 17:54   ` Jon Medhurst (Tixy)
  1 sibling, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2013-03-07 16:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 7 Mar 2013, Mark Rutland wrote:
> From: Mark Rutland <mark.rutland@arm.com>
> Date: Thu, 7 Mar 2013 15:09:24 +0000
> Subject: [PATCH] clockevents: don't allow dummy broadcast timers
> 
> Currently tick_check_broadcast_device doesn't reject clock_event_devices
> with CLOCK_EVT_FEAT_DUMMY, and may select them in preference to real
> hardware if they have a higher rating value. In this situation, the
> dummy timer is responsible for broadcasting to itself, and the core
> clockevents code may attempt to call non-existent callbacks for
> programming the dummy, eventually leading to a panic.

Sigh, yes. I so wish, that this whole broadcast business would go
away.
 
> This patch makes tick_check_broadcast_device always reject dummy timers,
> preventing this problem.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> ---
>  kernel/time/tick-broadcast.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index 2fb8cb8..7f32fe0 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -67,7 +67,8 @@ static void tick_broadcast_start_periodic(struct clock_event_device *bc)
>   */
>  int tick_check_broadcast_device(struct clock_event_device *dev)
>  {
> -	if ((tick_broadcast_device.evtdev &&
> +	if ((dev->features & CLOCK_EVT_FEAT_DUMMY) ||
> +	    (tick_broadcast_device.evtdev &&
>  	     tick_broadcast_device.evtdev->rating >= dev->rating) ||
>  	     (dev->features & CLOCK_EVT_FEAT_C3STOP))
>  		return 0;
> -- 
> 1.8.1.1
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [BUG] ARM Architected timers appear broken in 3.9-rc1
  2013-03-07 15:26 ` Mark Rutland
  2013-03-07 16:11   ` Thomas Gleixner
@ 2013-03-07 17:54   ` Jon Medhurst (Tixy)
  1 sibling, 0 replies; 4+ messages in thread
From: Jon Medhurst (Tixy) @ 2013-03-07 17:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2013-03-07 at 15:26 +0000, Mark Rutland wrote:
> On Thu, Mar 07, 2013 at 11:04:49AM +0000, Jon Medhurst (Tixy) wrote: 
> > When booting Versatile Express TC2 I am getting a null pointer
> > dereference which appears to be related to the recent changes in
> > architected timer support. 
> > 
> > The interesting part of the call stack is below (the full log is
> > attached). It shows that whilst entering CPU idle the code is calling
> > the NULL set_next_event function on the dummy timer setup by
> > broadcast_timer_setup.
> 
> As you say, we're calling the NULL set_next_event on the dummy timer,
> because the dummy is registered as the broadcast device, which makes no
> sense (as that means it's responsible for broadcasting to itself).
> 
> As far as I can see, the issue is that the dummy timer has a higher
> rating (400) than the sp804 (300) that would otherwise be used as the
> broadcast source, and tick_check_broadcast_device does not reject clocks
> with CLOCK_EVT_FEAT_DUMMY (which should *never* be used as a broadcast
> source). 
> 
> Giving the dummy timer a lower rating than the sp804 would prevent this,
> but it might make more sense to do something like the below and
> explicitly prevent dummy timers from being registered as broadcast
> sources (this seems to work for me with your kernel, though I hit an
> unrelated BUG() later due to bL_platform_power_ops *power_ops not being
> set).

The patch works for me too and I can boot successfully to a command
prompt. I think the BUG you are hitting could be due to different
device-trees, I'm using vexpress-v2p-ca15-tc2.dts in the branch I
pointed you to.

Thanks Mark for so quickly looking into this.

Tixy.

> ---->8----
> From 7432019cdfea9b2808e3b04489cd71a89266ca8f Mon Sep 17 00:00:00 2001
> From: Mark Rutland <mark.rutland@arm.com>
> Date: Thu, 7 Mar 2013 15:09:24 +0000
> Subject: [PATCH] clockevents: don't allow dummy broadcast timers
> 
> Currently tick_check_broadcast_device doesn't reject clock_event_devices
> with CLOCK_EVT_FEAT_DUMMY, and may select them in preference to real
> hardware if they have a higher rating value. In this situation, the
> dummy timer is responsible for broadcasting to itself, and the core
> clockevents code may attempt to call non-existent callbacks for
> programming the dummy, eventually leading to a panic.
> 
> This patch makes tick_check_broadcast_device always reject dummy timers,
> preventing this problem.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> ---
>  kernel/time/tick-broadcast.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index 2fb8cb8..7f32fe0 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -67,7 +67,8 @@ static void tick_broadcast_start_periodic(struct clock_event_device *bc)
>   */
>  int tick_check_broadcast_device(struct clock_event_device *dev)
>  {
> -	if ((tick_broadcast_device.evtdev &&
> +	if ((dev->features & CLOCK_EVT_FEAT_DUMMY) ||
> +	    (tick_broadcast_device.evtdev &&
>  	     tick_broadcast_device.evtdev->rating >= dev->rating) ||
>  	     (dev->features & CLOCK_EVT_FEAT_C3STOP))
>  		return 0;

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-03-07 17:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-07 11:04 [BUG] ARM Architected timers appear broken in 3.9-rc1 Jon Medhurst (Tixy)
2013-03-07 15:26 ` Mark Rutland
2013-03-07 16:11   ` Thomas Gleixner
2013-03-07 17:54   ` Jon Medhurst (Tixy)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).