All of lore.kernel.org
 help / color / mirror / Atom feed
* MX28 poweroff issue
@ 2012-07-03 22:30 Marek Vasut
  2012-07-03 22:46 ` Russell King - ARM Linux
  2012-07-04  1:10 ` Fabio Estevam
  0 siblings, 2 replies; 25+ messages in thread
From: Marek Vasut @ 2012-07-03 22:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

I recently confirmed this issue when doing "poweroff". The issue was initially 
reported by Detlev Zundel.

The console output (trimmed) can be seen below:

Unmounting remote filesystems...
Stopping portmap daemon...
Deactivating swap...
Unmounting local filesystems...
[   59.840000] System halted.
[   84.100000] BUG: soft lockup - CPU#0 stuck for 23s! [halt:584]
[   84.100000] Modules linked in:
[   84.100000] irq event stamp: 27684
[   84.100000] hardirqs last  enabled at (27683): [<c000e758>] 
__irq_svc+0x58/0x60
[   84.100000] hardirqs last disabled at (27684): [<c000e734>] 
__irq_svc+0x34/0x60
[   84.100000] softirqs last  enabled at (27682): [<c002332c>] 
irq_exit+0x8c/0x94
[   84.100000] softirqs last disabled at (27675): [<c002332c>] 
irq_exit+0x8c/0x94
[   84.100000] 
[   84.100000] Pid: 584, comm:                 halt
[   84.100000] CPU: 0    Not tainted  (3.5.0-rc5-next-20120703-00028-g0ec2c3e-
dirty #1578)
[   84.100000] PC is at machine_halt+0x0/0x4
[   84.100000] LR is at sys_reboot+0x160/0x1d0
[   84.100000] pc : [<c000fd08>]    lr : [<c0031c88>]    psr: 60000013
[   84.100000] sp : c7745e90  ip : c06a68c4  fp : 00000000
[   84.100000] r10: 00000000  r9 : c7744000  r8 : c000ec88
[   84.100000] r7 : 00000001  r6 : c7744000  r5 : 4321fedc  r4 : 4321fedc
[   84.100000] r3 : 22222222  r2 : 00000000  r1 : c77e9318  r0 : 00000005
[   84.100000] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   84.100000] Control: 0005317f  Table: 46474000  DAC: 00000015
[   84.100000] [<c001378c>] (unwind_backtrace+0x0/0xf0) from [<c0070cd4>] 
(watchdog_timer_fn+0x114/0x14c)
[   84.100000] [<c0070cd4>] (watchdog_timer_fn+0x114/0x14c) from [<c004052c>] 
(__run_hrtimer+0x7c/0x1ec)
[   84.100000] [<c004052c>] (__run_hrtimer+0x7c/0x1ec) from [<c0040e6c>] 
(hrtimer_interrupt+0xf8/0x280)
[   84.100000] [<c0040e6c>] (hrtimer_interrupt+0xf8/0x280) from [<c0017f28>] 
(mxs_timer_interrupt+0x1c/0x28)
[   84.100000] [<c0017f28>] (mxs_timer_interrupt+0x1c/0x28) from [<c00714d0>] 
(handle_irq_event_percpu+0x5c/0x274)
[   84.100000] [<c00714d0>] (handle_irq_event_percpu+0x5c/0x274) from 
[<c0071724>] (handle_irq_event+0x3c/0x5c)
[   84.100000] [<c0071724>] (handle_irq_event+0x3c/0x5c) from [<c0073930>] 
(handle_level_irq+0x8c/0xe8)
[   84.100000] [<c0073930>] (handle_level_irq+0x8c/0xe8) from [<c0070e38>] 
(generic_handle_irq+0x28/0x3c)
[   84.100000] [<c0070e38>] (generic_handle_irq+0x28/0x3c) from [<c000f9ac>] 
(handle_IRQ+0x30/0x84)
[   84.100000] [<c000f9ac>] (handle_IRQ+0x30/0x84) from [<c000e738>] 
(__irq_svc+0x38/0x60)
[   84.100000] [<c000e738>] (__irq_svc+0x38/0x60) from [<c000fd08>] 
(machine_halt+0x0/0x4)
[   84.100000] [<c000fd08>] (machine_halt+0x0/0x4) from [<c0031c88>] 
(sys_reboot+0x160/0x1d0)
[   84.100000] [<c0031c88>] (sys_reboot+0x160/0x1d0) from [<c000eae0>] 
(ret_fast_syscall+0x0/0x38)

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-03 22:30 MX28 poweroff issue Marek Vasut
@ 2012-07-03 22:46 ` Russell King - ARM Linux
  2012-07-04  0:13   ` Marek Vasut
  2012-07-04  1:10 ` Fabio Estevam
  1 sibling, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2012-07-03 22:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 04, 2012 at 12:30:36AM +0200, Marek Vasut wrote:
> I recently confirmed this issue when doing "poweroff". The issue was
> initially reported by Detlev Zundel.

My guess is it's because we don't disable interrupts on halt - because
the kernel used to allow halt followed by another ctrl-alt-reboot to
cause the thing to reboot.

That's long since gone, so we might as well make the CPU spin endlessly
waiting for someone to pull the power on halt...

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-03 22:46 ` Russell King - ARM Linux
@ 2012-07-04  0:13   ` Marek Vasut
  0 siblings, 0 replies; 25+ messages in thread
From: Marek Vasut @ 2012-07-04  0:13 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Russell King - ARM Linux,

> On Wed, Jul 04, 2012 at 12:30:36AM +0200, Marek Vasut wrote:
> > I recently confirmed this issue when doing "poweroff". The issue was
> > initially reported by Detlev Zundel.
> 
> My guess is it's because we don't disable interrupts on halt - because
> the kernel used to allow halt followed by another ctrl-alt-reboot to
> cause the thing to reboot.

This I consider correct.

> That's long since gone, so we might as well make the CPU spin endlessly
> waiting for someone to pull the power on halt...

The backtrace signals the problem being caused by timer interrupt. Maybe the 
struct sys_timer shall be extended by .exit callback to shut down the timer?

Well, the other option is to disable interrupts, indeed.

Russell, you're the wiser one, therefore shall I submit a patch-attempt that 
disables interrupts before the spinning?

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-03 22:30 MX28 poweroff issue Marek Vasut
  2012-07-03 22:46 ` Russell King - ARM Linux
@ 2012-07-04  1:10 ` Fabio Estevam
  2012-07-04  2:12   ` Marek Vasut
  1 sibling, 1 reply; 25+ messages in thread
From: Fabio Estevam @ 2012-07-04  1:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marek,

On Tue, Jul 3, 2012 at 7:30 PM, Marek Vasut <marex@denx.de> wrote:
> Hello,
>
> I recently confirmed this issue when doing "poweroff". The issue was initially
> reported by Detlev Zundel.
>
> The console output (trimmed) can be seen below:

Is this an ARM926EJS issue?

Just tried it on a mx31 and I don't get such issue:

root at freescale /$ poweroff
starting pid 723, tty '': '/etc/rc.d/rcS stop'
Stopping inetd:
Terminated
Unmounting filesystems
umount: tmpfs busy - remounted read-only
umount: devtmpfs busy - remounted read-only
chown: /home/user/.rhosts: Read-only file system
chown: /home/user: Read-only file system
chown: /home/user: Read-only file system
cat: can't open '/proc/devices': No such file or directory
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
System halted.

Regards,

Fabio Estevam

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  1:10 ` Fabio Estevam
@ 2012-07-04  2:12   ` Marek Vasut
  2012-07-04  3:14     ` Fabio Estevam
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2012-07-04  2:12 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Fabio Estevam,

> Hi Marek,
> 
> On Tue, Jul 3, 2012 at 7:30 PM, Marek Vasut <marex@denx.de> wrote:
> > Hello,
> > 
> > I recently confirmed this issue when doing "poweroff". The issue was
> > initially reported by Detlev Zundel.
> 
> > The console output (trimmed) can be seen below:
> Is this an ARM926EJS issue?
> 
> Just tried it on a mx31 and I don't get such issue:
> 
> root at freescale /$ poweroff
> starting pid 723, tty '': '/etc/rc.d/rcS stop'
> Stopping inetd:
> Terminated
> Unmounting filesystems
> umount: tmpfs busy - remounted read-only
> umount: devtmpfs busy - remounted read-only
> chown: /home/user/.rhosts: Read-only file system
> chown: /home/user: Read-only file system
> chown: /home/user: Read-only file system
> cat: can't open '/proc/devices': No such file or directory
> The system is going down NOW!
> Sent SIGTERM to all processes
> Sent SIGKILL to all processes
> Requesting system poweroff
> System halted.

Dunno, can you try on mx28 please?

> Regards,
> 
> Fabio Estevam

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  2:12   ` Marek Vasut
@ 2012-07-04  3:14     ` Fabio Estevam
  2012-07-04  3:47       ` Marek Vasut
  2012-07-04  7:38       ` Attila Kinali
  0 siblings, 2 replies; 25+ messages in thread
From: Fabio Estevam @ 2012-07-04  3:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:

> Dunno, can you try on mx28 please?

I get the same crash as you reported.

On a mx27 (also ARM926EJS) I do not see this problem.

Regards,

Fabio Estevam

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  3:14     ` Fabio Estevam
@ 2012-07-04  3:47       ` Marek Vasut
  2012-07-04  7:07         ` Uwe Kleine-König
  2012-07-04  7:38       ` Attila Kinali
  1 sibling, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2012-07-04  3:47 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Fabio Estevam,

> On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:
> > Dunno, can you try on mx28 please?
> 
> I get the same crash as you reported.
> 
> On a mx27 (also ARM926EJS) I do not see this problem.

Thanks for verifying this, Fabio. The mx28 generates timer interrupt after it 
should be sitting in the endless loop, therefore this crash. I think it's 
because the mx28 (mxs at al) is using the new clock framework (compared to mxc) 
and the timer isn't deinited by then ...

Disabling interrupts might be just the right solution, though I'm not sure if 
it's not hiding bugs ...

> Regards,
> 
> Fabio Estevam

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  3:47       ` Marek Vasut
@ 2012-07-04  7:07         ` Uwe Kleine-König
  2012-07-04  8:20           ` Russell King - ARM Linux
  0 siblings, 1 reply; 25+ messages in thread
From: Uwe Kleine-König @ 2012-07-04  7:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marek,

On Wed, Jul 04, 2012 at 05:47:36AM +0200, Marek Vasut wrote:
> Dear Fabio Estevam,
> 
> > On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:
> > > Dunno, can you try on mx28 please?
> > 
> > I get the same crash as you reported.
> > 
> > On a mx27 (also ARM926EJS) I do not see this problem.
> 
> Thanks for verifying this, Fabio. The mx28 generates timer interrupt after it 
> should be sitting in the endless loop, therefore this crash. I think it's 
> because the mx28 (mxs at al) is using the new clock framework (compared to mxc) 
> and the timer isn't deinited by then ...
On an i.MX35 with v3.5-rc4 (which is using the new clock framework in
the meantime, but is an ARM11) I don't see this crash.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-K?nig            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  3:14     ` Fabio Estevam
  2012-07-04  3:47       ` Marek Vasut
@ 2012-07-04  7:38       ` Attila Kinali
  2012-07-04 14:31         ` Marek Vasut
  1 sibling, 1 reply; 25+ messages in thread
From: Attila Kinali @ 2012-07-04  7:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 4 Jul 2012 00:14:12 -0300
Fabio Estevam <festevam@gmail.com> wrote:

> On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:
> 
> > Dunno, can you try on mx28 please?
> 
> I get the same crash as you reported.
> 
> On a mx27 (also ARM926EJS) I do not see this problem.

I have the same issue on the mx23 (using 3.5-rc5):

Requesting system halt
[  139.330000] System halted.
[  164.080000] BUG: soft lockup - CPU#0 stuck for 22s! [init:61]
[  164.080000] Modules linked in:
[  164.080000] irq event stamp: 24804
[  164.080000] hardirqs last  enabled at (24803): [<c000ee58>] __irq_svc+0x58/0x60
[  164.080000] hardirqs last disabled at (24804): [<c000ee34>] __irq_svc+0x34/0x60
[  164.080000] softirqs last  enabled at (24802): [<c00244e4>] irq_exit+0x8c/0x94
[  164.080000] softirqs last disabled at (24795): [<c00244e4>] irq_exit+0x8c/0x94
[  164.080000] 
[  164.080000] Pid: 61, comm:                 init
[  164.080000] CPU: 0    Not tainted  (3.5.0-rc5-svn13642 #15)
[  164.080000] PC is at machine_halt+0x0/0x4
[  164.080000] LR is at sys_reboot+0x180/0x1dc
[  164.080000] pc : [<c00101ec>]    lr : [<c0032ed4>]    psr: 60000013
[  164.080000] sp : c39a9e90  ip : c39f0880  fp : 00000000
[  164.080000] r10: 00000000  r9 : c39a8000  r8 : c043eddc
[  164.080000] r7 : 00000008  r6 : 4321fedc  r5 : cdef0123  r4 : c39a8000
[  164.080000] r3 : 00000000  r2 : 00000000  r1 : c39f0b68  r0 : 00000005
[  164.080000] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  164.080000] Control: 0005317f  Table: 43984000  DAC: 00000015
[  164.080000] [<c0014990>] (unwind_backtrace+0x0/0xf4) from [<c00686a0>] (watchdog_timer_fn+0x130/0x16c)
[  164.080000] [<c00686a0>] (watchdog_timer_fn+0x130/0x16c) from [<c00400f8>] (__run_hrtimer+0x7c/0x1e4)
[  164.080000] [<c00400f8>] (__run_hrtimer+0x7c/0x1e4) from [<c00404d4>] (hrtimer_interrupt+0x108/0x2e4)
[  164.080000] [<c00404d4>] (hrtimer_interrupt+0x108/0x2e4) from [<c0018f4c>] (mxs_timer_interrupt+0x1c/0x28)
[  164.080000] [<c0018f4c>] (mxs_timer_interrupt+0x1c/0x28) from [<c0068dd8>] (handle_irq_event_percpu+0x5c/0x274)
[  164.080000] [<c0068dd8>] (handle_irq_event_percpu+0x5c/0x274) from [<c006902c>] (handle_irq_event+0x3c/0x5c)
[  164.080000] [<c006902c>] (handle_irq_event+0x3c/0x5c) from [<c006b5dc>] (handle_level_irq+0x8c/0x118)
[  164.080000] [<c006b5dc>] (handle_level_irq+0x8c/0x118) from [<c00689a4>] (generic_handle_irq+0x34/0x40)
[  164.080000] [<c00689a4>] (generic_handle_irq+0x34/0x40) from [<c00100dc>] (handle_IRQ+0x30/0x84)
[  164.080000] [<c00100dc>] (handle_IRQ+0x30/0x84) from [<c000ee38>] (__irq_svc+0x38/0x60)
[  164.080000] [<c000ee38>] (__irq_svc+0x38/0x60) from [<c00101ec>] (machine_halt+0x0/0x4)
[  164.080000] [<c00101ec>] (machine_halt+0x0/0x4) from [<c0032ed4>] (sys_reboot+0x180/0x1dc)
[  164.080000] [<c0032ed4>] (sys_reboot+0x180/0x1dc) from [<c000f1e0>] (ret_fast_syscall+0x0/0x38)

			Attila Kinali

-- 
It is upon moral qualities that a society is ultimately founded. All 
the prosperity and technological sophistication in the world is of no 
use without that foundation.
                 -- Miss Matheson, The Diamond Age, Neil Stephenson

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  7:07         ` Uwe Kleine-König
@ 2012-07-04  8:20           ` Russell King - ARM Linux
  2012-07-04  8:45             ` Uwe Kleine-König
  0 siblings, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2012-07-04  8:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 04, 2012 at 09:07:51AM +0200, Uwe Kleine-K?nig wrote:
> Hi Marek,
> 
> On Wed, Jul 04, 2012 at 05:47:36AM +0200, Marek Vasut wrote:
> > Dear Fabio Estevam,
> > 
> > > On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:
> > > > Dunno, can you try on mx28 please?
> > > 
> > > I get the same crash as you reported.
> > > 
> > > On a mx27 (also ARM926EJS) I do not see this problem.
> > 
> > Thanks for verifying this, Fabio. The mx28 generates timer interrupt after it 
> > should be sitting in the endless loop, therefore this crash. I think it's 
> > because the mx28 (mxs at al) is using the new clock framework (compared to mxc) 
> > and the timer isn't deinited by then ...
> On an i.MX35 with v3.5-rc4 (which is using the new clock framework in
> the meantime, but is an ARM11) I don't see this crash.

Are you who aren't seeing it waiting around 30 seconds after "System halted" ?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  8:20           ` Russell King - ARM Linux
@ 2012-07-04  8:45             ` Uwe Kleine-König
  2012-07-04 14:30               ` Marek Vasut
  0 siblings, 1 reply; 25+ messages in thread
From: Uwe Kleine-König @ 2012-07-04  8:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 04, 2012 at 09:20:49AM +0100, Russell King - ARM Linux wrote:
> On Wed, Jul 04, 2012 at 09:07:51AM +0200, Uwe Kleine-K?nig wrote:
> > Hi Marek,
> > 
> > On Wed, Jul 04, 2012 at 05:47:36AM +0200, Marek Vasut wrote:
> > > Dear Fabio Estevam,
> > > 
> > > > On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:
> > > > > Dunno, can you try on mx28 please?
> > > > 
> > > > I get the same crash as you reported.
> > > > 
> > > > On a mx27 (also ARM926EJS) I do not see this problem.
> > > 
> > > Thanks for verifying this, Fabio. The mx28 generates timer interrupt after it 
> > > should be sitting in the endless loop, therefore this crash. I think it's 
> > > because the mx28 (mxs at al) is using the new clock framework (compared to mxc) 
> > > and the timer isn't deinited by then ...
> > On an i.MX35 with v3.5-rc4 (which is using the new clock framework in
> > the meantime, but is an ARM11) I don't see this crash.
> 
> Are you who aren't seeing it waiting around 30 seconds after "System halted" ?
I waited longer and had

	CONFIG_DETECT_HUNG_TASK=y
	CONFIG_LOCKUP_DETECTOR=y

in my .config.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-K?nig            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  8:45             ` Uwe Kleine-König
@ 2012-07-04 14:30               ` Marek Vasut
  0 siblings, 0 replies; 25+ messages in thread
From: Marek Vasut @ 2012-07-04 14:30 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Uwe Kleine-K?nig,

> On Wed, Jul 04, 2012 at 09:20:49AM +0100, Russell King - ARM Linux wrote:
> > On Wed, Jul 04, 2012 at 09:07:51AM +0200, Uwe Kleine-K?nig wrote:
> > > Hi Marek,
> > > 
> > > On Wed, Jul 04, 2012 at 05:47:36AM +0200, Marek Vasut wrote:
> > > > Dear Fabio Estevam,
> > > > 
> > > > > On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:
> > > > > > Dunno, can you try on mx28 please?
> > > > > 
> > > > > I get the same crash as you reported.
> > > > > 
> > > > > On a mx27 (also ARM926EJS) I do not see this problem.
> > > > 
> > > > Thanks for verifying this, Fabio. The mx28 generates timer interrupt
> > > > after it should be sitting in the endless loop, therefore this
> > > > crash. I think it's because the mx28 (mxs at al) is using the new
> > > > clock framework (compared to mxc) and the timer isn't deinited by
> > > > then ...
> > > 
> > > On an i.MX35 with v3.5-rc4 (which is using the new clock framework in
> > > the meantime, but is an ARM11) I don't see this crash.
> > 
> > Are you who aren't seeing it waiting around 30 seconds after "System
> > halted" ?
> 
> I waited longer and had
> 
> 	CONFIG_DETECT_HUNG_TASK=y
> 	CONFIG_LOCKUP_DETECTOR=y
> 
> in my .config.

Good, so we can trim this down the being mxs specific bug I guess?

> Best regards
> Uwe

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04  7:38       ` Attila Kinali
@ 2012-07-04 14:31         ` Marek Vasut
  2012-07-04 14:53           ` Shawn Guo
  2012-07-04 15:19           ` Russell King - ARM Linux
  0 siblings, 2 replies; 25+ messages in thread
From: Marek Vasut @ 2012-07-04 14:31 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Attila Kinali,

> On Wed, 4 Jul 2012 00:14:12 -0300
> 
> Fabio Estevam <festevam@gmail.com> wrote:
> > On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:
> > > Dunno, can you try on mx28 please?
> > 
> > I get the same crash as you reported.
> > 
> > On a mx27 (also ARM926EJS) I do not see this problem.
> 
> I have the same issue on the mx23 (using 3.5-rc5):

Thank you, all of you who tested it on random devices and pointed out stuff :-)

Shawn, do you have any suggestions on how to proceed? Shall we use Russells 
approach?

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04 14:31         ` Marek Vasut
@ 2012-07-04 14:53           ` Shawn Guo
  2012-07-04 15:19           ` Russell King - ARM Linux
  1 sibling, 0 replies; 25+ messages in thread
From: Shawn Guo @ 2012-07-04 14:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 04, 2012 at 04:31:29PM +0200, Marek Vasut wrote:
> Shawn, do you have any suggestions on how to proceed? Shall we use Russells 
> approach?
> 
Yeah, he's the best person suggesting a solution :)

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04 14:31         ` Marek Vasut
  2012-07-04 14:53           ` Shawn Guo
@ 2012-07-04 15:19           ` Russell King - ARM Linux
  2012-07-05 16:08             ` Shawn Guo
  1 sibling, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2012-07-04 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 04, 2012 at 04:31:29PM +0200, Marek Vasut wrote:
> Dear Attila Kinali,
> 
> > On Wed, 4 Jul 2012 00:14:12 -0300
> > 
> > Fabio Estevam <festevam@gmail.com> wrote:
> > > On Tue, Jul 3, 2012 at 11:12 PM, Marek Vasut <marex@denx.de> wrote:
> > > > Dunno, can you try on mx28 please?
> > > 
> > > I get the same crash as you reported.
> > > 
> > > On a mx27 (also ARM926EJS) I do not see this problem.
> > 
> > I have the same issue on the mx23 (using 3.5-rc5):
> 
> Thank you, all of you who tested it on random devices and pointed out stuff :-)
> 
> Shawn, do you have any suggestions on how to proceed? Shall we use Russells 
> approach?

If it's specific to mx28 and mx23 and nothing else, the cause needs to
be found.  Maybe we need it tested on other (non-MX) platforms too?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-04 15:19           ` Russell King - ARM Linux
@ 2012-07-05 16:08             ` Shawn Guo
  2012-07-05 16:23               ` Marek Vasut
  2012-07-05 20:10               ` Russell King - ARM Linux
  0 siblings, 2 replies; 25+ messages in thread
From: Shawn Guo @ 2012-07-05 16:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 04, 2012 at 04:19:44PM +0100, Russell King - ARM Linux wrote:
> On Wed, Jul 04, 2012 at 04:31:29PM +0200, Marek Vasut wrote:
> > Dear Attila Kinali,
> > 
> > > On Wed, 4 Jul 2012 00:14:12 -0300
> > > 
> > > Fabio Estevam <festevam@gmail.com> wrote:
> > > > On a mx27 (also ARM926EJS) I do not see this problem.
> > > 
> > > I have the same issue on the mx23 (using 3.5-rc5):
> > 
> > Thank you, all of you who tested it on random devices and pointed out stuff :-)
> > 
> > Shawn, do you have any suggestions on how to proceed? Shall we use Russells 
> > approach?
> 
> If it's specific to mx28 and mx23 and nothing else, the cause needs to
> be found.  Maybe we need it tested on other (non-MX) platforms too?

Though people reported that imx27 does not have the problem, I'm not
so sure about it's a mach-mxs (mx23 and mx28) specific issue.  I have
not figured it out why imx27 does not run into it, but I got some
finding here.

Let's look at the dump again.

[   59.840000] System halted.
[   84.100000] BUG: soft lockup - CPU#0 stuck for 23s! [halt:584]
...
[   84.100000] [<c0070cd4>] (watchdog_timer_fn+0x114/0x14c) from [<c004052c>]
(__run_hrtimer+0x7c/0x1ec)

It reports the issue eventually in function watchdog_timer_fn
(kernel/watchdog.c):

/* watchdog kicker functions */
static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
{
	...

        /* check for a softlockup
         * This is done by making sure a high priority task is
         * being scheduled.  The task touches the watchdog to
         * indicate it is getting cpu time.  If it hasn't then
         * this is a good indication some task is hogging the cpu
         */
        duration = is_softlockup(touch_ts);
        if (unlikely(duration)) {
                /*
                 * If a virtual machine is stopped by the host it can look to
                 * the watchdog like a soft lockup, check to see if the host
                 * stopped the vm before we issue the warning
                 */
                if (kvm_check_and_clear_guest_paused())
                        return HRTIMER_RESTART;

                /* only warn once */
                if (__this_cpu_read(soft_watchdog_warn) == true)
                        return HRTIMER_RESTART;

                printk(KERN_EMERG "BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
                        smp_processor_id(), duration,
                        current->comm, task_pid_nr(current));

	...
}

As Russell already said, interrupts are not disabled on halt.  The
mxs_timer_irq handler eventually triggers this watchdog, as the cpu
gets stuck on that while(1) on halt.

I'm not sure if the change below is the right fix, but it does remove
the issue for mach-mxs.

Regards,
Shawn

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 19c95ea..e41edea 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -247,7 +247,8 @@ void machine_shutdown(void)
 void machine_halt(void)
 {
        machine_shutdown();
-       while (1);
+       while (1)
+               msleep(1);
 }

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-05 16:08             ` Shawn Guo
@ 2012-07-05 16:23               ` Marek Vasut
  2012-07-05 20:32                 ` Fabio Estevam
  2012-07-05 20:10               ` Russell King - ARM Linux
  1 sibling, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2012-07-05 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Shawn Guo,

[...]

> diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> index 19c95ea..e41edea 100644
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -247,7 +247,8 @@ void machine_shutdown(void)
>  void machine_halt(void)
>  {
>         machine_shutdown();
> -       while (1);
> +       while (1)
> +               msleep(1);

Won't mdelay() be better?

It removes the warning, but can't some other stray interrupt trigger it again?

>  }

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-05 16:08             ` Shawn Guo
  2012-07-05 16:23               ` Marek Vasut
@ 2012-07-05 20:10               ` Russell King - ARM Linux
  2012-07-06  4:49                 ` Shawn Guo
  2012-07-06  7:32                 ` Lothar Waßmann
  1 sibling, 2 replies; 25+ messages in thread
From: Russell King - ARM Linux @ 2012-07-05 20:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 06, 2012 at 12:08:58AM +0800, Shawn Guo wrote:
> On Wed, Jul 04, 2012 at 04:19:44PM +0100, Russell King - ARM Linux wrote:
> > If it's specific to mx28 and mx23 and nothing else, the cause needs to
> > be found.  Maybe we need it tested on other (non-MX) platforms too?
> 
> Though people reported that imx27 does not have the problem, I'm not
> so sure about it's a mach-mxs (mx23 and mx28) specific issue.  I have
> not figured it out why imx27 does not run into it, but I got some
> finding here.
> 
> Let's look at the dump again.
> 
> [   59.840000] System halted.
> [   84.100000] BUG: soft lockup - CPU#0 stuck for 23s! [halt:584]
> ...
> [   84.100000] [<c0070cd4>] (watchdog_timer_fn+0x114/0x14c) from [<c004052c>]
> (__run_hrtimer+0x7c/0x1ec)
> 
> It reports the issue eventually in function watchdog_timer_fn
> (kernel/watchdog.c):

Yes, the general idea is that if the timer is running, and the watchdog
is running, and it detects that it's event thread doesn't occasionally
run, it will report a lockup.

As other platforms seem to not exhibit the problem when we halt, and
endlessly spin with IRQs enabled, the question needs to be asked: what
is different with MX23/MX28 and why is it different.

Yes, we can mask the problem by disabling interrupts - just like x86
does - but that doesn't tell us why we have this apparant difference
in behaviours.  I think we need to understand that before we start
heading down the path of disabling interrupts to get rid of this
problem.

So, I think we need some analysis of what's going on here with platforms
that do _not_ exhibit the problem.  Eg, does the timer tick get shut
down?  Does NO_HZ have a bearing on it?  What about timer mode (periodic
vs one shot?)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-05 16:23               ` Marek Vasut
@ 2012-07-05 20:32                 ` Fabio Estevam
  2012-07-05 20:59                   ` Fabio Estevam
  0 siblings, 1 reply; 25+ messages in thread
From: Fabio Estevam @ 2012-07-05 20:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marek,

On Thu, Jul 5, 2012 at 1:23 PM, Marek Vasut <marex@denx.de> wrote:

> Won't mdelay() be better?
>
> It removes the warning, but can't some other stray interrupt trigger it again?

Could you please try the change below? I do not have access to my mx28
right now.

diff --git a/arch/arm/mach-mxs/timer.c b/arch/arm/mach-mxs/timer.c
index 02d36de..6c7f81e 100644
--- a/arch/arm/mach-mxs/timer.c
+++ b/arch/arm/mach-mxs/timer.c
@@ -149,6 +149,9 @@ static const char *clock_event_mode_label[] const = {
 static void mxs_set_mode(enum clock_event_mode mode,
 				struct clock_event_device *evt)
 {
+	unsigned long flags;
+
+	local_irq_save(flags);
 	/* Disable interrupt in timer module */
 	timrot_irq_disable();

@@ -173,13 +176,16 @@ static void mxs_set_mode(enum clock_event_mode mode,

 	/* Remember timer mode */
 	mxs_clockevent_mode = mode;
+	local_irq_restore(flags);

 	switch (mode) {
 	case CLOCK_EVT_MODE_PERIODIC:
 		pr_err("%s: Periodic mode is not implemented\n", __func__);
 		break;
 	case CLOCK_EVT_MODE_ONESHOT:
+		local_irq_save(flags);
 		timrot_irq_enable();
+		local_irq_restore(flags);
 		break;
 	case CLOCK_EVT_MODE_SHUTDOWN:
 	case CLOCK_EVT_MODE_UNUSED:
--

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-05 20:32                 ` Fabio Estevam
@ 2012-07-05 20:59                   ` Fabio Estevam
  0 siblings, 0 replies; 25+ messages in thread
From: Fabio Estevam @ 2012-07-05 20:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 5, 2012 at 5:32 PM, Fabio Estevam <festevam@gmail.com> wrote:
> Hi Marek,
>
> On Thu, Jul 5, 2012 at 1:23 PM, Marek Vasut <marex@denx.de> wrote:
>
>> Won't mdelay() be better?
>>
>> It removes the warning, but can't some other stray interrupt trigger it again?
>
> Could you please try the change below? I do not have access to my mx28
> right now.

Nevermind. It still does not work. I will try again tomorrow.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-05 20:10               ` Russell King - ARM Linux
@ 2012-07-06  4:49                 ` Shawn Guo
  2012-07-06  7:32                 ` Lothar Waßmann
  1 sibling, 0 replies; 25+ messages in thread
From: Shawn Guo @ 2012-07-06  4:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 05, 2012 at 09:10:02PM +0100, Russell King - ARM Linux wrote:
> Yes, the general idea is that if the timer is running, and the watchdog
> is running, and it detects that it's event thread doesn't occasionally
> run, it will report a lockup.
> 
> As other platforms seem to not exhibit the problem when we halt, and
> endlessly spin with IRQs enabled, the question needs to be asked: what
> is different with MX23/MX28 and why is it different.
> 
Ha, it turns out that the only reason is mxs_defconfig has
CONFIG_LOCKUP_DETECTOR enabled.

> Yes, we can mask the problem by disabling interrupts - just like x86
> does - but that doesn't tell us why we have this apparant difference
> in behaviours.  I think we need to understand that before we start
> heading down the path of disabling interrupts to get rid of this
> problem.
> 
> So, I think we need some analysis of what's going on here with platforms
> that do _not_ exhibit the problem.

I just reproduced the problem on imx53 with CONFIG_LOCKUP_DETECTOR
enabled in imx_v6_v7_defconfig.  I think if we enable the config for
imx_v4_v5_defconfig, we will see the problem on imx27.

Presumably, other platforms behave same on this.  So would you consider
the following change is a right fix?

Regards,
Shawn

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 19c95ea..693b744 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -247,6 +247,7 @@ void machine_shutdown(void)
 void machine_halt(void)
 {
        machine_shutdown();
+       local_irq_disable();
        while (1);
 }

@@ -268,6 +269,7 @@ void machine_restart(char *cmd)

        /* Whoops - the platform was unable to reboot. Tell the user! */
        printk("Reboot failed -- System halted\n");
+       local_irq_disable();
        while (1);
 }

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-05 20:10               ` Russell King - ARM Linux
  2012-07-06  4:49                 ` Shawn Guo
@ 2012-07-06  7:32                 ` Lothar Waßmann
  2012-07-19 17:04                   ` Marek Vasut
  1 sibling, 1 reply; 25+ messages in thread
From: Lothar Waßmann @ 2012-07-06  7:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Russell King - ARM Linux writes:
> On Fri, Jul 06, 2012 at 12:08:58AM +0800, Shawn Guo wrote:
> > On Wed, Jul 04, 2012 at 04:19:44PM +0100, Russell King - ARM Linux wrote:
> > > If it's specific to mx28 and mx23 and nothing else, the cause needs to
> > > be found.  Maybe we need it tested on other (non-MX) platforms too?
> > 
> > Though people reported that imx27 does not have the problem, I'm not
> > so sure about it's a mach-mxs (mx23 and mx28) specific issue.  I have
> > not figured it out why imx27 does not run into it, but I got some
> > finding here.
> > 
> > Let's look at the dump again.
> > 
> > [   59.840000] System halted.
> > [   84.100000] BUG: soft lockup - CPU#0 stuck for 23s! [halt:584]
> > ...
> > [   84.100000] [<c0070cd4>] (watchdog_timer_fn+0x114/0x14c) from [<c004052c>]
> > (__run_hrtimer+0x7c/0x1ec)
> > 
> > It reports the issue eventually in function watchdog_timer_fn
> > (kernel/watchdog.c):
> 
> Yes, the general idea is that if the timer is running, and the watchdog
> is running, and it detects that it's event thread doesn't occasionally
> run, it will report a lockup.
> 
> As other platforms seem to not exhibit the problem when we halt, and
> endlessly spin with IRQs enabled, the question needs to be asked: what
> is different with MX23/MX28 and why is it different.
> 
I could reproduce this on an i.MX53 (TX53) platform too. Maybe the
softlockup watchdog was not enabled on those platforms which didn't
show this behaviour.


Lothar Wa?mann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstra?e 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Gesch?ftsf?hrer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info at karo-electronics.de
___________________________________________________________

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-06  7:32                 ` Lothar Waßmann
@ 2012-07-19 17:04                   ` Marek Vasut
  2012-07-20  0:48                     ` Shawn Guo
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2012-07-19 17:04 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Lothar Wa?mann,

> Hi,
> 
> Russell King - ARM Linux writes:
> > On Fri, Jul 06, 2012 at 12:08:58AM +0800, Shawn Guo wrote:
> > > On Wed, Jul 04, 2012 at 04:19:44PM +0100, Russell King - ARM Linux wrote:
> > > > If it's specific to mx28 and mx23 and nothing else, the cause needs
> > > > to be found.  Maybe we need it tested on other (non-MX) platforms
> > > > too?
> > > 
> > > Though people reported that imx27 does not have the problem, I'm not
> > > so sure about it's a mach-mxs (mx23 and mx28) specific issue.  I have
> > > not figured it out why imx27 does not run into it, but I got some
> > > finding here.
> > > 
> > > Let's look at the dump again.
> > > 
> > > [   59.840000] System halted.
> > > [   84.100000] BUG: soft lockup - CPU#0 stuck for 23s! [halt:584]
> > > ...
> > > [   84.100000] [<c0070cd4>] (watchdog_timer_fn+0x114/0x14c) from
> > > [<c004052c>] (__run_hrtimer+0x7c/0x1ec)
> > > 
> > > It reports the issue eventually in function watchdog_timer_fn
> > 
> > > (kernel/watchdog.c):
> > Yes, the general idea is that if the timer is running, and the watchdog
> > is running, and it detects that it's event thread doesn't occasionally
> > run, it will report a lockup.
> > 
> > As other platforms seem to not exhibit the problem when we halt, and
> > endlessly spin with IRQs enabled, the question needs to be asked: what
> > is different with MX23/MX28 and why is it different.
> 
> I could reproduce this on an i.MX53 (TX53) platform too. Maybe the
> softlockup watchdog was not enabled on those platforms which didn't
> show this behaviour.

So what's the result, where did this discussion lead to?

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-19 17:04                   ` Marek Vasut
@ 2012-07-20  0:48                     ` Shawn Guo
  2012-07-20  1:53                       ` Marek Vasut
  0 siblings, 1 reply; 25+ messages in thread
From: Shawn Guo @ 2012-07-20  0:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2012 at 07:04:30PM +0200, Marek Vasut wrote:
> > I could reproduce this on an i.MX53 (TX53) platform too. Maybe the
> > softlockup watchdog was not enabled on those platforms which didn't
> > show this behaviour.
> 
> So what's the result, where did this discussion lead to?
> 
I have submitted a fix into Russell's patch system.  We will know
his take soon next week when he gets back from his vacation.

Regards,
Shawn

[1] http://thread.gmane.org/gmane.linux.ports.arm.kernel/176127

^ permalink raw reply	[flat|nested] 25+ messages in thread

* MX28 poweroff issue
  2012-07-20  0:48                     ` Shawn Guo
@ 2012-07-20  1:53                       ` Marek Vasut
  0 siblings, 0 replies; 25+ messages in thread
From: Marek Vasut @ 2012-07-20  1:53 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Shawn Guo,

> On Thu, Jul 19, 2012 at 07:04:30PM +0200, Marek Vasut wrote:
> > > I could reproduce this on an i.MX53 (TX53) platform too. Maybe the
> > > softlockup watchdog was not enabled on those platforms which didn't
> > > show this behaviour.
> > 
> > So what's the result, where did this discussion lead to?
> 
> I have submitted a fix into Russell's patch system.  We will know
> his take soon next week when he gets back from his vacation.

Ah ok, thanks!

> 
> Regards,
> Shawn
> 
> [1] http://thread.gmane.org/gmane.linux.ports.arm.kernel/176127

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-07-20  1:53 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-03 22:30 MX28 poweroff issue Marek Vasut
2012-07-03 22:46 ` Russell King - ARM Linux
2012-07-04  0:13   ` Marek Vasut
2012-07-04  1:10 ` Fabio Estevam
2012-07-04  2:12   ` Marek Vasut
2012-07-04  3:14     ` Fabio Estevam
2012-07-04  3:47       ` Marek Vasut
2012-07-04  7:07         ` Uwe Kleine-König
2012-07-04  8:20           ` Russell King - ARM Linux
2012-07-04  8:45             ` Uwe Kleine-König
2012-07-04 14:30               ` Marek Vasut
2012-07-04  7:38       ` Attila Kinali
2012-07-04 14:31         ` Marek Vasut
2012-07-04 14:53           ` Shawn Guo
2012-07-04 15:19           ` Russell King - ARM Linux
2012-07-05 16:08             ` Shawn Guo
2012-07-05 16:23               ` Marek Vasut
2012-07-05 20:32                 ` Fabio Estevam
2012-07-05 20:59                   ` Fabio Estevam
2012-07-05 20:10               ` Russell King - ARM Linux
2012-07-06  4:49                 ` Shawn Guo
2012-07-06  7:32                 ` Lothar Waßmann
2012-07-19 17:04                   ` Marek Vasut
2012-07-20  0:48                     ` Shawn Guo
2012-07-20  1:53                       ` Marek Vasut

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.