All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-nfs <linux-nfs@vger.kernel.org>,
	"trond.myklebust" <trond.myklebust@primarydata.com>,
	netdev <netdev@vger.kernel.org>
Subject: Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed
Date: Tue, 4 Mar 2014 17:43:10 -0800	[thread overview]
Message-ID: <20140305014310.GC3334@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAGVrzcbRycBy0w64R9pV=JG6M3aJeARbOnh-xRrumYzzVDgWGQ@mail.gmail.com>

On Tue, Mar 04, 2014 at 05:16:27PM -0800, Florian Fainelli wrote:
> 2014-03-04 17:03 GMT-08:00 Florian Fainelli <f.fainelli@gmail.com>:
> > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>:
> >> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote:
> >>> Hi all,
> >>>
> >>> I am seeing the following RCU stalls messages appearing on an ARMv7
> >>> 4xCPUs system running 3.14-rc4:
> >>>
> >>> [   42.974327] INFO: rcu_sched detected stalls on CPUs/tasks:
> >>> [   42.979839]  (detected by 0, t=2102 jiffies, g=4294967082,
> >>> c=4294967081, q=516)
> >>> [   42.987169] INFO: Stall ended before state dump start
> >>>
> >>> this is happening under the following conditions:
> >>>
> >>> - the attached bumper.c binary alters various kernel thread priorities
> >>> based on the contents of bumpup.cfg and
> >>> - malloc_crazy is running from a NFS share
> >>> - malloc_crazy.c is running in a loop allocating chunks of memory but
> >>> never freeing it
> >>>
> >>> when the priorities are altered, instead of getting the OOM killer to
> >>> be invoked, the RCU stalls are happening. Taking NFS out of the
> >>> equation does not allow me to reproduce the problem even with the
> >>> priorities altered.
> >>>
> >>> This "problem" seems to have been there for quite a while now since I
> >>> was able to get 3.8.13 to trigger that bug as well, with a slightly
> >>> more detailed RCU debugging trace which points the finger at kswapd0.
> >>>
> >>> You should be able to get that reproduced under QEMU with the
> >>> Versatile Express platform emulating a Cortex A15 CPU and the attached
> >>> files.
> >>>
> >>> Any help or suggestions would be greatly appreciated. Thanks!
> >>
> >> Do you have a more complete trace, including stack traces ?
> >
> > Attatched is what I get out of SysRq-t, which is the only thing I have
> > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y):
> 
> QEMU for Versatile Express w/ 2 CPUs yields something slightly
> different than the real HW platform this is happening with, but it
> does produce the RCU stall anyway:
> 
> [  125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91]

This soft-lockup condition can result in RCU CPU stall warnings.  Fix
the problem causing the soft lockup, and I bet that your RCU CPU stall
warnings go away.

							Thanx, Paul

> [  125.766841] Modules linked in:
> [  125.768389]
> [  125.769199] CPU: 1 PID: 91 Comm: malloc_crazy Not tainted 3.14.0-rc4 #39
> [  125.769883] task: edbded00 ti: c089c000 task.ti: c089c000
> [  125.771743] PC is at load_balance+0x4b0/0x760
> [  125.772069] LR is at cpumask_next_and+0x44/0x5c
> [  125.772387] pc : [<c004ff58>]    lr : [<c01db940>]    psr: 60000113
> [  125.772387] sp : c089db48  ip : 80000113  fp : edfd8cf4
> [  125.773128] r10: c0de871c  r9 : ed893840  r8 : 00000000
> [  125.773452] r7 : c0de8458  r6 : edfd8840  r5 : edfd8840  r4 : ed89389c
> [  125.773825] r3 : 000012d8  r2 : 80000113  r1 : 00000023  r0 : 00000000
> [  125.774332] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [  125.774753] Control: 30c7387d  Table: 80835d40  DAC: 00000000
> [  125.775266] CPU: 1 PID: 91 Comm: malloc_test_bcm Not tainted 3.14.0-rc4 #39
> [  125.776392] [<c0015624>] (unwind_backtrace) from [<c00111a4>]
> (show_stack+0x10/0x14)
> [  125.777026] [<c00111a4>] (show_stack) from [<c04c1bd4>]
> (dump_stack+0x84/0x94)
> [  125.777429] [<c04c1bd4>] (dump_stack) from [<c007e7f0>]
> (watchdog_timer_fn+0x144/0x17c)
> [  125.777865] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>]
> (__run_hrtimer.isra.32+0x54/0xe4)
> [  125.778333] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>]
> (hrtimer_interrupt+0x11c/0x2d0)
> [  125.778814] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>]
> (arch_timer_handler_virt+0x28/0x30)
> [  125.779297] [<c03c4f80>] (arch_timer_handler_virt) from
> [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84)
> [  125.779734] [<c006280c>] (handle_percpu_devid_irq) from
> [<c005edec>] (generic_handle_irq+0x2c/0x3c)
> [  125.780145] [<c005edec>] (generic_handle_irq) from [<c000eb7c>]
> (handle_IRQ+0x40/0x90)
> [  125.780513] [<c000eb7c>] (handle_IRQ) from [<c0008568>]
> (gic_handle_irq+0x2c/0x5c)
> [  125.780867] [<c0008568>] (gic_handle_irq) from [<c0011d00>]
> (__irq_svc+0x40/0x50)
> [  125.781312] Exception stack(0xc089db00 to 0xc089db48)
> [  125.781787] db00: 00000000 00000023 80000113 000012d8 ed89389c
> edfd8840 edfd8840 c0de8458
> [  125.782234] db20: 00000000 ed893840 c0de871c edfd8cf4 80000113
> c089db48 c01db940 c004ff58
> [  125.782594] db40: 60000113 ffffffff
> [  125.782864] [<c0011d00>] (__irq_svc) from [<c004ff58>]
> (load_balance+0x4b0/0x760)
> [  125.783215] [<c004ff58>] (load_balance) from [<c005035c>]
> (rebalance_domains+0x154/0x284)
> [  125.783595] [<c005035c>] (rebalance_domains) from [<c00504c0>]
> (run_rebalance_domains+0x34/0x164)
> [  125.784000] [<c00504c0>] (run_rebalance_domains) from [<c0025aac>]
> (__do_softirq+0x110/0x24c)
> [  125.784388] [<c0025aac>] (__do_softirq) from [<c0025e6c>]
> (irq_exit+0xac/0xf4)
> [  125.784726] [<c0025e6c>] (irq_exit) from [<c000eb80>] (handle_IRQ+0x44/0x90)
> [  125.785059] [<c000eb80>] (handle_IRQ) from [<c0008568>]
> (gic_handle_irq+0x2c/0x5c)
> [  125.785412] [<c0008568>] (gic_handle_irq) from [<c0011d00>]
> (__irq_svc+0x40/0x50)
> [  125.785742] Exception stack(0xc089dcf8 to 0xc089dd40)
> [  125.785983] dce0:
>     ee4e38c0 00000000
> [  125.786360] dd00: 000200da 00000001 ee4e38a0 c0de2340 2d201000
> edfe3358 c05c0c18 00000001
> [  125.786737] dd20: c05c0c2c c0e1e180 00000000 c089dd40 c0086050
> c0086140 40000113 ffffffff
> [  125.787120] [<c0011d00>] (__irq_svc) from [<c0086140>]
> (get_page_from_freelist+0x2bc/0x638)
> [  125.787507] [<c0086140>] (get_page_from_freelist) from [<c0087018>]
> (__alloc_pages_nodemask+0x114/0x8f4)
> [  125.787949] [<c0087018>] (__alloc_pages_nodemask) from [<c00a20c8>]
> (handle_mm_fault+0x9f8/0xcdc)
> [  125.788357] [<c00a20c8>] (handle_mm_fault) from [<c001793c>]
> (do_page_fault+0x194/0x27c)
> [  125.788726] [<c001793c>] (do_page_fault) from [<c000844c>]
> (do_DataAbort+0x30/0x90)
> [  125.789080] [<c000844c>] (do_DataAbort) from [<c0011e34>]
> (__dabt_usr+0x34/0x40)
> [  125.789403] Exception stack(0xc089dfb0 to 0xc089dff8)
> [  125.789652] dfa0:                                     ae55e008
> 11111111 000dc000 ae582000
> [  125.790029] dfc0: be967d18 00000000 00008390 00000000 00000000
> 00000000 b6f29000 be967d04
> [  125.790404] dfe0: 11111111 be967ce8 00008550 b6e5ce48 20000010 ffffffff
> [  125.791282] BUG: soft lockup - CPU#0 stuck for 53s! [kworker/0:1:41]
> [  125.791710] Modules linked in:
> [  125.792046]
> [  125.792836] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 3.14.0-rc4 #39
> [  125.793832] task: ed893840 ti: c0826000 task.ti: c0826000
> [  125.795257] PC is at finish_task_switch+0x40/0xec
> [  125.795562] LR is at __schedule+0x1fc/0x51c
> [  125.795856] pc : [<c0044724>]    lr : [<c04c3274>]    psr: 600f0013
> [  125.795856] sp : c0827ec8  ip : 00000000  fp : c0827edc
> [  125.796416] r10: eda56e00  r9 : c0deac28  r8 : 00000000
> [  125.796698] r7 : c0de871c  r6 : c0826020  r5 : ed893840  r4 : 00000001
> [  125.797028] r3 : eda56e00  r2 : 000012d7  r1 : ed854780  r0 : edfd8840
> [  125.797488] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
> Segment kernel
> [  125.797866] Control: 30c7387d  Table: 808353c0  DAC: fffffffd
> [  125.798321] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 3.14.0-rc4 #39
> [  125.800604] [<c0015624>] (unwind_backtrace) from [<c00111a4>]
> (show_stack+0x10/0x14)
> [  125.801205] [<c00111a4>] (show_stack) from [<c04c1bd4>]
> (dump_stack+0x84/0x94)
> [  125.801786] [<c04c1bd4>] (dump_stack) from [<c007e7f0>]
> (watchdog_timer_fn+0x144/0x17c)
> [  125.802238] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>]
> (__run_hrtimer.isra.32+0x54/0xe4)
> [  125.802679] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>]
> (hrtimer_interrupt+0x11c/0x2d0)
> [  125.803108] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>]
> (arch_timer_handler_virt+0x28/0x30)
> [  125.803530] [<c03c4f80>] (arch_timer_handler_virt) from
> [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84)
> [  125.803965] [<c006280c>] (handle_percpu_devid_irq) from
> [<c005edec>] (generic_handle_irq+0x2c/0x3c)
> [  125.804380] [<c005edec>] (generic_handle_irq) from [<c000eb7c>]
> (handle_IRQ+0x40/0x90)
> [  125.804774] [<c000eb7c>] (handle_IRQ) from [<c0008568>]
> (gic_handle_irq+0x2c/0x5c)
> [  125.805181] [<c0008568>] (gic_handle_irq) from [<c0011d00>]
> (__irq_svc+0x40/0x50)
> [  125.805706] Exception stack(0xc0827e80 to 0xc0827ec8)
> [  125.806185] 7e80: edfd8840 ed854780 000012d7 eda56e00 00000001
> ed893840 c0826020 c0de871c
> [  125.806623] 7ea0: 00000000 c0deac28 eda56e00 c0827edc 00000000
> c0827ec8 c04c3274 c0044724
> [  125.807032] 7ec0: 600f0013 ffffffff
> [  125.807328] [<c0011d00>] (__irq_svc) from [<c0044724>]
> (finish_task_switch+0x40/0xec)
> [  125.807752] [<c0044724>] (finish_task_switch) from [<c04c3274>]
> (__schedule+0x1fc/0x51c)
> [  125.808175] [<c04c3274>] (__schedule) from [<c0037590>]
> (worker_thread+0x210/0x404)
> [  125.808570] [<c0037590>] (worker_thread) from [<c003cba0>]
> (kthread+0xd4/0xec)
> [  125.808952] [<c003cba0>] (kthread) from [<c000e338>]
> (ret_from_fork+0x14/0x3c)
> [  246.080014] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  246.080611]  (detected by 0, t=6972 jiffies, g=4294967160,
> c=4294967159, q=127)
> [  246.081576] INFO: Stall ended before state dump start
> [  246.082510] BUG: soft lockup - CPU#1 stuck for 69s! [grep:93]
> [  246.082849] Modules linked in:
> [  246.083046]
> [  246.083179] CPU: 1 PID: 93 Comm: grep Not tainted 3.14.0-rc4 #39
> [  246.083548] task: edbdf480 ti: c09e6000 task.ti: c09e6000
> [  246.083897] PC is at rebalance_domains+0x0/0x284
> [  246.084154] LR is at run_rebalance_domains+0x34/0x164
> [  246.084430] pc : [<c0050208>]    lr : [<c00504c0>]    psr: 20000113
> [  246.084430] sp : c09e7ef8  ip : c0dde340  fp : 40000005
> [  246.084992] r10: 00000007  r9 : c0ddf840  r8 : c0ddf840
> [  246.085267] r7 : edfe0840  r6 : 00000100  r5 : c0de209c  r4 : 00000001
> [  246.085606] r3 : c005048c  r2 : 2d201000  r1 : 00000001  r0 : edfe0840
> [  246.085944] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [  246.086315] Control: 30c7387d  Table: 808351c0  DAC: 00000000
> [  246.086626] CPU: 1 PID: 93 Comm: grep Not tainted 3.14.0-rc4 #39
> [  246.086992] [<c0015624>] (unwind_backtrace) from [<c00111a4>]
> (show_stack+0x10/0x14)
> [  246.087420] [<c00111a4>] (show_stack) from [<c04c1bd4>]
> (dump_stack+0x84/0x94)
> [  246.087825] [<c04c1bd4>] (dump_stack) from [<c007e7f0>]
> (watchdog_timer_fn+0x144/0x17c)
> [  246.088260] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>]
> (__run_hrtimer.isra.32+0x54/0xe4)
> [  246.088726] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>]
> (hrtimer_interrupt+0x11c/0x2d0)
> [  246.089190] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>]
> (arch_timer_handler_virt+0x28/0x30)
> [  246.089670] [<c03c4f80>] (arch_timer_handler_virt) from
> [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84)
> [  246.090155] [<c006280c>] (handle_percpu_devid_irq) from
> [<c005edec>] (generic_handle_irq+0x2c/0x3c)
> [  246.090822] [<c005edec>] (generic_handle_irq) from [<c000eb7c>]
> (handle_IRQ+0x40/0x90)
> [  246.091233] [<c000eb7c>] (handle_IRQ) from [<c0008568>]
> (gic_handle_irq+0x2c/0x5c)
> [  246.091631] [<c0008568>] (gic_handle_irq) from [<c0011d00>]
> (__irq_svc+0x40/0x50)
> [  246.092007] Exception stack(0xc09e7eb0 to 0xc09e7ef8)
> [  246.092291] 7ea0:                                     edfe0840
> 00000001 2d201000 c005048c
> [  246.092719] 7ec0: 00000001 c0de209c 00000100 edfe0840 c0ddf840
> c0ddf840 00000007 40000005
> [  246.093142] 7ee0: c0dde340 c09e7ef8 c00504c0 c0050208 20000113 ffffffff
> [  246.093647] [<c0011d00>] (__irq_svc) from [<c0050208>]
> (rebalance_domains+0x0/0x284)
> [  246.094422] [<c0050208>] (rebalance_domains) from [<c09e6028>] (0xc09e6028)
> 


WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-nfs <linux-nfs@vger.kernel.org>,
	"trond.myklebust" <trond.myklebust@primarydata.com>,
	netdev <netdev@vger.kernel.org>
Subject: Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed
Date: Tue, 4 Mar 2014 17:43:10 -0800	[thread overview]
Message-ID: <20140305014310.GC3334@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAGVrzcbRycBy0w64R9pV=JG6M3aJeARbOnh-xRrumYzzVDgWGQ@mail.gmail.com>

On Tue, Mar 04, 2014 at 05:16:27PM -0800, Florian Fainelli wrote:
> 2014-03-04 17:03 GMT-08:00 Florian Fainelli <f.fainelli@gmail.com>:
> > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>:
> >> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote:
> >>> Hi all,
> >>>
> >>> I am seeing the following RCU stalls messages appearing on an ARMv7
> >>> 4xCPUs system running 3.14-rc4:
> >>>
> >>> [   42.974327] INFO: rcu_sched detected stalls on CPUs/tasks:
> >>> [   42.979839]  (detected by 0, t=2102 jiffies, g=4294967082,
> >>> c=4294967081, q=516)
> >>> [   42.987169] INFO: Stall ended before state dump start
> >>>
> >>> this is happening under the following conditions:
> >>>
> >>> - the attached bumper.c binary alters various kernel thread priorities
> >>> based on the contents of bumpup.cfg and
> >>> - malloc_crazy is running from a NFS share
> >>> - malloc_crazy.c is running in a loop allocating chunks of memory but
> >>> never freeing it
> >>>
> >>> when the priorities are altered, instead of getting the OOM killer to
> >>> be invoked, the RCU stalls are happening. Taking NFS out of the
> >>> equation does not allow me to reproduce the problem even with the
> >>> priorities altered.
> >>>
> >>> This "problem" seems to have been there for quite a while now since I
> >>> was able to get 3.8.13 to trigger that bug as well, with a slightly
> >>> more detailed RCU debugging trace which points the finger at kswapd0.
> >>>
> >>> You should be able to get that reproduced under QEMU with the
> >>> Versatile Express platform emulating a Cortex A15 CPU and the attached
> >>> files.
> >>>
> >>> Any help or suggestions would be greatly appreciated. Thanks!
> >>
> >> Do you have a more complete trace, including stack traces ?
> >
> > Attatched is what I get out of SysRq-t, which is the only thing I have
> > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y):
> 
> QEMU for Versatile Express w/ 2 CPUs yields something slightly
> different than the real HW platform this is happening with, but it
> does produce the RCU stall anyway:
> 
> [  125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91]

This soft-lockup condition can result in RCU CPU stall warnings.  Fix
the problem causing the soft lockup, and I bet that your RCU CPU stall
warnings go away.

							Thanx, Paul

> [  125.766841] Modules linked in:
> [  125.768389]
> [  125.769199] CPU: 1 PID: 91 Comm: malloc_crazy Not tainted 3.14.0-rc4 #39
> [  125.769883] task: edbded00 ti: c089c000 task.ti: c089c000
> [  125.771743] PC is at load_balance+0x4b0/0x760
> [  125.772069] LR is at cpumask_next_and+0x44/0x5c
> [  125.772387] pc : [<c004ff58>]    lr : [<c01db940>]    psr: 60000113
> [  125.772387] sp : c089db48  ip : 80000113  fp : edfd8cf4
> [  125.773128] r10: c0de871c  r9 : ed893840  r8 : 00000000
> [  125.773452] r7 : c0de8458  r6 : edfd8840  r5 : edfd8840  r4 : ed89389c
> [  125.773825] r3 : 000012d8  r2 : 80000113  r1 : 00000023  r0 : 00000000
> [  125.774332] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [  125.774753] Control: 30c7387d  Table: 80835d40  DAC: 00000000
> [  125.775266] CPU: 1 PID: 91 Comm: malloc_test_bcm Not tainted 3.14.0-rc4 #39
> [  125.776392] [<c0015624>] (unwind_backtrace) from [<c00111a4>]
> (show_stack+0x10/0x14)
> [  125.777026] [<c00111a4>] (show_stack) from [<c04c1bd4>]
> (dump_stack+0x84/0x94)
> [  125.777429] [<c04c1bd4>] (dump_stack) from [<c007e7f0>]
> (watchdog_timer_fn+0x144/0x17c)
> [  125.777865] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>]
> (__run_hrtimer.isra.32+0x54/0xe4)
> [  125.778333] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>]
> (hrtimer_interrupt+0x11c/0x2d0)
> [  125.778814] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>]
> (arch_timer_handler_virt+0x28/0x30)
> [  125.779297] [<c03c4f80>] (arch_timer_handler_virt) from
> [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84)
> [  125.779734] [<c006280c>] (handle_percpu_devid_irq) from
> [<c005edec>] (generic_handle_irq+0x2c/0x3c)
> [  125.780145] [<c005edec>] (generic_handle_irq) from [<c000eb7c>]
> (handle_IRQ+0x40/0x90)
> [  125.780513] [<c000eb7c>] (handle_IRQ) from [<c0008568>]
> (gic_handle_irq+0x2c/0x5c)
> [  125.780867] [<c0008568>] (gic_handle_irq) from [<c0011d00>]
> (__irq_svc+0x40/0x50)
> [  125.781312] Exception stack(0xc089db00 to 0xc089db48)
> [  125.781787] db00: 00000000 00000023 80000113 000012d8 ed89389c
> edfd8840 edfd8840 c0de8458
> [  125.782234] db20: 00000000 ed893840 c0de871c edfd8cf4 80000113
> c089db48 c01db940 c004ff58
> [  125.782594] db40: 60000113 ffffffff
> [  125.782864] [<c0011d00>] (__irq_svc) from [<c004ff58>]
> (load_balance+0x4b0/0x760)
> [  125.783215] [<c004ff58>] (load_balance) from [<c005035c>]
> (rebalance_domains+0x154/0x284)
> [  125.783595] [<c005035c>] (rebalance_domains) from [<c00504c0>]
> (run_rebalance_domains+0x34/0x164)
> [  125.784000] [<c00504c0>] (run_rebalance_domains) from [<c0025aac>]
> (__do_softirq+0x110/0x24c)
> [  125.784388] [<c0025aac>] (__do_softirq) from [<c0025e6c>]
> (irq_exit+0xac/0xf4)
> [  125.784726] [<c0025e6c>] (irq_exit) from [<c000eb80>] (handle_IRQ+0x44/0x90)
> [  125.785059] [<c000eb80>] (handle_IRQ) from [<c0008568>]
> (gic_handle_irq+0x2c/0x5c)
> [  125.785412] [<c0008568>] (gic_handle_irq) from [<c0011d00>]
> (__irq_svc+0x40/0x50)
> [  125.785742] Exception stack(0xc089dcf8 to 0xc089dd40)
> [  125.785983] dce0:
>     ee4e38c0 00000000
> [  125.786360] dd00: 000200da 00000001 ee4e38a0 c0de2340 2d201000
> edfe3358 c05c0c18 00000001
> [  125.786737] dd20: c05c0c2c c0e1e180 00000000 c089dd40 c0086050
> c0086140 40000113 ffffffff
> [  125.787120] [<c0011d00>] (__irq_svc) from [<c0086140>]
> (get_page_from_freelist+0x2bc/0x638)
> [  125.787507] [<c0086140>] (get_page_from_freelist) from [<c0087018>]
> (__alloc_pages_nodemask+0x114/0x8f4)
> [  125.787949] [<c0087018>] (__alloc_pages_nodemask) from [<c00a20c8>]
> (handle_mm_fault+0x9f8/0xcdc)
> [  125.788357] [<c00a20c8>] (handle_mm_fault) from [<c001793c>]
> (do_page_fault+0x194/0x27c)
> [  125.788726] [<c001793c>] (do_page_fault) from [<c000844c>]
> (do_DataAbort+0x30/0x90)
> [  125.789080] [<c000844c>] (do_DataAbort) from [<c0011e34>]
> (__dabt_usr+0x34/0x40)
> [  125.789403] Exception stack(0xc089dfb0 to 0xc089dff8)
> [  125.789652] dfa0:                                     ae55e008
> 11111111 000dc000 ae582000
> [  125.790029] dfc0: be967d18 00000000 00008390 00000000 00000000
> 00000000 b6f29000 be967d04
> [  125.790404] dfe0: 11111111 be967ce8 00008550 b6e5ce48 20000010 ffffffff
> [  125.791282] BUG: soft lockup - CPU#0 stuck for 53s! [kworker/0:1:41]
> [  125.791710] Modules linked in:
> [  125.792046]
> [  125.792836] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 3.14.0-rc4 #39
> [  125.793832] task: ed893840 ti: c0826000 task.ti: c0826000
> [  125.795257] PC is at finish_task_switch+0x40/0xec
> [  125.795562] LR is at __schedule+0x1fc/0x51c
> [  125.795856] pc : [<c0044724>]    lr : [<c04c3274>]    psr: 600f0013
> [  125.795856] sp : c0827ec8  ip : 00000000  fp : c0827edc
> [  125.796416] r10: eda56e00  r9 : c0deac28  r8 : 00000000
> [  125.796698] r7 : c0de871c  r6 : c0826020  r5 : ed893840  r4 : 00000001
> [  125.797028] r3 : eda56e00  r2 : 000012d7  r1 : ed854780  r0 : edfd8840
> [  125.797488] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
> Segment kernel
> [  125.797866] Control: 30c7387d  Table: 808353c0  DAC: fffffffd
> [  125.798321] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 3.14.0-rc4 #39
> [  125.800604] [<c0015624>] (unwind_backtrace) from [<c00111a4>]
> (show_stack+0x10/0x14)
> [  125.801205] [<c00111a4>] (show_stack) from [<c04c1bd4>]
> (dump_stack+0x84/0x94)
> [  125.801786] [<c04c1bd4>] (dump_stack) from [<c007e7f0>]
> (watchdog_timer_fn+0x144/0x17c)
> [  125.802238] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>]
> (__run_hrtimer.isra.32+0x54/0xe4)
> [  125.802679] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>]
> (hrtimer_interrupt+0x11c/0x2d0)
> [  125.803108] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>]
> (arch_timer_handler_virt+0x28/0x30)
> [  125.803530] [<c03c4f80>] (arch_timer_handler_virt) from
> [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84)
> [  125.803965] [<c006280c>] (handle_percpu_devid_irq) from
> [<c005edec>] (generic_handle_irq+0x2c/0x3c)
> [  125.804380] [<c005edec>] (generic_handle_irq) from [<c000eb7c>]
> (handle_IRQ+0x40/0x90)
> [  125.804774] [<c000eb7c>] (handle_IRQ) from [<c0008568>]
> (gic_handle_irq+0x2c/0x5c)
> [  125.805181] [<c0008568>] (gic_handle_irq) from [<c0011d00>]
> (__irq_svc+0x40/0x50)
> [  125.805706] Exception stack(0xc0827e80 to 0xc0827ec8)
> [  125.806185] 7e80: edfd8840 ed854780 000012d7 eda56e00 00000001
> ed893840 c0826020 c0de871c
> [  125.806623] 7ea0: 00000000 c0deac28 eda56e00 c0827edc 00000000
> c0827ec8 c04c3274 c0044724
> [  125.807032] 7ec0: 600f0013 ffffffff
> [  125.807328] [<c0011d00>] (__irq_svc) from [<c0044724>]
> (finish_task_switch+0x40/0xec)
> [  125.807752] [<c0044724>] (finish_task_switch) from [<c04c3274>]
> (__schedule+0x1fc/0x51c)
> [  125.808175] [<c04c3274>] (__schedule) from [<c0037590>]
> (worker_thread+0x210/0x404)
> [  125.808570] [<c0037590>] (worker_thread) from [<c003cba0>]
> (kthread+0xd4/0xec)
> [  125.808952] [<c003cba0>] (kthread) from [<c000e338>]
> (ret_from_fork+0x14/0x3c)
> [  246.080014] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  246.080611]  (detected by 0, t=6972 jiffies, g=4294967160,
> c=4294967159, q=127)
> [  246.081576] INFO: Stall ended before state dump start
> [  246.082510] BUG: soft lockup - CPU#1 stuck for 69s! [grep:93]
> [  246.082849] Modules linked in:
> [  246.083046]
> [  246.083179] CPU: 1 PID: 93 Comm: grep Not tainted 3.14.0-rc4 #39
> [  246.083548] task: edbdf480 ti: c09e6000 task.ti: c09e6000
> [  246.083897] PC is at rebalance_domains+0x0/0x284
> [  246.084154] LR is at run_rebalance_domains+0x34/0x164
> [  246.084430] pc : [<c0050208>]    lr : [<c00504c0>]    psr: 20000113
> [  246.084430] sp : c09e7ef8  ip : c0dde340  fp : 40000005
> [  246.084992] r10: 00000007  r9 : c0ddf840  r8 : c0ddf840
> [  246.085267] r7 : edfe0840  r6 : 00000100  r5 : c0de209c  r4 : 00000001
> [  246.085606] r3 : c005048c  r2 : 2d201000  r1 : 00000001  r0 : edfe0840
> [  246.085944] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [  246.086315] Control: 30c7387d  Table: 808351c0  DAC: 00000000
> [  246.086626] CPU: 1 PID: 93 Comm: grep Not tainted 3.14.0-rc4 #39
> [  246.086992] [<c0015624>] (unwind_backtrace) from [<c00111a4>]
> (show_stack+0x10/0x14)
> [  246.087420] [<c00111a4>] (show_stack) from [<c04c1bd4>]
> (dump_stack+0x84/0x94)
> [  246.087825] [<c04c1bd4>] (dump_stack) from [<c007e7f0>]
> (watchdog_timer_fn+0x144/0x17c)
> [  246.088260] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>]
> (__run_hrtimer.isra.32+0x54/0xe4)
> [  246.088726] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>]
> (hrtimer_interrupt+0x11c/0x2d0)
> [  246.089190] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>]
> (arch_timer_handler_virt+0x28/0x30)
> [  246.089670] [<c03c4f80>] (arch_timer_handler_virt) from
> [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84)
> [  246.090155] [<c006280c>] (handle_percpu_devid_irq) from
> [<c005edec>] (generic_handle_irq+0x2c/0x3c)
> [  246.090822] [<c005edec>] (generic_handle_irq) from [<c000eb7c>]
> (handle_IRQ+0x40/0x90)
> [  246.091233] [<c000eb7c>] (handle_IRQ) from [<c0008568>]
> (gic_handle_irq+0x2c/0x5c)
> [  246.091631] [<c0008568>] (gic_handle_irq) from [<c0011d00>]
> (__irq_svc+0x40/0x50)
> [  246.092007] Exception stack(0xc09e7eb0 to 0xc09e7ef8)
> [  246.092291] 7ea0:                                     edfe0840
> 00000001 2d201000 c005048c
> [  246.092719] 7ec0: 00000001 c0de209c 00000100 edfe0840 c0ddf840
> c0ddf840 00000007 40000005
> [  246.093142] 7ee0: c0dde340 c09e7ef8 c00504c0 c0050208 20000113 ffffffff
> [  246.093647] [<c0011d00>] (__irq_svc) from [<c0050208>]
> (rebalance_domains+0x0/0x284)
> [  246.094422] [<c0050208>] (rebalance_domains) from [<c09e6028>] (0xc09e6028)
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-03-05  1:43 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-04 23:55 RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed Florian Fainelli
2014-03-05  0:48 ` Eric Dumazet
2014-03-05  0:48   ` Eric Dumazet
2014-03-05  0:48   ` Eric Dumazet
2014-03-05  1:03   ` Florian Fainelli
2014-03-05  1:16     ` Florian Fainelli
2014-03-05  1:16       ` Florian Fainelli
2014-03-05  1:43       ` Paul E. McKenney [this message]
2014-03-05  1:43         ` Paul E. McKenney
2014-03-05  3:55         ` Florian Fainelli
2014-03-05  3:55           ` Florian Fainelli
2014-03-05  5:34           ` Paul E. McKenney
2014-03-05  5:34             ` Paul E. McKenney
2014-03-06  0:42             ` Florian Fainelli
2014-03-06  0:42               ` Florian Fainelli
2014-03-06  0:42               ` Florian Fainelli
2014-03-06  1:42               ` Paul E. McKenney
2014-03-06  1:42                 ` Paul E. McKenney
2014-03-05  1:41     ` Paul E. McKenney
2014-03-05  1:41       ` Paul E. McKenney
2014-03-05  1:43       ` Florian Fainelli
2014-03-05  1:43         ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140305014310.GC3334@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=f.fainelli@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.