All of lore.kernel.org
 help / color / mirror / Atom feed
* Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-07-07  8:48 Jan de Kruyf
  2014-07-07 13:00 ` Thomas Gleixner
  2014-07-07 16:23 ` Austin Schuh
  0 siblings, 2 replies; 43+ messages in thread
From: Jan de Kruyf @ 2014-07-07  8:48 UTC (permalink / raw)
  To: linux-rt-users

Hope I do not kick in an open door, but

Linux JanDell 3.2.0-0.bpo.3-rt-686-pae #1 SMP PREEMPT RT Thu Aug 23
09:55:27 UTC 2012 i686 GNU/Linux
and
Linux jan 3.2.0-4-rt-amd64 #1 SMP PREEMPT RT Debian 3.2.57-3+deb7u2
x86_64 GNU/Linux

does not give the same problem that I experienced with XFS.
So there is a hickup between version 3.2-rt and 3.10-rt I would say.

cheers,

j.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-07  8:48 Filesystem lockup with CONFIG_PREEMPT_RT Jan de Kruyf
@ 2014-07-07 13:00 ` Thomas Gleixner
  2014-07-07 16:23 ` Austin Schuh
  1 sibling, 0 replies; 43+ messages in thread
From: Thomas Gleixner @ 2014-07-07 13:00 UTC (permalink / raw)
  To: Jan de Kruyf; +Cc: linux-rt-users

On Mon, 7 Jul 2014, Jan de Kruyf wrote:

> Hope I do not kick in an open door, but
> 
> Linux JanDell 3.2.0-0.bpo.3-rt-686-pae #1 SMP PREEMPT RT Thu Aug 23
> 09:55:27 UTC 2012 i686 GNU/Linux
> and
> Linux jan 3.2.0-4-rt-amd64 #1 SMP PREEMPT RT Debian 3.2.57-3+deb7u2
> x86_64 GNU/Linux
> 
> does not give the same problem that I experienced with XFS.
> So there is a hickup between version 3.2-rt and 3.10-rt I would say.

There is no hickup. Just stuff changes over time and makes certain
problems more likely/obvious.

The workqueue issue has been there since we had to deal with the
workqueue wakeup from the guts of the scheduler.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-07  8:48 Filesystem lockup with CONFIG_PREEMPT_RT Jan de Kruyf
  2014-07-07 13:00 ` Thomas Gleixner
@ 2014-07-07 16:23 ` Austin Schuh
  2014-07-08  8:03   ` Jan de Kruyf
  1 sibling, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-07-07 16:23 UTC (permalink / raw)
  To: Jan de Kruyf; +Cc: rt-users

On Mon, Jul 7, 2014 at 1:48 AM, Jan de Kruyf <jan.de.kruyf@gmail.com> wrote:
> does not give the same problem that I experienced with XFS.
> So there is a hickup between version 3.2-rt and 3.10-rt I would say.

I posted a simple kernel module that reliably triggers the lockup by
recreating the circumstances in a controlled manner.  Try testing with
that module.

Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-07 16:23 ` Austin Schuh
@ 2014-07-08  8:03   ` Jan de Kruyf
  2014-07-08 16:09     ` Austin Schuh
  0 siblings, 1 reply; 43+ messages in thread
From: Jan de Kruyf @ 2014-07-08  8:03 UTC (permalink / raw)
  To: rt-users, austin

Thank you Austin,

One of these days we will have Kernel Day to see what gives with
Thomas's new patches.
In the mean time I get 45usec max latency under heavy disk io with the
3.2 kernel on some cheap 5 years old Foxconn board (1600Meg duo core)
and no midnight surprises, so I am smiling. It averages between 3 and
4 usec.

When did you post that module? I did not spot it going through the
discussion on rt-users.

cheers,

j.


On Mon, Jul 7, 2014 at 6:23 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> On Mon, Jul 7, 2014 at 1:48 AM, Jan de Kruyf <jan.de.kruyf@gmail.com> wrote:
>> does not give the same problem that I experienced with XFS.
>> So there is a hickup between version 3.2-rt and 3.10-rt I would say.
>
> I posted a simple kernel module that reliably triggers the lockup by
> recreating the circumstances in a controlled manner.  Try testing with
> that module.
>
> Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-08  8:03   ` Jan de Kruyf
@ 2014-07-08 16:09     ` Austin Schuh
  0 siblings, 0 replies; 43+ messages in thread
From: Austin Schuh @ 2014-07-08 16:09 UTC (permalink / raw)
  To: Jan de Kruyf; +Cc: rt-users

On Tue, Jul 8, 2014 at 1:03 AM, Jan de Kruyf <jan.de.kruyf@gmail.com> wrote:
> When did you post that module? I did not spot it going through the
> discussion on rt-users.


https://lkml.org/lkml/2014/6/26/436

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-05 20:26                 ` Thomas Gleixner
@ 2014-07-06  4:55                   ` Austin Schuh
  0 siblings, 0 replies; 43+ messages in thread
From: Austin Schuh @ 2014-07-06  4:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Sat, Jul 5, 2014 at 1:26 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 30 Jun 2014, Austin Schuh wrote:
>> I think I might have an answer for my own question, but I would
>> appreciate someone else to double check.  If list_empty erroneously
>> returns that there is work to do when there isn't work to do, we wake
>> up an extra worker which then goes back to sleep.  Not a big loss.  If
>> list_empty erroneously returns that there isn't work to do when there
>> is, this should only be because someone is modifying the work list.
>> When they finish, as far as I can tell, all callers then check to see
>> if a worker needs to be started up, and start one.
>
> Precisely.

A comment there when you put together a polished patch for inclusion
would be awesome.  I'm assuming that you will put the patch together?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-05-21 19:30 ` John Blackwood
@ 2014-07-05 20:36   ` Thomas Gleixner
  -1 siblings, 0 replies; 43+ messages in thread
From: Thomas Gleixner @ 2014-07-05 20:36 UTC (permalink / raw)
  To: John Blackwood
  Cc: Richard Weinberger, Austin Schuh, linux-kernel, xfs, linux-rt-users

On Wed, 21 May 2014, John Blackwood wrote:
> I'm not 100% sure that the patch below will fix your problem, but we
> saw something that sounds pretty familiar to your issue involving the
> nvidia driver and the preempt-rt patch.  The nvidia driver uses the
> completion support to create their own driver's notion of an internally
> used semaphore.

And why should we care about that? If they abuse the completion code
then it's their problem. And it has nothing to do at all with the
problem Austin is observing.
 
> Some tasks were failing to ever wakeup from wait_for_completion() calls
> due to a race in the underlying do_wait_for_common() routine.

Why are you not fixing the nvidia crap instead of hacking utter shite
into the core kernel?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-07-05 20:36   ` Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: Thomas Gleixner @ 2014-07-05 20:36 UTC (permalink / raw)
  To: John Blackwood
  Cc: Richard Weinberger, linux-rt-users, linux-kernel, Austin Schuh, xfs

On Wed, 21 May 2014, John Blackwood wrote:
> I'm not 100% sure that the patch below will fix your problem, but we
> saw something that sounds pretty familiar to your issue involving the
> nvidia driver and the preempt-rt patch.  The nvidia driver uses the
> completion support to create their own driver's notion of an internally
> used semaphore.

And why should we care about that? If they abuse the completion code
then it's their problem. And it has nothing to do at all with the
problem Austin is observing.
 
> Some tasks were failing to ever wakeup from wait_for_completion() calls
> due to a race in the underlying do_wait_for_common() routine.

Why are you not fixing the nvidia crap instead of hacking utter shite
into the core kernel?

Thanks,

	tglx

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-01  0:53               ` Austin Schuh
@ 2014-07-05 20:26                 ` Thomas Gleixner
  2014-07-06  4:55                   ` Austin Schuh
  0 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2014-07-05 20:26 UTC (permalink / raw)
  To: Austin Schuh
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Mon, 30 Jun 2014, Austin Schuh wrote:
> I think I might have an answer for my own question, but I would
> appreciate someone else to double check.  If list_empty erroneously
> returns that there is work to do when there isn't work to do, we wake
> up an extra worker which then goes back to sleep.  Not a big loss.  If
> list_empty erroneously returns that there isn't work to do when there
> is, this should only be because someone is modifying the work list.
> When they finish, as far as I can tell, all callers then check to see
> if a worker needs to be started up, and start one.

Precisely.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-07-05 19:30 Jan de Kruyf
  0 siblings, 0 replies; 43+ messages in thread
From: Jan de Kruyf @ 2014-07-05 19:30 UTC (permalink / raw)
  To: linux-rt-users

Good people,

Is it possible that the copy of the mail to the xfs mailinglist below
is connected to this problem?
If so, is there anything I can do to help? I am not a kernel developer
but I have run debugging stuff before, just give instructions.

Greetings from dark Africa

Jan de Kruijf.


---------- Forwarded message ----------
From: Jan de Kruyf <jan.de.kruyf@gmail.com>
Date: Sat, Jul 5, 2014 at 2:41 PM
Subject: Data loss XFS with RT kernel on Debian.
To: xfs@oss.sgi.com


Hallo,

While doing a reasonably high density job like rsynching a
subdirectory from one place to another, or tarring it to a pipe
and untarring it at the other end, I note that the cpu usage goes
practically to 100% and when I after 5 minutes or so I reset the
computer the writing has not finished at all. However on the
stock Debian kernel it works without a problem.

Could I still use this combination in an industrial environment
reading and writing reasonably short text files? So far I did not
experience this problem with normal day to day use. It stuck up
its head during installation of gnat-gpl-2014-x86_64-linux-bin
from the http://libre.adacore.com/download/ page. The offending
code is in the Makefile in the top directory page. The Xterm will
give you the place where it gets stuck.

Regards,

Jan de Kruijf.



Her are the details of the installation:

root@jan:~# xfs_info -V
xfs_info version 3.1.7

root@jan:~# xfs_info /usr
meta-data=/dev/sda3 isize=256 agcount=4, agsize=732416 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=2929664, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

This combination does not work:
root@jan:~# uname -a
Linux jan 3.14-0.bpo.1-rt-amd64 #1 SMP PREEMPT RT Debian
3.14.7-1~bpo70+1 (2014-06-21) x86_64 GNU/Linux

Also kernel 3.10-0.bpo.3-rt-amd64 does not work

But this combination works:
root@jan:~# uname -a
Linux jan 3.2.0-4-amd64 #1 SMP Debian 3.2.57-3+deb7u2 x86_64 GNU/Linux

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-03 23:08                 ` Austin Schuh
@ 2014-07-04  4:42                   ` Mike Galbraith
  0 siblings, 0 replies; 43+ messages in thread
From: Mike Galbraith @ 2014-07-04  4:42 UTC (permalink / raw)
  To: Austin Schuh
  Cc: Thomas Gleixner, Richard Weinberger, LKML, rt-users, Steven Rostedt

On Thu, 2014-07-03 at 16:08 -0700, Austin Schuh wrote: 
> On Tue, Jul 1, 2014 at 12:32 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> > On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> >> On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> >>> Completely untested patch below.
> 
> I've tested it and looked it over now, and feel pretty confident in
> the patch.  Thanks Thomas!

FWIW, I flogged my boxen hard with the virgin tglx patch applied again
after merging 3.14-rt5 with 3.14.10.  It remained thoroughly boring.

-Mike


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-01 19:32               ` Austin Schuh
@ 2014-07-03 23:08                 ` Austin Schuh
  2014-07-04  4:42                   ` Mike Galbraith
  0 siblings, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-07-03 23:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Tue, Jul 1, 2014 at 12:32 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh <austin@peloton-tech.com> wrote:
>> On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>>> Completely untested patch below.

I've tested it and looked it over now, and feel pretty confident in
the patch.  Thanks Thomas!

I've been running this fix with my extra lock steps for a couple days
now on 3 machines, and haven't seen any issues.  On each of the
machines, I've got a CAN card generating about 1 CPU worth of load
from interrupts, and the rest of the cores are tied up putting work on
the work queues with the killer_module.

What next?

Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-01  3:01             ` Austin Schuh
@ 2014-07-01 19:32               ` Austin Schuh
  2014-07-03 23:08                 ` Austin Schuh
  0 siblings, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-07-01 19:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Mon, Jun 30, 2014 at 8:01 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> Completely untested patch below.
>
> By chance, I found this in my boot logs.  I'll do some more startup
> testing tomorrow.
>
> Jun 30 19:54:40 vpc5 kernel: [    0.670955] ------------[ cut here ]------------
> Jun 30 19:54:40 vpc5 kernel: [    0.670962] WARNING: CPU: 0 PID: 4 at
> kernel/workqueue.c:1604 worker_enter_idle+0x65/0x16b()
> Jun 30 19:54:40 vpc5 kernel: [    0.670970] Modules linked in:
> Jun 30 19:54:40 vpc5 kernel: [    0.670973] CPU: 0 PID: 4 Comm:
> kworker/0:0 Not tainted 3.14.3-rt4abs+ #8
> Jun 30 19:54:40 vpc5 kernel: [    0.670974] Hardware name: CompuLab
> Intense-PC/Intense-PC, BIOS CR_2.2.0.377 X64 04/10/2013
> Jun 30 19:54:40 vpc5 kernel: [    0.670983]  0000000000000009
> ffff88040ce75de8 ffffffff81510faf 0000000000000002
> Jun 30 19:54:40 vpc5 kernel: [    0.670985]  0000000000000000
> ffff88040ce75e28 ffffffff81042085 0000000000000001
> Jun 30 19:54:40 vpc5 kernel: [    0.670987]  ffffffff81057a60
> ffff88042d406900 ffff88042da63fc0 ffff88042da64030
> Jun 30 19:54:40 vpc5 kernel: [    0.670988] Call Trace:
> Jun 30 19:54:40 vpc5 kernel: [    0.670995]  [<ffffffff81510faf>]
> dump_stack+0x4f/0x7c
> Jun 30 19:54:40 vpc5 kernel: [    0.670999]  [<ffffffff81042085>]
> warn_slowpath_common+0x81/0x9c
> Jun 30 19:54:40 vpc5 kernel: [    0.671002]  [<ffffffff81057a60>] ?
> worker_enter_idle+0x65/0x16b
> Jun 30 19:54:40 vpc5 kernel: [    0.671005]  [<ffffffff810420ba>]
> warn_slowpath_null+0x1a/0x1c
> Jun 30 19:54:40 vpc5 kernel: [    0.671007]  [<ffffffff81057a60>]
> worker_enter_idle+0x65/0x16b
> Jun 30 19:54:40 vpc5 kernel: [    0.671010]  [<ffffffff8105a0a9>]
> worker_thread+0x1b3/0x22b
> Jun 30 19:54:40 vpc5 kernel: [    0.671013]  [<ffffffff81059ef6>] ?
> rescuer_thread+0x293/0x293
> Jun 30 19:54:40 vpc5 kernel: [    0.671015]  [<ffffffff81059ef6>] ?
> rescuer_thread+0x293/0x293
> Jun 30 19:54:40 vpc5 kernel: [    0.671018]  [<ffffffff8105f7ab>]
> kthread+0xdc/0xe4
> Jun 30 19:54:40 vpc5 kernel: [    0.671022]  [<ffffffff8105f6cf>] ?
> flush_kthread_worker+0xe1/0xe1
> Jun 30 19:54:40 vpc5 kernel: [    0.671025]  [<ffffffff8151586c>]
> ret_from_fork+0x7c/0xb0
> Jun 30 19:54:40 vpc5 kernel: [    0.671027]  [<ffffffff8105f6cf>] ?
> flush_kthread_worker+0xe1/0xe1
> Jun 30 19:54:40 vpc5 kernel: [    0.671029] ---[ end trace 0000000000000001 ]---

Bug in my extra locking...  Sorry for the noise.  The second diff is a
cleaner way of destroying the workers.

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 8900da8..590cc26 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1567,10 +1602,16 @@ static void worker_enter_idle(struct worker *worker)
 {
        struct worker_pool *pool = worker->pool;

-       if (WARN_ON_ONCE(worker->flags & WORKER_IDLE) ||
-           WARN_ON_ONCE(!list_empty(&worker->entry) &&
-                        (worker->hentry.next || worker->hentry.pprev)))
+       if (WARN_ON_ONCE(worker->flags & WORKER_IDLE)) return;
+
+       rt_lock_idle_list(pool);
+       if (WARN_ON_ONCE(!list_empty(&worker->entry) &&
+                        (worker->hentry.next || worker->hentry.pprev))) {
+               rt_unlock_idle_list(pool);
                return;
+       } else {
+               rt_unlock_idle_list(pool);
+       }

        /* can't use worker_set_flags(), also called from start_worker() */
        worker->flags |= WORKER_IDLE;
@@ -3584,8 +3637,14 @@ static void put_unbound_pool(struct worker_pool *pool)
        mutex_lock(&pool->manager_mutex);
        spin_lock_irq(&pool->lock);

-       while ((worker = first_worker(pool)))
+       rt_lock_idle_list(pool);
+       while ((worker = first_worker(pool))) {
+               rt_unlock_idle_list(pool);
                destroy_worker(worker);
+               rt_lock_idle_list(pool);
+       }
+       rt_unlock_idle_list(pool);
+
        WARN_ON(pool->nr_workers || pool->nr_idle);

        spin_unlock_irq(&pool->lock);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 14:24           ` Thomas Gleixner
  2014-06-28  4:51             ` Mike Galbraith
  2014-07-01  0:12             ` Austin Schuh
@ 2014-07-01  3:01             ` Austin Schuh
  2014-07-01 19:32               ` Austin Schuh
  2 siblings, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-07-01  3:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> Completely untested patch below.

By chance, I found this in my boot logs.  I'll do some more startup
testing tomorrow.

Jun 30 19:54:40 vpc5 kernel: [    0.670955] ------------[ cut here ]------------
Jun 30 19:54:40 vpc5 kernel: [    0.670962] WARNING: CPU: 0 PID: 4 at
kernel/workqueue.c:1604 worker_enter_idle+0x65/0x16b()
Jun 30 19:54:40 vpc5 kernel: [    0.670970] Modules linked in:
Jun 30 19:54:40 vpc5 kernel: [    0.670973] CPU: 0 PID: 4 Comm:
kworker/0:0 Not tainted 3.14.3-rt4abs+ #8
Jun 30 19:54:40 vpc5 kernel: [    0.670974] Hardware name: CompuLab
Intense-PC/Intense-PC, BIOS CR_2.2.0.377 X64 04/10/2013
Jun 30 19:54:40 vpc5 kernel: [    0.670983]  0000000000000009
ffff88040ce75de8 ffffffff81510faf 0000000000000002
Jun 30 19:54:40 vpc5 kernel: [    0.670985]  0000000000000000
ffff88040ce75e28 ffffffff81042085 0000000000000001
Jun 30 19:54:40 vpc5 kernel: [    0.670987]  ffffffff81057a60
ffff88042d406900 ffff88042da63fc0 ffff88042da64030
Jun 30 19:54:40 vpc5 kernel: [    0.670988] Call Trace:
Jun 30 19:54:40 vpc5 kernel: [    0.670995]  [<ffffffff81510faf>]
dump_stack+0x4f/0x7c
Jun 30 19:54:40 vpc5 kernel: [    0.670999]  [<ffffffff81042085>]
warn_slowpath_common+0x81/0x9c
Jun 30 19:54:40 vpc5 kernel: [    0.671002]  [<ffffffff81057a60>] ?
worker_enter_idle+0x65/0x16b
Jun 30 19:54:40 vpc5 kernel: [    0.671005]  [<ffffffff810420ba>]
warn_slowpath_null+0x1a/0x1c
Jun 30 19:54:40 vpc5 kernel: [    0.671007]  [<ffffffff81057a60>]
worker_enter_idle+0x65/0x16b
Jun 30 19:54:40 vpc5 kernel: [    0.671010]  [<ffffffff8105a0a9>]
worker_thread+0x1b3/0x22b
Jun 30 19:54:40 vpc5 kernel: [    0.671013]  [<ffffffff81059ef6>] ?
rescuer_thread+0x293/0x293
Jun 30 19:54:40 vpc5 kernel: [    0.671015]  [<ffffffff81059ef6>] ?
rescuer_thread+0x293/0x293
Jun 30 19:54:40 vpc5 kernel: [    0.671018]  [<ffffffff8105f7ab>]
kthread+0xdc/0xe4
Jun 30 19:54:40 vpc5 kernel: [    0.671022]  [<ffffffff8105f6cf>] ?
flush_kthread_worker+0xe1/0xe1
Jun 30 19:54:40 vpc5 kernel: [    0.671025]  [<ffffffff8151586c>]
ret_from_fork+0x7c/0xb0
Jun 30 19:54:40 vpc5 kernel: [    0.671027]  [<ffffffff8105f6cf>] ?
flush_kthread_worker+0xe1/0xe1
Jun 30 19:54:40 vpc5 kernel: [    0.671029] ---[ end trace 0000000000000001 ]---

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-07-01  0:12             ` Austin Schuh
@ 2014-07-01  0:53               ` Austin Schuh
  2014-07-05 20:26                 ` Thomas Gleixner
  0 siblings, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-07-01  0:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Mon, Jun 30, 2014 at 5:12 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Thu, 26 Jun 2014, Austin Schuh wrote:
>>> If I'm reading the rt patch correctly, wq_worker_sleeping was moved
>>> out of __schedule to sched_submit_work.  It looks like that changes
>>> the conditions under which wq_worker_sleeping is called.  It used to
>>> be called whenever a task was going to sleep (I think).  It looks like
>>> it is called now if the task is going to sleep, and if the task isn't
>>> blocked on a PI mutex (I think).
>>>
>>> If I try the following experiment
>>>
>>>  static inline void sched_submit_work(struct task_struct *tsk)
>>>  {
>>> +   if (tsk->state && tsk->flags & PF_WQ_WORKER) {
>>> +     wq_worker_sleeping(tsk);
>>> +     return;
>>> +   }
>>>
>>> and then remove the call later in the function, I am able to pass my test.
>>>
>>> Unfortunately, I then get a recursive pool spinlock BUG_ON after a
>>> while (as I would expect), and it all blows up.
>>
>> Well, that's why the check for !pi_blocked_on is there.
>>
>>> I'm not sure where to go from there.  Any changes to the workpool to
>>> try to fix that will be hard, or could affect latency significantly.
>>
>> Completely untested patch below.
>>
>> Thanks,
>>
>>         tglx
>>
>> --------------------->
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 8749d20..621329a 100644
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index be0ef50..0ca9d97 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -849,25 +882,18 @@ void wq_worker_sleeping(struct task_struct *task)
>>                 return;
>>
>>         worker->sleeping = 1;
>> -       spin_lock_irq(&pool->lock);
>> +
>>         /*
>>          * The counterpart of the following dec_and_test, implied mb,
>>          * worklist not empty test sequence is in insert_work().
>>          * Please read comment there.
>> -        *
>> -        * NOT_RUNNING is clear.  This means that we're bound to and
>> -        * running on the local cpu w/ rq lock held and preemption
>> -        * disabled, which in turn means that none else could be
>> -        * manipulating idle_list, so dereferencing idle_list without pool
>> -        * lock is safe.
>>          */
>>         if (atomic_dec_and_test(&pool->nr_running) &&
>>             !list_empty(&pool->worklist)) {
>
> What guarantees that this pool->worklist access is safe?  Preemption
> isn't disabled.

I think I might have an answer for my own question, but I would
appreciate someone else to double check.  If list_empty erroneously
returns that there is work to do when there isn't work to do, we wake
up an extra worker which then goes back to sleep.  Not a big loss.  If
list_empty erroneously returns that there isn't work to do when there
is, this should only be because someone is modifying the work list.
When they finish, as far as I can tell, all callers then check to see
if a worker needs to be started up, and start one.

Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 14:24           ` Thomas Gleixner
  2014-06-28  4:51             ` Mike Galbraith
@ 2014-07-01  0:12             ` Austin Schuh
  2014-07-01  0:53               ` Austin Schuh
  2014-07-01  3:01             ` Austin Schuh
  2 siblings, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-07-01  0:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Fri, Jun 27, 2014 at 7:24 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 26 Jun 2014, Austin Schuh wrote:
>> If I'm reading the rt patch correctly, wq_worker_sleeping was moved
>> out of __schedule to sched_submit_work.  It looks like that changes
>> the conditions under which wq_worker_sleeping is called.  It used to
>> be called whenever a task was going to sleep (I think).  It looks like
>> it is called now if the task is going to sleep, and if the task isn't
>> blocked on a PI mutex (I think).
>>
>> If I try the following experiment
>>
>>  static inline void sched_submit_work(struct task_struct *tsk)
>>  {
>> +   if (tsk->state && tsk->flags & PF_WQ_WORKER) {
>> +     wq_worker_sleeping(tsk);
>> +     return;
>> +   }
>>
>> and then remove the call later in the function, I am able to pass my test.
>>
>> Unfortunately, I then get a recursive pool spinlock BUG_ON after a
>> while (as I would expect), and it all blows up.
>
> Well, that's why the check for !pi_blocked_on is there.
>
>> I'm not sure where to go from there.  Any changes to the workpool to
>> try to fix that will be hard, or could affect latency significantly.
>
> Completely untested patch below.
>
> Thanks,
>
>         tglx
>
> --------------------->
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8749d20..621329a 100644
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index be0ef50..0ca9d97 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -849,25 +882,18 @@ void wq_worker_sleeping(struct task_struct *task)
>                 return;
>
>         worker->sleeping = 1;
> -       spin_lock_irq(&pool->lock);
> +
>         /*
>          * The counterpart of the following dec_and_test, implied mb,
>          * worklist not empty test sequence is in insert_work().
>          * Please read comment there.
> -        *
> -        * NOT_RUNNING is clear.  This means that we're bound to and
> -        * running on the local cpu w/ rq lock held and preemption
> -        * disabled, which in turn means that none else could be
> -        * manipulating idle_list, so dereferencing idle_list without pool
> -        * lock is safe.
>          */
>         if (atomic_dec_and_test(&pool->nr_running) &&
>             !list_empty(&pool->worklist)) {

What guarantees that this pool->worklist access is safe?  Preemption
isn't disabled.

I'm seeing some inconsistency when accessing the idle list.  You seem
to only disable preemption when writing to the idle list?  I'm unsure
if I'm missing something, or what.  With that question in mind, I took
a stab at adding locking around all the idle_list calls.

I went through and double checked to make sure that rt_lock_idle_list
and rt_unlock_idle_list surround all idle_list or entry accesses.  The
following are places where I think you should be locking, but aren't.

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 8900da8..314a098 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -778,8 +805,12 @@ static bool too_many_workers(struct worker_pool *pool)
  * nr_idle and idle_list may disagree if idle rebinding is in
  * progress.  Never return %true if idle_list is empty.
  */
- if (list_empty(&pool->idle_list))
+ rt_lock_idle_list(pool);
+ if (list_empty(&pool->idle_list)) {
+ rt_unlock_idle_list(pool);
  return false;
+ }
+ rt_unlock_idle_list(pool);

  return nr_idle > 2 && (nr_idle - 2) * MAX_IDLE_WORKERS_RATIO >= nr_busy;
 }
@@ -788,7 +819,7 @@ static bool too_many_workers(struct worker_pool *pool)
  * Wake up functions.
  */

-/* Return the first worker.  Safe with preemption disabled */
+/* Return the first worker.  Preemption must be disabled */
 static struct worker *first_worker(struct worker_pool *pool)
 {
  if (unlikely(list_empty(&pool->idle_list)))
@@ -1567,10 +1598,16 @@ static void worker_enter_idle(struct worker *worker)
 {
  struct worker_pool *pool = worker->pool;

- if (WARN_ON_ONCE(worker->flags & WORKER_IDLE) ||
-    WARN_ON_ONCE(!list_empty(&worker->entry) &&
- (worker->hentry.next || worker->hentry.pprev)))
- return;
+ if (WARN_ON_ONCE(worker->flags & WORKER_IDLE)) return;
+
+ rt_lock_idle_list(pool);
+ if (WARN_ON_ONCE(!list_empty(&worker->entry))) {
+ rt_unlock_idle_list(pool);
+ if (worker->hentry.next || worker->hentry.pprev)
+ return;
+ } else {
+ rt_unlock_idle_list(pool);
+ }

  /* can't use worker_set_flags(), also called from start_worker() */
  worker->flags |= WORKER_IDLE;
@@ -1882,7 +1925,9 @@ static void idle_worker_timeout(unsigned long __pool)
  unsigned long expires;

  /* idle_list is kept in LIFO order, check the last one */
+ rt_lock_idle_list(pool);
  worker = list_entry(pool->idle_list.prev, struct worker, entry);
+ rt_unlock_idle_list(pool);
  expires = worker->last_active + IDLE_WORKER_TIMEOUT;

  if (time_before(jiffies, expires))
@@ -2026,7 +2071,9 @@ static bool maybe_destroy_workers(struct
worker_pool *pool)
  struct worker *worker;
  unsigned long expires;

+ rt_lock_idle_list(pool);
  worker = list_entry(pool->idle_list.prev, struct worker, entry);
+ rt_unlock_idle_list(pool);
  expires = worker->last_active + IDLE_WORKER_TIMEOUT;

  if (time_before(jiffies, expires)) {
@@ -2299,7 +2346,9 @@ woke_up:
  /* am I supposed to die? */
  if (unlikely(worker->flags & WORKER_DIE)) {
  spin_unlock_irq(&pool->lock);
+ rt_lock_idle_list(pool);
  WARN_ON_ONCE(!list_empty(&worker->entry));
+ rt_unlock_idle_list(pool);
  worker->task->flags &= ~PF_WQ_WORKER;
  return 0;
  }
@@ -3584,8 +3633,17 @@ static void put_unbound_pool(struct worker_pool *pool)
  mutex_lock(&pool->manager_mutex);
  spin_lock_irq(&pool->lock);

- while ((worker = first_worker(pool)))
- destroy_worker(worker);
+ while (true) {
+ rt_lock_idle_list(pool);
+ if ((worker = first_worker(pool))) {
+ destroy_worker(worker);
+ } else {
+ break;
+ }
+ rt_unlock_idle_list(pool);
+ }
+ rt_unlock_idle_list(pool);
+
  WARN_ON(pool->nr_workers || pool->nr_idle);

  spin_unlock_irq(&pool->lock);

So far, my test machines are staying up, which is progress.  I have a
couple hours of runtime on the unmodified patch, and 1/2 hour ish on
my modifications.

Thanks!
  Austin

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-28  6:20                           ` Austin Schuh
@ 2014-06-28  7:11                             ` Mike Galbraith
  0 siblings, 0 replies; 43+ messages in thread
From: Mike Galbraith @ 2014-06-28  7:11 UTC (permalink / raw)
  To: Austin Schuh
  Cc: Steven Rostedt, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, 2014-06-27 at 23:20 -0700, Austin Schuh wrote:

> For workqueues, as long as the helper doesn't block on a lock which
> requires the work queue to be freed up, it will eventually become
> unblocked and make progress.  The helper _should_ only need the pool
> lock, which will wake the helper back up when it is available again.
> Nothing should go to sleep in an un-recoverable way with the work pool
> lock held.

Well, Thomas killed taking a lock from within the core of a lock, so
that wart shrank back to microscopic.  Applying that same ointment to
the plug puller gizmo and schedule_work() would be most excellent.

-Mike


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-28  3:32                         ` Mike Galbraith
@ 2014-06-28  6:20                           ` Austin Schuh
  2014-06-28  7:11                             ` Mike Galbraith
  0 siblings, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-06-28  6:20 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Steven Rostedt, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, Jun 27, 2014 at 8:32 PM, Mike Galbraith
<umgwanakikbuti@gmail.com> wrote:
> On Fri, 2014-06-27 at 18:18 -0700, Austin Schuh wrote:
>
>> It would be more context switches, but I wonder if we could kick the
>> workqueue logic completely out of the scheduler into a thread.  Have
>> the scheduler increment/decrement an atomic pool counter, and wake up
>> the monitoring thread to spawn new threads when needed?  That would
>> get rid of the recursive pool lock problem, and should reduce
>> scheduler latency if we would need to spawn a new thread.
>
> I was wondering the same thing, and not only for workqueue, but also the
> plug pulling.  It's kind of a wart to have that stuff sitting in the
> hear of the scheduler in the first place, would be nice if it just went
> away.  When a task can't help itself, you _could_ wake a proxy do that
> for you.  Trouble is, I can imagine that being a heck of a lot of
> context switches with some loads.. and who's gonna help the helper when
> he blocks while trying to help?
>
> -Mike

For workqueues, as long as the helper doesn't block on a lock which
requires the work queue to be freed up, it will eventually become
unblocked and make progress.  The helper _should_ only need the pool
lock, which will wake the helper back up when it is available again.
Nothing should go to sleep in an un-recoverable way with the work pool
lock held.

To drop the extra context switch, you could have a minimum of 2 worker
threads around at all times, and have the management thread start the
work and delegate to the next management thread.  That thread would
then wait for the first thread to block, spawn a new thread, and then
start the next piece of work.  Definitely a bit more complicated.

Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 14:24           ` Thomas Gleixner
@ 2014-06-28  4:51             ` Mike Galbraith
  2014-07-01  0:12             ` Austin Schuh
  2014-07-01  3:01             ` Austin Schuh
  2 siblings, 0 replies; 43+ messages in thread
From: Mike Galbraith @ 2014-06-28  4:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Austin Schuh, Richard Weinberger, LKML, rt-users, Steven Rostedt

On Fri, 2014-06-27 at 16:24 +0200, Thomas Gleixner wrote:

> Completely untested patch below.

It's no longer completely untested, killer_module is no longer a killer.
I'll let box (lockdep etc is enabled) chew on it a while, no news is
good news as usual.

-Mike


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-28  1:18                       ` Austin Schuh
@ 2014-06-28  3:32                         ` Mike Galbraith
  2014-06-28  6:20                           ` Austin Schuh
  0 siblings, 1 reply; 43+ messages in thread
From: Mike Galbraith @ 2014-06-28  3:32 UTC (permalink / raw)
  To: Austin Schuh
  Cc: Steven Rostedt, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, 2014-06-27 at 18:18 -0700, Austin Schuh wrote:

> It would be more context switches, but I wonder if we could kick the
> workqueue logic completely out of the scheduler into a thread.  Have
> the scheduler increment/decrement an atomic pool counter, and wake up
> the monitoring thread to spawn new threads when needed?  That would
> get rid of the recursive pool lock problem, and should reduce
> scheduler latency if we would need to spawn a new thread.

I was wondering the same thing, and not only for workqueue, but also the
plug pulling.  It's kind of a wart to have that stuff sitting in the
hear of the scheduler in the first place, would be nice if it just went
away.  When a task can't help itself, you _could_ wake a proxy do that
for you.  Trouble is, I can imagine that being a heck of a lot of
context switches with some loads.. and who's gonna help the helper when
he blocks while trying to help?

-Mike


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 18:19                     ` Steven Rostedt
  2014-06-27 19:11                       ` Mike Galbraith
@ 2014-06-28  1:18                       ` Austin Schuh
  2014-06-28  3:32                         ` Mike Galbraith
  1 sibling, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-06-28  1:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mike Galbraith, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, Jun 27, 2014 at 11:19 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 27 Jun 2014 20:07:54 +0200
> Mike Galbraith <umgwanakikbuti@gmail.com> wrote:
>
>> > Why do we need the wakeup? the owner of the lock should wake it up
>> > shouldn't it?
>>
>> True, but that can take ages.
>
> Can it? If the workqueue is of some higher priority, it should boost
> the process that owns the lock. Otherwise it just waits like anything
> else does.
>
> I much rather keep the paradigm of the mainline kernel than to add a
> bunch of hacks that can cause more unforeseen side effects that may
> cause other issues.
>
> Remember, this would only be for spinlocks converted into a rtmutex,
> not for normal mutex or other sleeps. In mainline, the wake up still
> would not happen so why are we waking it up here?
>
> This seems similar to the BKL crap we had to deal with as well. If we
> were going to sleep because we were blocked on a spinlock converted
> rtmutex we could not release and retake the BKL because we would end up
> blocked on two locks. Instead, we made sure that the spinlock would not
> release or take the BKL. It kept with the paradigm of mainline and
> worked. Sucked, but it worked.
>
> -- Steve

Sounds like you are arguing that we should disable preemption (or
whatever the right mechanism is) while holding the pool lock?

Workqueues spin up more threads when work that they are executing
blocks.  This is done through hooks in the scheduler.  This means that
we have to acquire the pool lock when work blocks on a lock in order
to see if there is more work and whether or not we need to spin up a
new thread.

It would be more context switches, but I wonder if we could kick the
workqueue logic completely out of the scheduler into a thread.  Have
the scheduler increment/decrement an atomic pool counter, and wake up
the monitoring thread to spawn new threads when needed?  That would
get rid of the recursive pool lock problem, and should reduce
scheduler latency if we would need to spawn a new thread.

Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 18:19                     ` Steven Rostedt
@ 2014-06-27 19:11                       ` Mike Galbraith
  2014-06-28  1:18                       ` Austin Schuh
  1 sibling, 0 replies; 43+ messages in thread
From: Mike Galbraith @ 2014-06-27 19:11 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Austin Schuh, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, 2014-06-27 at 14:19 -0400, Steven Rostedt wrote: 
> On Fri, 27 Jun 2014 20:07:54 +0200
> Mike Galbraith <umgwanakikbuti@gmail.com> wrote:
>  
> > > Why do we need the wakeup? the owner of the lock should wake it up
> > > shouldn't it?
> > 
> > True, but that can take ages.
> 
> Can it? If the workqueue is of some higher priority, it should boost
> the process that owns the lock. Otherwise it just waits like anything
> else does.

No, not like everything else, preempt a lock holder.  IO that starts
moving upon context switch in a stock kernel can rot for ages in rt when
an IO packing task bumps into lock held by preempted task.  When there's
a prio delta on a lock, sure, PI kicks in to help a high prio task.. but
there is no PI help for high priority task waiting on IO stuck behind a
low prio task plug.

(you're taking workqueue, I'm talking IO, but the two meet in things
like raid too. you can't prioritize workqueues, and PI doesn't really
have a lot to do with the general issue of things happening or not
happening at context switch time, and consequences thereof)

> I much rather keep the paradigm of the mainline kernel than to add a
> bunch of hacks that can cause more unforeseen side effects that may
> cause other issues.

The paradigm of mainline is to start IO on context switch, reason for
that is IO deadlock prevention.  We need to follow paradigm somehow.
That somehow really wants to be a tad prettier and more guaranteed than
my somehow :)

-Mike


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 18:07                   ` Mike Galbraith
@ 2014-06-27 18:19                     ` Steven Rostedt
  2014-06-27 19:11                       ` Mike Galbraith
  2014-06-28  1:18                       ` Austin Schuh
  0 siblings, 2 replies; 43+ messages in thread
From: Steven Rostedt @ 2014-06-27 18:19 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Austin Schuh, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, 27 Jun 2014 20:07:54 +0200
Mike Galbraith <umgwanakikbuti@gmail.com> wrote:
 
> > Why do we need the wakeup? the owner of the lock should wake it up
> > shouldn't it?
> 
> True, but that can take ages.

Can it? If the workqueue is of some higher priority, it should boost
the process that owns the lock. Otherwise it just waits like anything
else does.

I much rather keep the paradigm of the mainline kernel than to add a
bunch of hacks that can cause more unforeseen side effects that may
cause other issues.

Remember, this would only be for spinlocks converted into a rtmutex,
not for normal mutex or other sleeps. In mainline, the wake up still
would not happen so why are we waking it up here?

This seems similar to the BKL crap we had to deal with as well. If we
were going to sleep because we were blocked on a spinlock converted
rtmutex we could not release and retake the BKL because we would end up
blocked on two locks. Instead, we made sure that the spinlock would not
release or take the BKL. It kept with the paradigm of mainline and
worked. Sucked, but it worked.

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 17:54                 ` Steven Rostedt
@ 2014-06-27 18:07                   ` Mike Galbraith
  2014-06-27 18:19                     ` Steven Rostedt
  0 siblings, 1 reply; 43+ messages in thread
From: Mike Galbraith @ 2014-06-27 18:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Austin Schuh, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, 2014-06-27 at 13:54 -0400, Steven Rostedt wrote: 
> On Fri, 27 Jun 2014 19:34:53 +0200
> Mike Galbraith <umgwanakikbuti@gmail.com> wrote:
> 
> > On Fri, 2014-06-27 at 10:01 -0400, Steven Rostedt wrote:
> > 
> > > This seems like a lot of hacks.
> > 
> > It is exactly that, lacking proper pooper-scooper, show rt kernel how to
> > not step in it.
> > 
> > > I'm wondering if it would work if we
> > > just have the rt_spin_lock_slowlock not call schedule(), but call
> > > __schedule() directly. I mean it would keep with the mainline paradigm
> > > as spinlocks don't sleep there, and one going to sleep in the -rt
> > > kernel is similar to it being preempted by a very long NMI.
> > 
> > Problem being that we do sleep there, do need wakeup.  I have a hack
> > that turns them back into spinning locks, but it.. works too :)
> 
> Why do we need the wakeup? the owner of the lock should wake it up
> shouldn't it?

True, but that can take ages.

-Mike 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 17:34               ` Mike Galbraith
@ 2014-06-27 17:54                 ` Steven Rostedt
  2014-06-27 18:07                   ` Mike Galbraith
  0 siblings, 1 reply; 43+ messages in thread
From: Steven Rostedt @ 2014-06-27 17:54 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Austin Schuh, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, 27 Jun 2014 19:34:53 +0200
Mike Galbraith <umgwanakikbuti@gmail.com> wrote:

> On Fri, 2014-06-27 at 10:01 -0400, Steven Rostedt wrote:
> 
> > This seems like a lot of hacks.
> 
> It is exactly that, lacking proper pooper-scooper, show rt kernel how to
> not step in it.
> 
> > I'm wondering if it would work if we
> > just have the rt_spin_lock_slowlock not call schedule(), but call
> > __schedule() directly. I mean it would keep with the mainline paradigm
> > as spinlocks don't sleep there, and one going to sleep in the -rt
> > kernel is similar to it being preempted by a very long NMI.
> 
> Problem being that we do sleep there, do need wakeup.  I have a hack
> that turns them back into spinning locks, but it.. works too :)

Why do we need the wakeup? the owner of the lock should wake it up
shouldn't it?


-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 14:01             ` Steven Rostedt
@ 2014-06-27 17:34               ` Mike Galbraith
  2014-06-27 17:54                 ` Steven Rostedt
  0 siblings, 1 reply; 43+ messages in thread
From: Mike Galbraith @ 2014-06-27 17:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Austin Schuh, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, 2014-06-27 at 10:01 -0400, Steven Rostedt wrote:

> This seems like a lot of hacks.

It is exactly that, lacking proper pooper-scooper, show rt kernel how to
not step in it.

> I'm wondering if it would work if we
> just have the rt_spin_lock_slowlock not call schedule(), but call
> __schedule() directly. I mean it would keep with the mainline paradigm
> as spinlocks don't sleep there, and one going to sleep in the -rt
> kernel is similar to it being preempted by a very long NMI.

Problem being that we do sleep there, do need wakeup.  I have a hack
that turns them back into spinning locks, but it.. works too :)

> Does a spin_lock going to sleep really need to do all the presched and
> postsched work?

It would be lovely if we didn't have to do any of that.  On the IO bit,
I haven't seen hard evidence that the spinlock bit is absolutely
required (better not be, it doesn't guarantee anything), but the
combined hack did kill IO deadlock of multiple filesystems.

-Mike


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27  0:07         ` Austin Schuh
  2014-06-27  3:22           ` Mike Galbraith
  2014-06-27 12:57           ` Mike Galbraith
@ 2014-06-27 14:24           ` Thomas Gleixner
  2014-06-28  4:51             ` Mike Galbraith
                               ` (2 more replies)
  2 siblings, 3 replies; 43+ messages in thread
From: Thomas Gleixner @ 2014-06-27 14:24 UTC (permalink / raw)
  To: Austin Schuh
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Thu, 26 Jun 2014, Austin Schuh wrote:
> If I'm reading the rt patch correctly, wq_worker_sleeping was moved
> out of __schedule to sched_submit_work.  It looks like that changes
> the conditions under which wq_worker_sleeping is called.  It used to
> be called whenever a task was going to sleep (I think).  It looks like
> it is called now if the task is going to sleep, and if the task isn't
> blocked on a PI mutex (I think).
> 
> If I try the following experiment
> 
>  static inline void sched_submit_work(struct task_struct *tsk)
>  {
> +   if (tsk->state && tsk->flags & PF_WQ_WORKER) {
> +     wq_worker_sleeping(tsk);
> +     return;
> +   }
> 
> and then remove the call later in the function, I am able to pass my test.
> 
> Unfortunately, I then get a recursive pool spinlock BUG_ON after a
> while (as I would expect), and it all blows up.

Well, that's why the check for !pi_blocked_on is there.
 
> I'm not sure where to go from there.  Any changes to the workpool to
> try to fix that will be hard, or could affect latency significantly.

Completely untested patch below.

Thanks,

	tglx

--------------------->
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8749d20..621329a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2651,9 +2651,8 @@ need_resched:
 
 static inline void sched_submit_work(struct task_struct *tsk)
 {
-	if (!tsk->state || tsk_is_pi_blocked(tsk))
+	if (!tsk->state)
 		return;
-
 	/*
 	 * If a worker went to sleep, notify and ask workqueue whether
 	 * it wants to wake up a task to maintain concurrency.
@@ -2661,6 +2660,10 @@ static inline void sched_submit_work(struct task_struct *tsk)
 	if (tsk->flags & PF_WQ_WORKER)
 		wq_worker_sleeping(tsk);
 
+
+	if (tsk_is_pi_blocked(tsk))
+		return;
+
 	/*
 	 * If we are going to sleep and we have plugged IO queued,
 	 * make sure to submit it to avoid deadlocks.
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index be0ef50..0ca9d97 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -126,6 +126,8 @@ enum {
  *    cpu or grabbing pool->lock is enough for read access.  If
  *    POOL_DISASSOCIATED is set, it's identical to L.
  *
+ *    On RT we need the extra protection via rt_lock_idle_list()
+ *
  * MG: pool->manager_mutex and pool->lock protected.  Writes require both
  *     locks.  Reads can happen under either lock.
  *
@@ -409,6 +411,31 @@ static void copy_workqueue_attrs(struct workqueue_attrs *to,
 		if (({ assert_rcu_or_wq_mutex(wq); false; })) { }	\
 		else
 
+#ifdef CONFIG_PREEMPT_RT_BASE
+static inline void rt_lock_idle_list(struct worker_pool *pool)
+{
+	preempt_disable();
+}
+static inline void rt_unlock_idle_list(struct worker_pool *pool)
+{
+	preempt_enable();
+}
+static inline void sched_lock_idle_list(struct worker_pool *pool) { }
+static inline void sched_unlock_idle_list(struct worker_pool *pool) { }
+#else
+static inline void rt_lock_idle_list(struct worker_pool *pool) { }
+static inline void rt_unlock_idle_list(struct worker_pool *pool) { }
+static inline void sched_lock_idle_list(struct worker_pool *pool)
+{
+	spin_lock_irq(&pool->lock);
+}
+static inline void sched_unlock_idle_list(struct worker_pool *pool)
+{
+	spin_unlock_irq(&pool->lock);
+}
+#endif
+
+
 #ifdef CONFIG_DEBUG_OBJECTS_WORK
 
 static struct debug_obj_descr work_debug_descr;
@@ -801,10 +828,16 @@ static struct worker *first_worker(struct worker_pool *pool)
  */
 static void wake_up_worker(struct worker_pool *pool)
 {
-	struct worker *worker = first_worker(pool);
+	struct worker *worker;
+
+	rt_lock_idle_list(pool);
+
+	worker = first_worker(pool);
 
 	if (likely(worker))
 		wake_up_process(worker->task);
+
+	rt_unlock_idle_list(pool);
 }
 
 /**
@@ -832,7 +865,7 @@ void wq_worker_running(struct task_struct *task)
  */
 void wq_worker_sleeping(struct task_struct *task)
 {
-	struct worker *next, *worker = kthread_data(task);
+	struct worker *worker = kthread_data(task);
 	struct worker_pool *pool;
 
 	/*
@@ -849,25 +882,18 @@ void wq_worker_sleeping(struct task_struct *task)
 		return;
 
 	worker->sleeping = 1;
-	spin_lock_irq(&pool->lock);
+
 	/*
 	 * The counterpart of the following dec_and_test, implied mb,
 	 * worklist not empty test sequence is in insert_work().
 	 * Please read comment there.
-	 *
-	 * NOT_RUNNING is clear.  This means that we're bound to and
-	 * running on the local cpu w/ rq lock held and preemption
-	 * disabled, which in turn means that none else could be
-	 * manipulating idle_list, so dereferencing idle_list without pool
-	 * lock is safe.
 	 */
 	if (atomic_dec_and_test(&pool->nr_running) &&
 	    !list_empty(&pool->worklist)) {
-		next = first_worker(pool);
-		if (next)
-			wake_up_process(next->task);
+		sched_lock_idle_list(pool);
+		wake_up_worker(pool);
+		sched_unlock_idle_list(pool);
 	}
-	spin_unlock_irq(&pool->lock);
 }
 
 /**
@@ -1571,7 +1597,9 @@ static void worker_enter_idle(struct worker *worker)
 	worker->last_active = jiffies;
 
 	/* idle_list is LIFO */
+	rt_lock_idle_list(pool);
 	list_add(&worker->entry, &pool->idle_list);
+	rt_unlock_idle_list(pool);
 
 	if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
 		mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
@@ -1604,7 +1632,9 @@ static void worker_leave_idle(struct worker *worker)
 		return;
 	worker_clr_flags(worker, WORKER_IDLE);
 	pool->nr_idle--;
+	rt_lock_idle_list(pool);
 	list_del_init(&worker->entry);
+	rt_unlock_idle_list(pool);
 }
 
 /**
@@ -1849,7 +1879,9 @@ static void destroy_worker(struct worker *worker)
 	 */
 	get_task_struct(worker->task);
 
+	rt_lock_idle_list(pool);
 	list_del_init(&worker->entry);
+	rt_unlock_idle_list(pool);
 	worker->flags |= WORKER_DIE;
 
 	idr_remove(&pool->worker_idr, worker->id);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27 12:57           ` Mike Galbraith
@ 2014-06-27 14:01             ` Steven Rostedt
  2014-06-27 17:34               ` Mike Galbraith
  0 siblings, 1 reply; 43+ messages in thread
From: Steven Rostedt @ 2014-06-27 14:01 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Austin Schuh, Thomas Gleixner, Richard Weinberger, LKML, rt-users

On Fri, 27 Jun 2014 14:57:36 +0200
Mike Galbraith <umgwanakikbuti@gmail.com> wrote:

> On Thu, 2014-06-26 at 17:07 -0700, Austin Schuh wrote:
> 
> > I'm not sure where to go from there.  Any changes to the workpool to
> > try to fix that will be hard, or could affect latency significantly.
> 
> Oh what the hell, I'm out of frozen shark, may as well stock up.  Nobody
> else has posted spit yet.  I _know_ Steven has some shark, I saw tglx
> toss him a chunk.
> 
> It's the same root cause as -rt letting tasks schedule off with plugged
> IO, leading to deadlock scenarios.  Extending the way I dealt with that
> to deal with workqueue forward progress requirements.. works.. though it

> could perhaps use a face lift, tummy tuck.. and minor body-ectomy.

Yeah, I'd say ;-)

> 
> Subject: rtmutex: move pre/post schedule() progress guarantees to before we schedule
> 
> Queued IO can lead to IO deadlock should a task require wakeup from as task
> which is blocked on that queued IO, which is why we usually pull the plug
> while scheduling off.  In RT, pulling the plug could lead us to block on
> a contended sleeping lock while n the process of blocking.  Bad idea, so

"in the process"

> we don't, but that leaves us with various flavors of IO stall like below.
> 
> ext3: dbench1 queues a buffer, blocks on journal mutex, it's plug is not
> pulled.  dbench2 mutex owner is waiting for kjournald, who is waiting for
> the buffer queued by dbench1.  Game over.
> 
> The exact same situation can occur with workqueues.  We must notify the
> workqueue management that a worker is blocking, as if we don't, we can
> lock the box up completely.  It's too late to do that once we have arrived
> in schedule(), as we can't take a sleeping lock.
> 
> Therefore, move progress guarantee work up to before we reach the point of
> no return, and tell the scheduler that we're handling it early.
> 
> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> ---
>  include/linux/sched.h    |    2 +
>  kernel/locking/rtmutex.c |   59 +++++++++++++++++++++++++++++++++++++++++++----
>  kernel/sched/core.c      |   19 +++++++++++----
>  3 files changed, 72 insertions(+), 8 deletions(-)
> 
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1256,6 +1256,8 @@ struct task_struct {
>  	/* Revert to default priority/policy when forking */
>  	unsigned sched_reset_on_fork:1;
>  	unsigned sched_contributes_to_load:1;
> +	unsigned sched_presched:1;
> +	unsigned rtmutex_presched:1;
>  
>  	pid_t pid;
>  	pid_t tgid;
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -23,6 +23,7 @@
>  #include <linux/sched/deadline.h>
>  #include <linux/timer.h>
>  #include <linux/ww_mutex.h>
> +#include <linux/blkdev.h>
>  
>  #include "rtmutex_common.h"
>  
> @@ -782,6 +783,41 @@ void rt_mutex_adjust_pi(struct task_stru
>  }
>  
>  #ifdef CONFIG_PREEMPT_RT_FULL
> +#include "../workqueue_internal.h"
> +
> +static inline void rt_mutex_presched(struct task_struct *tsk)
> +{
> +	/* Recursive handling of presched work is a very bad idea. */

The above comment can be simplified to just:

	/* Recursion protection */

> +	if (tsk->rtmutex_presched || tsk->sched_presched)
> +		return;
> +
> +	/* Tell the scheduler we handle pre/post schedule() work. */
> +	tsk->rtmutex_presched = 1;
> +
> +	/*
> +	 * If a worker went to sleep, notify and ask workqueue whether
> +	 * it wants to wake up a task to maintain concurrency.
> +	 */
> +	if (tsk->flags & PF_WQ_WORKER)
> +		wq_worker_sleeping(tsk);
> +
> +	/*
> +	 * If we are going to sleep and we have plugged IO queued,
> +	 * make sure to submit it to avoid deadlocks.
> +	 */
> +	if (blk_needs_flush_plug(tsk))
> +		blk_schedule_flush_plug(tsk);
> +}
> +
> +static inline void rt_mutex_postsched(struct task_struct *tsk)
> +{
> +	if (!tsk->rtmutex_presched)
> +		return;
> +	if (tsk->flags & PF_WQ_WORKER)
> +		wq_worker_running(tsk);
> +	tsk->rtmutex_presched = 0;
> +}
> +
>  /*
>   * preemptible spin_lock functions:
>   */
> @@ -857,13 +893,23 @@ static void  noinline __sched rt_spin_lo
>  	struct rt_mutex_waiter waiter, *top_waiter;
>  	int ret;
>  
> +	/*
> +	 * We can't do presched work if we're already holding a lock
> +	 * else we can deadlock.  eg, if we're holding slab_lock,
> +	 * ksoftirqd can block while processing BLOCK_SOFTIRQ after
> +	 * having acquired q->queue_lock.  If _we_ then block on
> +	 * that q->queue_lock while flushing our plug, deadlock.
> +	 */
> +	if (__migrate_disabled(self) < 2)
> +		rt_mutex_presched(self);
> +
>  	rt_mutex_init_waiter(&waiter, true);
>  
>  	raw_spin_lock(&lock->wait_lock);
>  
>  	if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) {
>  		raw_spin_unlock(&lock->wait_lock);
> -		return;
> +		goto out;
>  	}
>  
>  	BUG_ON(rt_mutex_owner(lock) == self);
> @@ -928,6 +974,8 @@ static void  noinline __sched rt_spin_lo
>  	raw_spin_unlock(&lock->wait_lock);
>  
>  	debug_rt_mutex_free_waiter(&waiter);
> +out:
> +	rt_mutex_postsched(self);
>  }
>  
>  /*
> @@ -1274,18 +1322,20 @@ rt_mutex_slowlock(struct rt_mutex *lock,
>  		  int detect_deadlock, struct ww_acquire_ctx *ww_ctx)
>  {
>  	struct rt_mutex_waiter waiter;
> +	struct task_struct *self = current;
>  	int ret = 0;
>  
> +	rt_mutex_presched(self);
>  	rt_mutex_init_waiter(&waiter, false);
>  
>  	raw_spin_lock(&lock->wait_lock);
>  
>  	/* Try to acquire the lock again: */
> -	if (try_to_take_rt_mutex(lock, current, NULL)) {
> +	if (try_to_take_rt_mutex(lock, self, NULL)) {
>  		if (ww_ctx)
>  			ww_mutex_account_lock(lock, ww_ctx);
>  		raw_spin_unlock(&lock->wait_lock);
> -		return 0;
> +		goto out;
>  	}
>  
>  	set_current_state(state);
> @@ -1322,7 +1372,8 @@ rt_mutex_slowlock(struct rt_mutex *lock,
>  		hrtimer_cancel(&timeout->timer);
>  
>  	debug_rt_mutex_free_waiter(&waiter);
> -
> +out:
> +	rt_mutex_postsched(self);
>  	return ret;
>  }
>  
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2915,11 +2915,18 @@ static void __sched __schedule(void)
>  		goto need_resched;
>  }
>  
> -static inline void sched_submit_work(struct task_struct *tsk)
> +static inline void sched_presched_work(struct task_struct *tsk)
>  {
> +	/* It's being handled by rtmutex code */
> +	if (tsk->rtmutex_presched)
> +		return;
> +
>  	if (!tsk->state || tsk_is_pi_blocked(tsk))
>  		return;
>  
> +	/* Tell rtmutex presched code that we're handling it. */
> +	tsk->sched_presched = 1;
> +
>  	/*
>  	 * If a worker went to sleep, notify and ask workqueue whether
>  	 * it wants to wake up a task to maintain concurrency.
> @@ -2935,19 +2942,23 @@ static inline void sched_submit_work(str
>  		blk_schedule_flush_plug(tsk);
>  }
>  
> -static inline void sched_update_worker(struct task_struct *tsk)
> +static inline void sched_postsched_work(struct task_struct *tsk)
>  {
> +	/* It's being handled by rtmutex code */
> +	if (tsk->rtmutex_presched)
> +		return;
>  	if (tsk->flags & PF_WQ_WORKER)
>  		wq_worker_running(tsk);
> +	tsk->sched_presched = 0;
>  }
>  
>  asmlinkage void __sched schedule(void)
>  {
>  	struct task_struct *tsk = current;
>  
> -	sched_submit_work(tsk);
> +	sched_presched_work(tsk);
>  	__schedule();
> -	sched_update_worker(tsk);
> +	sched_postsched_work(tsk);
>  }

This seems like a lot of hacks. I'm wondering if it would work if we
just have the rt_spin_lock_slowlock not call schedule(), but call
__schedule() directly. I mean it would keep with the mainline paradigm
as spinlocks don't sleep there, and one going to sleep in the -rt
kernel is similar to it being preempted by a very long NMI.

Does a spin_lock going to sleep really need to do all the presched and
postsched work?


-- Steve



>  EXPORT_SYMBOL(schedule);
>  
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27  0:07         ` Austin Schuh
  2014-06-27  3:22           ` Mike Galbraith
@ 2014-06-27 12:57           ` Mike Galbraith
  2014-06-27 14:01             ` Steven Rostedt
  2014-06-27 14:24           ` Thomas Gleixner
  2 siblings, 1 reply; 43+ messages in thread
From: Mike Galbraith @ 2014-06-27 12:57 UTC (permalink / raw)
  To: Austin Schuh
  Cc: Thomas Gleixner, Richard Weinberger, LKML, rt-users, Steven Rostedt

On Thu, 2014-06-26 at 17:07 -0700, Austin Schuh wrote:

> I'm not sure where to go from there.  Any changes to the workpool to
> try to fix that will be hard, or could affect latency significantly.

Oh what the hell, I'm out of frozen shark, may as well stock up.  Nobody
else has posted spit yet.  I _know_ Steven has some shark, I saw tglx
toss him a chunk.

It's the same root cause as -rt letting tasks schedule off with plugged
IO, leading to deadlock scenarios.  Extending the way I dealt with that
to deal with workqueue forward progress requirements.. works.. though it
could perhaps use a face lift, tummy tuck.. and minor body-ectomy.

Subject: rtmutex: move pre/post schedule() progress guarantees to before we schedule

Queued IO can lead to IO deadlock should a task require wakeup from as task
which is blocked on that queued IO, which is why we usually pull the plug
while scheduling off.  In RT, pulling the plug could lead us to block on
a contended sleeping lock while n the process of blocking.  Bad idea, so
we don't, but that leaves us with various flavors of IO stall like below.

ext3: dbench1 queues a buffer, blocks on journal mutex, it's plug is not
pulled.  dbench2 mutex owner is waiting for kjournald, who is waiting for
the buffer queued by dbench1.  Game over.

The exact same situation can occur with workqueues.  We must notify the
workqueue management that a worker is blocking, as if we don't, we can
lock the box up completely.  It's too late to do that once we have arrived
in schedule(), as we can't take a sleeping lock.

Therefore, move progress guarantee work up to before we reach the point of
no return, and tell the scheduler that we're handling it early.

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
---
 include/linux/sched.h    |    2 +
 kernel/locking/rtmutex.c |   59 +++++++++++++++++++++++++++++++++++++++++++----
 kernel/sched/core.c      |   19 +++++++++++----
 3 files changed, 72 insertions(+), 8 deletions(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1256,6 +1256,8 @@ struct task_struct {
 	/* Revert to default priority/policy when forking */
 	unsigned sched_reset_on_fork:1;
 	unsigned sched_contributes_to_load:1;
+	unsigned sched_presched:1;
+	unsigned rtmutex_presched:1;
 
 	pid_t pid;
 	pid_t tgid;
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -23,6 +23,7 @@
 #include <linux/sched/deadline.h>
 #include <linux/timer.h>
 #include <linux/ww_mutex.h>
+#include <linux/blkdev.h>
 
 #include "rtmutex_common.h"
 
@@ -782,6 +783,41 @@ void rt_mutex_adjust_pi(struct task_stru
 }
 
 #ifdef CONFIG_PREEMPT_RT_FULL
+#include "../workqueue_internal.h"
+
+static inline void rt_mutex_presched(struct task_struct *tsk)
+{
+	/* Recursive handling of presched work is a very bad idea. */
+	if (tsk->rtmutex_presched || tsk->sched_presched)
+		return;
+
+	/* Tell the scheduler we handle pre/post schedule() work. */
+	tsk->rtmutex_presched = 1;
+
+	/*
+	 * If a worker went to sleep, notify and ask workqueue whether
+	 * it wants to wake up a task to maintain concurrency.
+	 */
+	if (tsk->flags & PF_WQ_WORKER)
+		wq_worker_sleeping(tsk);
+
+	/*
+	 * If we are going to sleep and we have plugged IO queued,
+	 * make sure to submit it to avoid deadlocks.
+	 */
+	if (blk_needs_flush_plug(tsk))
+		blk_schedule_flush_plug(tsk);
+}
+
+static inline void rt_mutex_postsched(struct task_struct *tsk)
+{
+	if (!tsk->rtmutex_presched)
+		return;
+	if (tsk->flags & PF_WQ_WORKER)
+		wq_worker_running(tsk);
+	tsk->rtmutex_presched = 0;
+}
+
 /*
  * preemptible spin_lock functions:
  */
@@ -857,13 +893,23 @@ static void  noinline __sched rt_spin_lo
 	struct rt_mutex_waiter waiter, *top_waiter;
 	int ret;
 
+	/*
+	 * We can't do presched work if we're already holding a lock
+	 * else we can deadlock.  eg, if we're holding slab_lock,
+	 * ksoftirqd can block while processing BLOCK_SOFTIRQ after
+	 * having acquired q->queue_lock.  If _we_ then block on
+	 * that q->queue_lock while flushing our plug, deadlock.
+	 */
+	if (__migrate_disabled(self) < 2)
+		rt_mutex_presched(self);
+
 	rt_mutex_init_waiter(&waiter, true);
 
 	raw_spin_lock(&lock->wait_lock);
 
 	if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) {
 		raw_spin_unlock(&lock->wait_lock);
-		return;
+		goto out;
 	}
 
 	BUG_ON(rt_mutex_owner(lock) == self);
@@ -928,6 +974,8 @@ static void  noinline __sched rt_spin_lo
 	raw_spin_unlock(&lock->wait_lock);
 
 	debug_rt_mutex_free_waiter(&waiter);
+out:
+	rt_mutex_postsched(self);
 }
 
 /*
@@ -1274,18 +1322,20 @@ rt_mutex_slowlock(struct rt_mutex *lock,
 		  int detect_deadlock, struct ww_acquire_ctx *ww_ctx)
 {
 	struct rt_mutex_waiter waiter;
+	struct task_struct *self = current;
 	int ret = 0;
 
+	rt_mutex_presched(self);
 	rt_mutex_init_waiter(&waiter, false);
 
 	raw_spin_lock(&lock->wait_lock);
 
 	/* Try to acquire the lock again: */
-	if (try_to_take_rt_mutex(lock, current, NULL)) {
+	if (try_to_take_rt_mutex(lock, self, NULL)) {
 		if (ww_ctx)
 			ww_mutex_account_lock(lock, ww_ctx);
 		raw_spin_unlock(&lock->wait_lock);
-		return 0;
+		goto out;
 	}
 
 	set_current_state(state);
@@ -1322,7 +1372,8 @@ rt_mutex_slowlock(struct rt_mutex *lock,
 		hrtimer_cancel(&timeout->timer);
 
 	debug_rt_mutex_free_waiter(&waiter);
-
+out:
+	rt_mutex_postsched(self);
 	return ret;
 }
 
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2915,11 +2915,18 @@ static void __sched __schedule(void)
 		goto need_resched;
 }
 
-static inline void sched_submit_work(struct task_struct *tsk)
+static inline void sched_presched_work(struct task_struct *tsk)
 {
+	/* It's being handled by rtmutex code */
+	if (tsk->rtmutex_presched)
+		return;
+
 	if (!tsk->state || tsk_is_pi_blocked(tsk))
 		return;
 
+	/* Tell rtmutex presched code that we're handling it. */
+	tsk->sched_presched = 1;
+
 	/*
 	 * If a worker went to sleep, notify and ask workqueue whether
 	 * it wants to wake up a task to maintain concurrency.
@@ -2935,19 +2942,23 @@ static inline void sched_submit_work(str
 		blk_schedule_flush_plug(tsk);
 }
 
-static inline void sched_update_worker(struct task_struct *tsk)
+static inline void sched_postsched_work(struct task_struct *tsk)
 {
+	/* It's being handled by rtmutex code */
+	if (tsk->rtmutex_presched)
+		return;
 	if (tsk->flags & PF_WQ_WORKER)
 		wq_worker_running(tsk);
+	tsk->sched_presched = 0;
 }
 
 asmlinkage void __sched schedule(void)
 {
 	struct task_struct *tsk = current;
 
-	sched_submit_work(tsk);
+	sched_presched_work(tsk);
 	__schedule();
-	sched_update_worker(tsk);
+	sched_postsched_work(tsk);
 }
 EXPORT_SYMBOL(schedule);
 



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-27  0:07         ` Austin Schuh
@ 2014-06-27  3:22           ` Mike Galbraith
  2014-06-27 12:57           ` Mike Galbraith
  2014-06-27 14:24           ` Thomas Gleixner
  2 siblings, 0 replies; 43+ messages in thread
From: Mike Galbraith @ 2014-06-27  3:22 UTC (permalink / raw)
  To: Austin Schuh
  Cc: Thomas Gleixner, Richard Weinberger, LKML, rt-users, Steven Rostedt

On Thu, 2014-06-26 at 17:07 -0700, Austin Schuh wrote:

> If I'm reading the rt patch correctly, wq_worker_sleeping was moved
> out of __schedule to sched_submit_work.  It looks like that changes
> the conditions under which wq_worker_sleeping is called.  It used to
> be called whenever a task was going to sleep (I think).  It looks like
> it is called now if the task is going to sleep, and if the task isn't
> blocked on a PI mutex (I think).

Re-entanglement does turn your killer into a happy camper, but that
patch is too lovely to just surrender without a squabble.

-Mike


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-26 22:35       ` Thomas Gleixner
@ 2014-06-27  0:07         ` Austin Schuh
  2014-06-27  3:22           ` Mike Galbraith
                             ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Austin Schuh @ 2014-06-27  0:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Thu, Jun 26, 2014 at 3:35 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 26 Jun 2014, Austin Schuh wrote:
>> On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger
>> <richard.weinberger@gmail.com> wrote:
>> > CC'ing RT folks
>> >
>> > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com> wrote:
>> >> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
>> >>> Hi,
>> >>>
>> >>> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
>> >>> patched kernel.  I have currently only triggered it using dpkg.  Dave
>> >>> Chinner on the XFS mailing list suggested that it was a rt-kernel
>> >>> workqueue issue as opposed to a XFS problem after looking at the
>> >>> kernel messages.
>>
>> I've got a 100% reproducible test case that doesn't involve a
>> filesystem.  I wrote a module that triggers the bug when the device is
>> written to, making it easy to enable tracing during the event and
>> capture everything.
>>
>> It looks like rw_semaphores don't trigger wq_worker_sleeping to run
>> when work goes to sleep on a rw_semaphore.  This only happens with the
>> RT patches, not with the mainline kernel.  I'm foreseeing a second
>> deadlock/bug coming into play shortly.  If a task holding the work
>> pool spinlock gets preempted, and we need to schedule more work from
>> another worker thread which was just blocked by a mutex, we'll then
>> end up trying to go to sleep on 2 locks at once.
>
> I remember vaguely, that I've seen and analyzed that quite some time
> ago. I can't page in all the gory details right now, but I have a look
> how the related code changed in the last couple of years tomorrow
> morning with an awake brain.
>
> Steven, you did some analysis on that IIRC, or was that just related
> to rw_locks?
>
> Thanks,
>
>         tglx

If I'm reading the rt patch correctly, wq_worker_sleeping was moved
out of __schedule to sched_submit_work.  It looks like that changes
the conditions under which wq_worker_sleeping is called.  It used to
be called whenever a task was going to sleep (I think).  It looks like
it is called now if the task is going to sleep, and if the task isn't
blocked on a PI mutex (I think).

If I try the following experiment

 static inline void sched_submit_work(struct task_struct *tsk)
 {
+   if (tsk->state && tsk->flags & PF_WQ_WORKER) {
+     wq_worker_sleeping(tsk);
+     return;
+   }

and then remove the call later in the function, I am able to pass my test.

Unfortunately, I then get a recursive pool spinlock BUG_ON after a
while (as I would expect), and it all blows up.

I'm not sure where to go from there.  Any changes to the workpool to
try to fix that will be hard, or could affect latency significantly.

Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-06-26 19:50     ` Austin Schuh
@ 2014-06-26 22:35       ` Thomas Gleixner
  2014-06-27  0:07         ` Austin Schuh
  0 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2014-06-26 22:35 UTC (permalink / raw)
  To: Austin Schuh
  Cc: Richard Weinberger, Mike Galbraith, LKML, rt-users, Steven Rostedt

On Thu, 26 Jun 2014, Austin Schuh wrote:
> On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger
> <richard.weinberger@gmail.com> wrote:
> > CC'ing RT folks
> >
> > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com> wrote:
> >> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> >>> Hi,
> >>>
> >>> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
> >>> patched kernel.  I have currently only triggered it using dpkg.  Dave
> >>> Chinner on the XFS mailing list suggested that it was a rt-kernel
> >>> workqueue issue as opposed to a XFS problem after looking at the
> >>> kernel messages.
> 
> I've got a 100% reproducible test case that doesn't involve a
> filesystem.  I wrote a module that triggers the bug when the device is
> written to, making it easy to enable tracing during the event and
> capture everything.
> 
> It looks like rw_semaphores don't trigger wq_worker_sleeping to run
> when work goes to sleep on a rw_semaphore.  This only happens with the
> RT patches, not with the mainline kernel.  I'm foreseeing a second
> deadlock/bug coming into play shortly.  If a task holding the work
> pool spinlock gets preempted, and we need to schedule more work from
> another worker thread which was just blocked by a mutex, we'll then
> end up trying to go to sleep on 2 locks at once.

I remember vaguely, that I've seen and analyzed that quite some time
ago. I can't page in all the gory details right now, but I have a look
how the related code changed in the last couple of years tomorrow
morning with an awake brain.

Steven, you did some analysis on that IIRC, or was that just related
to rw_locks?

Thanks,

	tglx





^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-05-21  7:33     ` Richard Weinberger
  (?)
@ 2014-06-26 19:50     ` Austin Schuh
  2014-06-26 22:35       ` Thomas Gleixner
  -1 siblings, 1 reply; 43+ messages in thread
From: Austin Schuh @ 2014-06-26 19:50 UTC (permalink / raw)
  To: Richard Weinberger, Thomas Gleixner, Mike Galbraith; +Cc: LKML, rt-users

[-- Attachment #1: Type: text/plain, Size: 1371 bytes --]

On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger
<richard.weinberger@gmail.com> wrote:
> CC'ing RT folks
>
> On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com> wrote:
>> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
>>> Hi,
>>>
>>> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
>>> patched kernel.  I have currently only triggered it using dpkg.  Dave
>>> Chinner on the XFS mailing list suggested that it was a rt-kernel
>>> workqueue issue as opposed to a XFS problem after looking at the
>>> kernel messages.

I've got a 100% reproducible test case that doesn't involve a
filesystem.  I wrote a module that triggers the bug when the device is
written to, making it easy to enable tracing during the event and
capture everything.

It looks like rw_semaphores don't trigger wq_worker_sleeping to run
when work goes to sleep on a rw_semaphore.  This only happens with the
RT patches, not with the mainline kernel.  I'm foreseeing a second
deadlock/bug coming into play shortly.  If a task holding the work
pool spinlock gets preempted, and we need to schedule more work from
another worker thread which was just blocked by a mutex, we'll then
end up trying to go to sleep on 2 locks at once.

That is getting a bit deep into the scheduler for me...  Any
suggestions on how to fix it?

Austin

[-- Attachment #2: killer_module.c --]
[-- Type: text/x-csrc, Size: 4183 bytes --]

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/fs.h>
#include <asm/uaccess.h>
#include <linux/semaphore.h>

static int device_open(struct inode *, struct file *);
static int device_release(struct inode *, struct file *);
static ssize_t device_read(struct file *, char *, size_t, loff_t *);
static ssize_t device_write(struct file *, const char *, size_t, loff_t *);

// Dev name as it appears in /proc/devices
#define DEVICE_NAME "aschuh"

// Major number assigned to our device driver
static int major;
static struct workqueue_struct *lockup_wq1;
static struct workqueue_struct *lockup_wq2;

static struct file_operations fops = {
  .read = device_read,
  .write = device_write,
  .open = device_open,
  .release = device_release
};

static int __init init_killer_module(void) {

  lockup_wq1 = alloc_workqueue("lockup_wq1", WQ_MEM_RECLAIM, 0);
  if (!lockup_wq1) return -ENOMEM;

  lockup_wq2 = alloc_workqueue("lockup_wq2", WQ_MEM_RECLAIM, 0);
  if (!lockup_wq2) {
    destroy_workqueue(lockup_wq1);
    return -ENOMEM;
  }

  major = register_chrdev(0, DEVICE_NAME, &fops);
  if (major < 0) {
    printk(KERN_ALERT "Registering char device failed with %d\n", major);
    destroy_workqueue(lockup_wq1);
    destroy_workqueue(lockup_wq2);

    return major;
  }

  printk(KERN_INFO "'mknod /dev/%s c %d 0'.\n", DEVICE_NAME, major);

  // A non 0 return means init_module failed; module can't be loaded.
  return 0;
}

// Called when a process tries to open the device file.
static int device_open(struct inode *inode, struct file *file) {
  try_module_get(THIS_MODULE);
  return 0;
}

// Called when a process closes the device file.
static int device_release(struct inode *inode, struct file *file) {
  // Decrement the usage count, or else once you opened the file, you'll never
  // get get rid of the module.
  module_put(THIS_MODULE);

  return 0;
}

static ssize_t device_read(struct file *filp, char *buffer, size_t length,
                           loff_t *offset) {
  return 0;
}

#if 0

#define SEM_INIT(sem) sema_init(sem, 1)
#define SEM_TYPE struct semaphore
#define SEM_DOWN(sem) down(sem)
#define SEM_UP(sem) up(sem)

#else

#define SEM_INIT(sem) init_rwsem(sem)
#define SEM_TYPE struct rw_semaphore
#define SEM_DOWN(sem) down_write_nested(sem, 0)
#define SEM_UP(sem) up_write(sem)

#endif

struct mywork {
  struct work_struct work;
  int index;
  SEM_TYPE *sem;
};

static void work1(struct work_struct *work) {
  struct mywork *my_work = container_of(work, struct mywork, work);
  trace_printk("work1 Called with index %d\n", my_work->index);
}

static void work2(struct work_struct *work) {
  struct mywork *my_work = container_of(work, struct mywork, work);
  trace_printk("work2 Called with index %d\n", my_work->index);
  SEM_DOWN(my_work->sem);
  SEM_UP(my_work->sem);
  trace_printk("work2 Finished with index %d\n", my_work->index);
}


static ssize_t device_write(struct file *filp, const char *buff, size_t len,
                            loff_t *off) {
  SEM_TYPE write_sem;
  SEM_INIT(&write_sem);
  
  struct mywork my_work1;
  struct mywork my_work2;
  trace_printk("lockup_wq1 %p lockup_wq2 %p\n", lockup_wq1, lockup_wq2);

  trace_printk("Got a write\n");

  SEM_DOWN(&write_sem);
  my_work1.index = len;
  my_work1.sem = &write_sem;
  INIT_WORK_ONSTACK(&my_work1.work, work1);

  my_work2.index = len;
  my_work2.sem = &write_sem;
  INIT_WORK_ONSTACK(&my_work2.work, work2);

  queue_work(lockup_wq2, &my_work2.work);

  queue_work(lockup_wq1, &my_work1.work);
  flush_work(&my_work1.work);
  destroy_work_on_stack(&my_work1.work);

  SEM_UP(&write_sem);

  flush_work(&my_work2.work);
  destroy_work_on_stack(&my_work2.work);
  trace_printk("Write done\n");
  tracing_off();

  return len;
}

static void __exit cleanup_killer_module(void) {
  printk(KERN_INFO "Goodbye world 1.\n");
  unregister_chrdev(major, DEVICE_NAME);
	destroy_workqueue(lockup_wq1);
	destroy_workqueue(lockup_wq2);
}

module_init(init_killer_module);
module_exit(cleanup_killer_module);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Austin Schuh <austin@peloton-tech.com>");
MODULE_DESCRIPTION("Triggers a workqueue bug on write.");

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-05-21 19:30 ` John Blackwood
@ 2014-05-21 21:59   ` Austin Schuh
  -1 siblings, 0 replies; 43+ messages in thread
From: Austin Schuh @ 2014-05-21 21:59 UTC (permalink / raw)
  To: John Blackwood; +Cc: Richard Weinberger, linux-kernel, xfs, linux-rt-users

On Wed, May 21, 2014 at 12:30 PM, John Blackwood
<john.blackwood@ccur.com> wrote:
>> Date: Wed, 21 May 2014 03:33:49 -0400
>> From: Richard Weinberger <richard.weinberger@gmail.com>
>> To: Austin Schuh <austin@peloton-tech.com>
>> CC: LKML <linux-kernel@vger.kernel.org>, xfs <xfs@oss.sgi.com>, rt-users
>>       <linux-rt-users@vger.kernel.org>
>> Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT
>
>>
>> CC'ing RT folks
>>
>> On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com>
>> wrote:
>> > > On Tue, May 13, 2014 at 7:29 PM, Austin Schuh
>> > > <austin@peloton-tech.com> wrote:
>> >> >> Hi,
>> >> >>
>> >> >> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
>> >> >> patched kernel.  I have currently only triggered it using dpkg.
>> >> >> Dave
>> >> >> Chinner on the XFS mailing list suggested that it was a rt-kernel
>> >> >> workqueue issue as opposed to a XFS problem after looking at the
>> >> >> kernel messages.
>> >> >>
>> >> >> The only modification to the kernel besides the RT patch is that I
>> >> >> have applied tglx's "genirq: Sanitize spurious interrupt detection
>> >> >> of
>> >> >> threaded irqs" patch.
>> > >
>> > > I upgraded to 3.14.3-rt4, and the problem still persists.
>> > >
>> > > I turned on event tracing and tracked it down further.  I'm able to
>> > > lock it up by scping a new kernel debian package to /tmp/ on the
>> > > machine.  scp is locking the inode, and then scheduling
>> > > xfs_bmapi_allocate_worker in the work queue.  The work then never gets
>> > > run.  The kworkers then lock up waiting for the inode lock.
>> > >
>> > > Here are the relevant events from the trace.  ffff8803e9f10288
>> > > (blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
>> > > (xfs_bmapi_allocate_worker) never does.  The kernel then warns about
>> > > blocked tasks 120 seconds later.
>
> Austin and Richard,
>
> I'm not 100% sure that the patch below will fix your problem, but we
> saw something that sounds pretty familiar to your issue involving the
> nvidia driver and the preempt-rt patch.  The nvidia driver uses the
> completion support to create their own driver's notion of an internally
> used semaphore.
>
> Some tasks were failing to ever wakeup from wait_for_completion() calls
> due to a race in the underlying do_wait_for_common() routine.

Hi John,

Thanks for the suggestion and patch.  The issue is that the work never
gets run, not that the work finishes but the waiter never gets woken.
I applied it anyways to see if it helps, but I still get the lockup.

Thanks,
    Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-05-21 21:59   ` Austin Schuh
  0 siblings, 0 replies; 43+ messages in thread
From: Austin Schuh @ 2014-05-21 21:59 UTC (permalink / raw)
  To: John Blackwood; +Cc: Richard Weinberger, linux-rt-users, linux-kernel, xfs

On Wed, May 21, 2014 at 12:30 PM, John Blackwood
<john.blackwood@ccur.com> wrote:
>> Date: Wed, 21 May 2014 03:33:49 -0400
>> From: Richard Weinberger <richard.weinberger@gmail.com>
>> To: Austin Schuh <austin@peloton-tech.com>
>> CC: LKML <linux-kernel@vger.kernel.org>, xfs <xfs@oss.sgi.com>, rt-users
>>       <linux-rt-users@vger.kernel.org>
>> Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT
>
>>
>> CC'ing RT folks
>>
>> On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com>
>> wrote:
>> > > On Tue, May 13, 2014 at 7:29 PM, Austin Schuh
>> > > <austin@peloton-tech.com> wrote:
>> >> >> Hi,
>> >> >>
>> >> >> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
>> >> >> patched kernel.  I have currently only triggered it using dpkg.
>> >> >> Dave
>> >> >> Chinner on the XFS mailing list suggested that it was a rt-kernel
>> >> >> workqueue issue as opposed to a XFS problem after looking at the
>> >> >> kernel messages.
>> >> >>
>> >> >> The only modification to the kernel besides the RT patch is that I
>> >> >> have applied tglx's "genirq: Sanitize spurious interrupt detection
>> >> >> of
>> >> >> threaded irqs" patch.
>> > >
>> > > I upgraded to 3.14.3-rt4, and the problem still persists.
>> > >
>> > > I turned on event tracing and tracked it down further.  I'm able to
>> > > lock it up by scping a new kernel debian package to /tmp/ on the
>> > > machine.  scp is locking the inode, and then scheduling
>> > > xfs_bmapi_allocate_worker in the work queue.  The work then never gets
>> > > run.  The kworkers then lock up waiting for the inode lock.
>> > >
>> > > Here are the relevant events from the trace.  ffff8803e9f10288
>> > > (blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
>> > > (xfs_bmapi_allocate_worker) never does.  The kernel then warns about
>> > > blocked tasks 120 seconds later.
>
> Austin and Richard,
>
> I'm not 100% sure that the patch below will fix your problem, but we
> saw something that sounds pretty familiar to your issue involving the
> nvidia driver and the preempt-rt patch.  The nvidia driver uses the
> completion support to create their own driver's notion of an internally
> used semaphore.
>
> Some tasks were failing to ever wakeup from wait_for_completion() calls
> due to a race in the underlying do_wait_for_common() routine.

Hi John,

Thanks for the suggestion and patch.  The issue is that the work never
gets run, not that the work finishes but the waiter never gets woken.
I applied it anyways to see if it helps, but I still get the lockup.

Thanks,
    Austin

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-05-21 19:30 ` John Blackwood
  0 siblings, 0 replies; 43+ messages in thread
From: John Blackwood @ 2014-05-21 19:30 UTC (permalink / raw)
  To: Richard Weinberger, Austin Schuh; +Cc: linux-kernel, xfs, linux-rt-users

 > Date: Wed, 21 May 2014 03:33:49 -0400
 > From: Richard Weinberger <richard.weinberger@gmail.com>
 > To: Austin Schuh <austin@peloton-tech.com>
 > CC: LKML <linux-kernel@vger.kernel.org>, xfs <xfs@oss.sgi.com>, rt-users
 > 	<linux-rt-users@vger.kernel.org>
 > Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT
 >
 > CC'ing RT folks
 >
 > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com> wrote:
 > > > On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
 > >> >> Hi,
 > >> >>
 > >> >> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
 > >> >> patched kernel.  I have currently only triggered it using dpkg.  Dave
 > >> >> Chinner on the XFS mailing list suggested that it was a rt-kernel
 > >> >> workqueue issue as opposed to a XFS problem after looking at the
 > >> >> kernel messages.
 > >> >>
 > >> >> The only modification to the kernel besides the RT patch is that I
 > >> >> have applied tglx's "genirq: Sanitize spurious interrupt detection of
 > >> >> threaded irqs" patch.
 > > >
 > > > I upgraded to 3.14.3-rt4, and the problem still persists.
 > > >
 > > > I turned on event tracing and tracked it down further.  I'm able to
 > > > lock it up by scping a new kernel debian package to /tmp/ on the
 > > > machine.  scp is locking the inode, and then scheduling
 > > > xfs_bmapi_allocate_worker in the work queue.  The work then never gets
 > > > run.  The kworkers then lock up waiting for the inode lock.
 > > >
 > > > Here are the relevant events from the trace.  ffff8803e9f10288
 > > > (blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
 > > > (xfs_bmapi_allocate_worker) never does.  The kernel then warns about
 > > > blocked tasks 120 seconds later.

Austin and Richard,

I'm not 100% sure that the patch below will fix your problem, but we
saw something that sounds pretty familiar to your issue involving the
nvidia driver and the preempt-rt patch.  The nvidia driver uses the
completion support to create their own driver's notion of an internally
used semaphore.

Some tasks were failing to ever wakeup from wait_for_completion() calls
due to a race in the underlying do_wait_for_common() routine.

This is the patch that we used to fix this issue:

------------------- -------------------

Fix a race in the PRT wait for completion simple wait code.

A wait_for_completion() waiter task can be awoken by a task calling
complete(), but fail to consume the 'done' completion resource if it
looses a race with another task calling wait_for_completion() just as
it is waking up.

In this case, the awoken task will call schedule_timeout() again
without being in the simple wait queue.

So if the awoken task is unable to claim the 'done' completion resource,
check to see if it needs to be re-inserted into the wait list before
waiting again in schedule_timeout().

Fix-by: John Blackwood <john.blackwood@ccur.com>
Index: b/kernel/sched/core.c
===================================================================
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3529,11 +3529,19 @@ static inline long __sched
  do_wait_for_common(struct completion *x,
  		   long (*action)(long), long timeout, int state)
  {
+	int again = 0;
+
  	if (!x->done) {
  		DEFINE_SWAITER(wait);

  		swait_prepare_locked(&x->wait, &wait);
  		do {
+			/* Check to see if we lost race for 'done' and are
+			 * no longer in the wait list.
+			 */
+			if (unlikely(again) && list_empty(&wait.node))
+				swait_prepare_locked(&x->wait, &wait);
+
  			if (signal_pending_state(state, current)) {
  				timeout = -ERESTARTSYS;
  				break;
@@ -3542,6 +3550,7 @@ do_wait_for_common(struct completion *x,
  			raw_spin_unlock_irq(&x->wait.lock);
  			timeout = action(timeout);
  			raw_spin_lock_irq(&x->wait.lock);
+			again = 1;
  		} while (!x->done && timeout);
  		swait_finish_locked(&x->wait, &wait);
  		if (!x->done)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-05-21 19:30 ` John Blackwood
  0 siblings, 0 replies; 43+ messages in thread
From: John Blackwood @ 2014-05-21 19:30 UTC (permalink / raw)
  To: Richard Weinberger, Austin Schuh; +Cc: linux-rt-users, linux-kernel, xfs

 > Date: Wed, 21 May 2014 03:33:49 -0400
 > From: Richard Weinberger <richard.weinberger@gmail.com>
 > To: Austin Schuh <austin@peloton-tech.com>
 > CC: LKML <linux-kernel@vger.kernel.org>, xfs <xfs@oss.sgi.com>, rt-users
 > 	<linux-rt-users@vger.kernel.org>
 > Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT
 >
 > CC'ing RT folks
 >
 > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com> wrote:
 > > > On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
 > >> >> Hi,
 > >> >>
 > >> >> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
 > >> >> patched kernel.  I have currently only triggered it using dpkg.  Dave
 > >> >> Chinner on the XFS mailing list suggested that it was a rt-kernel
 > >> >> workqueue issue as opposed to a XFS problem after looking at the
 > >> >> kernel messages.
 > >> >>
 > >> >> The only modification to the kernel besides the RT patch is that I
 > >> >> have applied tglx's "genirq: Sanitize spurious interrupt detection of
 > >> >> threaded irqs" patch.
 > > >
 > > > I upgraded to 3.14.3-rt4, and the problem still persists.
 > > >
 > > > I turned on event tracing and tracked it down further.  I'm able to
 > > > lock it up by scping a new kernel debian package to /tmp/ on the
 > > > machine.  scp is locking the inode, and then scheduling
 > > > xfs_bmapi_allocate_worker in the work queue.  The work then never gets
 > > > run.  The kworkers then lock up waiting for the inode lock.
 > > >
 > > > Here are the relevant events from the trace.  ffff8803e9f10288
 > > > (blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
 > > > (xfs_bmapi_allocate_worker) never does.  The kernel then warns about
 > > > blocked tasks 120 seconds later.

Austin and Richard,

I'm not 100% sure that the patch below will fix your problem, but we
saw something that sounds pretty familiar to your issue involving the
nvidia driver and the preempt-rt patch.  The nvidia driver uses the
completion support to create their own driver's notion of an internally
used semaphore.

Some tasks were failing to ever wakeup from wait_for_completion() calls
due to a race in the underlying do_wait_for_common() routine.

This is the patch that we used to fix this issue:

------------------- -------------------

Fix a race in the PRT wait for completion simple wait code.

A wait_for_completion() waiter task can be awoken by a task calling
complete(), but fail to consume the 'done' completion resource if it
looses a race with another task calling wait_for_completion() just as
it is waking up.

In this case, the awoken task will call schedule_timeout() again
without being in the simple wait queue.

So if the awoken task is unable to claim the 'done' completion resource,
check to see if it needs to be re-inserted into the wait list before
waiting again in schedule_timeout().

Fix-by: John Blackwood <john.blackwood@ccur.com>
Index: b/kernel/sched/core.c
===================================================================
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3529,11 +3529,19 @@ static inline long __sched
  do_wait_for_common(struct completion *x,
  		   long (*action)(long), long timeout, int state)
  {
+	int again = 0;
+
  	if (!x->done) {
  		DEFINE_SWAITER(wait);

  		swait_prepare_locked(&x->wait, &wait);
  		do {
+			/* Check to see if we lost race for 'done' and are
+			 * no longer in the wait list.
+			 */
+			if (unlikely(again) && list_empty(&wait.node))
+				swait_prepare_locked(&x->wait, &wait);
+
  			if (signal_pending_state(state, current)) {
  				timeout = -ERESTARTSYS;
  				break;
@@ -3542,6 +3550,7 @@ do_wait_for_common(struct completion *x,
  			raw_spin_unlock_irq(&x->wait.lock);
  			timeout = action(timeout);
  			raw_spin_lock_irq(&x->wait.lock);
+			again = 1;
  		} while (!x->done && timeout);
  		swait_finish_locked(&x->wait, &wait);
  		if (!x->done)

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-05-21  6:23   ` Austin Schuh
@ 2014-05-21  7:33     ` Richard Weinberger
  -1 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2014-05-21  7:33 UTC (permalink / raw)
  To: Austin Schuh; +Cc: LKML, xfs, rt-users

CC'ing RT folks

On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com> wrote:
> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
>> Hi,
>>
>> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
>> patched kernel.  I have currently only triggered it using dpkg.  Dave
>> Chinner on the XFS mailing list suggested that it was a rt-kernel
>> workqueue issue as opposed to a XFS problem after looking at the
>> kernel messages.
>>
>> The only modification to the kernel besides the RT patch is that I
>> have applied tglx's "genirq: Sanitize spurious interrupt detection of
>> threaded irqs" patch.
>
> I upgraded to 3.14.3-rt4, and the problem still persists.
>
> I turned on event tracing and tracked it down further.  I'm able to
> lock it up by scping a new kernel debian package to /tmp/ on the
> machine.  scp is locking the inode, and then scheduling
> xfs_bmapi_allocate_worker in the work queue.  The work then never gets
> run.  The kworkers then lock up waiting for the inode lock.
>
> Here are the relevant events from the trace.  ffff8803e9f10288
> (blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
> (xfs_bmapi_allocate_worker) never does.  The kernel then warns about
> blocked tasks 120 seconds later.
>
>
>              scp-5393  [001] ....1..   148.883383: xfs_writepage: dev
> 8:5 ino 0xd8e pgoff 0x4a3e000 size 0x16af02b0 offset 0 length 0
> delalloc 1 unwritten 0
>              scp-5393  [001] ....1..   148.883383: xfs_ilock_nowait:
> dev 8:5 ino 0xd8e flags ILOCK_SHARED caller xfs_map_blocks
>              scp-5393  [001] ....1..   148.883384: xfs_iunlock: dev
> 8:5 ino 0xd8e flags ILOCK_SHARED caller xfs_map_blocks
>              scp-5393  [001] ....1..   148.883386: xfs_log_reserve:
> dev 8:5 type STRAT_WRITE t_ocnt 2 t_cnt 2 t_curr_res 112332 t_unit_res
> 112332 t_flags XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserveq empty
> writeq empty grant_reserve_cycle 57 grant_reserve_bytes 9250852
> grant_write_cycle 57 grant_write_bytes 9250852 curr_cycle 57
> curr_block 18031 tail_cycle 57 tail_block 17993
>              scp-5393  [001] ....1..   148.883386:
> xfs_log_reserve_exit: dev 8:5 type STRAT_WRITE t_ocnt 2 t_cnt 2
> t_curr_res 112332 t_unit_res 112332 t_flags
> XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserveq empty writeq empty
> grant_reserve_cycle 57 grant_reserve_bytes 9475516 grant_write_cycle
> 57 grant_write_bytes 9475516 curr_cycle 57 curr_block 18031 tail_cycle
> 57 tail_block 17993
>              scp-5393  [001] ....1..   148.883386: xfs_ilock: dev 8:5
> ino 0xd8e flags ILOCK_EXCL caller xfs_iomap_write_allocate
>              scp-5393  [001] ....1.3   148.883390:
> workqueue_queue_work: work struct=ffff8803b4c158d0
> function=xfs_bmapi_allocate_worker workqueue=ffff8803ea85dc00
> req_cpu=512 cpu=1
>              scp-5393  [001] ....1.3   148.883390:
> workqueue_activate_work: work struct ffff8803b4c158d0
>              scp-5393  [001] ....1.4   148.883396:
> workqueue_queue_work: work struct=ffff8803e9f10288
> function=blk_delay_work workqueue=ffff8803ecf96200 req_cpu=512 cpu=1
>              scp-5393  [001] ....1.4   148.883396:
> workqueue_activate_work: work struct ffff8803e9f10288
>              scp-5393  [001] dN..3.4   148.883399: sched_wakeup:
> comm=kworker/1:1H pid=573 prio=100 success=1 target_cpu=001
>              scp-5393  [001] dN..3.4   148.883400: sched_stat_runtime:
> comm=scp pid=5393 runtime=32683 [ns] vruntime=658862317 [ns]
>              scp-5393  [001] d...3.4   148.883400: sched_switch:
> prev_comm=scp prev_pid=5393 prev_prio=120 prev_state=R+ ==>
> next_comm=kworker/1:1H next_pid=573 next_prio=100
>
>     irq/44-ahci-273   [000] d...3.5   148.883647: sched_wakeup:
> comm=kworker/0:2 pid=732 prio=120 success=1 target_cpu=000
>     irq/44-ahci-273   [000] d...3..   148.883667: sched_switch:
> prev_comm=irq/44-ahci prev_pid=273 prev_prio=49 prev_state=S ==>
> next_comm=kworker/0:2 next_pid=732 next_prio=120
>     kworker/0:2-732   [000] ....1..   148.883674:
> workqueue_execute_start: work struct ffff8800aea30cb8: function
> xfs_end_io
>     kworker/0:2-732   [000] ....1..   148.883674: xfs_ilock: dev 8:5
> ino 0xd8e flags ILOCK_EXCL caller xfs_setfilesize
>     kworker/0:2-732   [000] d...3..   148.883677: sched_stat_runtime:
> comm=kworker/0:2 pid=732 runtime=11392 [ns] vruntime=46907654869 [ns]
>     kworker/0:2-732   [000] d...3..   148.883679: sched_switch:
> prev_comm=kworker/0:2 prev_pid=732 prev_prio=120 prev_state=D ==>
> next_comm=swapper/0 next_pid=0 next_prio=120
>
>    kworker/1:1-99    [001] ....1..   148.884208:
> workqueue_execute_start: work struct ffff8800aea30c20: function
> xfs_end_io
>     kworker/1:1-99    [001] ....1..   148.884208: xfs_ilock: dev 8:5
> ino 0xd8e flags ILOCK_EXCL caller xfs_setfilesize
>     kworker/1:1-99    [001] d...3..   148.884210: sched_stat_runtime:
> comm=kworker/1:1 pid=99 runtime=36705 [ns] vruntime=48354424724 [ns]
>     kworker/1:1-99    [001] d...3..   148.884211: sched_switch:
> prev_comm=kworker/1:1 prev_pid=99 prev_prio=120 prev_state=R+ ==>
> next_comm=swapper/1 next_pid=0 next_prio=120
>
>   kworker/u16:4-296   [001] ....1..   151.560358:
> workqueue_execute_start: work struct ffff8803e9f10660: function
> bdi_writeback_workfn
>   kworker/u16:4-296   [001] ....1..   151.560389:
> workqueue_execute_end: work struct ffff8803e9f10660
>   kworker/u16:4-296   [001] d...3..   151.560398: sched_stat_runtime:
> comm=kworker/u16:4 pid=296 runtime=51174 [ns] vruntime=48345513312
> [ns]
>   kworker/u16:4-296   [001] d...3..   151.560403: sched_switch:
> prev_comm=kworker/u16:4 prev_pid=296 prev_prio=120 prev_state=R+ ==>
> next_comm=swapper/1 next_pid=0 next_prio=120
>
>
> I have a peak_pci can card in the machine that is consuming about 1.5
> CPU cores because it is not connected to a CAN bus and gets error
> interrupts every time it tries to send and resend.  If I disable the
> card (ifconfig can0 down; ifconfig can1 down), I have a lot more
> trouble reproducing the lockup.
>
> Any ideas why the scheduled work is never getting run?
>
> Thanks,
> Austin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-05-21  7:33     ` Richard Weinberger
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2014-05-21  7:33 UTC (permalink / raw)
  To: Austin Schuh; +Cc: rt-users, LKML, xfs

CC'ing RT folks

On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com> wrote:
> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
>> Hi,
>>
>> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
>> patched kernel.  I have currently only triggered it using dpkg.  Dave
>> Chinner on the XFS mailing list suggested that it was a rt-kernel
>> workqueue issue as opposed to a XFS problem after looking at the
>> kernel messages.
>>
>> The only modification to the kernel besides the RT patch is that I
>> have applied tglx's "genirq: Sanitize spurious interrupt detection of
>> threaded irqs" patch.
>
> I upgraded to 3.14.3-rt4, and the problem still persists.
>
> I turned on event tracing and tracked it down further.  I'm able to
> lock it up by scping a new kernel debian package to /tmp/ on the
> machine.  scp is locking the inode, and then scheduling
> xfs_bmapi_allocate_worker in the work queue.  The work then never gets
> run.  The kworkers then lock up waiting for the inode lock.
>
> Here are the relevant events from the trace.  ffff8803e9f10288
> (blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
> (xfs_bmapi_allocate_worker) never does.  The kernel then warns about
> blocked tasks 120 seconds later.
>
>
>              scp-5393  [001] ....1..   148.883383: xfs_writepage: dev
> 8:5 ino 0xd8e pgoff 0x4a3e000 size 0x16af02b0 offset 0 length 0
> delalloc 1 unwritten 0
>              scp-5393  [001] ....1..   148.883383: xfs_ilock_nowait:
> dev 8:5 ino 0xd8e flags ILOCK_SHARED caller xfs_map_blocks
>              scp-5393  [001] ....1..   148.883384: xfs_iunlock: dev
> 8:5 ino 0xd8e flags ILOCK_SHARED caller xfs_map_blocks
>              scp-5393  [001] ....1..   148.883386: xfs_log_reserve:
> dev 8:5 type STRAT_WRITE t_ocnt 2 t_cnt 2 t_curr_res 112332 t_unit_res
> 112332 t_flags XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserveq empty
> writeq empty grant_reserve_cycle 57 grant_reserve_bytes 9250852
> grant_write_cycle 57 grant_write_bytes 9250852 curr_cycle 57
> curr_block 18031 tail_cycle 57 tail_block 17993
>              scp-5393  [001] ....1..   148.883386:
> xfs_log_reserve_exit: dev 8:5 type STRAT_WRITE t_ocnt 2 t_cnt 2
> t_curr_res 112332 t_unit_res 112332 t_flags
> XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserveq empty writeq empty
> grant_reserve_cycle 57 grant_reserve_bytes 9475516 grant_write_cycle
> 57 grant_write_bytes 9475516 curr_cycle 57 curr_block 18031 tail_cycle
> 57 tail_block 17993
>              scp-5393  [001] ....1..   148.883386: xfs_ilock: dev 8:5
> ino 0xd8e flags ILOCK_EXCL caller xfs_iomap_write_allocate
>              scp-5393  [001] ....1.3   148.883390:
> workqueue_queue_work: work struct=ffff8803b4c158d0
> function=xfs_bmapi_allocate_worker workqueue=ffff8803ea85dc00
> req_cpu=512 cpu=1
>              scp-5393  [001] ....1.3   148.883390:
> workqueue_activate_work: work struct ffff8803b4c158d0
>              scp-5393  [001] ....1.4   148.883396:
> workqueue_queue_work: work struct=ffff8803e9f10288
> function=blk_delay_work workqueue=ffff8803ecf96200 req_cpu=512 cpu=1
>              scp-5393  [001] ....1.4   148.883396:
> workqueue_activate_work: work struct ffff8803e9f10288
>              scp-5393  [001] dN..3.4   148.883399: sched_wakeup:
> comm=kworker/1:1H pid=573 prio=100 success=1 target_cpu=001
>              scp-5393  [001] dN..3.4   148.883400: sched_stat_runtime:
> comm=scp pid=5393 runtime=32683 [ns] vruntime=658862317 [ns]
>              scp-5393  [001] d...3.4   148.883400: sched_switch:
> prev_comm=scp prev_pid=5393 prev_prio=120 prev_state=R+ ==>
> next_comm=kworker/1:1H next_pid=573 next_prio=100
>
>     irq/44-ahci-273   [000] d...3.5   148.883647: sched_wakeup:
> comm=kworker/0:2 pid=732 prio=120 success=1 target_cpu=000
>     irq/44-ahci-273   [000] d...3..   148.883667: sched_switch:
> prev_comm=irq/44-ahci prev_pid=273 prev_prio=49 prev_state=S ==>
> next_comm=kworker/0:2 next_pid=732 next_prio=120
>     kworker/0:2-732   [000] ....1..   148.883674:
> workqueue_execute_start: work struct ffff8800aea30cb8: function
> xfs_end_io
>     kworker/0:2-732   [000] ....1..   148.883674: xfs_ilock: dev 8:5
> ino 0xd8e flags ILOCK_EXCL caller xfs_setfilesize
>     kworker/0:2-732   [000] d...3..   148.883677: sched_stat_runtime:
> comm=kworker/0:2 pid=732 runtime=11392 [ns] vruntime=46907654869 [ns]
>     kworker/0:2-732   [000] d...3..   148.883679: sched_switch:
> prev_comm=kworker/0:2 prev_pid=732 prev_prio=120 prev_state=D ==>
> next_comm=swapper/0 next_pid=0 next_prio=120
>
>    kworker/1:1-99    [001] ....1..   148.884208:
> workqueue_execute_start: work struct ffff8800aea30c20: function
> xfs_end_io
>     kworker/1:1-99    [001] ....1..   148.884208: xfs_ilock: dev 8:5
> ino 0xd8e flags ILOCK_EXCL caller xfs_setfilesize
>     kworker/1:1-99    [001] d...3..   148.884210: sched_stat_runtime:
> comm=kworker/1:1 pid=99 runtime=36705 [ns] vruntime=48354424724 [ns]
>     kworker/1:1-99    [001] d...3..   148.884211: sched_switch:
> prev_comm=kworker/1:1 prev_pid=99 prev_prio=120 prev_state=R+ ==>
> next_comm=swapper/1 next_pid=0 next_prio=120
>
>   kworker/u16:4-296   [001] ....1..   151.560358:
> workqueue_execute_start: work struct ffff8803e9f10660: function
> bdi_writeback_workfn
>   kworker/u16:4-296   [001] ....1..   151.560389:
> workqueue_execute_end: work struct ffff8803e9f10660
>   kworker/u16:4-296   [001] d...3..   151.560398: sched_stat_runtime:
> comm=kworker/u16:4 pid=296 runtime=51174 [ns] vruntime=48345513312
> [ns]
>   kworker/u16:4-296   [001] d...3..   151.560403: sched_switch:
> prev_comm=kworker/u16:4 prev_pid=296 prev_prio=120 prev_state=R+ ==>
> next_comm=swapper/1 next_pid=0 next_prio=120
>
>
> I have a peak_pci can card in the machine that is consuming about 1.5
> CPU cores because it is not connected to a CAN bus and gets error
> interrupts every time it tries to send and resend.  If I disable the
> card (ifconfig can0 down; ifconfig can1 down), I have a lot more
> trouble reproducing the lockup.
>
> Any ideas why the scheduled work is never getting run?
>
> Thanks,
> Austin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Thanks,
//richard

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
  2014-05-14  2:29 ` Austin Schuh
@ 2014-05-21  6:23   ` Austin Schuh
  -1 siblings, 0 replies; 43+ messages in thread
From: Austin Schuh @ 2014-05-21  6:23 UTC (permalink / raw)
  To: linux-kernel; +Cc: xfs

On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> Hi,
>
> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
> patched kernel.  I have currently only triggered it using dpkg.  Dave
> Chinner on the XFS mailing list suggested that it was a rt-kernel
> workqueue issue as opposed to a XFS problem after looking at the
> kernel messages.
>
> The only modification to the kernel besides the RT patch is that I
> have applied tglx's "genirq: Sanitize spurious interrupt detection of
> threaded irqs" patch.

I upgraded to 3.14.3-rt4, and the problem still persists.

I turned on event tracing and tracked it down further.  I'm able to
lock it up by scping a new kernel debian package to /tmp/ on the
machine.  scp is locking the inode, and then scheduling
xfs_bmapi_allocate_worker in the work queue.  The work then never gets
run.  The kworkers then lock up waiting for the inode lock.

Here are the relevant events from the trace.  ffff8803e9f10288
(blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
(xfs_bmapi_allocate_worker) never does.  The kernel then warns about
blocked tasks 120 seconds later.


             scp-5393  [001] ....1..   148.883383: xfs_writepage: dev
8:5 ino 0xd8e pgoff 0x4a3e000 size 0x16af02b0 offset 0 length 0
delalloc 1 unwritten 0
             scp-5393  [001] ....1..   148.883383: xfs_ilock_nowait:
dev 8:5 ino 0xd8e flags ILOCK_SHARED caller xfs_map_blocks
             scp-5393  [001] ....1..   148.883384: xfs_iunlock: dev
8:5 ino 0xd8e flags ILOCK_SHARED caller xfs_map_blocks
             scp-5393  [001] ....1..   148.883386: xfs_log_reserve:
dev 8:5 type STRAT_WRITE t_ocnt 2 t_cnt 2 t_curr_res 112332 t_unit_res
112332 t_flags XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserveq empty
writeq empty grant_reserve_cycle 57 grant_reserve_bytes 9250852
grant_write_cycle 57 grant_write_bytes 9250852 curr_cycle 57
curr_block 18031 tail_cycle 57 tail_block 17993
             scp-5393  [001] ....1..   148.883386:
xfs_log_reserve_exit: dev 8:5 type STRAT_WRITE t_ocnt 2 t_cnt 2
t_curr_res 112332 t_unit_res 112332 t_flags
XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserveq empty writeq empty
grant_reserve_cycle 57 grant_reserve_bytes 9475516 grant_write_cycle
57 grant_write_bytes 9475516 curr_cycle 57 curr_block 18031 tail_cycle
57 tail_block 17993
             scp-5393  [001] ....1..   148.883386: xfs_ilock: dev 8:5
ino 0xd8e flags ILOCK_EXCL caller xfs_iomap_write_allocate
             scp-5393  [001] ....1.3   148.883390:
workqueue_queue_work: work struct=ffff8803b4c158d0
function=xfs_bmapi_allocate_worker workqueue=ffff8803ea85dc00
req_cpu=512 cpu=1
             scp-5393  [001] ....1.3   148.883390:
workqueue_activate_work: work struct ffff8803b4c158d0
             scp-5393  [001] ....1.4   148.883396:
workqueue_queue_work: work struct=ffff8803e9f10288
function=blk_delay_work workqueue=ffff8803ecf96200 req_cpu=512 cpu=1
             scp-5393  [001] ....1.4   148.883396:
workqueue_activate_work: work struct ffff8803e9f10288
             scp-5393  [001] dN..3.4   148.883399: sched_wakeup:
comm=kworker/1:1H pid=573 prio=100 success=1 target_cpu=001
             scp-5393  [001] dN..3.4   148.883400: sched_stat_runtime:
comm=scp pid=5393 runtime=32683 [ns] vruntime=658862317 [ns]
             scp-5393  [001] d...3.4   148.883400: sched_switch:
prev_comm=scp prev_pid=5393 prev_prio=120 prev_state=R+ ==>
next_comm=kworker/1:1H next_pid=573 next_prio=100

    irq/44-ahci-273   [000] d...3.5   148.883647: sched_wakeup:
comm=kworker/0:2 pid=732 prio=120 success=1 target_cpu=000
    irq/44-ahci-273   [000] d...3..   148.883667: sched_switch:
prev_comm=irq/44-ahci prev_pid=273 prev_prio=49 prev_state=S ==>
next_comm=kworker/0:2 next_pid=732 next_prio=120
    kworker/0:2-732   [000] ....1..   148.883674:
workqueue_execute_start: work struct ffff8800aea30cb8: function
xfs_end_io
    kworker/0:2-732   [000] ....1..   148.883674: xfs_ilock: dev 8:5
ino 0xd8e flags ILOCK_EXCL caller xfs_setfilesize
    kworker/0:2-732   [000] d...3..   148.883677: sched_stat_runtime:
comm=kworker/0:2 pid=732 runtime=11392 [ns] vruntime=46907654869 [ns]
    kworker/0:2-732   [000] d...3..   148.883679: sched_switch:
prev_comm=kworker/0:2 prev_pid=732 prev_prio=120 prev_state=D ==>
next_comm=swapper/0 next_pid=0 next_prio=120

   kworker/1:1-99    [001] ....1..   148.884208:
workqueue_execute_start: work struct ffff8800aea30c20: function
xfs_end_io
    kworker/1:1-99    [001] ....1..   148.884208: xfs_ilock: dev 8:5
ino 0xd8e flags ILOCK_EXCL caller xfs_setfilesize
    kworker/1:1-99    [001] d...3..   148.884210: sched_stat_runtime:
comm=kworker/1:1 pid=99 runtime=36705 [ns] vruntime=48354424724 [ns]
    kworker/1:1-99    [001] d...3..   148.884211: sched_switch:
prev_comm=kworker/1:1 prev_pid=99 prev_prio=120 prev_state=R+ ==>
next_comm=swapper/1 next_pid=0 next_prio=120

  kworker/u16:4-296   [001] ....1..   151.560358:
workqueue_execute_start: work struct ffff8803e9f10660: function
bdi_writeback_workfn
  kworker/u16:4-296   [001] ....1..   151.560389:
workqueue_execute_end: work struct ffff8803e9f10660
  kworker/u16:4-296   [001] d...3..   151.560398: sched_stat_runtime:
comm=kworker/u16:4 pid=296 runtime=51174 [ns] vruntime=48345513312
[ns]
  kworker/u16:4-296   [001] d...3..   151.560403: sched_switch:
prev_comm=kworker/u16:4 prev_pid=296 prev_prio=120 prev_state=R+ ==>
next_comm=swapper/1 next_pid=0 next_prio=120


I have a peak_pci can card in the machine that is consuming about 1.5
CPU cores because it is not connected to a CAN bus and gets error
interrupts every time it tries to send and resend.  If I disable the
card (ifconfig can0 down; ifconfig can1 down), I have a lot more
trouble reproducing the lockup.

Any ideas why the scheduled work is never getting run?

Thanks,
Austin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-05-21  6:23   ` Austin Schuh
  0 siblings, 0 replies; 43+ messages in thread
From: Austin Schuh @ 2014-05-21  6:23 UTC (permalink / raw)
  To: linux-kernel; +Cc: xfs

On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> Hi,
>
> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
> patched kernel.  I have currently only triggered it using dpkg.  Dave
> Chinner on the XFS mailing list suggested that it was a rt-kernel
> workqueue issue as opposed to a XFS problem after looking at the
> kernel messages.
>
> The only modification to the kernel besides the RT patch is that I
> have applied tglx's "genirq: Sanitize spurious interrupt detection of
> threaded irqs" patch.

I upgraded to 3.14.3-rt4, and the problem still persists.

I turned on event tracing and tracked it down further.  I'm able to
lock it up by scping a new kernel debian package to /tmp/ on the
machine.  scp is locking the inode, and then scheduling
xfs_bmapi_allocate_worker in the work queue.  The work then never gets
run.  The kworkers then lock up waiting for the inode lock.

Here are the relevant events from the trace.  ffff8803e9f10288
(blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
(xfs_bmapi_allocate_worker) never does.  The kernel then warns about
blocked tasks 120 seconds later.


             scp-5393  [001] ....1..   148.883383: xfs_writepage: dev
8:5 ino 0xd8e pgoff 0x4a3e000 size 0x16af02b0 offset 0 length 0
delalloc 1 unwritten 0
             scp-5393  [001] ....1..   148.883383: xfs_ilock_nowait:
dev 8:5 ino 0xd8e flags ILOCK_SHARED caller xfs_map_blocks
             scp-5393  [001] ....1..   148.883384: xfs_iunlock: dev
8:5 ino 0xd8e flags ILOCK_SHARED caller xfs_map_blocks
             scp-5393  [001] ....1..   148.883386: xfs_log_reserve:
dev 8:5 type STRAT_WRITE t_ocnt 2 t_cnt 2 t_curr_res 112332 t_unit_res
112332 t_flags XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserveq empty
writeq empty grant_reserve_cycle 57 grant_reserve_bytes 9250852
grant_write_cycle 57 grant_write_bytes 9250852 curr_cycle 57
curr_block 18031 tail_cycle 57 tail_block 17993
             scp-5393  [001] ....1..   148.883386:
xfs_log_reserve_exit: dev 8:5 type STRAT_WRITE t_ocnt 2 t_cnt 2
t_curr_res 112332 t_unit_res 112332 t_flags
XLOG_TIC_INITED|XLOG_TIC_PERM_RESERV reserveq empty writeq empty
grant_reserve_cycle 57 grant_reserve_bytes 9475516 grant_write_cycle
57 grant_write_bytes 9475516 curr_cycle 57 curr_block 18031 tail_cycle
57 tail_block 17993
             scp-5393  [001] ....1..   148.883386: xfs_ilock: dev 8:5
ino 0xd8e flags ILOCK_EXCL caller xfs_iomap_write_allocate
             scp-5393  [001] ....1.3   148.883390:
workqueue_queue_work: work struct=ffff8803b4c158d0
function=xfs_bmapi_allocate_worker workqueue=ffff8803ea85dc00
req_cpu=512 cpu=1
             scp-5393  [001] ....1.3   148.883390:
workqueue_activate_work: work struct ffff8803b4c158d0
             scp-5393  [001] ....1.4   148.883396:
workqueue_queue_work: work struct=ffff8803e9f10288
function=blk_delay_work workqueue=ffff8803ecf96200 req_cpu=512 cpu=1
             scp-5393  [001] ....1.4   148.883396:
workqueue_activate_work: work struct ffff8803e9f10288
             scp-5393  [001] dN..3.4   148.883399: sched_wakeup:
comm=kworker/1:1H pid=573 prio=100 success=1 target_cpu=001
             scp-5393  [001] dN..3.4   148.883400: sched_stat_runtime:
comm=scp pid=5393 runtime=32683 [ns] vruntime=658862317 [ns]
             scp-5393  [001] d...3.4   148.883400: sched_switch:
prev_comm=scp prev_pid=5393 prev_prio=120 prev_state=R+ ==>
next_comm=kworker/1:1H next_pid=573 next_prio=100

    irq/44-ahci-273   [000] d...3.5   148.883647: sched_wakeup:
comm=kworker/0:2 pid=732 prio=120 success=1 target_cpu=000
    irq/44-ahci-273   [000] d...3..   148.883667: sched_switch:
prev_comm=irq/44-ahci prev_pid=273 prev_prio=49 prev_state=S ==>
next_comm=kworker/0:2 next_pid=732 next_prio=120
    kworker/0:2-732   [000] ....1..   148.883674:
workqueue_execute_start: work struct ffff8800aea30cb8: function
xfs_end_io
    kworker/0:2-732   [000] ....1..   148.883674: xfs_ilock: dev 8:5
ino 0xd8e flags ILOCK_EXCL caller xfs_setfilesize
    kworker/0:2-732   [000] d...3..   148.883677: sched_stat_runtime:
comm=kworker/0:2 pid=732 runtime=11392 [ns] vruntime=46907654869 [ns]
    kworker/0:2-732   [000] d...3..   148.883679: sched_switch:
prev_comm=kworker/0:2 prev_pid=732 prev_prio=120 prev_state=D ==>
next_comm=swapper/0 next_pid=0 next_prio=120

   kworker/1:1-99    [001] ....1..   148.884208:
workqueue_execute_start: work struct ffff8800aea30c20: function
xfs_end_io
    kworker/1:1-99    [001] ....1..   148.884208: xfs_ilock: dev 8:5
ino 0xd8e flags ILOCK_EXCL caller xfs_setfilesize
    kworker/1:1-99    [001] d...3..   148.884210: sched_stat_runtime:
comm=kworker/1:1 pid=99 runtime=36705 [ns] vruntime=48354424724 [ns]
    kworker/1:1-99    [001] d...3..   148.884211: sched_switch:
prev_comm=kworker/1:1 prev_pid=99 prev_prio=120 prev_state=R+ ==>
next_comm=swapper/1 next_pid=0 next_prio=120

  kworker/u16:4-296   [001] ....1..   151.560358:
workqueue_execute_start: work struct ffff8803e9f10660: function
bdi_writeback_workfn
  kworker/u16:4-296   [001] ....1..   151.560389:
workqueue_execute_end: work struct ffff8803e9f10660
  kworker/u16:4-296   [001] d...3..   151.560398: sched_stat_runtime:
comm=kworker/u16:4 pid=296 runtime=51174 [ns] vruntime=48345513312
[ns]
  kworker/u16:4-296   [001] d...3..   151.560403: sched_switch:
prev_comm=kworker/u16:4 prev_pid=296 prev_prio=120 prev_state=R+ ==>
next_comm=swapper/1 next_pid=0 next_prio=120


I have a peak_pci can card in the machine that is consuming about 1.5
CPU cores because it is not connected to a CAN bus and gets error
interrupts every time it tries to send and resend.  If I disable the
card (ifconfig can0 down; ifconfig can1 down), I have a lot more
trouble reproducing the lockup.

Any ideas why the scheduled work is never getting run?

Thanks,
Austin

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-05-14  2:29 ` Austin Schuh
  0 siblings, 0 replies; 43+ messages in thread
From: Austin Schuh @ 2014-05-14  2:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: xfs

[-- Attachment #1: Type: text/plain, Size: 831 bytes --]

Hi,

I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
patched kernel.  I have currently only triggered it using dpkg.  Dave
Chinner on the XFS mailing list suggested that it was a rt-kernel
workqueue issue as opposed to a XFS problem after looking at the
kernel messages.

$ uname -a
Linux vpc5 3.10.24-rt22abs #15 SMP PREEMPT RT Tue May 13 14:42:22 PDT
2014 x86_64 GNU/Linux

The only modification to the kernel besides the RT patch is that I
have applied tglx's "genirq: Sanitize spurious interrupt detection of
threaded irqs" patch.

Any ideas on what could be wrong?

Is there any information that I can pull before I reboot the machine
that would be useful?  I have the output of triggering sysrq with l
and t if that would be useful.  Attached is the kernel blocked task
message output.

Thanks,
    Austin

[-- Attachment #2: vpc5_xfs_lockup_locks --]
[-- Type: application/octet-stream, Size: 31428 bytes --]

May 13 18:45:18 vpc5 kernel: [  959.966849] INFO: task kworker/2:1:82 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.966917] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.966972] kworker/2:1     D 0000000000000000     0    82      2 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.966981] Workqueue: xfs-data/sda5 xfs_end_io
May 13 18:45:18 vpc5 kernel: [  959.966985]  ffff8803f64b9b58 0000000000000002 0000000000000000 ffff8803f659c600
May 13 18:45:18 vpc5 kernel: [  959.966987]  ffff880366153a40 ffff8803f64b9fd8 0000000000063800 ffff8803f64b9fd8
May 13 18:45:18 vpc5 kernel: [  959.966988]  0000000000063800 ffff8803f64b1180 ffff8803f64b9b78 ffff8803f64b1180
May 13 18:45:18 vpc5 kernel: [  959.966991] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.966997]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.966999]  [<ffffffff814b7c74>] __rt_mutex_slowlock+0x8e/0xc7
May 13 18:45:18 vpc5 kernel: [  959.967001]  [<ffffffff814b7db6>] rt_mutex_slowlock+0x109/0x18a
May 13 18:45:18 vpc5 kernel: [  959.967005]  [<ffffffff8107a629>] ? __lock_acquire.isra.27+0x1ce/0x541
May 13 18:45:18 vpc5 kernel: [  959.967008]  [<ffffffff811b1f1f>] ? xfs_setfilesize+0x80/0x148
May 13 18:45:18 vpc5 kernel: [  959.967009]  [<ffffffff814b7e4e>] rt_mutex_lock+0x17/0x19
May 13 18:45:18 vpc5 kernel: [  959.967011]  [<ffffffff8107fdbd>] rt_down_write_nested+0x3a/0x41
May 13 18:45:18 vpc5 kernel: [  959.967014]  [<ffffffff811f18d0>] ? xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967015]  [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967017]  [<ffffffff811b1f1f>] xfs_setfilesize+0x80/0x148
May 13 18:45:18 vpc5 kernel: [  959.967019]  [<ffffffff811b1ea4>] ? xfs_setfilesize+0x5/0x148
May 13 18:45:18 vpc5 kernel: [  959.967021]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967023]  [<ffffffff811b2bba>] xfs_end_io+0x83/0x99
May 13 18:45:18 vpc5 kernel: [  959.967025]  [<ffffffff8104f384>] process_one_work+0x213/0x397
May 13 18:45:18 vpc5 kernel: [  959.967026]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967028]  [<ffffffff8104f925>] worker_thread+0x149/0x224
May 13 18:45:18 vpc5 kernel: [  959.967030]  [<ffffffff8104f7dc>] ? rescuer_thread+0x2a5/0x2a5
May 13 18:45:18 vpc5 kernel: [  959.967032]  [<ffffffff810550b2>] kthread+0xa2/0xaa
May 13 18:45:18 vpc5 kernel: [  959.967035]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:45:18 vpc5 kernel: [  959.967037]  [<ffffffff814be1dc>] ret_from_fork+0x7c/0xb0
May 13 18:45:18 vpc5 kernel: [  959.967038]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:45:18 vpc5 kernel: [  959.967040] 4 locks held by kworker/2:1/82:
May 13 18:45:18 vpc5 kernel: [  959.967051]  #0:  (xfs-data/%s){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967056]  #1:  ((&ioend->io_work)){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967061]  #2:  (sb_internal){......}, at: [<ffffffff811b1ea4>] xfs_setfilesize+0x5/0x148
May 13 18:45:18 vpc5 kernel: [  959.967066]  #3:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967073] INFO: task kworker/u16:5:241 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.967127] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.967181] kworker/u16:5   D 0000000000000002     0   241      2 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.967187] Workqueue: writeback bdi_writeback_workfn (flush-8:0)
May 13 18:45:18 vpc5 kernel: [  959.967190]  ffff8803f572b3c8 0000000000000002 ffff88042db62b80 ffff8803f8f9c600
May 13 18:45:18 vpc5 kernel: [  959.967192]  ffff8803f63b6900 ffff8803f572bfd8 0000000000063800 ffff8803f572bfd8
May 13 18:45:18 vpc5 kernel: [  959.967193]  0000000000063800 ffff8803f63b6900 ffff880300000001 ffff8803f63b6900
May 13 18:45:18 vpc5 kernel: [  959.967195] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.967198]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.967200]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:45:18 vpc5 kernel: [  959.967201]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:45:18 vpc5 kernel: [  959.967203]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:45:18 vpc5 kernel: [  959.967205]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:45:18 vpc5 kernel: [  959.967207]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:45:18 vpc5 kernel: [  959.967209]  [<ffffffff811da0fe>] xfs_bmapi_allocate+0xd8/0xea
May 13 18:45:18 vpc5 kernel: [  959.967211]  [<ffffffff811da453>] xfs_bmapi_write+0x343/0x59b
May 13 18:45:18 vpc5 kernel: [  959.967213]  [<ffffffff811d7d75>] ? __xfs_bmapi_allocate+0x23c/0x23c
May 13 18:45:18 vpc5 kernel: [  959.967216]  [<ffffffff811bfb94>] xfs_iomap_write_allocate+0x1b9/0x2c2
May 13 18:45:18 vpc5 kernel: [  959.967218]  [<ffffffff811b2218>] xfs_map_blocks+0x12b/0x203
May 13 18:45:18 vpc5 kernel: [  959.967221]  [<ffffffff810d451c>] ? rcu_read_unlock+0x23/0x23
May 13 18:45:18 vpc5 kernel: [  959.967223]  [<ffffffff811b326a>] xfs_vm_writepage+0x280/0x4b8
May 13 18:45:18 vpc5 kernel: [  959.967226]  [<ffffffff810dc27c>] __writepage+0x18/0x37
May 13 18:45:18 vpc5 kernel: [  959.967228]  [<ffffffff810dcdd0>] write_cache_pages+0x25a/0x37e
May 13 18:45:18 vpc5 kernel: [  959.967230]  [<ffffffff810dc264>] ? page_index+0x1a/0x1a
May 13 18:45:18 vpc5 kernel: [  959.967232]  [<ffffffff814b81d8>] ? rt_spin_lock_slowunlock+0x14/0x20
May 13 18:45:18 vpc5 kernel: [  959.967234]  [<ffffffff810dcf35>] generic_writepages+0x41/0x5b
May 13 18:45:18 vpc5 kernel: [  959.967236]  [<ffffffff811b1e29>] xfs_vm_writepages+0x51/0x5c
May 13 18:45:18 vpc5 kernel: [  959.967238]  [<ffffffff810de05b>] do_writepages+0x21/0x2f
May 13 18:45:18 vpc5 kernel: [  959.967239]  [<ffffffff81142c1c>] __writeback_single_inode+0x7b/0x238
May 13 18:45:18 vpc5 kernel: [  959.967240]  [<ffffffff81143e74>] writeback_sb_inodes+0x220/0x37a
May 13 18:45:18 vpc5 kernel: [  959.967242]  [<ffffffff81144042>] __writeback_inodes_wb+0x74/0xb9
May 13 18:45:18 vpc5 kernel: [  959.967244]  [<ffffffff811441c5>] wb_writeback+0x13e/0x2a3
May 13 18:45:18 vpc5 kernel: [  959.967245]  [<ffffffff8114463b>] wb_do_writeback+0x163/0x1d9
May 13 18:45:18 vpc5 kernel: [  959.967247]  [<ffffffff8114471d>] bdi_writeback_workfn+0x6c/0xfe
May 13 18:45:18 vpc5 kernel: [  959.967249]  [<ffffffff8104f384>] process_one_work+0x213/0x397
May 13 18:45:18 vpc5 kernel: [  959.967250]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967252]  [<ffffffff8104f925>] worker_thread+0x149/0x224
May 13 18:45:18 vpc5 kernel: [  959.967254]  [<ffffffff8104f7dc>] ? rescuer_thread+0x2a5/0x2a5
May 13 18:45:18 vpc5 kernel: [  959.967255]  [<ffffffff810550b2>] kthread+0xa2/0xaa
May 13 18:45:18 vpc5 kernel: [  959.967258]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:45:18 vpc5 kernel: [  959.967259]  [<ffffffff814be1dc>] ret_from_fork+0x7c/0xb0
May 13 18:45:18 vpc5 kernel: [  959.967261]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:45:18 vpc5 kernel: [  959.967262] 5 locks held by kworker/u16:5/241:
May 13 18:45:18 vpc5 kernel: [  959.967278]  #0:  (writeback){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967283]  #1:  ((&(&wb->dwork)->work)){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967288]  #2:  (&type->s_umount_key#18){......}, at: [<ffffffff81123d4f>] grab_super_passive+0x60/0x8a
May 13 18:45:18 vpc5 kernel: [  959.967294]  #3:  (sb_internal){......}, at: [<ffffffff811ff638>] xfs_trans_alloc+0x24/0x3d
May 13 18:45:18 vpc5 kernel: [  959.967299]  #4:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967315] INFO: task dpkg:5825 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.967367] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.967422] dpkg            D 0000000000000002     0  5825   5789 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.967425]  ffff8803f57cd778 0000000000000002 0000000000000001 ffff8803f8f9c600
May 13 18:45:18 vpc5 kernel: [  959.967427]  ffff8803f57cd778 ffff8803f57cdfd8 0000000000063800 ffff8803f57cdfd8
May 13 18:45:18 vpc5 kernel: [  959.967428]  0000000000063800 ffff8803f659c600 ffff880300000001 ffff8803f659c600
May 13 18:45:18 vpc5 kernel: [  959.967430] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.967433]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.967435]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:45:18 vpc5 kernel: [  959.967436]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:45:18 vpc5 kernel: [  959.967438]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:45:18 vpc5 kernel: [  959.967440]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:45:18 vpc5 kernel: [  959.967441]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:45:18 vpc5 kernel: [  959.967443]  [<ffffffff811da0fe>] xfs_bmapi_allocate+0xd8/0xea
May 13 18:45:18 vpc5 kernel: [  959.967444]  [<ffffffff811da453>] xfs_bmapi_write+0x343/0x59b
May 13 18:45:18 vpc5 kernel: [  959.967447]  [<ffffffff811d7d75>] ? __xfs_bmapi_allocate+0x23c/0x23c
May 13 18:45:18 vpc5 kernel: [  959.967449]  [<ffffffff811bfb94>] xfs_iomap_write_allocate+0x1b9/0x2c2
May 13 18:45:18 vpc5 kernel: [  959.967451]  [<ffffffff811b2218>] xfs_map_blocks+0x12b/0x203
May 13 18:45:18 vpc5 kernel: [  959.967453]  [<ffffffff811b326a>] xfs_vm_writepage+0x280/0x4b8
May 13 18:45:18 vpc5 kernel: [  959.967456]  [<ffffffff810dc27c>] __writepage+0x18/0x37
May 13 18:45:18 vpc5 kernel: [  959.967458]  [<ffffffff810dcdd0>] write_cache_pages+0x25a/0x37e
May 13 18:45:18 vpc5 kernel: [  959.967460]  [<ffffffff810dc264>] ? page_index+0x1a/0x1a
May 13 18:45:18 vpc5 kernel: [  959.967462]  [<ffffffff810dcf35>] generic_writepages+0x41/0x5b
May 13 18:45:18 vpc5 kernel: [  959.967464]  [<ffffffff811b1e29>] xfs_vm_writepages+0x51/0x5c
May 13 18:45:18 vpc5 kernel: [  959.967466]  [<ffffffff810de05b>] do_writepages+0x21/0x2f
May 13 18:45:18 vpc5 kernel: [  959.967467]  [<ffffffff810d54b6>] __filemap_fdatawrite_range+0x53/0x55
May 13 18:45:18 vpc5 kernel: [  959.967469]  [<ffffffff810d5f57>] filemap_fdatawrite_range+0x13/0x15
May 13 18:45:18 vpc5 kernel: [  959.967471]  [<ffffffff811479c6>] SyS_sync_file_range+0xe9/0x12d
May 13 18:45:18 vpc5 kernel: [  959.967473]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:45:18 vpc5 kernel: [  959.967474] 2 locks held by dpkg/5825:
May 13 18:45:18 vpc5 kernel: [  959.967485]  #0:  (sb_internal){......}, at: [<ffffffff811ff638>] xfs_trans_alloc+0x24/0x3d
May 13 18:45:18 vpc5 kernel: [  959.967490]  #1:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967496] INFO: task shutdown_watche:5964 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.967549] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.967604] shutdown_watche D 0000000000000002     0  5964   2604 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.967607]  ffff8803c1561ce8 0000000000000002 ffff8803c1561ce8 ffffffff81a1a400
May 13 18:45:18 vpc5 kernel: [  959.967609]  ffff8803f63b2300 ffff8803c1561fd8 0000000000063800 ffff8803c1561fd8
May 13 18:45:18 vpc5 kernel: [  959.967610]  0000000000063800 ffff8803f63b2300 ffff880300000001 ffff8803f63b2300
May 13 18:45:18 vpc5 kernel: [  959.967612] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.967615]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.967616]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:45:18 vpc5 kernel: [  959.967618]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:45:18 vpc5 kernel: [  959.967620]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:45:18 vpc5 kernel: [  959.967621]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:45:18 vpc5 kernel: [  959.967623]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:45:18 vpc5 kernel: [  959.967625]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:45:18 vpc5 kernel: [  959.967626]  [<ffffffff8104dd2d>] flush_work+0x1e1/0x209
May 13 18:45:18 vpc5 kernel: [  959.967627]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:45:18 vpc5 kernel: [  959.967629]  [<ffffffff8104c4b4>] ? create_and_start_worker+0x6e/0x6e
May 13 18:45:18 vpc5 kernel: [  959.967631]  [<ffffffff810dfe68>] ? __pagevec_release+0x2c/0x2c
May 13 18:45:18 vpc5 kernel: [  959.967633]  [<ffffffff8104ff67>] schedule_on_each_cpu+0xca/0x104
May 13 18:45:18 vpc5 kernel: [  959.967635]  [<ffffffff810dfe8d>] lru_add_drain_all+0x15/0x18
May 13 18:45:18 vpc5 kernel: [  959.967637]  [<ffffffff810f7aa9>] SyS_mlockall+0x48/0x11d
May 13 18:45:18 vpc5 kernel: [  959.967638]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:45:18 vpc5 kernel: [  959.967639] no locks held by shutdown_watche/5964.
May 13 18:45:18 vpc5 kernel: [  959.967648] INFO: task j1939_vehicle_m:5965 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.967702] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.967757] j1939_vehicle_m D 0000000000000002     0  5965   2604 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.967760]  ffff8803c14f9ce8 0000000000000002 ffff8803c14f9ce8 ffff8803f8f9d780
May 13 18:45:18 vpc5 kernel: [  959.967761]  ffff8803f63b4600 ffff8803c14f9fd8 0000000000063800 ffff8803c14f9fd8
May 13 18:45:18 vpc5 kernel: [  959.967763]  0000000000063800 ffff8803f63b4600 ffff880300000001 ffff8803f63b4600
May 13 18:45:18 vpc5 kernel: [  959.967765] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.967768]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.967769]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:45:18 vpc5 kernel: [  959.967770]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:45:18 vpc5 kernel: [  959.967772]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:45:18 vpc5 kernel: [  959.967774]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:45:18 vpc5 kernel: [  959.967775]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:45:18 vpc5 kernel: [  959.967777]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:45:18 vpc5 kernel: [  959.967779]  [<ffffffff8104dd2d>] flush_work+0x1e1/0x209
May 13 18:45:18 vpc5 kernel: [  959.967780]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:45:18 vpc5 kernel: [  959.967783]  [<ffffffff8104c4b4>] ? create_and_start_worker+0x6e/0x6e
May 13 18:45:18 vpc5 kernel: [  959.967785]  [<ffffffff810dfe68>] ? __pagevec_release+0x2c/0x2c
May 13 18:45:18 vpc5 kernel: [  959.967786]  [<ffffffff8104ff67>] schedule_on_each_cpu+0xca/0x104
May 13 18:45:18 vpc5 kernel: [  959.967788]  [<ffffffff810dfe8d>] lru_add_drain_all+0x15/0x18
May 13 18:45:18 vpc5 kernel: [  959.967789]  [<ffffffff810f7aa9>] SyS_mlockall+0x48/0x11d
May 13 18:45:18 vpc5 kernel: [  959.967791]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:45:18 vpc5 kernel: [  959.967792] no locks held by j1939_vehicle_m/5965.
May 13 18:47:18 vpc5 kernel: [ 1079.835312] INFO: task kworker/2:1:82 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.835369] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.835426] kworker/2:1     D 0000000000000000     0    82      2 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.835434] Workqueue: xfs-data/sda5 xfs_end_io
May 13 18:47:18 vpc5 kernel: [ 1079.835438]  ffff8803f64b9b58 0000000000000002 0000000000000000 ffff8803f659c600
May 13 18:47:18 vpc5 kernel: [ 1079.835439]  ffff880366153a40 ffff8803f64b9fd8 0000000000063800 ffff8803f64b9fd8
May 13 18:47:18 vpc5 kernel: [ 1079.835441]  0000000000063800 ffff8803f64b1180 ffff8803f64b9b78 ffff8803f64b1180
May 13 18:47:18 vpc5 kernel: [ 1079.835444] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.835449]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.835451]  [<ffffffff814b7c74>] __rt_mutex_slowlock+0x8e/0xc7
May 13 18:47:18 vpc5 kernel: [ 1079.835453]  [<ffffffff814b7db6>] rt_mutex_slowlock+0x109/0x18a
May 13 18:47:18 vpc5 kernel: [ 1079.835456]  [<ffffffff8107a629>] ? __lock_acquire.isra.27+0x1ce/0x541
May 13 18:47:18 vpc5 kernel: [ 1079.835459]  [<ffffffff811b1f1f>] ? xfs_setfilesize+0x80/0x148
May 13 18:47:18 vpc5 kernel: [ 1079.835461]  [<ffffffff814b7e4e>] rt_mutex_lock+0x17/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.835462]  [<ffffffff8107fdbd>] rt_down_write_nested+0x3a/0x41
May 13 18:47:18 vpc5 kernel: [ 1079.835465]  [<ffffffff811f18d0>] ? xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.835466]  [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.835468]  [<ffffffff811b1f1f>] xfs_setfilesize+0x80/0x148
May 13 18:47:18 vpc5 kernel: [ 1079.835470]  [<ffffffff811b1ea4>] ? xfs_setfilesize+0x5/0x148
May 13 18:47:18 vpc5 kernel: [ 1079.835473]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835475]  [<ffffffff811b2bba>] xfs_end_io+0x83/0x99
May 13 18:47:18 vpc5 kernel: [ 1079.835476]  [<ffffffff8104f384>] process_one_work+0x213/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835478]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835480]  [<ffffffff8104f925>] worker_thread+0x149/0x224
May 13 18:47:18 vpc5 kernel: [ 1079.835482]  [<ffffffff8104f7dc>] ? rescuer_thread+0x2a5/0x2a5
May 13 18:47:18 vpc5 kernel: [ 1079.835484]  [<ffffffff810550b2>] kthread+0xa2/0xaa
May 13 18:47:18 vpc5 kernel: [ 1079.835487]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:47:18 vpc5 kernel: [ 1079.835489]  [<ffffffff814be1dc>] ret_from_fork+0x7c/0xb0
May 13 18:47:18 vpc5 kernel: [ 1079.835490]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:47:18 vpc5 kernel: [ 1079.835492] 4 locks held by kworker/2:1/82:
May 13 18:47:18 vpc5 kernel: [ 1079.835503]  #0:  (xfs-data/%s){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835509]  #1:  ((&ioend->io_work)){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835514]  #2:  (sb_internal){......}, at: [<ffffffff811b1ea4>] xfs_setfilesize+0x5/0x148
May 13 18:47:18 vpc5 kernel: [ 1079.835519]  #3:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.835526] INFO: task kworker/u16:5:241 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.835590] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.835665] kworker/u16:5   D 0000000000000002     0   241      2 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.835670] Workqueue: writeback bdi_writeback_workfn (flush-8:0)
May 13 18:47:18 vpc5 kernel: [ 1079.835674]  ffff8803f572b3c8 0000000000000002 ffff88042db62b80 ffff8803f8f9c600
May 13 18:47:18 vpc5 kernel: [ 1079.835675]  ffff8803f63b6900 ffff8803f572bfd8 0000000000063800 ffff8803f572bfd8
May 13 18:47:18 vpc5 kernel: [ 1079.835677]  0000000000063800 ffff8803f63b6900 ffff880300000001 ffff8803f63b6900
May 13 18:47:18 vpc5 kernel: [ 1079.835679] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.835682]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.835684]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:47:18 vpc5 kernel: [ 1079.835686]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.835688]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.835690]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.835692]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:47:18 vpc5 kernel: [ 1079.835693]  [<ffffffff811da0fe>] xfs_bmapi_allocate+0xd8/0xea
May 13 18:47:18 vpc5 kernel: [ 1079.835695]  [<ffffffff811da453>] xfs_bmapi_write+0x343/0x59b
May 13 18:47:18 vpc5 kernel: [ 1079.835698]  [<ffffffff811d7d75>] ? __xfs_bmapi_allocate+0x23c/0x23c
May 13 18:47:18 vpc5 kernel: [ 1079.835701]  [<ffffffff811bfb94>] xfs_iomap_write_allocate+0x1b9/0x2c2
May 13 18:47:18 vpc5 kernel: [ 1079.835703]  [<ffffffff811b2218>] xfs_map_blocks+0x12b/0x203
May 13 18:47:18 vpc5 kernel: [ 1079.835705]  [<ffffffff810d451c>] ? rcu_read_unlock+0x23/0x23
May 13 18:47:18 vpc5 kernel: [ 1079.835707]  [<ffffffff811b326a>] xfs_vm_writepage+0x280/0x4b8
May 13 18:47:18 vpc5 kernel: [ 1079.835710]  [<ffffffff810dc27c>] __writepage+0x18/0x37
May 13 18:47:18 vpc5 kernel: [ 1079.835713]  [<ffffffff810dcdd0>] write_cache_pages+0x25a/0x37e
May 13 18:47:18 vpc5 kernel: [ 1079.835715]  [<ffffffff810dc264>] ? page_index+0x1a/0x1a
May 13 18:47:18 vpc5 kernel: [ 1079.835717]  [<ffffffff814b81d8>] ? rt_spin_lock_slowunlock+0x14/0x20
May 13 18:47:18 vpc5 kernel: [ 1079.835719]  [<ffffffff810dcf35>] generic_writepages+0x41/0x5b
May 13 18:47:18 vpc5 kernel: [ 1079.835721]  [<ffffffff811b1e29>] xfs_vm_writepages+0x51/0x5c
May 13 18:47:18 vpc5 kernel: [ 1079.835723]  [<ffffffff810de05b>] do_writepages+0x21/0x2f
May 13 18:47:18 vpc5 kernel: [ 1079.835724]  [<ffffffff81142c1c>] __writeback_single_inode+0x7b/0x238
May 13 18:47:18 vpc5 kernel: [ 1079.835726]  [<ffffffff81143e74>] writeback_sb_inodes+0x220/0x37a
May 13 18:47:18 vpc5 kernel: [ 1079.835728]  [<ffffffff81144042>] __writeback_inodes_wb+0x74/0xb9
May 13 18:47:18 vpc5 kernel: [ 1079.835729]  [<ffffffff811441c5>] wb_writeback+0x13e/0x2a3
May 13 18:47:18 vpc5 kernel: [ 1079.835731]  [<ffffffff8114463b>] wb_do_writeback+0x163/0x1d9
May 13 18:47:18 vpc5 kernel: [ 1079.835733]  [<ffffffff8114471d>] bdi_writeback_workfn+0x6c/0xfe
May 13 18:47:18 vpc5 kernel: [ 1079.835735]  [<ffffffff8104f384>] process_one_work+0x213/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835737]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835739]  [<ffffffff8104f925>] worker_thread+0x149/0x224
May 13 18:47:18 vpc5 kernel: [ 1079.835741]  [<ffffffff8104f7dc>] ? rescuer_thread+0x2a5/0x2a5
May 13 18:47:18 vpc5 kernel: [ 1079.835742]  [<ffffffff810550b2>] kthread+0xa2/0xaa
May 13 18:47:18 vpc5 kernel: [ 1079.835744]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:47:18 vpc5 kernel: [ 1079.835746]  [<ffffffff814be1dc>] ret_from_fork+0x7c/0xb0
May 13 18:47:18 vpc5 kernel: [ 1079.835748]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:47:18 vpc5 kernel: [ 1079.835749] 5 locks held by kworker/u16:5/241:
May 13 18:47:18 vpc5 kernel: [ 1079.835767]  #0:  (writeback){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835772]  #1:  ((&(&wb->dwork)->work)){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835776]  #2:  (&type->s_umount_key#18){......}, at: [<ffffffff81123d4f>] grab_super_passive+0x60/0x8a
May 13 18:47:18 vpc5 kernel: [ 1079.835782]  #3:  (sb_internal){......}, at: [<ffffffff811ff638>] xfs_trans_alloc+0x24/0x3d
May 13 18:47:18 vpc5 kernel: [ 1079.835788]  #4:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.835802] INFO: task dpkg:5825 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.835863] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.835938] dpkg            D 0000000000000002     0  5825   5789 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.835942]  ffff8803f57cd778 0000000000000002 0000000000000001 ffff8803f8f9c600
May 13 18:47:18 vpc5 kernel: [ 1079.835944]  ffff8803f57cd778 ffff8803f57cdfd8 0000000000063800 ffff8803f57cdfd8
May 13 18:47:18 vpc5 kernel: [ 1079.835945]  0000000000063800 ffff8803f659c600 ffff880300000001 ffff8803f659c600
May 13 18:47:18 vpc5 kernel: [ 1079.835947] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.835950]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.835952]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:47:18 vpc5 kernel: [ 1079.835953]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.835955]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.835957]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.835959]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:47:18 vpc5 kernel: [ 1079.835960]  [<ffffffff811da0fe>] xfs_bmapi_allocate+0xd8/0xea
May 13 18:47:18 vpc5 kernel: [ 1079.835962]  [<ffffffff811da453>] xfs_bmapi_write+0x343/0x59b
May 13 18:47:18 vpc5 kernel: [ 1079.835965]  [<ffffffff811d7d75>] ? __xfs_bmapi_allocate+0x23c/0x23c
May 13 18:47:18 vpc5 kernel: [ 1079.835967]  [<ffffffff811bfb94>] xfs_iomap_write_allocate+0x1b9/0x2c2
May 13 18:47:18 vpc5 kernel: [ 1079.835970]  [<ffffffff811b2218>] xfs_map_blocks+0x12b/0x203
May 13 18:47:18 vpc5 kernel: [ 1079.835972]  [<ffffffff811b326a>] xfs_vm_writepage+0x280/0x4b8
May 13 18:47:18 vpc5 kernel: [ 1079.835975]  [<ffffffff810dc27c>] __writepage+0x18/0x37
May 13 18:47:18 vpc5 kernel: [ 1079.835977]  [<ffffffff810dcdd0>] write_cache_pages+0x25a/0x37e
May 13 18:47:18 vpc5 kernel: [ 1079.835978]  [<ffffffff810dc264>] ? page_index+0x1a/0x1a
May 13 18:47:18 vpc5 kernel: [ 1079.835981]  [<ffffffff810dcf35>] generic_writepages+0x41/0x5b
May 13 18:47:18 vpc5 kernel: [ 1079.835983]  [<ffffffff811b1e29>] xfs_vm_writepages+0x51/0x5c
May 13 18:47:18 vpc5 kernel: [ 1079.835985]  [<ffffffff810de05b>] do_writepages+0x21/0x2f
May 13 18:47:18 vpc5 kernel: [ 1079.835987]  [<ffffffff810d54b6>] __filemap_fdatawrite_range+0x53/0x55
May 13 18:47:18 vpc5 kernel: [ 1079.835989]  [<ffffffff810d5f57>] filemap_fdatawrite_range+0x13/0x15
May 13 18:47:18 vpc5 kernel: [ 1079.835991]  [<ffffffff811479c6>] SyS_sync_file_range+0xe9/0x12d
May 13 18:47:18 vpc5 kernel: [ 1079.835993]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:47:18 vpc5 kernel: [ 1079.835994] 2 locks held by dpkg/5825:
May 13 18:47:18 vpc5 kernel: [ 1079.836006]  #0:  (sb_internal){......}, at: [<ffffffff811ff638>] xfs_trans_alloc+0x24/0x3d
May 13 18:47:18 vpc5 kernel: [ 1079.836011]  #1:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.836017] INFO: task shutdown_watche:5964 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.836091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.836165] shutdown_watche D 0000000000000002     0  5964   2604 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.836169]  ffff8803c1561ce8 0000000000000002 ffff8803c1561ce8 ffffffff81a1a400
May 13 18:47:18 vpc5 kernel: [ 1079.836171]  ffff8803f63b2300 ffff8803c1561fd8 0000000000063800 ffff8803c1561fd8
May 13 18:47:18 vpc5 kernel: [ 1079.836172]  0000000000063800 ffff8803f63b2300 ffff880300000001 ffff8803f63b2300
May 13 18:47:18 vpc5 kernel: [ 1079.836174] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.836177]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.836179]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:47:18 vpc5 kernel: [ 1079.836180]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.836183]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.836184]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.836186]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836188]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:47:18 vpc5 kernel: [ 1079.836189]  [<ffffffff8104dd2d>] flush_work+0x1e1/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836191]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836193]  [<ffffffff8104c4b4>] ? create_and_start_worker+0x6e/0x6e
May 13 18:47:18 vpc5 kernel: [ 1079.836195]  [<ffffffff810dfe68>] ? __pagevec_release+0x2c/0x2c
May 13 18:47:18 vpc5 kernel: [ 1079.836197]  [<ffffffff8104ff67>] schedule_on_each_cpu+0xca/0x104
May 13 18:47:18 vpc5 kernel: [ 1079.836198]  [<ffffffff810dfe8d>] lru_add_drain_all+0x15/0x18
May 13 18:47:18 vpc5 kernel: [ 1079.836200]  [<ffffffff810f7aa9>] SyS_mlockall+0x48/0x11d
May 13 18:47:18 vpc5 kernel: [ 1079.836202]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:47:18 vpc5 kernel: [ 1079.836203] no locks held by shutdown_watche/5964.
May 13 18:47:18 vpc5 kernel: [ 1079.836212] INFO: task j1939_vehicle_m:5965 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.836312] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.836388] j1939_vehicle_m D 0000000000000002     0  5965   2604 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.836394]  ffff8803c14f9ce8 0000000000000002 ffff8803c14f9ce8 ffff8803f8f9d780
May 13 18:47:18 vpc5 kernel: [ 1079.836396]  ffff8803f63b4600 ffff8803c14f9fd8 0000000000063800 ffff8803c14f9fd8
May 13 18:47:18 vpc5 kernel: [ 1079.836398]  0000000000063800 ffff8803f63b4600 ffff880300000001 ffff8803f63b4600
May 13 18:47:18 vpc5 kernel: [ 1079.836402] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.836407]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.836410]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:47:18 vpc5 kernel: [ 1079.836412]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.836415]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.836417]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.836420]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836423]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:47:18 vpc5 kernel: [ 1079.836425]  [<ffffffff8104dd2d>] flush_work+0x1e1/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836427]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836431]  [<ffffffff8104c4b4>] ? create_and_start_worker+0x6e/0x6e
May 13 18:47:18 vpc5 kernel: [ 1079.836434]  [<ffffffff810dfe68>] ? __pagevec_release+0x2c/0x2c
May 13 18:47:18 vpc5 kernel: [ 1079.836437]  [<ffffffff8104ff67>] schedule_on_each_cpu+0xca/0x104
May 13 18:47:18 vpc5 kernel: [ 1079.836439]  [<ffffffff810dfe8d>] lru_add_drain_all+0x15/0x18
May 13 18:47:18 vpc5 kernel: [ 1079.836441]  [<ffffffff810f7aa9>] SyS_mlockall+0x48/0x11d
May 13 18:47:18 vpc5 kernel: [ 1079.836444]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:47:18 vpc5 kernel: [ 1079.836446] no locks held by j1939_vehicle_m/5965.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Filesystem lockup with CONFIG_PREEMPT_RT
@ 2014-05-14  2:29 ` Austin Schuh
  0 siblings, 0 replies; 43+ messages in thread
From: Austin Schuh @ 2014-05-14  2:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: xfs

[-- Attachment #1: Type: text/plain, Size: 831 bytes --]

Hi,

I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
patched kernel.  I have currently only triggered it using dpkg.  Dave
Chinner on the XFS mailing list suggested that it was a rt-kernel
workqueue issue as opposed to a XFS problem after looking at the
kernel messages.

$ uname -a
Linux vpc5 3.10.24-rt22abs #15 SMP PREEMPT RT Tue May 13 14:42:22 PDT
2014 x86_64 GNU/Linux

The only modification to the kernel besides the RT patch is that I
have applied tglx's "genirq: Sanitize spurious interrupt detection of
threaded irqs" patch.

Any ideas on what could be wrong?

Is there any information that I can pull before I reboot the machine
that would be useful?  I have the output of triggering sysrq with l
and t if that would be useful.  Attached is the kernel blocked task
message output.

Thanks,
    Austin

[-- Attachment #2: vpc5_xfs_lockup_locks --]
[-- Type: application/octet-stream, Size: 31428 bytes --]

May 13 18:45:18 vpc5 kernel: [  959.966849] INFO: task kworker/2:1:82 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.966917] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.966972] kworker/2:1     D 0000000000000000     0    82      2 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.966981] Workqueue: xfs-data/sda5 xfs_end_io
May 13 18:45:18 vpc5 kernel: [  959.966985]  ffff8803f64b9b58 0000000000000002 0000000000000000 ffff8803f659c600
May 13 18:45:18 vpc5 kernel: [  959.966987]  ffff880366153a40 ffff8803f64b9fd8 0000000000063800 ffff8803f64b9fd8
May 13 18:45:18 vpc5 kernel: [  959.966988]  0000000000063800 ffff8803f64b1180 ffff8803f64b9b78 ffff8803f64b1180
May 13 18:45:18 vpc5 kernel: [  959.966991] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.966997]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.966999]  [<ffffffff814b7c74>] __rt_mutex_slowlock+0x8e/0xc7
May 13 18:45:18 vpc5 kernel: [  959.967001]  [<ffffffff814b7db6>] rt_mutex_slowlock+0x109/0x18a
May 13 18:45:18 vpc5 kernel: [  959.967005]  [<ffffffff8107a629>] ? __lock_acquire.isra.27+0x1ce/0x541
May 13 18:45:18 vpc5 kernel: [  959.967008]  [<ffffffff811b1f1f>] ? xfs_setfilesize+0x80/0x148
May 13 18:45:18 vpc5 kernel: [  959.967009]  [<ffffffff814b7e4e>] rt_mutex_lock+0x17/0x19
May 13 18:45:18 vpc5 kernel: [  959.967011]  [<ffffffff8107fdbd>] rt_down_write_nested+0x3a/0x41
May 13 18:45:18 vpc5 kernel: [  959.967014]  [<ffffffff811f18d0>] ? xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967015]  [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967017]  [<ffffffff811b1f1f>] xfs_setfilesize+0x80/0x148
May 13 18:45:18 vpc5 kernel: [  959.967019]  [<ffffffff811b1ea4>] ? xfs_setfilesize+0x5/0x148
May 13 18:45:18 vpc5 kernel: [  959.967021]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967023]  [<ffffffff811b2bba>] xfs_end_io+0x83/0x99
May 13 18:45:18 vpc5 kernel: [  959.967025]  [<ffffffff8104f384>] process_one_work+0x213/0x397
May 13 18:45:18 vpc5 kernel: [  959.967026]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967028]  [<ffffffff8104f925>] worker_thread+0x149/0x224
May 13 18:45:18 vpc5 kernel: [  959.967030]  [<ffffffff8104f7dc>] ? rescuer_thread+0x2a5/0x2a5
May 13 18:45:18 vpc5 kernel: [  959.967032]  [<ffffffff810550b2>] kthread+0xa2/0xaa
May 13 18:45:18 vpc5 kernel: [  959.967035]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:45:18 vpc5 kernel: [  959.967037]  [<ffffffff814be1dc>] ret_from_fork+0x7c/0xb0
May 13 18:45:18 vpc5 kernel: [  959.967038]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:45:18 vpc5 kernel: [  959.967040] 4 locks held by kworker/2:1/82:
May 13 18:45:18 vpc5 kernel: [  959.967051]  #0:  (xfs-data/%s){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967056]  #1:  ((&ioend->io_work)){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967061]  #2:  (sb_internal){......}, at: [<ffffffff811b1ea4>] xfs_setfilesize+0x5/0x148
May 13 18:45:18 vpc5 kernel: [  959.967066]  #3:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967073] INFO: task kworker/u16:5:241 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.967127] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.967181] kworker/u16:5   D 0000000000000002     0   241      2 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.967187] Workqueue: writeback bdi_writeback_workfn (flush-8:0)
May 13 18:45:18 vpc5 kernel: [  959.967190]  ffff8803f572b3c8 0000000000000002 ffff88042db62b80 ffff8803f8f9c600
May 13 18:45:18 vpc5 kernel: [  959.967192]  ffff8803f63b6900 ffff8803f572bfd8 0000000000063800 ffff8803f572bfd8
May 13 18:45:18 vpc5 kernel: [  959.967193]  0000000000063800 ffff8803f63b6900 ffff880300000001 ffff8803f63b6900
May 13 18:45:18 vpc5 kernel: [  959.967195] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.967198]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.967200]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:45:18 vpc5 kernel: [  959.967201]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:45:18 vpc5 kernel: [  959.967203]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:45:18 vpc5 kernel: [  959.967205]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:45:18 vpc5 kernel: [  959.967207]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:45:18 vpc5 kernel: [  959.967209]  [<ffffffff811da0fe>] xfs_bmapi_allocate+0xd8/0xea
May 13 18:45:18 vpc5 kernel: [  959.967211]  [<ffffffff811da453>] xfs_bmapi_write+0x343/0x59b
May 13 18:45:18 vpc5 kernel: [  959.967213]  [<ffffffff811d7d75>] ? __xfs_bmapi_allocate+0x23c/0x23c
May 13 18:45:18 vpc5 kernel: [  959.967216]  [<ffffffff811bfb94>] xfs_iomap_write_allocate+0x1b9/0x2c2
May 13 18:45:18 vpc5 kernel: [  959.967218]  [<ffffffff811b2218>] xfs_map_blocks+0x12b/0x203
May 13 18:45:18 vpc5 kernel: [  959.967221]  [<ffffffff810d451c>] ? rcu_read_unlock+0x23/0x23
May 13 18:45:18 vpc5 kernel: [  959.967223]  [<ffffffff811b326a>] xfs_vm_writepage+0x280/0x4b8
May 13 18:45:18 vpc5 kernel: [  959.967226]  [<ffffffff810dc27c>] __writepage+0x18/0x37
May 13 18:45:18 vpc5 kernel: [  959.967228]  [<ffffffff810dcdd0>] write_cache_pages+0x25a/0x37e
May 13 18:45:18 vpc5 kernel: [  959.967230]  [<ffffffff810dc264>] ? page_index+0x1a/0x1a
May 13 18:45:18 vpc5 kernel: [  959.967232]  [<ffffffff814b81d8>] ? rt_spin_lock_slowunlock+0x14/0x20
May 13 18:45:18 vpc5 kernel: [  959.967234]  [<ffffffff810dcf35>] generic_writepages+0x41/0x5b
May 13 18:45:18 vpc5 kernel: [  959.967236]  [<ffffffff811b1e29>] xfs_vm_writepages+0x51/0x5c
May 13 18:45:18 vpc5 kernel: [  959.967238]  [<ffffffff810de05b>] do_writepages+0x21/0x2f
May 13 18:45:18 vpc5 kernel: [  959.967239]  [<ffffffff81142c1c>] __writeback_single_inode+0x7b/0x238
May 13 18:45:18 vpc5 kernel: [  959.967240]  [<ffffffff81143e74>] writeback_sb_inodes+0x220/0x37a
May 13 18:45:18 vpc5 kernel: [  959.967242]  [<ffffffff81144042>] __writeback_inodes_wb+0x74/0xb9
May 13 18:45:18 vpc5 kernel: [  959.967244]  [<ffffffff811441c5>] wb_writeback+0x13e/0x2a3
May 13 18:45:18 vpc5 kernel: [  959.967245]  [<ffffffff8114463b>] wb_do_writeback+0x163/0x1d9
May 13 18:45:18 vpc5 kernel: [  959.967247]  [<ffffffff8114471d>] bdi_writeback_workfn+0x6c/0xfe
May 13 18:45:18 vpc5 kernel: [  959.967249]  [<ffffffff8104f384>] process_one_work+0x213/0x397
May 13 18:45:18 vpc5 kernel: [  959.967250]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967252]  [<ffffffff8104f925>] worker_thread+0x149/0x224
May 13 18:45:18 vpc5 kernel: [  959.967254]  [<ffffffff8104f7dc>] ? rescuer_thread+0x2a5/0x2a5
May 13 18:45:18 vpc5 kernel: [  959.967255]  [<ffffffff810550b2>] kthread+0xa2/0xaa
May 13 18:45:18 vpc5 kernel: [  959.967258]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:45:18 vpc5 kernel: [  959.967259]  [<ffffffff814be1dc>] ret_from_fork+0x7c/0xb0
May 13 18:45:18 vpc5 kernel: [  959.967261]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:45:18 vpc5 kernel: [  959.967262] 5 locks held by kworker/u16:5/241:
May 13 18:45:18 vpc5 kernel: [  959.967278]  #0:  (writeback){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967283]  #1:  ((&(&wb->dwork)->work)){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:45:18 vpc5 kernel: [  959.967288]  #2:  (&type->s_umount_key#18){......}, at: [<ffffffff81123d4f>] grab_super_passive+0x60/0x8a
May 13 18:45:18 vpc5 kernel: [  959.967294]  #3:  (sb_internal){......}, at: [<ffffffff811ff638>] xfs_trans_alloc+0x24/0x3d
May 13 18:45:18 vpc5 kernel: [  959.967299]  #4:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967315] INFO: task dpkg:5825 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.967367] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.967422] dpkg            D 0000000000000002     0  5825   5789 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.967425]  ffff8803f57cd778 0000000000000002 0000000000000001 ffff8803f8f9c600
May 13 18:45:18 vpc5 kernel: [  959.967427]  ffff8803f57cd778 ffff8803f57cdfd8 0000000000063800 ffff8803f57cdfd8
May 13 18:45:18 vpc5 kernel: [  959.967428]  0000000000063800 ffff8803f659c600 ffff880300000001 ffff8803f659c600
May 13 18:45:18 vpc5 kernel: [  959.967430] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.967433]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.967435]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:45:18 vpc5 kernel: [  959.967436]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:45:18 vpc5 kernel: [  959.967438]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:45:18 vpc5 kernel: [  959.967440]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:45:18 vpc5 kernel: [  959.967441]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:45:18 vpc5 kernel: [  959.967443]  [<ffffffff811da0fe>] xfs_bmapi_allocate+0xd8/0xea
May 13 18:45:18 vpc5 kernel: [  959.967444]  [<ffffffff811da453>] xfs_bmapi_write+0x343/0x59b
May 13 18:45:18 vpc5 kernel: [  959.967447]  [<ffffffff811d7d75>] ? __xfs_bmapi_allocate+0x23c/0x23c
May 13 18:45:18 vpc5 kernel: [  959.967449]  [<ffffffff811bfb94>] xfs_iomap_write_allocate+0x1b9/0x2c2
May 13 18:45:18 vpc5 kernel: [  959.967451]  [<ffffffff811b2218>] xfs_map_blocks+0x12b/0x203
May 13 18:45:18 vpc5 kernel: [  959.967453]  [<ffffffff811b326a>] xfs_vm_writepage+0x280/0x4b8
May 13 18:45:18 vpc5 kernel: [  959.967456]  [<ffffffff810dc27c>] __writepage+0x18/0x37
May 13 18:45:18 vpc5 kernel: [  959.967458]  [<ffffffff810dcdd0>] write_cache_pages+0x25a/0x37e
May 13 18:45:18 vpc5 kernel: [  959.967460]  [<ffffffff810dc264>] ? page_index+0x1a/0x1a
May 13 18:45:18 vpc5 kernel: [  959.967462]  [<ffffffff810dcf35>] generic_writepages+0x41/0x5b
May 13 18:45:18 vpc5 kernel: [  959.967464]  [<ffffffff811b1e29>] xfs_vm_writepages+0x51/0x5c
May 13 18:45:18 vpc5 kernel: [  959.967466]  [<ffffffff810de05b>] do_writepages+0x21/0x2f
May 13 18:45:18 vpc5 kernel: [  959.967467]  [<ffffffff810d54b6>] __filemap_fdatawrite_range+0x53/0x55
May 13 18:45:18 vpc5 kernel: [  959.967469]  [<ffffffff810d5f57>] filemap_fdatawrite_range+0x13/0x15
May 13 18:45:18 vpc5 kernel: [  959.967471]  [<ffffffff811479c6>] SyS_sync_file_range+0xe9/0x12d
May 13 18:45:18 vpc5 kernel: [  959.967473]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:45:18 vpc5 kernel: [  959.967474] 2 locks held by dpkg/5825:
May 13 18:45:18 vpc5 kernel: [  959.967485]  #0:  (sb_internal){......}, at: [<ffffffff811ff638>] xfs_trans_alloc+0x24/0x3d
May 13 18:45:18 vpc5 kernel: [  959.967490]  #1:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:45:18 vpc5 kernel: [  959.967496] INFO: task shutdown_watche:5964 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.967549] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.967604] shutdown_watche D 0000000000000002     0  5964   2604 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.967607]  ffff8803c1561ce8 0000000000000002 ffff8803c1561ce8 ffffffff81a1a400
May 13 18:45:18 vpc5 kernel: [  959.967609]  ffff8803f63b2300 ffff8803c1561fd8 0000000000063800 ffff8803c1561fd8
May 13 18:45:18 vpc5 kernel: [  959.967610]  0000000000063800 ffff8803f63b2300 ffff880300000001 ffff8803f63b2300
May 13 18:45:18 vpc5 kernel: [  959.967612] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.967615]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.967616]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:45:18 vpc5 kernel: [  959.967618]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:45:18 vpc5 kernel: [  959.967620]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:45:18 vpc5 kernel: [  959.967621]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:45:18 vpc5 kernel: [  959.967623]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:45:18 vpc5 kernel: [  959.967625]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:45:18 vpc5 kernel: [  959.967626]  [<ffffffff8104dd2d>] flush_work+0x1e1/0x209
May 13 18:45:18 vpc5 kernel: [  959.967627]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:45:18 vpc5 kernel: [  959.967629]  [<ffffffff8104c4b4>] ? create_and_start_worker+0x6e/0x6e
May 13 18:45:18 vpc5 kernel: [  959.967631]  [<ffffffff810dfe68>] ? __pagevec_release+0x2c/0x2c
May 13 18:45:18 vpc5 kernel: [  959.967633]  [<ffffffff8104ff67>] schedule_on_each_cpu+0xca/0x104
May 13 18:45:18 vpc5 kernel: [  959.967635]  [<ffffffff810dfe8d>] lru_add_drain_all+0x15/0x18
May 13 18:45:18 vpc5 kernel: [  959.967637]  [<ffffffff810f7aa9>] SyS_mlockall+0x48/0x11d
May 13 18:45:18 vpc5 kernel: [  959.967638]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:45:18 vpc5 kernel: [  959.967639] no locks held by shutdown_watche/5964.
May 13 18:45:18 vpc5 kernel: [  959.967648] INFO: task j1939_vehicle_m:5965 blocked for more than 120 seconds.
May 13 18:45:18 vpc5 kernel: [  959.967702] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:45:18 vpc5 kernel: [  959.967757] j1939_vehicle_m D 0000000000000002     0  5965   2604 0x00000000
May 13 18:45:18 vpc5 kernel: [  959.967760]  ffff8803c14f9ce8 0000000000000002 ffff8803c14f9ce8 ffff8803f8f9d780
May 13 18:45:18 vpc5 kernel: [  959.967761]  ffff8803f63b4600 ffff8803c14f9fd8 0000000000063800 ffff8803c14f9fd8
May 13 18:45:18 vpc5 kernel: [  959.967763]  0000000000063800 ffff8803f63b4600 ffff880300000001 ffff8803f63b4600
May 13 18:45:18 vpc5 kernel: [  959.967765] Call Trace:
May 13 18:45:18 vpc5 kernel: [  959.967768]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:45:18 vpc5 kernel: [  959.967769]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:45:18 vpc5 kernel: [  959.967770]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:45:18 vpc5 kernel: [  959.967772]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:45:18 vpc5 kernel: [  959.967774]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:45:18 vpc5 kernel: [  959.967775]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:45:18 vpc5 kernel: [  959.967777]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:45:18 vpc5 kernel: [  959.967779]  [<ffffffff8104dd2d>] flush_work+0x1e1/0x209
May 13 18:45:18 vpc5 kernel: [  959.967780]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:45:18 vpc5 kernel: [  959.967783]  [<ffffffff8104c4b4>] ? create_and_start_worker+0x6e/0x6e
May 13 18:45:18 vpc5 kernel: [  959.967785]  [<ffffffff810dfe68>] ? __pagevec_release+0x2c/0x2c
May 13 18:45:18 vpc5 kernel: [  959.967786]  [<ffffffff8104ff67>] schedule_on_each_cpu+0xca/0x104
May 13 18:45:18 vpc5 kernel: [  959.967788]  [<ffffffff810dfe8d>] lru_add_drain_all+0x15/0x18
May 13 18:45:18 vpc5 kernel: [  959.967789]  [<ffffffff810f7aa9>] SyS_mlockall+0x48/0x11d
May 13 18:45:18 vpc5 kernel: [  959.967791]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:45:18 vpc5 kernel: [  959.967792] no locks held by j1939_vehicle_m/5965.
May 13 18:47:18 vpc5 kernel: [ 1079.835312] INFO: task kworker/2:1:82 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.835369] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.835426] kworker/2:1     D 0000000000000000     0    82      2 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.835434] Workqueue: xfs-data/sda5 xfs_end_io
May 13 18:47:18 vpc5 kernel: [ 1079.835438]  ffff8803f64b9b58 0000000000000002 0000000000000000 ffff8803f659c600
May 13 18:47:18 vpc5 kernel: [ 1079.835439]  ffff880366153a40 ffff8803f64b9fd8 0000000000063800 ffff8803f64b9fd8
May 13 18:47:18 vpc5 kernel: [ 1079.835441]  0000000000063800 ffff8803f64b1180 ffff8803f64b9b78 ffff8803f64b1180
May 13 18:47:18 vpc5 kernel: [ 1079.835444] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.835449]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.835451]  [<ffffffff814b7c74>] __rt_mutex_slowlock+0x8e/0xc7
May 13 18:47:18 vpc5 kernel: [ 1079.835453]  [<ffffffff814b7db6>] rt_mutex_slowlock+0x109/0x18a
May 13 18:47:18 vpc5 kernel: [ 1079.835456]  [<ffffffff8107a629>] ? __lock_acquire.isra.27+0x1ce/0x541
May 13 18:47:18 vpc5 kernel: [ 1079.835459]  [<ffffffff811b1f1f>] ? xfs_setfilesize+0x80/0x148
May 13 18:47:18 vpc5 kernel: [ 1079.835461]  [<ffffffff814b7e4e>] rt_mutex_lock+0x17/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.835462]  [<ffffffff8107fdbd>] rt_down_write_nested+0x3a/0x41
May 13 18:47:18 vpc5 kernel: [ 1079.835465]  [<ffffffff811f18d0>] ? xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.835466]  [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.835468]  [<ffffffff811b1f1f>] xfs_setfilesize+0x80/0x148
May 13 18:47:18 vpc5 kernel: [ 1079.835470]  [<ffffffff811b1ea4>] ? xfs_setfilesize+0x5/0x148
May 13 18:47:18 vpc5 kernel: [ 1079.835473]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835475]  [<ffffffff811b2bba>] xfs_end_io+0x83/0x99
May 13 18:47:18 vpc5 kernel: [ 1079.835476]  [<ffffffff8104f384>] process_one_work+0x213/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835478]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835480]  [<ffffffff8104f925>] worker_thread+0x149/0x224
May 13 18:47:18 vpc5 kernel: [ 1079.835482]  [<ffffffff8104f7dc>] ? rescuer_thread+0x2a5/0x2a5
May 13 18:47:18 vpc5 kernel: [ 1079.835484]  [<ffffffff810550b2>] kthread+0xa2/0xaa
May 13 18:47:18 vpc5 kernel: [ 1079.835487]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:47:18 vpc5 kernel: [ 1079.835489]  [<ffffffff814be1dc>] ret_from_fork+0x7c/0xb0
May 13 18:47:18 vpc5 kernel: [ 1079.835490]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:47:18 vpc5 kernel: [ 1079.835492] 4 locks held by kworker/2:1/82:
May 13 18:47:18 vpc5 kernel: [ 1079.835503]  #0:  (xfs-data/%s){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835509]  #1:  ((&ioend->io_work)){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835514]  #2:  (sb_internal){......}, at: [<ffffffff811b1ea4>] xfs_setfilesize+0x5/0x148
May 13 18:47:18 vpc5 kernel: [ 1079.835519]  #3:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.835526] INFO: task kworker/u16:5:241 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.835590] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.835665] kworker/u16:5   D 0000000000000002     0   241      2 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.835670] Workqueue: writeback bdi_writeback_workfn (flush-8:0)
May 13 18:47:18 vpc5 kernel: [ 1079.835674]  ffff8803f572b3c8 0000000000000002 ffff88042db62b80 ffff8803f8f9c600
May 13 18:47:18 vpc5 kernel: [ 1079.835675]  ffff8803f63b6900 ffff8803f572bfd8 0000000000063800 ffff8803f572bfd8
May 13 18:47:18 vpc5 kernel: [ 1079.835677]  0000000000063800 ffff8803f63b6900 ffff880300000001 ffff8803f63b6900
May 13 18:47:18 vpc5 kernel: [ 1079.835679] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.835682]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.835684]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:47:18 vpc5 kernel: [ 1079.835686]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.835688]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.835690]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.835692]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:47:18 vpc5 kernel: [ 1079.835693]  [<ffffffff811da0fe>] xfs_bmapi_allocate+0xd8/0xea
May 13 18:47:18 vpc5 kernel: [ 1079.835695]  [<ffffffff811da453>] xfs_bmapi_write+0x343/0x59b
May 13 18:47:18 vpc5 kernel: [ 1079.835698]  [<ffffffff811d7d75>] ? __xfs_bmapi_allocate+0x23c/0x23c
May 13 18:47:18 vpc5 kernel: [ 1079.835701]  [<ffffffff811bfb94>] xfs_iomap_write_allocate+0x1b9/0x2c2
May 13 18:47:18 vpc5 kernel: [ 1079.835703]  [<ffffffff811b2218>] xfs_map_blocks+0x12b/0x203
May 13 18:47:18 vpc5 kernel: [ 1079.835705]  [<ffffffff810d451c>] ? rcu_read_unlock+0x23/0x23
May 13 18:47:18 vpc5 kernel: [ 1079.835707]  [<ffffffff811b326a>] xfs_vm_writepage+0x280/0x4b8
May 13 18:47:18 vpc5 kernel: [ 1079.835710]  [<ffffffff810dc27c>] __writepage+0x18/0x37
May 13 18:47:18 vpc5 kernel: [ 1079.835713]  [<ffffffff810dcdd0>] write_cache_pages+0x25a/0x37e
May 13 18:47:18 vpc5 kernel: [ 1079.835715]  [<ffffffff810dc264>] ? page_index+0x1a/0x1a
May 13 18:47:18 vpc5 kernel: [ 1079.835717]  [<ffffffff814b81d8>] ? rt_spin_lock_slowunlock+0x14/0x20
May 13 18:47:18 vpc5 kernel: [ 1079.835719]  [<ffffffff810dcf35>] generic_writepages+0x41/0x5b
May 13 18:47:18 vpc5 kernel: [ 1079.835721]  [<ffffffff811b1e29>] xfs_vm_writepages+0x51/0x5c
May 13 18:47:18 vpc5 kernel: [ 1079.835723]  [<ffffffff810de05b>] do_writepages+0x21/0x2f
May 13 18:47:18 vpc5 kernel: [ 1079.835724]  [<ffffffff81142c1c>] __writeback_single_inode+0x7b/0x238
May 13 18:47:18 vpc5 kernel: [ 1079.835726]  [<ffffffff81143e74>] writeback_sb_inodes+0x220/0x37a
May 13 18:47:18 vpc5 kernel: [ 1079.835728]  [<ffffffff81144042>] __writeback_inodes_wb+0x74/0xb9
May 13 18:47:18 vpc5 kernel: [ 1079.835729]  [<ffffffff811441c5>] wb_writeback+0x13e/0x2a3
May 13 18:47:18 vpc5 kernel: [ 1079.835731]  [<ffffffff8114463b>] wb_do_writeback+0x163/0x1d9
May 13 18:47:18 vpc5 kernel: [ 1079.835733]  [<ffffffff8114471d>] bdi_writeback_workfn+0x6c/0xfe
May 13 18:47:18 vpc5 kernel: [ 1079.835735]  [<ffffffff8104f384>] process_one_work+0x213/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835737]  [<ffffffff8104f2e0>] ? process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835739]  [<ffffffff8104f925>] worker_thread+0x149/0x224
May 13 18:47:18 vpc5 kernel: [ 1079.835741]  [<ffffffff8104f7dc>] ? rescuer_thread+0x2a5/0x2a5
May 13 18:47:18 vpc5 kernel: [ 1079.835742]  [<ffffffff810550b2>] kthread+0xa2/0xaa
May 13 18:47:18 vpc5 kernel: [ 1079.835744]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:47:18 vpc5 kernel: [ 1079.835746]  [<ffffffff814be1dc>] ret_from_fork+0x7c/0xb0
May 13 18:47:18 vpc5 kernel: [ 1079.835748]  [<ffffffff81055010>] ? __kthread_parkme+0x65/0x65
May 13 18:47:18 vpc5 kernel: [ 1079.835749] 5 locks held by kworker/u16:5/241:
May 13 18:47:18 vpc5 kernel: [ 1079.835767]  #0:  (writeback){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835772]  #1:  ((&(&wb->dwork)->work)){......}, at: [<ffffffff8104f2e0>] process_one_work+0x16f/0x397
May 13 18:47:18 vpc5 kernel: [ 1079.835776]  #2:  (&type->s_umount_key#18){......}, at: [<ffffffff81123d4f>] grab_super_passive+0x60/0x8a
May 13 18:47:18 vpc5 kernel: [ 1079.835782]  #3:  (sb_internal){......}, at: [<ffffffff811ff638>] xfs_trans_alloc+0x24/0x3d
May 13 18:47:18 vpc5 kernel: [ 1079.835788]  #4:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.835802] INFO: task dpkg:5825 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.835863] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.835938] dpkg            D 0000000000000002     0  5825   5789 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.835942]  ffff8803f57cd778 0000000000000002 0000000000000001 ffff8803f8f9c600
May 13 18:47:18 vpc5 kernel: [ 1079.835944]  ffff8803f57cd778 ffff8803f57cdfd8 0000000000063800 ffff8803f57cdfd8
May 13 18:47:18 vpc5 kernel: [ 1079.835945]  0000000000063800 ffff8803f659c600 ffff880300000001 ffff8803f659c600
May 13 18:47:18 vpc5 kernel: [ 1079.835947] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.835950]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.835952]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:47:18 vpc5 kernel: [ 1079.835953]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.835955]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.835957]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.835959]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:47:18 vpc5 kernel: [ 1079.835960]  [<ffffffff811da0fe>] xfs_bmapi_allocate+0xd8/0xea
May 13 18:47:18 vpc5 kernel: [ 1079.835962]  [<ffffffff811da453>] xfs_bmapi_write+0x343/0x59b
May 13 18:47:18 vpc5 kernel: [ 1079.835965]  [<ffffffff811d7d75>] ? __xfs_bmapi_allocate+0x23c/0x23c
May 13 18:47:18 vpc5 kernel: [ 1079.835967]  [<ffffffff811bfb94>] xfs_iomap_write_allocate+0x1b9/0x2c2
May 13 18:47:18 vpc5 kernel: [ 1079.835970]  [<ffffffff811b2218>] xfs_map_blocks+0x12b/0x203
May 13 18:47:18 vpc5 kernel: [ 1079.835972]  [<ffffffff811b326a>] xfs_vm_writepage+0x280/0x4b8
May 13 18:47:18 vpc5 kernel: [ 1079.835975]  [<ffffffff810dc27c>] __writepage+0x18/0x37
May 13 18:47:18 vpc5 kernel: [ 1079.835977]  [<ffffffff810dcdd0>] write_cache_pages+0x25a/0x37e
May 13 18:47:18 vpc5 kernel: [ 1079.835978]  [<ffffffff810dc264>] ? page_index+0x1a/0x1a
May 13 18:47:18 vpc5 kernel: [ 1079.835981]  [<ffffffff810dcf35>] generic_writepages+0x41/0x5b
May 13 18:47:18 vpc5 kernel: [ 1079.835983]  [<ffffffff811b1e29>] xfs_vm_writepages+0x51/0x5c
May 13 18:47:18 vpc5 kernel: [ 1079.835985]  [<ffffffff810de05b>] do_writepages+0x21/0x2f
May 13 18:47:18 vpc5 kernel: [ 1079.835987]  [<ffffffff810d54b6>] __filemap_fdatawrite_range+0x53/0x55
May 13 18:47:18 vpc5 kernel: [ 1079.835989]  [<ffffffff810d5f57>] filemap_fdatawrite_range+0x13/0x15
May 13 18:47:18 vpc5 kernel: [ 1079.835991]  [<ffffffff811479c6>] SyS_sync_file_range+0xe9/0x12d
May 13 18:47:18 vpc5 kernel: [ 1079.835993]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:47:18 vpc5 kernel: [ 1079.835994] 2 locks held by dpkg/5825:
May 13 18:47:18 vpc5 kernel: [ 1079.836006]  #0:  (sb_internal){......}, at: [<ffffffff811ff638>] xfs_trans_alloc+0x24/0x3d
May 13 18:47:18 vpc5 kernel: [ 1079.836011]  #1:  (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff811f18d0>] xfs_ilock+0x99/0xd6
May 13 18:47:18 vpc5 kernel: [ 1079.836017] INFO: task shutdown_watche:5964 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.836091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.836165] shutdown_watche D 0000000000000002     0  5964   2604 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.836169]  ffff8803c1561ce8 0000000000000002 ffff8803c1561ce8 ffffffff81a1a400
May 13 18:47:18 vpc5 kernel: [ 1079.836171]  ffff8803f63b2300 ffff8803c1561fd8 0000000000063800 ffff8803c1561fd8
May 13 18:47:18 vpc5 kernel: [ 1079.836172]  0000000000063800 ffff8803f63b2300 ffff880300000001 ffff8803f63b2300
May 13 18:47:18 vpc5 kernel: [ 1079.836174] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.836177]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.836179]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:47:18 vpc5 kernel: [ 1079.836180]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.836183]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.836184]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.836186]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836188]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:47:18 vpc5 kernel: [ 1079.836189]  [<ffffffff8104dd2d>] flush_work+0x1e1/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836191]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836193]  [<ffffffff8104c4b4>] ? create_and_start_worker+0x6e/0x6e
May 13 18:47:18 vpc5 kernel: [ 1079.836195]  [<ffffffff810dfe68>] ? __pagevec_release+0x2c/0x2c
May 13 18:47:18 vpc5 kernel: [ 1079.836197]  [<ffffffff8104ff67>] schedule_on_each_cpu+0xca/0x104
May 13 18:47:18 vpc5 kernel: [ 1079.836198]  [<ffffffff810dfe8d>] lru_add_drain_all+0x15/0x18
May 13 18:47:18 vpc5 kernel: [ 1079.836200]  [<ffffffff810f7aa9>] SyS_mlockall+0x48/0x11d
May 13 18:47:18 vpc5 kernel: [ 1079.836202]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:47:18 vpc5 kernel: [ 1079.836203] no locks held by shutdown_watche/5964.
May 13 18:47:18 vpc5 kernel: [ 1079.836212] INFO: task j1939_vehicle_m:5965 blocked for more than 120 seconds.
May 13 18:47:18 vpc5 kernel: [ 1079.836312] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 18:47:18 vpc5 kernel: [ 1079.836388] j1939_vehicle_m D 0000000000000002     0  5965   2604 0x00000000
May 13 18:47:18 vpc5 kernel: [ 1079.836394]  ffff8803c14f9ce8 0000000000000002 ffff8803c14f9ce8 ffff8803f8f9d780
May 13 18:47:18 vpc5 kernel: [ 1079.836396]  ffff8803f63b4600 ffff8803c14f9fd8 0000000000063800 ffff8803c14f9fd8
May 13 18:47:18 vpc5 kernel: [ 1079.836398]  0000000000063800 ffff8803f63b4600 ffff880300000001 ffff8803f63b4600
May 13 18:47:18 vpc5 kernel: [ 1079.836402] Call Trace:
May 13 18:47:18 vpc5 kernel: [ 1079.836407]  [<ffffffff814b7711>] schedule+0x75/0x87
May 13 18:47:18 vpc5 kernel: [ 1079.836410]  [<ffffffff814b6790>] schedule_timeout+0x37/0xf9
May 13 18:47:18 vpc5 kernel: [ 1079.836412]  [<ffffffff814b6f25>] ? __wait_for_common+0x2a/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.836415]  [<ffffffff814b6f78>] __wait_for_common+0x7d/0xda
May 13 18:47:18 vpc5 kernel: [ 1079.836417]  [<ffffffff814b6759>] ? console_conditional_schedule+0x19/0x19
May 13 18:47:18 vpc5 kernel: [ 1079.836420]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836423]  [<ffffffff814b6ff9>] wait_for_completion+0x24/0x26
May 13 18:47:18 vpc5 kernel: [ 1079.836425]  [<ffffffff8104dd2d>] flush_work+0x1e1/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836427]  [<ffffffff8104dce4>] ? flush_work+0x198/0x209
May 13 18:47:18 vpc5 kernel: [ 1079.836431]  [<ffffffff8104c4b4>] ? create_and_start_worker+0x6e/0x6e
May 13 18:47:18 vpc5 kernel: [ 1079.836434]  [<ffffffff810dfe68>] ? __pagevec_release+0x2c/0x2c
May 13 18:47:18 vpc5 kernel: [ 1079.836437]  [<ffffffff8104ff67>] schedule_on_each_cpu+0xca/0x104
May 13 18:47:18 vpc5 kernel: [ 1079.836439]  [<ffffffff810dfe8d>] lru_add_drain_all+0x15/0x18
May 13 18:47:18 vpc5 kernel: [ 1079.836441]  [<ffffffff810f7aa9>] SyS_mlockall+0x48/0x11d
May 13 18:47:18 vpc5 kernel: [ 1079.836444]  [<ffffffff814be482>] tracesys+0xdd/0xe2
May 13 18:47:18 vpc5 kernel: [ 1079.836446] no locks held by j1939_vehicle_m/5965.

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2014-07-08 16:09 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-07  8:48 Filesystem lockup with CONFIG_PREEMPT_RT Jan de Kruyf
2014-07-07 13:00 ` Thomas Gleixner
2014-07-07 16:23 ` Austin Schuh
2014-07-08  8:03   ` Jan de Kruyf
2014-07-08 16:09     ` Austin Schuh
  -- strict thread matches above, loose matches on Subject: below --
2014-07-05 19:30 Jan de Kruyf
2014-05-21 19:30 John Blackwood
2014-05-21 19:30 ` John Blackwood
2014-05-21 21:59 ` Austin Schuh
2014-05-21 21:59   ` Austin Schuh
2014-07-05 20:36 ` Thomas Gleixner
2014-07-05 20:36   ` Thomas Gleixner
2014-05-14  2:29 Austin Schuh
2014-05-14  2:29 ` Austin Schuh
2014-05-21  6:23 ` Austin Schuh
2014-05-21  6:23   ` Austin Schuh
2014-05-21  7:33   ` Richard Weinberger
2014-05-21  7:33     ` Richard Weinberger
2014-06-26 19:50     ` Austin Schuh
2014-06-26 22:35       ` Thomas Gleixner
2014-06-27  0:07         ` Austin Schuh
2014-06-27  3:22           ` Mike Galbraith
2014-06-27 12:57           ` Mike Galbraith
2014-06-27 14:01             ` Steven Rostedt
2014-06-27 17:34               ` Mike Galbraith
2014-06-27 17:54                 ` Steven Rostedt
2014-06-27 18:07                   ` Mike Galbraith
2014-06-27 18:19                     ` Steven Rostedt
2014-06-27 19:11                       ` Mike Galbraith
2014-06-28  1:18                       ` Austin Schuh
2014-06-28  3:32                         ` Mike Galbraith
2014-06-28  6:20                           ` Austin Schuh
2014-06-28  7:11                             ` Mike Galbraith
2014-06-27 14:24           ` Thomas Gleixner
2014-06-28  4:51             ` Mike Galbraith
2014-07-01  0:12             ` Austin Schuh
2014-07-01  0:53               ` Austin Schuh
2014-07-05 20:26                 ` Thomas Gleixner
2014-07-06  4:55                   ` Austin Schuh
2014-07-01  3:01             ` Austin Schuh
2014-07-01 19:32               ` Austin Schuh
2014-07-03 23:08                 ` Austin Schuh
2014-07-04  4:42                   ` Mike Galbraith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.