All of lore.kernel.org
 help / color / mirror / Atom feed
* task blocked for more than 120 seconds
@ 2012-04-18 15:11 Josef 'Jeff' Sipek
  2012-04-18 18:28 ` Ben Myers
  2012-04-18 23:48 ` Dave Chinner
  0 siblings, 2 replies; 21+ messages in thread
From: Josef 'Jeff' Sipek @ 2012-04-18 15:11 UTC (permalink / raw)
  To: xfs

Greetings!  I have a file server that get a pretty nasty load (about 15
million files created every day).  After some time, I noticed that the load
average spiked up from the usual 30 to about 180.  dmesg revealed:

[434042.318401] INFO: task php:2185 blocked for more than 120 seconds.
[434042.318403] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[434042.318405] php             D 000000010675d6cd     0  2185  27306 0x00000000
[434042.318408]  ffff88008d735a48 0000000000000086 ffff88008d735938 ffffffff00000000
[434042.318412]  ffff88008d734010 ffff88000e28e340 0000000000012000 ffff88008d735fd8
[434042.318416]  ffff88008d735fd8 0000000000012000 ffff8807ef9966c0 ffff88000e28e340
[434042.318419] Call Trace:
[434042.318442]  [<ffffffffa0087a9b>] ? xfs_trans_brelse+0xee/0xf7 [xfs]
[434042.318464]  [<ffffffffa00689de>] ? xfs_da_brelse+0x71/0x96 [xfs]
[434042.318485]  [<ffffffffa006df10>] ? xfs_dir2_leaf_lookup_int+0x211/0x225 [xfs]
[434042.318489]  [<ffffffff8141481e>] schedule+0x55/0x57
[434042.318512]  [<ffffffffa0083de2>] xlog_reserveq_wait+0x115/0x1c0 [xfs]
[434042.318515]  [<ffffffff810381f1>] ? try_to_wake_up+0x23d/0x23d
[434042.318539]  [<ffffffffa0083f45>] xlog_grant_log_space+0xb8/0x1be [xfs]
[434042.318562]  [<ffffffffa0084164>] xfs_log_reserve+0x119/0x133 [xfs]
[434042.318585]  [<ffffffffa0080cf1>] xfs_trans_reserve+0xca/0x199 [xfs]
[434042.318605]  [<ffffffffa00500dc>] xfs_create+0x18d/0x467 [xfs]
[434042.318623]  [<ffffffffa00485be>] xfs_vn_mknod+0xa0/0xf9 [xfs]
[434042.318640]  [<ffffffffa0048632>] xfs_vn_create+0xb/0xd [xfs]
[434042.318644]  [<ffffffff810f0c5d>] vfs_create+0x6e/0x9e
[434042.318647]  [<ffffffff810f1c5e>] do_last+0x302/0x642
[434042.318651]  [<ffffffff810f2068>] path_openat+0xca/0x344
[434042.318654]  [<ffffffff810f23d1>] do_filp_open+0x38/0x87
[434042.318658]  [<ffffffff810fb22e>] ? alloc_fd+0x76/0x11e
[434042.318661]  [<ffffffff810e40b1>] do_sys_open+0x10b/0x1a4
[434042.318664]  [<ffffffff810e4173>] sys_open+0x1b/0x1d

It makes sense that'd the load average would spike up if some major lock got
held longer than it should have been.

The box has 32GB RAM, 6 cores, and it's running 3.2.2.

I've looked at the commits in the stable tree since 3.2.2 was tagged, and I
do see a couple of useful commits so I'll try to get the kernel updated
anyway but I don't quite see any of those fixes addressing this "hang".

Thanks,

Jeff.

-- 
Research, n.:
  Consider Columbus:
    He didn't know where he was going.
    When he got there he didn't know where he was.
    When he got back he didn't know where he had been.
    And he did it all on someone else's money.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread
* task blocked for more than 120 seconds
@ 2010-11-04 15:58 Sergey Senozhatsky
  2010-11-04 16:19 ` Markus Trippelsdorf
  0 siblings, 1 reply; 21+ messages in thread
From: Sergey Senozhatsky @ 2010-11-04 15:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: KOSAKI Motohiro, Roland McGrath, David Howells, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4855 bytes --]

Hello,
Got the following traces:

[42001.449295] INFO: task openbox:17761 blocked for more than 120 seconds.
[42001.449303] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[42001.449311] openbox       D 0000000000000003     0 17761   3723 0x00000000
[42001.449321]  ffff88004ce8dc38 0000000000000046 0000000000000000 ffff88004ce8c000
[42001.449333]  ffff88004ce8dfd8 0000000000012040 ffff880133010000 ffff88004ce8dfd8
[42001.449343]  0000000000012040 0000000000012040 ffff88004ce8dfd8 0000000000012040
[42001.449354] Call Trace:
[42001.449369]  [<ffffffff814212ff>] schedule_timeout+0x38/0x220
[42001.449381]  [<ffffffff81066f21>] ? mark_held_locks+0x50/0x72
[42001.449389]  [<ffffffff81423dae>] ? _raw_spin_unlock_irq+0x2b/0x59
[42001.449398]  [<ffffffff81036070>] ? get_parent_ip+0x11/0x41
[42001.449406]  [<ffffffff81036132>] ? sub_preempt_count+0x92/0xa5
[42001.449413]  [<ffffffff8142057e>] wait_for_common+0xca/0x144
[42001.449421]  [<ffffffff81038b13>] ? default_wake_function+0x0/0xf
[42001.449429]  [<ffffffff81423e4b>] ? _raw_spin_unlock_irqrestore+0x6f/0x74
[42001.449436]  [<ffffffff81420692>] wait_for_completion+0x18/0x1a
[42001.449445]  [<ffffffff81084808>] stop_one_cpu+0x8c/0xba
[42001.449453]  [<ffffffff810355cb>] ? migration_cpu_stop+0x0/0x3d
[42001.449461]  [<ffffffff81035aef>] sched_exec+0xc3/0xdc
[42001.449470]  [<ffffffff810e7dad>] do_execve+0xaa/0x267
[42001.449479]  [<ffffffff8100a58b>] sys_execve+0x3e/0x55
[42001.449488]  [<ffffffff8100265c>] stub_execve+0x6c/0xc0
[42001.449494] 1 lock held by openbox/17761:
[42001.449498]  #0:  (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff810e7b1d>] prepare_bprm_creds+0x28/0x64

[47761.448699] INFO: task make:29123 blocked for more than 120 seconds.
[47761.448704] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[47761.448710] make          D 0000000000000001     0 29123  24208 0x00000004
[47761.448719]  ffff880115dadcf0 0000000000000046 ffff880100000000 ffff880115dac000
[47761.448731]  ffff880115dadfd8 0000000000012040 ffff880133290000 ffff880115dadfd8
[47761.448741]  0000000000012040 0000000000012040 ffff880115dadfd8 0000000000012040
[47761.448752] Call Trace:
[47761.448766]  [<ffffffff814212ff>] schedule_timeout+0x38/0x220
[47761.448774]  [<ffffffff814204da>] ? wait_for_common+0x26/0x144
[47761.448784]  [<ffffffff81207b53>] ? do_raw_spin_lock+0x6b/0x122
[47761.448792]  [<ffffffff81423dae>] ? _raw_spin_unlock_irq+0x2b/0x59
[47761.448801]  [<ffffffff81036070>] ? get_parent_ip+0x11/0x41
[47761.448808]  [<ffffffff81036132>] ? sub_preempt_count+0x92/0xa5
[47761.448816]  [<ffffffff8142057e>] wait_for_common+0xca/0x144
[47761.448823]  [<ffffffff81038b13>] ? default_wake_function+0x0/0xf
[47761.448831]  [<ffffffff81036132>] ? sub_preempt_count+0x92/0xa5
[47761.448838]  [<ffffffff81420692>] wait_for_completion+0x18/0x1a
[47761.448845]  [<ffffffff8103cec0>] do_fork+0x2b0/0x32e
[47761.448856]  [<ffffffff810021ba>] ? sysret_check+0x2e/0x69
[47761.448864]  [<ffffffff8100a526>] sys_vfork+0x20/0x22
[47761.448871]  [<ffffffff81002563>] stub_vfork+0x13/0x20
[47761.448878]  [<ffffffff81002182>] ? system_call_fastpath+0x16/0x1b
[47761.448884] no locks held by make/29123.
[47761.448891] INFO: task make:31380 blocked for more than 120 seconds.
[47761.448895] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[47761.448899] make          D 0000000000000000     0 31380  29123 0x00000000
[47761.448908]  ffff880157f01c38 0000000000000046 0000000000000000 ffff880157f00000
[47761.448918]  ffff880157f01fd8 0000000000012040 ffff8800773abea0 ffff880157f01fd8
[47761.448929]  0000000000012040 0000000000012040 ffff880157f01fd8 0000000000012040
[47761.448939] Call Trace:
[47761.448947]  [<ffffffff814212ff>] schedule_timeout+0x38/0x220
[47761.448957]  [<ffffffff81066f21>] ? mark_held_locks+0x50/0x72
[47761.448965]  [<ffffffff81423dae>] ? _raw_spin_unlock_irq+0x2b/0x59
[47761.448972]  [<ffffffff81036070>] ? get_parent_ip+0x11/0x41
[47761.448979]  [<ffffffff81036132>] ? sub_preempt_count+0x92/0xa5
[47761.448986]  [<ffffffff8142057e>] wait_for_common+0xca/0x144
[47761.448994]  [<ffffffff81038b13>] ? default_wake_function+0x0/0xf
[47761.449001]  [<ffffffff81423e4b>] ? _raw_spin_unlock_irqrestore+0x6f/0x74
[47761.449008]  [<ffffffff81420692>] wait_for_completion+0x18/0x1a
[47761.449017]  [<ffffffff81084808>] stop_one_cpu+0x8c/0xba
[47761.449025]  [<ffffffff810355cb>] ? migration_cpu_stop+0x0/0x3d
[47761.449033]  [<ffffffff81035aef>] sched_exec+0xc3/0xdc
[47761.449041]  [<ffffffff810e7dad>] do_execve+0xaa/0x267
[47761.449049]  [<ffffffff8100a58b>] sys_execve+0x3e/0x55
[47761.449056]  [<ffffffff8100265c>] stub_execve+0x6c/0xc0
[47761.449062] 1 lock held by make/31380:
[47761.449065]  #0:  (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff810e7b1d>] prepare_bprm_creds+0x28/0x64



	Sergey

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread
* task blocked for more than 120 seconds.
@ 2010-10-07 23:18 Shawn Bohrer
  0 siblings, 0 replies; 21+ messages in thread
From: Shawn Bohrer @ 2010-10-07 23:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: linux-kernel

Hello,

I've got some machines with nfs mounted home directories and recently
had the machines lock up with the following output below from the
kernel logs.  This machine was running 2.6.32.21 at the time, and
was locked up for at least 20 minutes before we rebooted.  We've had
this happen to us twice now, so while I haven't tried I believe we can
reproduce it in about a day.

Does anyone have any insight on what may be happening here or any
suggestions for debugging?

Thanks,
Shawn

INFO: task java:24970 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
java          D ffff88189c50e198     0 24970  24239 0x00000000
 ffff88205b7339c8 0000000000000086 0000000000000000 000000005ae2d09c
 0000000000000286 0000000000000030 ffff88205b733988 000000010174a797
 ffff88205acd4a80 ffff88205b733fd8 000000000000e198 ffff88205acd4a80
Call Trace:
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffff810e11ce>] ? find_get_page+0x1e/0xa0
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa0403d77>] nfs_updatepage+0x107/0x4f0 [nfs]
 [<ffffffffa03f3a6a>] nfs_write_end+0x5a/0x2c0 [nfs]
 [<ffffffff8100c9ce>] ? common_interrupt+0xe/0x13
 [<ffffffff810e0bb4>] generic_file_buffered_write+0x174/0x2a0
 [<ffffffff810e25f0>] __generic_file_aio_write+0x240/0x470
 [<ffffffff81047774>] ? __enqueue_entity+0x84/0x90
 [<ffffffff8104f7d5>] ? enqueue_task_fair+0x45/0x90
 [<ffffffff810e288f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task java:24982 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
java          D ffff8800283ee198     0 24982  24239 0x00000000
 ffff88104f6339c8 0000000000000086 0000000000000000 00000000ffffff10
 0000000000000286 0000000000000030 0000000000000282 000000010174a681
 ffff88105a592580 ffff88104f633fd8 000000000000e198 ffff88105a592580
Call Trace:
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffff810e11ce>] ? find_get_page+0x1e/0xa0
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa0403d77>] nfs_updatepage+0x107/0x4f0 [nfs]
 [<ffffffffa03f3a6a>] nfs_write_end+0x5a/0x2c0 [nfs]
 [<ffffffff810e0bb4>] generic_file_buffered_write+0x174/0x2a0
 [<ffffffff810e25f0>] __generic_file_aio_write+0x240/0x470
 [<ffffffff810e288f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff810ffef9>] ? do_wp_page+0x109/0x7c0
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task hbitimestamp:23184 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
hbitimestamp  D ffff88109c6ae198     0 23184  23175 0x00000000
 ffff880855d179c8 0000000000000086 0000000000000000 000000005bb2d600
 0000000000000282 0000000000000030 0000000000000282 00000001017445ab
 ffff8808577bc640 ffff880855d17fd8 000000000000e198 ffff8808577bc640
Call Trace:
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffff810e11ce>] ? find_get_page+0x1e/0xa0
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa0403d77>] nfs_updatepage+0x107/0x4f0 [nfs]
 [<ffffffffa03f3a6a>] nfs_write_end+0x5a/0x2c0 [nfs]
 [<ffffffff810e0bb4>] generic_file_buffered_write+0x174/0x2a0
 [<ffffffff810e25f0>] __generic_file_aio_write+0x240/0x470
 [<ffffffff810e288f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8100cb4e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8141c74e>] ? do_device_not_available+0xe/0x10
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task java:24970 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
java          D ffff88189c50e198     0 24970  24239 0x00000000
 ffff88205b7339c8 0000000000000086 0000000000000000 000000005ae2d09c
 0000000000000286 0000000000000030 ffff88205b733988 000000010174a797
 ffff88205acd4a80 ffff88205b733fd8 000000000000e198 ffff88205acd4a80
Call Trace:
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffff810e11ce>] ? find_get_page+0x1e/0xa0
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa0403d77>] nfs_updatepage+0x107/0x4f0 [nfs]
 [<ffffffffa03f3a6a>] nfs_write_end+0x5a/0x2c0 [nfs]
 [<ffffffff8100c9ce>] ? common_interrupt+0xe/0x13
 [<ffffffff810e0bb4>] generic_file_buffered_write+0x174/0x2a0
 [<ffffffff810e25f0>] __generic_file_aio_write+0x240/0x470
 [<ffffffff81047774>] ? __enqueue_entity+0x84/0x90
 [<ffffffff8104f7d5>] ? enqueue_task_fair+0x45/0x90
 [<ffffffff810e288f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task java:24982 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
java          D ffff8800283ee198     0 24982  24239 0x00000000
 ffff88104f6339c8 0000000000000086 0000000000000000 00000000ffffff10
 0000000000000286 0000000000000030 0000000000000282 000000010174a681
 ffff88105a592580 ffff88104f633fd8 000000000000e198 ffff88105a592580
Call Trace:
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffff810e11ce>] ? find_get_page+0x1e/0xa0
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa0403d77>] nfs_updatepage+0x107/0x4f0 [nfs]
 [<ffffffffa03f3a6a>] nfs_write_end+0x5a/0x2c0 [nfs]
 [<ffffffff810e0bb4>] generic_file_buffered_write+0x174/0x2a0
 [<ffffffff810e25f0>] __generic_file_aio_write+0x240/0x470
 [<ffffffff810e288f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff810ffef9>] ? do_wp_page+0x109/0x7c0
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task tail:22115 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
tail          D ffff88189c4ee198     0 22115  14494 0x00000000
 ffff88085b95bbd8 0000000000000086 0000000000000000 ffffffffa035d896
 ffff88205be57cc8 ffff88205be57cc8 ffff88205be57cc8 000000010174d70e
 ffff8808564fe400 ffff88085b95bfd8 000000000000e198 ffff8808564fe400
Call Trace:
 [<ffffffffa035d896>] ? __rpc_execute+0xd6/0x2a0 [sunrpc]
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa04037d2>] nfs_sync_mapping_wait+0x122/0x260 [nfs]
 [<ffffffffa0403aa9>] nfs_write_mapping+0x79/0xb0 [nfs]
 [<ffffffffa0403afa>] nfs_wb_nocommit+0x1a/0x20 [nfs]
 [<ffffffffa03f60f8>] nfs_getattr+0x128/0x140 [nfs]
 [<ffffffff811351d1>] vfs_getattr+0x51/0x80
 [<ffffffff8113548f>] vfs_fstat+0x3f/0x60
 [<ffffffff811354d4>] sys_newfstat+0x24/0x40
 [<ffffffff810841a4>] ? sys_nanosleep+0x74/0x80
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task hbitimestamp:23184 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
hbitimestamp  D ffff88109c6ae198     0 23184  23175 0x00000000
 ffff880855d179c8 0000000000000086 0000000000000000 000000005bb2d600
 0000000000000282 0000000000000030 0000000000000282 00000001017445ab
 ffff8808577bc640 ffff880855d17fd8 000000000000e198 ffff8808577bc640
Call Trace:
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffff810e11ce>] ? find_get_page+0x1e/0xa0
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa0403d77>] nfs_updatepage+0x107/0x4f0 [nfs]
 [<ffffffffa03f3a6a>] nfs_write_end+0x5a/0x2c0 [nfs]
 [<ffffffff810e0bb4>] generic_file_buffered_write+0x174/0x2a0
 [<ffffffff810e25f0>] __generic_file_aio_write+0x240/0x470
 [<ffffffff810e288f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8100cb4e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8141c74e>] ? do_device_not_available+0xe/0x10
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task java:24970 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
java          D ffff88189c50e198     0 24970  24239 0x00000000
 ffff88205b7339c8 0000000000000086 0000000000000000 000000005ae2d09c
 0000000000000286 0000000000000030 ffff88205b733988 000000010174a797
 ffff88205acd4a80 ffff88205b733fd8 000000000000e198 ffff88205acd4a80
Call Trace:
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffff810e11ce>] ? find_get_page+0x1e/0xa0
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa0403d77>] nfs_updatepage+0x107/0x4f0 [nfs]
 [<ffffffffa03f3a6a>] nfs_write_end+0x5a/0x2c0 [nfs]
 [<ffffffff8100c9ce>] ? common_interrupt+0xe/0x13
 [<ffffffff810e0bb4>] generic_file_buffered_write+0x174/0x2a0
 [<ffffffff810e25f0>] __generic_file_aio_write+0x240/0x470
 [<ffffffff81047774>] ? __enqueue_entity+0x84/0x90
 [<ffffffff8104f7d5>] ? enqueue_task_fair+0x45/0x90
 [<ffffffff810e288f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task java:24982 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
java          D ffff8800283ee198     0 24982  24239 0x00000000
 ffff88104f6339c8 0000000000000086 0000000000000000 00000000ffffff10
 0000000000000286 0000000000000030 0000000000000282 000000010174a681
 ffff88105a592580 ffff88104f633fd8 000000000000e198 ffff88105a592580
Call Trace:
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff81419df3>] io_schedule+0x73/0xc0
 [<ffffffffa03fe02e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
 [<ffffffff8141a61f>] __wait_on_bit+0x5f/0x90
 [<ffffffffa03fe020>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
 [<ffffffff8141a6c8>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff8107fe50>] ? wake_bit_function+0x0/0x40
 [<ffffffff810e11ce>] ? find_get_page+0x1e/0xa0
 [<ffffffffa03fe00f>] nfs_wait_on_request+0x2f/0x40 [nfs]
 [<ffffffffa0403d77>] nfs_updatepage+0x107/0x4f0 [nfs]
 [<ffffffffa03f3a6a>] nfs_write_end+0x5a/0x2c0 [nfs]
 [<ffffffff810e0bb4>] generic_file_buffered_write+0x174/0x2a0
 [<ffffffff810e25f0>] __generic_file_aio_write+0x240/0x470
 [<ffffffff810e288f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff810ffef9>] ? do_wp_page+0x109/0x7c0
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b
INFO: task hbitimestamp:24427 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
hbitimestamp  D ffff88089c40e198     0 24427  24317 0x00000000
 ffff88205a0b3c88 0000000000000082 0000000000000000 0000000000000000
 ffff88205a0b3ca8 ffffffff814196a8 0000000000000000 00000001017777de
 ffff88205a6ec8c0 ffff88205a0b3fd8 000000000000e198 ffff88205a6ec8c0
Call Trace:
 [<ffffffff814196a8>] ? thread_return+0x4e/0x726
 [<ffffffff8141aac1>] __mutex_lock_slowpath+0xf1/0x170
 [<ffffffff8141a9ab>] mutex_lock+0x2b/0x50
 [<ffffffff810e2879>] generic_file_aio_write+0x59/0xe0
 [<ffffffffa03f371a>] nfs_file_write+0xda/0x1e0 [nfs]
 [<ffffffff8112fb7a>] do_sync_write+0xfa/0x140
 [<ffffffff810fcb91>] ? __do_fault+0x3e1/0x4c0
 [<ffffffff8107fe10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8116a68f>] ? inotify_inode_queue_event+0x2f/0x120
 [<ffffffff811bd246>] ? security_file_permission+0x16/0x20
 [<ffffffff8112fe78>] vfs_write+0xb8/0x1a0
 [<ffffffff81130711>] sys_write+0x51/0x90
 [<ffffffff8141c74e>] ? do_device_not_available+0xe/0x10
 [<ffffffff8100c11b>] system_call_fastpath+0x16/0x1b

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Task blocked for more than 120 seconds
@ 2008-03-24 10:21 Samuel Tardieu
  2008-03-25  3:50 ` Neil Brown
  0 siblings, 1 reply; 21+ messages in thread
From: Samuel Tardieu @ 2008-03-24 10:21 UTC (permalink / raw)
  To: linux-raid

On my laptop, doing heavy C++ compilations in parallel with -j3 (this
is a dual core) often generates the following trace:

INFO: task g++:25119 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
g++           D c03916c0     0 25119  25118
       ecfef200 00200086 c038ef18 c03916c0 c03916c0 f77b2230 f77b2378 c17fa6c0 
       00000000 f89817bc f7486280 000000ff f7a1eb20 00000000 00000000 00000000 
       c17fa6c0 00000000 d1401e9c c17f0408 c02a2aa1 d1401e94 c0147100 c02a2c93 
Call Trace:
 [<f89817bc>] dm_table_unplug_all+0x1e/0x2e [dm_mod]
 [<c02a2aa1>] io_schedule+0x1b/0x24
 [<c0147100>] sync_page+0x33/0x36
 [<c02a2c93>] __wait_on_bit+0x33/0x58
 [<c01470cd>] sync_page+0x0/0x36
 [<c01472fd>] wait_on_page_bit+0x59/0x60
 [<c012cf7b>] wake_bit_function+0x0/0x3c
 [<c014e20d>] truncate_inode_pages_range+0x238/0x29f
 [<c014e27d>] truncate_inode_pages+0x9/0xc
 [<f8c67187>] ext2_delete_inode+0x12/0x6e [ext2]
 [<f8c67175>] ext2_delete_inode+0x0/0x6e [ext2]
 [<c0170ea5>] generic_delete_inode+0x8f/0xf3
 [<c0170819>] iput+0x60/0x62
 [<c0169aa5>] do_unlinkat+0xb7/0xf9
 [<c0113a3e>] do_page_fault+0x1fa/0x4dc
 [<c0104822>] sysenter_past_esp+0x5f/0x85
 =======================

This is with 2.6.25-rc6 (SMP) and has been present, as far as I can
remember, since the beginning of the 2.6.25-rc series. It is not
always reproducible, but the trace is always the same.

My filesystem is stored on a ext3 (rw,noatime) dm_crypt'd partition
leaving in a LVM volume.

% lsmod | grep dm | grep -v ' 0 *$'
dm_crypt               14340  1 
crypto_blkcipher       18308  6 ecb,cbc,dm_crypt
dm_mod                 53008  26 dm_crypt,dm_mirror,dm_snapshot

Here is the code I have in dm-table.o:

00001042 <dm_table_unplug_all>:
    1042:       56                      push   %esi
    1043:       53                      push   %ebx
    1044:       8b 98 a0 00 00 00       mov    0xa0(%eax),%ebx
    104a:       8d b0 a0 00 00 00       lea    0xa0(%eax),%esi
    1050:       eb 10                   jmp    1062 <dm_table_unplug_all+0x20>
    1052:       8b 43 10                mov    0x10(%ebx),%eax
    1055:       8b 40 5c                mov    0x5c(%eax),%eax
    1058:       8b 40 34                mov    0x34(%eax),%eax
    105b:       e8 fc ff ff ff          call   105c <dm_table_unplug_all+0x1a>
    1060:       8b 1b                   mov    (%ebx),%ebx
    1062:       8b 03                   mov    (%ebx),%eax
    1064:       0f 1f 40 00             nopl   0x0(%eax)
    1068:       39 f3                   cmp    %esi,%ebx
    106a:       75 e6                   jne    1052 <dm_table_unplug_all+0x10>
    106c:       5b                      pop    %ebx
    106d:       5e                      pop    %esi
    106e:       c3                      ret    

The symbol in 105b call is, after relocation, blk_unplug.

It there anything else I can do to help debugging this?

 Sam
-- 
Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-09-27 22:49 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-18 15:11 task blocked for more than 120 seconds Josef 'Jeff' Sipek
2012-04-18 18:28 ` Ben Myers
2012-04-18 23:48 ` Dave Chinner
2012-04-19 15:46   ` Josef 'Jeff' Sipek
2012-04-19 22:56     ` Dave Chinner
2012-04-20 13:58       ` Josef 'Jeff' Sipek
2012-04-21  0:29         ` Dave Chinner
2012-04-23 17:16           ` Josef 'Jeff' Sipek
2012-04-23 20:24           ` Josef 'Jeff' Sipek
2012-04-23 23:27             ` Dave Chinner
2012-04-24 15:10               ` Josef 'Jeff' Sipek
2012-09-27 12:49               ` Josef 'Jeff' Sipek
2012-09-27 22:50                 ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2010-11-04 15:58 Sergey Senozhatsky
2010-11-04 16:19 ` Markus Trippelsdorf
2010-11-04 17:32   ` Peter Zijlstra
2010-11-04 18:16     ` Sergey Senozhatsky
2010-11-05 11:14     ` Sergey Senozhatsky
2010-10-07 23:18 Shawn Bohrer
2008-03-24 10:21 Task " Samuel Tardieu
2008-03-25  3:50 ` Neil Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.